In this paper, we report our approach in detecting external plagiarism. For the pre-processing stage, we identify non-English documents and translate them into English using an online translator tool. Then we index and retrieve the top documents that are similar to the suspicious documents. We divide the retrieved documents into passages where each passage contains twenty sentences. The plagiarism is detected by identifying the number of overlapped words between suspicious and source passages.
|Journal||CEUR Workshop Proceedings|
|Publication status||Published - 1 Jan 2010|
|Event||2010 Cross Language Evaluation Forum Conference, CLEF 2010 - Padua, Italy|
Duration: 22 Sep 2010 → 23 Sep 2010
- Overlapping n-grams
- Passage retrieval
- Plagiarism detection