Plagiarism detection system for Indonesia text based document by fingerprint method and natural language processing approach

Titin Winarti, Djati Kirani, E. T.P. Lussiana, Kemal Ade Sekarwati

Research output: Contribution to journalArticlepeer-review

Abstract

The practice of plagiarism is very often carried out in a community environment for example in academia. So it can be stated that plagiarism is a major concern, especially in the academic environment, where it can affect both the credibility of the institution and its ability to ensure the quality of its students. In other words, the act of plagiarism may result in a decrease of creativity in the community. This research uses a combination of fingerprint method with natural language processing (NLP) approach. With the process or plagiarism detection system can be done through various methods, such as by the method of calculation algorithms Manber the similarities using the Jaccard coefficient and K-gram method as an alternative in the detection of document similarity, is expected to allow a user to use the application this without deciding the value of gram and its window to produce an accurate similarity value. Although it has been proven NLP techniques can improve the accuracy of detection tasks, there are other challenges remain. Current plagiarism detection tools are mostly limited to comparisons of suspicious plagiarised texts and potential original texts at string level. By doing stemming, the document similarity measurement process there was an increase of 31% measurement document based on documents that were tested.

Original languageEnglish
Pages (from-to)3128-3131
Number of pages4
JournalAdvanced Science Letters
Volume22
Issue number10
DOIs
Publication statusPublished - 1 Oct 2016

Keywords

  • Fingerprint
  • Natural language processing
  • Plagiarism

Fingerprint Dive into the research topics of 'Plagiarism detection system for Indonesia text based document by fingerprint method and natural language processing approach'. Together they form a unique fingerprint.

Cite this