Plagiarism Detection System for Indonesia Text Based Document by Fingerprint Method and Natural Language Processing Approach
The practice of plagiarism is very often carried out in a community environment for example in academia. So it can be stated that plagiarism is a major concern, especially in the academic environment, where it can affect both the credibility of the institution and its ability to ensure
the quality of its students. In other words, the act of plagiarism may result in a decrease of creativity in the community. This research uses a combination of fingerprint method with natural language processing (NLP) approach. With the process or plagiarism detection system can be done through
various methods, such as by the method of calculation algorithms Manber the similarities using the Jaccard coefficient and K-gram method as an alternative in the detection of document similarity, is expected to allow a user to use the application this without deciding the value of gram and
its window to produce an accurate similarity value. Although it has been proven NLP techniques can improve the accuracy of detection tasks, there are other challenges remain. Current plagiarism detection tools are mostly limited to comparisons of suspicious plagiarised texts and potential
original texts at string level. By doing stemming, the document similarity measurement process there was an increase of 31% measurement document based on documents that were tested.
Keywords: Fingerprint; Natural Language Processing; Plagiarism
Document Type: Research Article
Affiliations: 1: Faculty of Information Technology and Communication, Semarang University, 50196, Indonesia 2: Faculty of Mathematics and Natural Sciences, Indonesia University, 16424, Indonesia 3: Computer System, School of Information Management and Computer Jakarta, 12140, Indonesia 4: Faculty of Computer Science and Information Technology, Gunadarma University, 16424, Indonesia
Publication date: 01 October 2016
- ADVANCED SCIENCE LETTERS is an international peer-reviewed journal with a very wide-ranging coverage, consolidates research activities in all areas of (1) Physical Sciences, (2) Biological Sciences, (3) Mathematical Sciences, (4) Engineering, (5) Computer and Information Sciences, and (6) Geosciences to publish original short communications, full research papers and timely brief (mini) reviews with authors photo and biography encompassing the basic and applied research and current developments in educational aspects of these scientific areas.
- Editorial Board
- Information for Authors
- Subscribe to this Title
- Ingenta Connect is not responsible for the content or availability of external websites
- Access Key
- Free content
- Partial Free content
- New content
- Open access content
- Partial Open access content
- Subscribed content
- Partial Subscribed content
- Free trial content