TY - GEN
T1 - Fake news identification characteristics using named entity recognition and phrase detection
AU - Al-Ash, Herley Shaori
AU - Wibowo, Wahyu Catur
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/11/13
Y1 - 2018/11/13
N2 - Information explosion that can be generated by anyone may lead to the spread of fake news not only at the news channel, but also at social media, and so forth. Detection of fake news has become an urgent need on the society because of fake news spread of unrest in the society. Several related studies have been conducted in the news classification with the aim of providing a decision whether a news is included in fake news or original news. In the related research, a vector representation of documents is used. This vector representation is then given to the algorithm for further processing. This study aims to model vectors that can accommodate the characteristics of fake news before further processed by language algorithms using the Indonesian language. In this research, fake news and original news are represented according to the vector space model. Vector model combination of frequency term, inverse document frequency and frequency reversed with 10-fold cross validation using support vector machine algorithm classifier. Variations of phrase detection as well as name recognition entities (entity recognition names) are also used in vector representation. A vector representation that uses the term frequency shows promising performance. It can recognize news characteristics correctly 96.74% of 2516 documents across phrase detection and named entity recognition process.
AB - Information explosion that can be generated by anyone may lead to the spread of fake news not only at the news channel, but also at social media, and so forth. Detection of fake news has become an urgent need on the society because of fake news spread of unrest in the society. Several related studies have been conducted in the news classification with the aim of providing a decision whether a news is included in fake news or original news. In the related research, a vector representation of documents is used. This vector representation is then given to the algorithm for further processing. This study aims to model vectors that can accommodate the characteristics of fake news before further processed by language algorithms using the Indonesian language. In this research, fake news and original news are represented according to the vector space model. Vector model combination of frequency term, inverse document frequency and frequency reversed with 10-fold cross validation using support vector machine algorithm classifier. Variations of phrase detection as well as name recognition entities (entity recognition names) are also used in vector representation. A vector representation that uses the term frequency shows promising performance. It can recognize news characteristics correctly 96.74% of 2516 documents across phrase detection and named entity recognition process.
KW - Document vector representation
KW - Named entity recognition
KW - News identification
KW - Phrase detection
UR - http://www.scopus.com/inward/record.url?scp=85058389205&partnerID=8YFLogxK
U2 - 10.1109/ICITEED.2018.8534898
DO - 10.1109/ICITEED.2018.8534898
M3 - Conference contribution
AN - SCOPUS:85058389205
T3 - Proceedings of 2018 10th International Conference on Information Technology and Electrical Engineering: Smart Technology for Better Society, ICITEE 2018
SP - 12
EP - 17
BT - Proceedings of 2018 10th International Conference on Information Technology and Electrical Engineering
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th International Conference on Information Technology and Electrical Engineering, ICITEE 2018
Y2 - 24 July 2018 through 26 July 2018
ER -