TY - GEN
T1 - Improving classification performance by extending documents terms
AU - Widodo,
AU - Wibowo, Wahyu Catur
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/3/17
Y1 - 2014/3/17
N2 - Classification is a technique in data mining for categorizing objects. Text Classification is re-challenged for classifying very short documents or text as shown in social media collection. This paper proposes a method to improve the performance of classification on short documents. In this work, we expand words in every document before the documents are classified We use TFIDF model, Hidden Markov Model k-means clustering, and Latent Semantic Indexing (LSI) for expanding documents. The results show that extending document term by just 1 word will increase its accuracy, while extending by 2,4, and 8 words tend to give stable results.
AB - Classification is a technique in data mining for categorizing objects. Text Classification is re-challenged for classifying very short documents or text as shown in social media collection. This paper proposes a method to improve the performance of classification on short documents. In this work, we expand words in every document before the documents are classified We use TFIDF model, Hidden Markov Model k-means clustering, and Latent Semantic Indexing (LSI) for expanding documents. The results show that extending document term by just 1 word will increase its accuracy, while extending by 2,4, and 8 words tend to give stable results.
KW - Hidden Markov Model k-means
KW - Latent Semantic Indexing
KW - TFIDF model
KW - extend words
KW - text classification
UR - http://www.scopus.com/inward/record.url?scp=84946686368&partnerID=8YFLogxK
U2 - 10.1109/ICODSE.2014.7062657
DO - 10.1109/ICODSE.2014.7062657
M3 - Conference contribution
AN - SCOPUS:84946686368
T3 - Proceedings of 2014 International Conference on Data and Software Engineering, ICODSE 2014
BT - Proceedings of 2014 International Conference on Data and Software Engineering, ICODSE 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 International Conference on Data and Software Engineering, ICODSE 2014
Y2 - 26 November 2014 through 27 November 2014
ER -