TY - GEN
T1 - The Use of Text Mining for Classification of Product Selling Content in Social Media Female Daily
AU - Jonathan, Bern
AU - Budi, Indra
N1 - Funding Information:
This research was supported by (PUTI) Q2 Proceeding Grant (NKB-4060/UN2.RST/HKP.05.00/2020). We would express our gratitude to the Directorate of Research and Community Engagement, Universitas Indonesia.
Funding Information:
VI. ACKNOWLEDGEMENT This research was supported by (PUTI) Q2 Proceeding Grant (NKB-4060/UN2.RST/HKP.05.00/2020). We would express our gratitude to the Directorate of Research and Community Engagement, Universitas Indonesia.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Female Daily Network is a company engaged in social media. Female Daily has social media to share experiences using beauty products called Female Daily. Female Daily has regulations not to use the Female Daily Platform to promote, sell products and services on social media platforms in Female Daily. However, users on Female Daily sometimes violate these rules in their posts and cause other users to be annoyed about it. Admins at Female Daily have difficulty identifying users who violate these rules and ban their posts containing product sales due to the limited number of admins with the number of posts that enter each day. Text mining can also overcome this problem by determining the classification automatically by creating a system that carries out the learning process from the available post words. Algorithms that can be used to carry out the text mining process in this research are Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree (DT), and Random Forest (RF). This study uses a combination of feature extraction, contextual features, and data balancing. This study uses research scenarios to analyze feature extraction, contextual feature usage, and data balancing. The best algorithm seen from the recall value in the combination of algorithms and features of this research is the Random Forest TF-IDF Unigram and uses additional contextual features to detect money and selling words with balanced data. The recall value of 88.37% is obtained from the results of the combination of these algorithms and features.
AB - Female Daily Network is a company engaged in social media. Female Daily has social media to share experiences using beauty products called Female Daily. Female Daily has regulations not to use the Female Daily Platform to promote, sell products and services on social media platforms in Female Daily. However, users on Female Daily sometimes violate these rules in their posts and cause other users to be annoyed about it. Admins at Female Daily have difficulty identifying users who violate these rules and ban their posts containing product sales due to the limited number of admins with the number of posts that enter each day. Text mining can also overcome this problem by determining the classification automatically by creating a system that carries out the learning process from the available post words. Algorithms that can be used to carry out the text mining process in this research are Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree (DT), and Random Forest (RF). This study uses a combination of feature extraction, contextual features, and data balancing. This study uses research scenarios to analyze feature extraction, contextual feature usage, and data balancing. The best algorithm seen from the recall value in the combination of algorithms and features of this research is the Random Forest TF-IDF Unigram and uses additional contextual features to detect money and selling words with balanced data. The recall value of 88.37% is obtained from the results of the combination of these algorithms and features.
KW - contextual features
KW - data balancing
KW - recall
KW - regex
KW - social media
KW - text mining
UR - http://www.scopus.com/inward/record.url?scp=85123871621&partnerID=8YFLogxK
U2 - 10.1109/ICACSIS53237.2021.9631329
DO - 10.1109/ICACSIS53237.2021.9631329
M3 - Conference contribution
AN - SCOPUS:85123871621
T3 - 2021 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2021
BT - 2021 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 13th International Conference on Advanced Computer Science and Information Systems, ICACSIS 2021
Y2 - 23 October 2021 through 26 October 2021
ER -