TY - GEN
T1 - E-commerce merchant classification using website information
AU - Sahid, Galuh Tunggadewi
AU - Mahendra, Rahmad
AU - Budi, Indra
N1 - Publisher Copyright:
© 2019 ACM.
PY - 2019/6/26
Y1 - 2019/6/26
N2 - With the rapid growth of the e-commerce landscape, classifying e-commerce merchants has become an important task as it is an integral part of various processes in e-commerce. One of the examples is merchant on boarding, where the category of an e-commerce merchant has proven to be a good indicator of the risk of the merchant. However, since most of e-commerce businesses do not have brick-and-mortar stores from which we can assess it directly, the only source of information regarding the merchant itself is its website. Thus, we can view this problem as a web classification problem, where we classify e-commerce websites into a category. In this research, we aim to build an end-to-end classification system for e-commerce websites. There are a few challenges such as the number of pages to be processed, imbalanced dataset, and the language of e-commerce websites that can be mixed language. We built a website classification system and experimented with case study of Indonesian and English e-commerce webs, that are classified into 37 different categories. Our best result achieved an F-score of 0.83.
AB - With the rapid growth of the e-commerce landscape, classifying e-commerce merchants has become an important task as it is an integral part of various processes in e-commerce. One of the examples is merchant on boarding, where the category of an e-commerce merchant has proven to be a good indicator of the risk of the merchant. However, since most of e-commerce businesses do not have brick-and-mortar stores from which we can assess it directly, the only source of information regarding the merchant itself is its website. Thus, we can view this problem as a web classification problem, where we classify e-commerce websites into a category. In this research, we aim to build an end-to-end classification system for e-commerce websites. There are a few challenges such as the number of pages to be processed, imbalanced dataset, and the language of e-commerce websites that can be mixed language. We built a website classification system and experimented with case study of Indonesian and English e-commerce webs, that are classified into 37 different categories. Our best result achieved an F-score of 0.83.
KW - Classification
KW - E-commerce
KW - Text processing
KW - Web mining
UR - http://www.scopus.com/inward/record.url?scp=85117542311&partnerID=8YFLogxK
U2 - 10.1145/3326467.3326486
DO - 10.1145/3326467.3326486
M3 - Conference contribution
AN - SCOPUS:85117542311
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the 9th International Conference on Web Intelligence, Mining and Semantics, WIMS 2019
PB - Association for Computing Machinery
T2 - 9th International Conference on Web Intelligence, Mining and Semantics, WIMS 2019
Y2 - 26 June 2019 through 28 June 2019
ER -