TY - GEN
T1 - Semi-supervised Named-Entity Recognition for Product Attribute Extraction in Book Domain
AU - Putra, Hadi Syah
AU - Priatmadji, Faisal Satrio
AU - Mahendra, Rahmad
N1 - Funding Information:
Acknowledgments. This research was supported by the research grant from Universitas Indonesia, namely Publikasi Terindeks Internasional (PUTI) Prosiding year 2020 no NKB-854/UN2.RST/HKP.05.00/2020.
Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - Products sold in today’s marketplace are very numerous and varied. One of them is the book product. Detail information about the book, such as the title of the book, author, and publisher, is often presented in unstructured format in the product title. In order to be useful for the commercial applications, for example catalogs, search functions, and recommendation systems, the attributes need to be extracted from the product title. In this study, we apply Named-Entity Recognition model in semi-supervised style to extract the attributes of e-commerce products in book domain. We experiment with the number of features extraction, i.e. lexical, position, word shape, and embedding features. We extract the book attributes from near to 30K product title data with F-1 measure 65%.
AB - Products sold in today’s marketplace are very numerous and varied. One of them is the book product. Detail information about the book, such as the title of the book, author, and publisher, is often presented in unstructured format in the product title. In order to be useful for the commercial applications, for example catalogs, search functions, and recommendation systems, the attributes need to be extracted from the product title. In this study, we apply Named-Entity Recognition model in semi-supervised style to extract the attributes of e-commerce products in book domain. We experiment with the number of features extraction, i.e. lexical, position, word shape, and embedding features. We extract the book attributes from near to 30K product title data with F-1 measure 65%.
KW - Attribute extraction
KW - Book
KW - E-commerce
KW - Named-Entity Recognition
KW - Product title
UR - http://www.scopus.com/inward/record.url?scp=85097542629&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-64452-9_4
DO - 10.1007/978-3-030-64452-9_4
M3 - Conference contribution
AN - SCOPUS:85097542629
SN - 9783030644512
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 43
EP - 51
BT - Digital Libraries at Times of Massive Societal Transition - 22nd International Conference on Asia-Pacific Digital Libraries, ICADL 2020, Proceedings
A2 - Ishita, Emi
A2 - Pang, Natalie Lee
A2 - Zhou, Lihong
PB - Springer Science and Business Media Deutschland GmbH
T2 - 22nd International Conference on Asia-Pacific Digital Libraries, ICADL 2020
Y2 - 30 November 2020 through 1 December 2020
ER -