TY - GEN
T1 - Academic Expert Finding in Indonesia using Word Embedding and Document Embedding
T2 - 8th International Conference on Information and Communication Technology, ICoICT 2020
AU - Rampisela, Theresia V.
AU - Yulianti, Evi
N1 - Funding Information:
This work is supported by the Publikasi Ilmiah Terindeks Internasional (PUTI) Prosiding Universitas Indonesia 2020 grant.
Funding Information:
ACKNOWLEDGMENT This work is supported by the Publikasi Ilmiah Terindeks Internasional (PUTI) Prosiding Universitas Indonesia 2020 grant.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/6
Y1 - 2020/6
N2 - Expertise retrieval covers the problems of expert and expertise finding. In academia, expert finding can be beneficial in finding a research partner or a potential thesis supervisor. This research finds the experts in the Faculty of Computer Science in Universitas Indonesia (Fasilkom UI) using the thesis abstract and metadata of Fasilkom UI students. The methods that are used to represent the query and expertise of the lecturers are the combination of word2vec and doc2vec, which are word embedding and document embedding, respectively. Both embeddings are able to model semantic information, which is necessary for solving the problem of vocabulary mismatch in search problems. Our result shows that representing the expertise query with word2vec leads to better performance than using doc2vec. In addition, we also found that generally, the performance of the embedding models is comparable to the standard retrieval model BM25 in retrieving experts using expertise queries in both Indonesian and English languages.
AB - Expertise retrieval covers the problems of expert and expertise finding. In academia, expert finding can be beneficial in finding a research partner or a potential thesis supervisor. This research finds the experts in the Faculty of Computer Science in Universitas Indonesia (Fasilkom UI) using the thesis abstract and metadata of Fasilkom UI students. The methods that are used to represent the query and expertise of the lecturers are the combination of word2vec and doc2vec, which are word embedding and document embedding, respectively. Both embeddings are able to model semantic information, which is necessary for solving the problem of vocabulary mismatch in search problems. Our result shows that representing the expertise query with word2vec leads to better performance than using doc2vec. In addition, we also found that generally, the performance of the embedding models is comparable to the standard retrieval model BM25 in retrieving experts using expertise queries in both Indonesian and English languages.
KW - academic expert
KW - document embedding
KW - expert finding
KW - expertise retrieval
KW - word embedding
UR - http://www.scopus.com/inward/record.url?scp=85090997933&partnerID=8YFLogxK
U2 - 10.1109/ICoICT49345.2020.9166249
DO - 10.1109/ICoICT49345.2020.9166249
M3 - Conference contribution
AN - SCOPUS:85090997933
T3 - 2020 8th International Conference on Information and Communication Technology, ICoICT 2020
BT - 2020 8th International Conference on Information and Communication Technology, ICoICT 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 24 June 2020 through 26 June 2020
ER -