TY - GEN
T1 - Harvesting bibliography multi-thread, safe and ethical web crawling
AU - Achsan, Harry Tursulistyono Yani
AU - Wibowo, Wahyu Catur
AU - Putri, Wahyuningdiah Trisari Harsanti
AU - Achsan, M. Muhtar Baswara
AU - Barcah, Quintin Kumia Dikara
N1 - Publisher Copyright:
© 2019 IEEE. All Rights Reserved.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - Web mining is an important techniques for it enables extraction of data and information from the web for further needs. Although there are a lot of web databases that holds bibliography data, The Online Computer Library Center (OCLC) owns the largest web databases on bibliography in the world. A large number of bibliography data, surely needs more than manual downloading. In this research we conduct an experimentation to harvest bibliography data using multi-threading process that is fast, safe and ethical. Using C# programming language and Visual Studio IDE, We were successful in harvesting five million bibliography data, without being penalized by the source site.
AB - Web mining is an important techniques for it enables extraction of data and information from the web for further needs. Although there are a lot of web databases that holds bibliography data, The Online Computer Library Center (OCLC) owns the largest web databases on bibliography in the world. A large number of bibliography data, surely needs more than manual downloading. In this research we conduct an experimentation to harvest bibliography data using multi-threading process that is fast, safe and ethical. Using C# programming language and Visual Studio IDE, We were successful in harvesting five million bibliography data, without being penalized by the source site.
UR - http://www.scopus.com/inward/record.url?scp=85062416089&partnerID=8YFLogxK
U2 - 10.1109/ICACSIS.2018.8618262
DO - 10.1109/ICACSIS.2018.8618262
M3 - Conference contribution
AN - SCOPUS:85062416089
T3 - 2018 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2018
SP - 355
EP - 360
BT - 2018 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th International Conference on Advanced Computer Science and Information Systems, ICACSIS 2018
Y2 - 27 October 2018 through 28 October 2018
ER -