TY - GEN
T1 - Word level auto-correction for latent semantic analysis based essay grading system
AU - Ratna, Anak Agung Putri
AU - Sanjaya, Randy
AU - Wirianata, Tomi
AU - Purnamasari, Prima Dewi
N1 - Funding Information:
ACKNOWLEDGMENT The author would like to thank to Universitas Indonesia for funding the publication through the grant PITTA 2017.
Publisher Copyright:
© 2017 IEEE.
PY - 2017/12/5
Y1 - 2017/12/5
N2 - Assessment is an important step in the learning process in which the assessor evaluates students' level of understanding. One model of assessment is essay, which may cause problems in scoring objectivity and performance drop of human body when grading many essays. To ease essay grading and resolve those problems, a system that can assess documents according to its contexts is needed. From this concern, we developed a Java-based system for grading essays in Indonesian language using a more efficient and optimal algorithm. This algorithm consisted of 4 stages. The first stage is Latent Semantic Analysis (LSA), which is used to obtain and conclude the contextual relation of words meaning in a text. The second stage uses Single Value Decomposition (SVD) to obtain scatter variance from the relations. SVD identifies where variances appear at most, therefore is enabled to find the best approach to the original data using reduced dimensions. The third stage is Latent Semantic Indexing (LSI) which is an indexing and retrieval method to identifies patterns in relation between terms and concepts contained in unstructured text collection and results with a vector representing the text. The last stage is Cosine Similarity Measurement (CSM) to obtain similarity value from the text and answer document. To resolve problems stemmed from grammar and vocabulary, in this work we propose an auto-correction technique to check a word from word library for equalization of word with same or no specific meaning. Then, Jaro-Winkler distance algorithm is used to check word errors caused by accident when typing. With the distance, we can determine whether two strings of word are similar. This is extremely important when scanning text with typos, as it will affect the result from LSA. Using this system, the value obtained is similar to the value obtained from human rater. With word library consisting of 97 words for synonym check and 204 function words, the resulting accuracy is 85.246% ± 13.129.
AB - Assessment is an important step in the learning process in which the assessor evaluates students' level of understanding. One model of assessment is essay, which may cause problems in scoring objectivity and performance drop of human body when grading many essays. To ease essay grading and resolve those problems, a system that can assess documents according to its contexts is needed. From this concern, we developed a Java-based system for grading essays in Indonesian language using a more efficient and optimal algorithm. This algorithm consisted of 4 stages. The first stage is Latent Semantic Analysis (LSA), which is used to obtain and conclude the contextual relation of words meaning in a text. The second stage uses Single Value Decomposition (SVD) to obtain scatter variance from the relations. SVD identifies where variances appear at most, therefore is enabled to find the best approach to the original data using reduced dimensions. The third stage is Latent Semantic Indexing (LSI) which is an indexing and retrieval method to identifies patterns in relation between terms and concepts contained in unstructured text collection and results with a vector representing the text. The last stage is Cosine Similarity Measurement (CSM) to obtain similarity value from the text and answer document. To resolve problems stemmed from grammar and vocabulary, in this work we propose an auto-correction technique to check a word from word library for equalization of word with same or no specific meaning. Then, Jaro-Winkler distance algorithm is used to check word errors caused by accident when typing. With the distance, we can determine whether two strings of word are similar. This is extremely important when scanning text with typos, as it will affect the result from LSA. Using this system, the value obtained is similar to the value obtained from human rater. With word library consisting of 97 words for synonym check and 204 function words, the resulting accuracy is 85.246% ± 13.129.
KW - CSM
KW - Essay grading
KW - Jaro-Winkler
KW - LSA
KW - LSI
KW - SVD
UR - http://www.scopus.com/inward/record.url?scp=85045967612&partnerID=8YFLogxK
U2 - 10.1109/QIR.2017.8168488
DO - 10.1109/QIR.2017.8168488
M3 - Conference contribution
AN - SCOPUS:85045967612
T3 - QiR 2017 - 2017 15th International Conference on Quality in Research (QiR): International Symposium on Electrical and Computer Engineering
SP - 235
EP - 240
BT - QiR 2017 - 2017 15th International Conference on Quality in Research (QiR)
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th International Conference on Quality in Research: International Symposium on Electrical and Computer Engineering, QiR 2017
Y2 - 24 July 2017 through 27 July 2017
ER -