TY - GEN
T1 - Towards an Open NLI LLM-based System for KGs
T2 - 7th International Seminar on Research of Information Technology and Intelligent Systems, ISRITI 2024
AU - Ongris, Jaycent Gunawan
AU - Tjitrahardja, Eduardus
AU - Darari, Fariz
AU - Ekaputra, Fajar J.
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The rise of large language models (LLMs) has significantly advanced information retrieval, yet challenges like the limitation of knowledge updating ability, lack of openness, and hallucination issues persist. To address these, Retrieval-Augmented Generation (RAG) has been introduced but remains limited in interpretability due to its reliance on vector-based representations. This paper presents a question-answering (QA) system using GraphRAG, a RAG system with knowledge graphs (KGs) as its base. We develop a natural language interface (NLI) for QA over Wikidata, a popular, open, and crowdsourced KG. Our approach employs LLM chaining, i.e., a paradigm that leverages multiple LLM calls sequentially, to generate SPARQL queries, with the aim of creating an open system that ensures transparency and allows direct inspection of its components. Utilizing an experimental research approach, we evaluated the generated SPARQL queries and found that incorporating a broader set of property candidates into the prompts significantly boosts performance, achieving a Jaccard similarity score of 0.7806. These findings demonstrate the system's effectiveness in SPARQL query generation, highlighting its potential for further development. However, we consider the limitation of the LLM's context window and the hallucination phenomenon as the major challenges that limit the system's performance.
AB - The rise of large language models (LLMs) has significantly advanced information retrieval, yet challenges like the limitation of knowledge updating ability, lack of openness, and hallucination issues persist. To address these, Retrieval-Augmented Generation (RAG) has been introduced but remains limited in interpretability due to its reliance on vector-based representations. This paper presents a question-answering (QA) system using GraphRAG, a RAG system with knowledge graphs (KGs) as its base. We develop a natural language interface (NLI) for QA over Wikidata, a popular, open, and crowdsourced KG. Our approach employs LLM chaining, i.e., a paradigm that leverages multiple LLM calls sequentially, to generate SPARQL queries, with the aim of creating an open system that ensures transparency and allows direct inspection of its components. Utilizing an experimental research approach, we evaluated the generated SPARQL queries and found that incorporating a broader set of property candidates into the prompts significantly boosts performance, achieving a Jaccard similarity score of 0.7806. These findings demonstrate the system's effectiveness in SPARQL query generation, highlighting its potential for further development. However, we consider the limitation of the LLM's context window and the hallucination phenomenon as the major challenges that limit the system's performance.
KW - GraphRAG
KW - KG
KW - LLM
KW - RAG
KW - Wikidata
UR - http://www.scopus.com/inward/record.url?scp=105004412345&partnerID=8YFLogxK
U2 - 10.1109/ISRITI64779.2024.10963661
DO - 10.1109/ISRITI64779.2024.10963661
M3 - Conference contribution
AN - SCOPUS:105004412345
T3 - 7th International Seminar on Research of Information Technology and Intelligent Systems: Advanced Intelligent Systems in Contemporary Society, ISRITI 2024 - Proceedings
SP - 44
EP - 49
BT - 7th International Seminar on Research of Information Technology and Intelligent Systems
A2 - Wibowo, Ferry Wahyu
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 11 December 2024
ER -