Abstract
Question answering (QA) is one of the core technologies to build conversational AI. Within the legal context, conversational AI can answer inquiries in certain jurisdictions. Laws and regulations are structured in such a way that rules and legal statements are organized into law components at different granularity levels. We propose a new task of granularity-aware legal QA, which accounts for the underlying granularity levels of law components. Under the Indonesian jurisdiction, we consider four law component granularity levels: chapters (bab), articles (pasal), sections (ayat), and letters (huruf). We include 15 government regulations (Peraturan Pemerintah) of Indonesia related to labor affairs and build a legal QA dataset with granularity information. We then design a solution for such a task—the first IR system to account for legal component granularity. We implement a customized retriever-reranker pipeline in which the retriever accepts law components of multiple granularities and the reranker is trained for granularity-aware ranking. We leverage BM25 and BERT models as retriever and reranker, respectively, yielding an end-to-end exact match accuracy of 35.68%, which offers a significant improvement (20%) over a strong baseline. The use of reranker also improves the granularity accuracy from 44.86% to 63.24%.
Original language | English |
---|---|
Journal | International Journal of Advances in Intelligent Informatics |
Volume | 10 |
Issue number | 3 |
Publication status | Published - 2024 |
Keywords
- Granularity-aware
- Question answering
- Retrieval
- Regulation
- BERT