Completeness and soundness guarantees for conjunctive SPARQL queries over RDF data sources with completeness statements

Fariz Darari, Werner Nutt, Simon Razniewski, Sebastian Rudolph

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

RDF generally follows the open-world assumption: information is incomplete by default. Consequently, SPARQL queries cannot retrieve with certainty complete answers, and even worse, when they involve negation, it is unclear whether they produce sound answers. Nevertheless, there is hope to lift this limitation. On many specific topics (e.g., children of Trump, Apollo 11 crew, EU founders), RDF data sources contain complete information, a fact that can be made explicit through completeness statements. In this work, we leverage completeness statements over RDF data sources to provide guarantees of completeness and soundness for conjunctive SPARQL queries. We develop a technique to check whether query completeness can be guaranteed by taking into account also the specifics of the queried graph, and analyze the complexity of such checking. For queries with negation, we approach the problem of query soundness checking, and distinguish between answer soundness (i.e., is an answer of a query sound?) and pattern soundness (i.e., is a query as a whole sound?). We provide a formalization and characterize the soundness problem via a reduction to the completeness problem. We further develop heuristic techniques for completeness checking, and conduct experimental evaluations based on Wikidata, a prominent, real-world knowledge base, to demonstrate the feasibility of our approach.

Original languageEnglish
Pages (from-to)441-482
Number of pages42
JournalSemantic Web
Volume11
Issue number3
DOIs
Publication statusPublished - 2020

Keywords

  • data completeness
  • Data quality
  • query completeness
  • query soundness
  • RDF
  • SPARQL

Fingerprint

Dive into the research topics of 'Completeness and soundness guarantees for conjunctive SPARQL queries over RDF data sources with completeness statements'. Together they form a unique fingerprint.

Cite this