Pontificia Universidad Católica de Chile Pontificia Universidad Católica de Chile
Arenas M. and Ugarte M. (2017)

Designing a Query Language for RDF: Marrying Open and Closed Worlds

Revista : ACM Transactions on Database Systems
Volumen : 42
Número : 4
Páginas : 46pp
Tipo de publicación : ISI Ir a publicación

Abstract

When querying an Resource Description Framework (RDF) graph, a prominent feature is the possibility of extending the answer to a query with optional information. However, the definition of this feature in SPARQL—the standard RDF query language—has raised some important issues. Most notably, the use of this feature increases the complexity of the evaluation problem, and its closed-world semantics is in conflict with the underlying open-world semantics of RDF. Many approaches for fixing such problems have been proposed, the most prominent being the introduction of the semantic notion of weakly monotone SPARQL query. Weakly monotone SPARQL queries have shaped the class of queries that conform to the open-world semantics of RDF. Unfortunately, finding an effective way of restricting SPARQL to the fragment of weakly monotone queries has proven to be an elusive problem. In practice, the most widely adopted fragment for writing SPARQL queries is based on the syntactic notion of well designedness. This notion has proven to be a good approach for writing SPARQL queries, but its expressive power has yet to be fully understood.The starting point of this article is to understand the relation between well-designed queries and the semantic notion of weak monotonicity. It is known that every well-designed SPARQL query is weakly monotone; as our first contribution we prove that the converse does not hold, even if an extension of this notion based on the use of disjunction is considered. Given this negative result, we embark on the task of defining syntactic fragments that are weakly monotone and have higher expressive power than the fragment of well-designed queries. To this end, we move to a more general scenario where infinite RDF graphs are also allowed, so interpolation techniques studied for first-order logic can be applied. With the use of these techniques, we are able to define a new operator for SPARQL that gives rise to a query language with the desired properties (over finite and infinite RDF graphs). It should be noticed that every query in this fragment is weakly monotone if we restrict the semantics to finite RDF graphs. Moreover, we use this result to provide a simple characterization of the class of monotone CONSTRUCT queries, that is, the class of SPARQL queries that produce RDF graphs as output. Finally, we pinpoint the complexity of the evaluation problem for the query languages identified in the article.