Pontificia Universidad Católica de Chile Pontificia Universidad Católica de Chile
Arenas M. Barceló P., Fagin R. and Libkin L. (2013)

Solutions and query rewriting in data Exchange.

Revista : Information and Computation
Volumen : 228–229
Páginas : 28–61
Tipo de publicación : ISI Ir a publicación

Abstract

Data exchange is the problem of taking data structured under a source schema and creatingan instance of a target schema. Given a source instance, there may be many solutions –target instances that satisfy the constraints of the data exchange problem. Previous workhas identified two classes of desirable solutions: canonical universal solutions, and theircores. Query answering in data exchange amounts to rewriting a query over the targetschema to another query that, over a materialized target instance, gives the result that issemantically consistent with the source (specifically, the “certain answers”). Basic questionsare then: (1) how do these solutions compare in terms of query rewriting? and (2) howcan we determine whether a query is rewritable over a particular solution?Our goal is to answer these questions. Our first main result is that, in terms of rewritabilityby relational algebra queries, the core is strictly less expressive than the canonical universalsolution, which in turn is strictly less expressive than the source. To develop techniquesfor proving queries non-rewritable, we establish structural properties of solutions; in factthey are derived from the technical machinery developed in the rewritability proofs. Oursecond result is that both the canonical universal solution and the core preserve the localstructure of the data, and that every target query rewritable over any of these solutionscannot distinguish tuples whose neighborhoods in the source are similar. This gives usa first simple tool for checking whether a query is non-rewritable over the canonicaluniversal solution or over the core. We also show that these tools generalize to arbitrarytransformations that preserve the local structure of the data, and investigate an alternativesemantics of query answering in data exchange.