Pontificia Universidad Católica de Chile Pontificia Universidad Católica de Chile
Bourhis P., Puppis G., Riveros C. and Staworko S. (2016)

Bounded Repairability for Regular Tree Languages

Revista : ACM Transactions on Database Systems
Volumen : 41
Número : 3
Páginas : 45pp
Tipo de publicación : ISI Ir a publicación

Abstract

We study the problem of bounded repairability of a given restriction tree language R into a target tree language T. More precisely, we say that R is bounded repairable with respect to T if there exists a bound on the number of standard tree editing operations necessary to apply to any tree in R to obtain a tree in T. We consider a number of possible specifications for tree languages: bottom-up tree automata (on curry encoding of unranked trees) that capture the class of XML schemas and document type definitions (DTDs). We also consider a special case when the restriction language R is universal (i.e., contains all trees over a given alphabet).We give an effective characterization of bounded repairability between pairs of tree languages represented with automata. This characterization introduces two tools—synopsis trees and a coverage relation between them—allowing one to reason about tree languages that undergo a bounded number of editing operations. We then employ this characterization to provide upper bounds to the complexity of deciding bounded repairability and show that these bounds are tight. In particular, when the input tree languages are specified with arbitrary bottom-up automata, the problem is coNExp-complete. The problem remains coNExp-complete even if we use deterministic nonrecursive DTDs to specify the input languages. The complexity of the problem can be reduced if we assume that the alphabet, the set of node labels, is fixed: the problem becomes PSpace-complete for nonrecursive DTDs and coNP-complete for deterministic nonrecursive DTDs. Finally, when the restriction tree language R is universal, we show that the bounded repairability problem becomes Exp-complete if the target language is specified by an arbitrary bottom-up tree automaton and becomes tractable (P-complete, in fact) when a deterministic bottom-up automaton is used.