Skip to main content

Showing 1–50 of 50 results for author: Amarilli, A

.
  1. arXiv:2407.01127  [pdf, other

    cs.DB

    Tractable Circuits in Database Theory

    Authors: Antoine Amarilli, Florent Capelli

    Abstract: This work reviews how database theory uses tractable circuit classes from knowledge compilation. We present relevant query evaluation tasks, and notions of tractable circuits. We then show how these tractable circuits can be used to address database tasks. We first focus on Boolean provenance and its applications for aggregation tasks, in particular probabilistic query evaluation. We study these f… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 15 pages including 12 pages of main text

  2. arXiv:2404.09674  [pdf, ps, other

    cs.DS cs.DB cs.FL

    A Circus of Circuits: Connections Between Decision Diagrams, Circuits, and Automata

    Authors: Antoine Amarilli, Marcelo Arenas, YooJung Choi, Mikaël Monet, Guy Van den Broeck, Benjie Wang

    Abstract: This document is an introduction to two related formalisms to define Boolean functions: binary decision diagrams, and Boolean circuits. It presents these formalisms and several of their variants studied in the setting of knowledge compilation. Last, it explains how these formalisms can be connected to the notions of automata over words and trees.

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 26 pages

  3. arXiv:2401.16210  [pdf, ps, other

    math.CO cs.DM

    The Non-Cancelling Intersections Conjecture

    Authors: Antoine Amarilli, Mikaël Monet, Dan Suciu

    Abstract: In this note, we present a conjecture on intersections of set families, and a rephrasing of the conjecture in terms of principal downsets of Boolean lattices. The conjecture informally states that, whenever we can express the measure of a union of sets in terms of the measure of some of their intersections using the inclusion-exclusion formula, then we can express the union as a set from these sam… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 30 pages

  4. arXiv:2310.00731  [pdf, other

    cs.DB cs.DS cs.LO

    Ranked Enumeration for MSO on Trees via Knowledge Compilation

    Authors: Antoine Amarilli, Pierre Bourhis, Florent Capelli, Mikaël Monet

    Abstract: We study the problem of enumerating the satisfying assignments for circuit classes from knowledge compilation, where assignments are ranked in a specific order. In particular, we show how this problem can be used to efficiently perform ranked enumeration of the answers to MSO queries over trees, with the order being given by a ranking function satisfying a subset-monotonicity property. Assuming… ▽ More

    Submitted 22 January, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: 26 pages; this is the authors version of the corresponding ICDT'24 article

  5. Conjunctive Queries on Probabilistic Graphs: The Limits of Approximability

    Authors: Antoine Amarilli, Timothy van Bremen, Kuldeep S. Meel

    Abstract: Query evaluation over probabilistic databases is a notoriously intractable problem -- not only in combined complexity, but for many natural queries in data complexity as well. This motivates the study of probabilistic query evaluation through the lens of approximation algorithms, and particularly of combined FPRASes, whose runtime is polynomial in both the query and instance size. In this paper, w… ▽ More

    Submitted 9 April, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

    Comments: 20 pages. This article is identical to the ICDT'24 publication up to minor changes (including the correction of a mistake in the proof of Proposition 4.1)

  6. arXiv:2304.06155  [pdf, other

    cs.DB cs.FL

    Skyline Operators for Document Spanners

    Authors: Antoine Amarilli, Benny Kimelfeld, Sébastien Labbé, Stefan Mengel

    Abstract: When extracting a relation of spans (intervals) from a text document, a common practice is to filter out tuples of the relation that are deemed dominated by others. The domination rule is defined as a partial order that varies along different systems and tasks. For example, we may state that a tuple is dominated by tuples which extend it by assigning additional attributes, or assigning larger inte… ▽ More

    Submitted 4 March, 2024; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: 42 pages. This is the full version of the ICDT'24 publication, which includes all reviewer feedback; the main body is identical to the ICDT'24 article up to minor changes

  7. arXiv:2302.03461  [pdf, other

    cs.DM

    Degree-3 Planar Graphs as Topological Minors of Wall Graphs in Polynomial Time

    Authors: Antoine Amarilli

    Abstract: In this note, we give a proof of the fact that we can efficiently find degree-3 planar graphs as topological minors of sufficiently large wall graphs. The result is needed as an intermediate step to fix a proof in my PhD thesis.

    Submitted 22 February, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: V2: Updated to fix an error in the proof pointed out by Mikaël Monet. V3: Updated to point out alternative and simpler proof route following https://cstheory.stackexchange.com/a/52489

  8. arXiv:2212.11362  [pdf, ps, other

    cs.LO

    Tighter bounds for query answering with Guarded TGDs

    Authors: Antoine Amarilli, Michael Benedikt

    Abstract: We consider the complexity of the open-world query answering problem, where we wish to determine certain answers to conjunctive queries over incomplete datasets specified by an initial set of facts and a set of Guarded TGDs. This problem has been well-studied in the literature and is decidable but with a high complexity, namely, it is 2EXPTIME complete. Further, the complexity shrinks by one expon… ▽ More

    Submitted 8 January, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: arXiv admin note: text overlap with arXiv:1706.07936

  9. arXiv:2209.14878  [pdf, other

    cs.FL cs.DS

    Enumerating Regular Languages with Bounded Delay

    Authors: Antoine Amarilli, Mikaël Monet

    Abstract: We study the task, for a given language $L$, of enumerating the (generally infinite) sequence of its words, without repetitions, while bounding the delay between two consecutive words. To allow for delay bounds that do not depend on the current word length, we assume a model where we produce each word by editing the preceding word with a small edit script, rather than writing out the word from scr… ▽ More

    Submitted 7 January, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: This is the full versions with proofs of the STACS'23 article

  10. arXiv:2209.11177  [pdf, other

    cs.DB cs.CC cs.DM cs.DS cs.LO

    Uniform Reliability for Unbounded Homomorphism-Closed Graph Queries

    Authors: Antoine Amarilli

    Abstract: We study the uniform query reliability problem, which asks, for a fixed Boolean query Q, given an instance I, how many subinstances of I satisfy Q. Equivalently, this is a restricted case of Boolean query evaluation on tuple-independent probabilistic databases where all facts must have probability 1/2. We focus on graph signatures, and on queries closed under homomorphisms. We show that for any su… ▽ More

    Submitted 17 January, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

    Comments: Full version with proofs of the ICDT'23 article

  11. arXiv:2205.04224  [pdf, ps, other

    cs.DB

    Worst-case Analysis for Interactive Evaluation of Boolean Provenance

    Authors: Antoine Amarilli, Yael Amsterdamer

    Abstract: In recent work, we have introduced a framework for fine-grained consent management in databases, which combines Boolean data provenance with the field of interactive Boolean evaluation. In turn, interactive Boolean evaluation aims at unveiling the underlying truth value of a Boolean expression by frugally probing the truth values of individual values. The required number of probes depends on the B… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  12. arXiv:2205.00851  [pdf, other

    cs.CC cs.DM

    Weighted Counting of Matchings in Unbounded-Treewidth Graph Families

    Authors: Antoine Amarilli, Mikaël Monet

    Abstract: We consider a weighted counting problem on matchings, denoted $\textrm{PrMatching}(\mathcal{G})$, on an arbitrary fixed graph family $\mathcal{G}$. The input consists of a graph $G\in \mathcal{G}$ and of rational probabilities of existence on every edge of $G$, assuming independence. The output is the probability of obtaining a matching of $G$ in the resulting distribution, i.e., a set of edges th… ▽ More

    Submitted 7 January, 2023; v1 submitted 2 May, 2022; originally announced May 2022.

    Comments: This is the full version with proofs of the MFCS'22 article

  13. arXiv:2202.08555  [pdf, ps, other

    cs.LO cs.AI cs.DB

    Query Answering with Transitive and Linear-Ordered Data

    Authors: Antoine Amarilli, Michael Benedikt, Pierre Bourhis, Michael Vanden Boom

    Abstract: We consider entailment problems involving powerful constraint languages such as frontier-guarded existential rules in which we impose additional semantic restrictions on a set of distinguished relations. We consider restricting a relation to be transitive, restricting a relation to be the transitive closure of another relation, and restricting a relation to be a linear order. We give some natural… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

    Comments: This article was originally published at JAIR in 2018: https://www.jair.org/index.php/jair/article/view/11240 (DOI 10.1613/jair.1.11240). This version of the paper includes one modification from the publisher version: we fix an incorrect proof for one of our undecidability results (Theorem 6.2). arXiv admin note: substantial text overlap with arXiv:1607.00813

  14. Efficient Enumeration Algorithms for Annotated Grammars

    Authors: Antoine Amarilli, Louis Jachiet, Martín Muñoz, Cristian Riveros

    Abstract: We introduce annotated grammars, an extension of context-free grammars which allows annotations on terminals. Our model extends the standard notion of regular spanners, and is more expressive than the extraction grammars recently introduced by Peterfreund. We study the enumeration problem for annotated grammars: fixing a grammar, and given a string as input, enumerate all annotations of the string… ▽ More

    Submitted 17 May, 2022; v1 submitted 3 January, 2022; originally announced January 2022.

    Comments: 54 pages. Full version with proofs of the article to appear at PODS'22. Except formatting and minor differences, this article contains all the contents of the PODS'22 article, plus the technical appendices

  15. arXiv:2102.07728  [pdf, other

    cs.FL cs.DS

    Dynamic Membership for Regular Languages

    Authors: Antoine Amarilli, Louis Jachiet, Charles Paperman

    Abstract: We study the dynamic membership problem for regular languages: fix a language L, read a word w, build in time O(|w|) a data structure indicating if w is in L, and maintain this structure efficiently under letter substitutions on w. We consider this problem on the unit cost RAM model with logarithmic word length, where the problem always has a solution in O(log |w| / log log |w|) per operation. W… ▽ More

    Submitted 4 June, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: 34 pages. This is the full version with proofs of the ICALP'21 article

  16. Locality and Centrality: The Variety ZG

    Authors: Antoine Amarilli, Charles Paperman

    Abstract: We study the variety ZG of monoids where the elements that belong to a group are central, i.e., commute with all other elements. We show that ZG is local, that is, the semidirect product ZG * D of ZG by definite semigroups is equal to LZG, the variety of semigroups where all local monoids are in ZG. Our main result is thus: ZG * D = LZG. We prove this result using Straubing's delay theorem, by con… ▽ More

    Submitted 17 October, 2023; v1 submitted 15 February, 2021; originally announced February 2021.

    Journal ref: Logical Methods in Computer Science, Volume 19, Issue 4 (October 18, 2023) lmcs:11555

  17. arXiv:2003.07316  [pdf, other

    cs.DB

    Equivalent Rewritings on Path Views with Binding Patterns

    Authors: Julien Romero, Nicoleta Preda, Antoine Amarilli, Fabian Suchanek

    Abstract: A view with a binding pattern is a parameterized query on a database. Such views are used, e.g., to model Web services. To answer a query on such views, the views have to be orchestrated together in execution plans. We show how queries can be rewritten into equivalent execution plans, which are guaranteed to deliver the same results as the query on all databases. We provide a correct and complete… ▽ More

    Submitted 19 March, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

    Comments: 33 pages including 16 pages of main text. This is the full version of the ESWC'2020 article, which integrates all reviewer feedback, with the same text as the publisher version except minor changes. Several corrections relative to the first version

  18. arXiv:2003.02576  [pdf, ps, other

    cs.DB cs.IR

    Constant-Delay Enumeration for Nondeterministic Document Spanners

    Authors: Antoine Amarilli, Pierre Bourhis, Stefan Mengel, Matthias Niewerth

    Abstract: We consider the information extraction framework known as document spanners, and study the problem of efficiently computing the results of the extraction from an input document, where the extraction task is described as a sequential variable-set automaton (VA). We pose this problem in the setting of enumeration algorithms, where we can first run a preprocessing phase and must then produce the resu… ▽ More

    Submitted 7 December, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

    Comments: 29 pages. Extended version of arXiv:1807.09320. Integrates all corrections following reviewer feedback. Outside of some minor formatting differences and tweaks, this paper is the same as the paper to appear in the ACM TODS journal

  19. arXiv:2003.02521  [pdf, ps, other

    cs.LO cs.DB

    Finite Open-World Query Answering with Number Restrictions

    Authors: Antoine Amarilli, Michael Benedikt

    Abstract: Open-world query answering is the problem of deciding, given a set of facts, conjunction of constraints, and query, whether the facts and constraints imply the query. This amounts to reasoning over all instances that include the facts and satisfy the constraints. We study finite open-world query answering (FQA), which assumes that the underlying world is finite and thus only considers the finite c… ▽ More

    Submitted 5 March, 2020; originally announced March 2020.

    Comments: 70 pages. Extended journal version of arXiv:1505.04216. This article is the same as what will be published in ToCL, except for publisher-induced changes, minor changes, and reordering of the material (in the ToCL version some detailed proofs are moved from the article body to an appendix)

  20. The Dichotomy of Evaluating Homomorphism-Closed Queries on Probabilistic Graphs

    Authors: Antoine Amarilli, İsmail İlkan Ceylan

    Abstract: We study the problem of query evaluation on probabilistic graphs, namely, tuple-independent probabilistic databases over signatures of arity two. We focus on the class of queries closed under homomorphisms, or, equivalently, the infinite unions of conjunctive queries. Our main result states that the probabilistic query evaluation problem is #P-hard for all unbounded queries from this class. As bou… ▽ More

    Submitted 6 January, 2022; v1 submitted 4 October, 2019; originally announced October 2019.

    Journal ref: Logical Methods in Computer Science, Volume 18, Issue 1 (January 7, 2022) lmcs:7065

  21. Uniform Reliability of Self-Join-Free Conjunctive Queries

    Authors: Antoine Amarilli, Benny Kimelfeld

    Abstract: The reliability of a Boolean Conjunctive Query (CQ) over a tuple-independent probabilistic database is the probability that the CQ is satisfied when the tuples of the database are sampled one by one, independently, with their associated probability. For queries without self-joins (repeated relation symbols), the data complexity of this problem is fully characterized by a known dichotomy: reliabili… ▽ More

    Submitted 8 November, 2022; v1 submitted 19 August, 2019; originally announced August 2019.

    Comments: Extended version of the ICDT'21 paper

    Journal ref: Logical Methods in Computer Science, Volume 18, Issue 4 (November 9, 2022) lmcs:10088

  22. arXiv:1906.00311  [pdf, other

    cs.AI cs.LG

    Smoothing Structured Decomposable Circuits

    Authors: Andy Shih, Guy Van den Broeck, Paul Beame, Antoine Amarilli

    Abstract: We study the task of smoothing a circuit, i.e., ensuring that all children of a plus-gate mention the same variables. Circuits serve as the building blocks of state-of-the-art inference algorithms on discrete probabilistic graphical models and probabilistic programs. They are also important for discrete density estimation algorithms. Many of these tasks require the input circuit to be smooth. Howe… ▽ More

    Submitted 28 October, 2019; v1 submitted 1 June, 2019; originally announced June 2019.

    Journal ref: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

  23. arXiv:1812.09519  [pdf, ps, other

    cs.DB cs.DS cs.FL cs.LO

    Enumeration on Trees with Tractable Combined Complexity and Efficient Updates

    Authors: Antoine Amarilli, Pierre Bourhis, Stefan Mengel, Matthias Niewerth

    Abstract: We give an algorithm to enumerate the results on trees of monadic second-order (MSO) queries represented by nondeterministic tree automata. After linear time preprocessing (in the input tree), we can enumerate answers with linear delay (in each answer). We allow updates on the tree to take place at any time, and we can then restart the enumeration after logarithmic time in the tree. Further, all o… ▽ More

    Submitted 27 August, 2019; v1 submitted 22 December, 2018; originally announced December 2018.

    Comments: 16 pages of main material, 37 references, 11 pages of appendix. This is the extended version with proofs of the PODS'19 paper. Except for minor rephrasings and formatting differences, the contents are exactly the same as the version published in the PODS'19 proceedings

  24. Connecting Knowledge Compilation Classes and Width Parameters

    Authors: Antoine Amarilli, Florent Capelli, Mikaël Monet, Pierre Senellart

    Abstract: The field of knowledge compilation establishes the tractability of many tasks by studying how to compile them to Boolean circuit classes obeying some requirements such as structuredness, decomposability, and determinism. However, in other settings such as intensional query evaluation on databases, we obtain Boolean circuits that satisfy some width bounds, e.g., they have bounded treewidth or pathw… ▽ More

    Submitted 20 July, 2019; v1 submitted 7 November, 2018; originally announced November 2018.

    Comments: 46 pages. Extended version of arXiv:1709.06188. Up to the stylesheet, page/environment numbering, minor formatting, and publisher-induced changes, this is the exact content of the paper in Theory of Computing Systems <https://link.springer.com/article/10.1007%2Fs00224-019-09930-2>. The difference in the titles (missing "and") is an error introduced by the publisher

    ACM Class: H.2

  25. When Can We Answer Queries Using Result-Bounded Data Interfaces?

    Authors: Antoine Amarilli, Michael Benedikt

    Abstract: We consider answering queries on data available through access methods, that provide lookup access to the tuples matching a given binding. Such interfaces are common on the Web; further, they often have bounds on how many results they can return, e.g., because of pagination or rate limits. We thus study result-bounded methods, which may return only a limited number of tuples. We study how to decid… ▽ More

    Submitted 1 June, 2022; v1 submitted 17 October, 2018; originally announced October 2018.

    Comments: journal version of the PODS'18 paper arXiv:1706.07936

    Journal ref: Logical Methods in Computer Science, Volume 18, Issue 2 (June 2, 2022) lmcs:4903

  26. Evaluating Datalog via Tree Automata and Cycluits

    Authors: Antoine Amarilli, Pierre Bourhis, Mikaël Monet, Pierre Senellart

    Abstract: We investigate parameterizations of both database instances and queries that make query evaluation fixed-parameter tractable in combined complexity. We show that clique-frontier-guarded Datalog with stratified negation (CFG-Datalog) enjoys bilinear-time evaluation on structures of bounded treewidth for programs of bounded rule size. Such programs capture in particular conjunctive queries with simp… ▽ More

    Submitted 29 May, 2019; v1 submitted 14 August, 2018; originally announced August 2018.

    Comments: 56 pages, 63 references. Journal version of "Combined Tractability of Query Evaluation via Tree Automata and Cycluits (Extended Version)" at arXiv:1612.04203. Up to the stylesheet, page/environment numbering, and possible minor publisher-induced changes, this is the exact content of the journal paper that will appear in Theory of Computing Systems. Update wrt version 1: latest reviewer feedback

  27. Constant-Delay Enumeration for Nondeterministic Document Spanners

    Authors: Antoine Amarilli, Pierre Bourhis, Stefan Mengel, Matthias Niewerth

    Abstract: We consider the information extraction framework known as document spanners, and study the problem of efficiently computing the results of the extraction from an input document, where the extraction task is described as a sequential variable-set automaton (VA). We pose this problem in the setting of enumeration algorithms, where we can first run a preprocessing phase and must then produce the resu… ▽ More

    Submitted 7 December, 2020; v1 submitted 24 July, 2018; originally announced July 2018.

    Comments: 25 pages including 17 pages of main material. Integrates all reviewer feedback. T paper is exactly the same as the ICDT'19 paper except that it contains 6 pages of technical appendix, and except that we corrected some additional minor mistakes following reviews of the journal version (arXiv:2003.02576). We recommend reading the journal version instead of this paper

  28. arXiv:1801.06396  [pdf, ps, other

    cs.DB

    Computing Possible and Certain Answers over Order-Incomplete Data

    Authors: Antoine Amarilli, Mouhamadou Lamine Ba, Daniel Deutch, Pierre Senellart

    Abstract: This paper studies the complexity of query evaluation for databases whose relations are partially ordered; the problem commonly arises when combining or transforming ordered data from multiple sources. We focus on queries in a useful fragment of SQL, namely positive relational algebra with aggregates, whose bag semantics we extend to the partially ordered setting. Our semantics leads to the study… ▽ More

    Submitted 29 May, 2019; v1 submitted 19 January, 2018; originally announced January 2018.

    Comments: 55 pages, 56 references. Extended journal version of arXiv:1707.07222. Up to the stylesheet, page/environment numbering, and possible minor publisher-induced changes, this is the exact content of the journal paper that will appear in Theoretical Computer Science

  29. Connecting Width and Structure in Knowledge Compilation (Extended Version)

    Authors: Antoine Amarilli, Mikaël Monet, Pierre Senellart

    Abstract: Several query evaluation tasks can be done via knowledge compilation: the query result is compiled as a lineage circuit from which the answer can be determined. For such tasks, it is important to leverage some width parameters of the circuit, such as bounded treewidth or pathwidth, to convert the circuit to structured classes, e.g., deterministic structured NNFs (d-SDNNFs) or OBDDs. In this work,… ▽ More

    Submitted 15 December, 2022; v1 submitted 18 September, 2017; originally announced September 2017.

    Comments: 33 pages, no figures, 40 references. This is the full version with proofs of the corresponding ICDT'18 publication, and it integrates all reviewer feedback. Except for the additional appendices, and except for formatting differences and inessential changes, the contents are the same as in the conference version. Fixed in version 4 a minor omission in the proof of Theorem 33 and small typos

  30. Enumeration on Trees under Relabelings

    Authors: Antoine Amarilli, Pierre Bourhis, Stefan Mengel

    Abstract: We study how to evaluate MSO queries with free variables on trees, within the framework of enumeration algorithms. Previous work has shown how to enumerate answers with linear-time preprocessing and delay linear in the size of each output, i.e., constant-delay for free first-order variables. We extend this result to support relabelings, a restricted kind of update operations on trees which allows… ▽ More

    Submitted 31 May, 2018; v1 submitted 18 September, 2017; originally announced September 2017.

    Comments: 37 pages including appendix, 31 references. This is the full version with proofs of the corresponding ICDT'18 publication, and it integrates all reviewer feedback. Except for the additional appendices, the contents are exactly the same as in the conference version

  31. Possible and Certain Answers for Queries over Order-Incomplete Data

    Authors: Antoine Amarilli, Mouhamadou Lamine Ba, Daniel Deutch, Pierre Senellart

    Abstract: To combine and query ordered data from multiple sources, one needs to handle uncertainty about the possible orderings. Examples of such "order-incomplete" data include integrated event sequences such as log entries, lists of properties (e.g., hotels and restaurants) ranked by an unknown function reflecting relevance or customer ratings, and documents edited concurrently with an uncertain order on… ▽ More

    Submitted 26 January, 2018; v1 submitted 22 July, 2017; originally announced July 2017.

    Comments: This paper is the full version with appendices of the TIME'17 article. See also the upcoming journal version: arXiv:1801.06396. Important note: This version (version 2) removes some results because we found a bug in their proofs. See Appendix G for detailed explanations. The journal version also omits the affected results (and does not contain Appendix G)

  32. Topological Sorting under Regular Constraints

    Authors: Antoine Amarilli, Charles Paperman

    Abstract: We introduce the constrained topological sorting problem (CTS): given a regular language K and a directed acyclic graph G with labeled vertices, determine if G has a topological sort that forms a word in K. This natural problem applies to several settings, e.g., scheduling with costs or verifying concurrent programs. We consider the problem CTS[K] where the target language K is fixed, and study it… ▽ More

    Submitted 30 April, 2018; v1 submitted 13 July, 2017; originally announced July 2017.

    Comments: 45 pages, 31 references in the main text. This is the full version with proofs of the ICALP'18 paper, and is the same as the ICALP proceedings version up to minor publisher-dependent changes. Several important changes with respect to version 1, including fixing some errors. Title changed with respect to version 2

  33. When Can We Answer Queries Using Result-Bounded Data Interfaces?

    Authors: Antoine Amarilli, Michael Benedikt

    Abstract: We consider answering queries where the underlying data is available only over limited interfaces which provide lookup access to the tuples matching a given binding, but possibly restricting the number of output tuples returned. Interfaces imposing such "result bounds" are common in accessing data via the web. Given a query over a set of relations as well as some integrity constraints that relate… ▽ More

    Submitted 31 August, 2018; v1 submitted 24 June, 2017; originally announced June 2017.

    Comments: 45 pages, 2 tables, 43 references. Complete version with proofs of the PODS'18 paper. The main text of this paper is almost identical to the PODS'18 except that we have fixed some small mistakes. Relative to the earlier arXiv version, many errors were corrected, and some terminology has changed

  34. Conjunctive Queries on Probabilistic Graphs: Combined Complexity

    Authors: Antoine Amarilli, Mikaël Monet, Pierre Senellart

    Abstract: Query evaluation over probabilistic databases is known to be intractable in many cases, even in data complexity, i.e., when the query is fixed. Although some restrictions of the queries [19] and instances [4] have been proposed to lower the complexity, these known tractable cases usually do not apply to combined complexity, i.e., when the query is not fixed. This leaves open the question of which… ▽ More

    Submitted 27 August, 2019; v1 submitted 9 March, 2017; originally announced March 2017.

    Comments: 36 pages including 4 appendix sections. This is the PODS'17 article with all proofs and all reviewer feedback. Relative to the previous version and to the PODS version, this version adds details about a subtle point in Appendix D, and fixes some minor formatting issues

  35. A Circuit-Based Approach to Efficient Enumeration

    Authors: Antoine Amarilli, Pierre Bourhis, Louis Jachiet, Stefan Mengel

    Abstract: We study the problem of enumerating the satisfying valuations of a circuit while bounding the delay, i.e., the time needed to compute each successive valuation. We focus on the class of structured d-DNNF circuits originally introduced in knowledge compilation, a sub-area of artificial intelligence. We propose an algorithm for these circuits that enumerates valuations with linear preprocessing and… ▽ More

    Submitted 5 May, 2017; v1 submitted 18 February, 2017; originally announced February 2017.

    Comments: 45 pages, 1 figure, 36 references. Accepted at ICALP'17. This paper is the full version with appendices of the article in the ICALP proceedings. The main text of this full version is the same as the ICALP proceedings version, except some superficial changes (to fit the proceedings version to 12 pages, and to obey LIPIcs-specific formatting requirements)

  36. Top-k Querying of Unknown Values under Order Constraints (Extended Version)

    Authors: Antoine Amarilli, Yael Amsterdamer, Tova Milo, Pierre Senellart

    Abstract: Many practical scenarios make it necessary to evaluate top-k queries over data items with partially unknown values. This paper considers a setting where the values are taken from a numerical domain, and where some partial order constraints are given over known and unknown values: under these constraints, we assume that all possible worlds are equally likely. Our work is the first to propose a prin… ▽ More

    Submitted 10 January, 2017; originally announced January 2017.

    Comments: 32 pages, 1 figure, 1 algorithm, 51 references. Extended version of paper at ICDT'17

  37. Predicting Completeness in Knowledge Bases

    Authors: Luis Galárraga, Simon Razniewski, Antoine Amarilli, Fabian M. Suchanek

    Abstract: Knowledge bases such as Wikidata, DBpedia, or YAGO contain millions of entities and facts. In some knowledge bases, the correctness of these facts has been evaluated. However, much less is known about their completeness, i.e., the proportion of real facts that the knowledge bases cover. In this work, we investigate different signals to identify the areas where a knowledge base is complete. We show… ▽ More

    Submitted 17 December, 2016; originally announced December 2016.

    Comments: 21 pages, 19 references, 1 figure, 5 tables. Complete version of the article accepted at WSDM'17

  38. Combined Tractability of Query Evaluation via Tree Automata and Cycluits (Extended Version)

    Authors: Antoine Amarilli, Pierre Bourhis, Mikaël Monet, Pierre Senellart

    Abstract: We investigate parameterizations of both database instances and queries that make query evaluation fixed-parameter tractable in combined complexity. We introduce a new Datalog fragment with stratified negation, intensional-clique-guarded Datalog (ICG-Datalog), with linear-time evaluation on structures of bounded treewidth for programs of bounded rule size. Such programs capture in particular conju… ▽ More

    Submitted 15 January, 2017; v1 submitted 13 December, 2016; originally announced December 2016.

    Comments: 69 pages, accepted at ICDT'17. Appendix F contains results from an independent upcoming journal paper by Michael Benedikt, Pierre Bourhis, Georg Gottlob, and Pierre Senellart

  39. Challenges for Efficient Query Evaluation on Structured Probabilistic Data

    Authors: Antoine Amarilli, Silviu Maniu, Mikaël Monet

    Abstract: Query answering over probabilistic data is an important task but is generally intractable. However, a new approach for this problem has recently been proposed, based on structural decompositions of input databases, following, e.g., tree decompositions. This paper presents a vision for a database management system for probabilistic data built following this structural approach. We review our existi… ▽ More

    Submitted 19 July, 2016; originally announced July 2016.

    Comments: 9 pages, 1 figure, 23 references. Accepted for publication at SUM 2016

  40. arXiv:1607.00813  [pdf, other

    cs.DB cs.LO

    Query Answering with Transitive and Linear-Ordered Data

    Authors: Antoine Amarilli, Michael Benedikt, Pierre Bourhis, Michael Vanden Boom

    Abstract: We consider entailment problems involving powerful constraint languages such as guarded existential rules, in which additional semantic restrictions are put on a set of distinguished relations. We consider restricting a relation to be transitive, restricting a relation to be the transitive closure of another relation, and restricting a relation to be a linear order. We give some natural generaliza… ▽ More

    Submitted 4 July, 2016; originally announced July 2016.

    Comments: 36 pages. To appear in IJCAI 2016. Extended version with proofs

    Journal ref: A journal version of this conference article was published in JAIR (Volume 63, 2018): https://www.jair.org/index.php/jair/article/view/11240

  41. Tractable Lineages on Treelike Instances: Limits and Extensions

    Authors: Antoine Amarilli, Pierre Bourhis, Pierre Senellart

    Abstract: Query evaluation on probabilistic databases is generally intractable (#P-hard). Existing dichotomy results have identified which queries are tractable (or safe), and connected them to tractable lineages. In our previous work, using different tools, we showed that query evaluation is linear-time on probabilistic databases for arbitrary monadic second-order queries, if we bound the treewidth of the… ▽ More

    Submitted 12 April, 2023; v1 submitted 10 April, 2016; originally announced April 2016.

    Comments: 36 pages, 2 tables. Version with proofs of the PODS'16 article. Some omitted proofs are available in the thesis of the first author. Includes a corrected proof of Theorem 5.5

  42. Provenance Circuits for Trees and Treelike Instances (Extended Version)

    Authors: Antoine Amarilli, Pierre Bourhis, Pierre Senellart

    Abstract: Query evaluation in monadic second-order logic (MSO) is tractable on trees and treelike instances, even though it is hard for arbitrary instances. This tractability result has been extended to several tasks related to query evaluation, such as counting query results [3] or performing query evaluation on probabilistic trees [10]. These are two examples of the more general problem of computing augme… ▽ More

    Submitted 27 November, 2015; originally announced November 2015.

    Comments: 48 pages. Presented at ICALP'15

  43. Structurally Tractable Uncertain Data

    Authors: Antoine Amarilli

    Abstract: Many data management applications must deal with data which is uncertain, incomplete, or noisy. However, on existing uncertain data representations, we cannot tractably perform the important query evaluation tasks of determining query possibility, certainty, or probability: these problems are hard on arbitrary uncertain input instances. We thus ask whether we could restrict the structure of uncert… ▽ More

    Submitted 17 July, 2015; originally announced July 2015.

    Comments: 11 pages, 1 figure, 1 table. To appear in SIGMOD/PODS PhD Symposium 2015

    ACM Class: H.2.1

  44. Finite Open-World Query Answering with Number Restrictions (Extended Version)

    Authors: Antoine Amarilli, Michael Benedikt

    Abstract: Open-world query answering is the problem of deciding, given a set of facts, conjunction of constraints, and query, whether the facts and constraints imply the query. This amounts to reasoning over all instances that include the facts and satisfy the constraints. We study finite open-world query answering (FQA), which assumes that the underlying world is finite and thus only considers the finite c… ▽ More

    Submitted 15 May, 2015; originally announced May 2015.

    Comments: 59 pages. To appear in LICS 2015. Extended version including proofs

  45. Harvesting Entities from the Web Using Unique Identifiers -- IBEX

    Authors: Aliaksandr Talaika, Joanna Biega, Antoine Amarilli, Fabian M. Suchanek

    Abstract: In this paper we study the prevalence of unique entity identifiers on the Web. These are, e.g., ISBNs (for books), GTINs (for commercial products), DOIs (for documents), email addresses, and others. We show how these identifiers can be harvested systematically from Web pages, and how they can be associated with human-readable names for the entities at large scale. Starting with a simple extracti… ▽ More

    Submitted 4 May, 2015; originally announced May 2015.

    Comments: 30 pages, 5 figures, 9 tables. Complete technical report for A. Talaika, J. A. Biega, A. Amarilli, and F. M. Suchanek. IBEX: Harvesting Entities from the Web Using Unique Identifiers. WebDB workshop, 2015

  46. arXiv:1505.00326  [pdf, ps, other

    cs.DB

    Combining Existential Rules and Description Logics (Extended Version)

    Authors: Antoine Amarilli, Michael Benedikt

    Abstract: Query answering under existential rules -- implications with existential quantifiers in the head -- is known to be decidable when imposing restrictions on the rule bodies such as frontier-guardedness [BLM10, BLMS11]. Query answering is also decidable for description logics [Baa03], which further allow disjunction and functionality constraints (assert that certain relations are functions), however,… ▽ More

    Submitted 2 May, 2015; originally announced May 2015.

    Comments: 32 pages. To appear in IJCAI 2015. Extended version including proofs

    Journal ref: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), 2015, pages 2691-2697

  47. arXiv:1404.3131  [pdf, ps, other

    cs.DB cs.CC cs.LO

    The Possibility Problem for Probabilistic XML (Extended Version)

    Authors: Antoine Amarilli

    Abstract: We consider the possibility problem of determining if a document is a possible world of a probabilistic document, in the setting of probabilistic XML. This basic question is a special case of query answering or tree automata evaluation, but it has specific practical uses, such as checking whether an user-provided probabilistic document outcome is possible or sufficiently likely. In this paper, we… ▽ More

    Submitted 22 July, 2014; v1 submitted 11 April, 2014; originally announced April 2014.

    Comments: 20 pages, 1 table, 2 figures. This is the complete version (including proofs) of work initially submitted as an extended abstract (without proofs) at the AMW 2014 workshop and subsequently submitted (with proofs) at the BDA 2014 conference (no formal proceedings). This version integrates the feedback from both rounds of reviews

    ACM Class: H.2.3; E.1

  48. Uncertainty in Crowd Data Sourcing under Structural Constraints

    Authors: Antoine Amarilli, Yael Amsterdamer, Tova Milo

    Abstract: Applications extracting data from crowdsourcing platforms must deal with the uncertainty of crowd answers in two different ways: first, by deriving estimates of the correct value from the answers; second, by choosing crowd questions whose answers are expected to minimize this uncertainty relative to the overall data collection goal. Such problems are already challenging when we assume that questio… ▽ More

    Submitted 4 March, 2014; originally announced March 2014.

    Comments: 8 pages, vision paper. To appear at UnCrowd 2014

    ACM Class: H.2.8

  49. arXiv:1312.3248  [pdf, other

    cs.DB cs.CC cs.IR

    On the Complexity of Mining Itemsets from the Crowd Using Taxonomies

    Authors: Antoine Amarilli, Yael Amsterdamer, Tova Milo

    Abstract: We study the problem of frequent itemset mining in domains where data is not recorded in a conventional database but only exists in human knowledge. We provide examples of such scenarios, and present a crowdsourcing model for them. The model uses the crowd as an oracle to find out whether an itemset is frequent or not, and relies on a known taxonomy of the item domain to guide the search for frequ… ▽ More

    Submitted 16 December, 2013; v1 submitted 11 December, 2013; originally announced December 2013.

    Comments: 18 pages, 2 figures. To be published to ICDT'13. Added missing acknowledgement

    ACM Class: H.2.8

  50. arXiv:1207.2819  [pdf, other

    cs.FL

    A Proof of the Pum** Lemma for Context-Free Languages Through Pushdown Automata

    Authors: Antoine Amarilli, Marc Jeanmougin

    Abstract: The pum** lemma for context-free languages is a result about pushdown automata which is strikingly similar to the well-known pum** lemma for regular languages. However, though the lemma for regular languages is simply proved by using the pigeonhole principle on deterministic automata, the lemma for pushdown automata is proven through an equivalence with context-free languages and through the m… ▽ More

    Submitted 7 July, 2013; v1 submitted 11 July, 2012; originally announced July 2012.

    Comments: Corrected a typo in a definition, added related work, added acknowledgement, added note about proving Ogden's lemma