Skip to main content

Showing 1–18 of 18 results for author: Gilad, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.04352  [pdf, other

    cs.DB

    Qr-Hint: Actionable Hints Towards Correcting Wrong SQL Queries

    Authors: Yihao Hu, Amir Gilad, Kristin Stephens-Martinez, Sudeepa Roy, Jun Yang

    Abstract: We describe a system called Qr-Hint that, given a (correct) target query Q* and a (wrong) working query Q, both expressed in SQL, provides actionable hints for the user to fix the working query so that it becomes semantically equivalent to the target. It is particularly useful in an educational setting, where novices can receive help from Qr-Hint without requiring extensive personal tutoring. Sinc… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: SIGMOD 2024

  2. arXiv:2310.04424  [pdf, other

    cs.NE cs.AI cs.LG q-bio.MN

    Stability Analysis of Non-Linear Classifiers using Gene Regulatory Neural Network for Biological AI

    Authors: Adrian Ratwatte, Samitha Somathilaka, Sasitharan Balasubramaniam, Assaf A. Gilad

    Abstract: The Gene Regulatory Network (GRN) of biological cells governs a number of key functionalities that enables them to adapt and survive through different environmental conditions. Close observation of the GRN shows that the structure and operational principles resembles an Artificial Neural Network (ANN), which can pave the way for the development of Biological Artificial Intelligence. In particular,… ▽ More

    Submitted 14 September, 2023; originally announced October 2023.

  3. arXiv:2309.08574  [pdf, other

    cs.DB cs.CR

    DP-PQD: Privately Detecting Per-Query Gaps In Synthetic Data Generated By Black-Box Mechanisms

    Authors: Shweta Patwa, Danyu Sun, Amir Gilad, Ashwin Machanavajjhala, Sudeepa Roy

    Abstract: Synthetic data generation methods, and in particular, private synthetic data generation methods, are gaining popularity as a means to make copies of sensitive databases that can be shared widely for research and data analysis. Some of the fundamental operations in data analysis include analyzing aggregated statistics, e.g., count, sum, or median, on a subset of data satisfying some conditions. Whe… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  4. arXiv:2212.12104  [pdf, other

    cs.DB

    The Consistency of Probabilistic Databases with Independent Cells

    Authors: Amir Gilad, Aviram Imber, Benny Kimelfeld

    Abstract: A probabilistic database with attribute-level uncertainty consists of relations where cells of some attributes may hold probability distributions rather than deterministic content. Such databases arise, implicitly or explicitly, in the context of noisy operations such as missing data imputation, where we automatically fill in missing values, column prediction, where we predict unknown attributes,… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

    Comments: Full version of the ICDT 2023 paper with the same title

  5. arXiv:2212.10310  [pdf, other

    cs.CR cs.CY cs.DB

    PreFair: Privately Generating Justifiably Fair Synthetic Data

    Authors: David Pujol, Amir Gilad, Ashwin Machanavajjhala

    Abstract: When a database is protected by Differential Privacy (DP), its usability is limited in scope. In this scenario, generating a synthetic version of the data that mimics the properties of the private data allows users to perform any operation on the synthetic data, while maintaining the privacy of the original data. Therefore, multiple works have been devoted to devising systems for DP synthetic data… ▽ More

    Submitted 27 March, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: 15 pages, 11 figures

  6. arXiv:2209.06260  [pdf, other

    cs.DB

    FEDEX: An Explainability Framework for Data Exploration Steps

    Authors: Daniel Deutch, Amir Gilad, Tova Milo, Amit Mualem, Amit Somech

    Abstract: When exploring a new dataset, Data Scientists often apply analysis queries, look for insights in the resulting dataframe, and repeat to apply further queries. We propose in this paper a novel solution that assists data scientists in this laborious process. In a nutshell, our solution pinpoints the most interesting (sets of) rows in each obtained dataframe. Uniquely, our definition of interest is b… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: Full version of the VLDB paper with the same title

  7. arXiv:2209.01286  [pdf, other

    cs.DB

    DPXPlain: Privately Explaining Aggregate Query Answers

    Authors: Yuchao Tao, Amir Gilad, Ashwin Machanavajjhala, Sudeepa Roy

    Abstract: Differential privacy (DP) is the state-of-the-art and rigorous notion of privacy for answering aggregate database queries while preserving the privacy of sensitive information in the data. In today's era of data analysis, however, it poses new challenges for users to understand the trends and anomalies observed in the query results: Is the unexpected answer due to the data itself, or is it due to… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

  8. arXiv:2203.14692  [pdf, other

    cs.DB

    HypeR: Hypothetical Reasoning With What-If and How-To Queries Using a Probabilistic Causal Approach

    Authors: Sainyam Galhotra, Amir Gilad, Sudeepa Roy, Babak Salimi

    Abstract: What-if (provisioning for an update to a database) and how-to (how to modify the database to achieve a goal) analyses provide insights to users who wish to examine hypothetical scenarios without making actual changes to a database and thereby help plan strategies in their fields. Typically, such analyses are done by testing the effect of an update in the existing database on a specific view create… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Full version of the SIGMOD 2022 paper with the same title

  9. arXiv:2202.11160  [pdf, other

    cs.DB

    Understanding Queries by Conditional Instances

    Authors: Amir Gilad, Zhengjie Miao, Sudeepa Roy, Jun Yang

    Abstract: A powerful way to understand a complex query is by observing how it operates on data instances. However, specific database instances are not ideal for such observations: they often include large amounts of superfluous details that are not only irrelevant to understanding the query but also cause cognitive overload; and one specific database may not be enough. Given a relational query, is it possib… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  10. arXiv:2202.04039  [pdf, other

    cs.NE q-bio.BM

    Using Genetic Programming to Predict and Optimize Protein Function

    Authors: Iliya Miralavy, Alexander Bricco, Assaf Gilad, Wolfgang Banzhaf

    Abstract: Protein engineers conventionally use tools such as Directed Evolution to find new proteins with better functionalities and traits. More recently, computational techniques and especially machine learning approaches have been recruited to assist Directed Evolution, showing promising results. In this paper, we propose POET, a computational Genetic Programming tool based on evolutionary computation me… ▽ More

    Submitted 22 February, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

    Comments: 23 pages, 8 figures and 4 tables

  11. arXiv:2105.10591  [pdf, other

    cs.SI

    Heterogeneous Treatment Effects in Social Networks

    Authors: Amir Gilad, Harsh Parikh, Sudeepa Roy, Babak Salimi

    Abstract: We study treatment effect modifiers for causal analysis in a social network, where neighbors' characteristics or network structure may affect the outcome of a unit, and the goal is to identify sub-populations with varying treatment effects using such network properties. We propose a novel framework for this purpose that facilitates data-driven decision making by testing hypotheses about complex ef… ▽ More

    Submitted 7 November, 2021; v1 submitted 21 May, 2021; originally announced May 2021.

  12. arXiv:2103.14435  [pdf, other

    cs.DB

    Synthesizing Linked Data Under Cardinality and Integrity Constraints

    Authors: Amir Gilad, Shweta Patwa, Ashwin Machanavajjhala

    Abstract: The generation of synthetic data is useful in multiple aspects, from testing applications to benchmarking to privacy preservation. Generating the links between relations, subject to cardinality constraints (CCs) and integrity constraints (ICs) is an important aspect of this problem. Given instances of two relations, where one has a foreign key dependence on the other and is missing its foreign key… ▽ More

    Submitted 26 March, 2021; originally announced March 2021.

  13. arXiv:2103.00288  [pdf, other

    cs.DB

    On Optimizing the Trade-off between Privacy and Utility in Data Provenance

    Authors: Daniel Deutch, Ariel Frankenthal, Amir Gilad, Yuval Moskovitch

    Abstract: Organizations that collect and analyze data may wish or be mandated by regulation to justify and explain their analysis results. At the same time, the logic that they have followed to analyze the data, i.e., their queries, may be proprietary and confidential. Data provenance, a record of the transformations that data underwent, was extensively studied as means of explanations. In contrast, only a… ▽ More

    Submitted 27 February, 2021; originally announced March 2021.

  14. Towards Inferring Queries from Simple and Partial Provenance Examples

    Authors: Amir Gilad, Yuval Moskovitch

    Abstract: The field of query-by-example aims at inferring queries from output examples given by non-expert users, by finding the underlying logic that binds the examples. However, for a very small set of examples, it is difficult to correctly infer such logic. To bridge this gap, previous work suggested attaching explanations to each output example, modeled as provenance, allowing users to explain the reaso… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

  15. Explaining Natural Language Query Results

    Authors: Daniel Deutch, Nave Frost, Amir Gilad

    Abstract: Multiple lines of research have developed Natural Language (NL) interfaces for formulating database queries. We build upon this work, but focus on presenting a highly detailed form of the answers in NL. The answers that we present are importantly based on the provenance of tuples in the query result, detailing not only the results but also their explanations. We develop a novel method for transfor… ▽ More

    Submitted 8 July, 2020; originally announced July 2020.

    Journal ref: The VLDB Journal 29, pp. 485--508 (2020)

  16. T-REx: Table Repair Explanations

    Authors: Daniel Deutch, Nave Frost, Amir Gilad, Oren Sheffer

    Abstract: Data repair is a common and crucial step in many frameworks today, as applications may use data from different sources and of different levels of credibility. Thus, this step has been the focus of many works, proposing diverse approaches. To assist users in understanding the output of such data repair algorithms, we propose T-REx, a system for providing data repair explanations through Shapley val… ▽ More

    Submitted 8 July, 2020; originally announced July 2020.

    Journal ref: In Proceedings of the 2020 ACM SIGMOD. Association for Computing Machinery, New York, NY, USA, pages: 2765 to 2768 (2020)

  17. On Multiple Semantics for Declarative Database Repairs

    Authors: Amir Gilad, Daniel Deutch, Sudeepa Roy

    Abstract: We study the problem of database repairs through a rule-based framework that we refer to as Delta Rules. Delta Rules are highly expressive and allow specifying complex, cross-relations repair logic associated with Denial Constraints, Causal Rules, and allowing to capture Database Triggers of interest. We show that there are no one-size-fits-all semantics for repairs in this inclusive setting, and… ▽ More

    Submitted 12 April, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

    Journal ref: SIGMOD 2020

  18. arXiv:1602.03819  [pdf, other

    cs.DB

    Query By Provenance

    Authors: Daniel Deutch, Amir Gilad

    Abstract: To assist non-specialists in formulating database queries, multiple frameworks that automatically infer queries from a set of examples have been proposed. While highly useful, a shortcoming of the approach is that if users can only provide a small set of examples, many inherently different queries may qualify, and only some of these actually match the user intentions. Our main observation is that… ▽ More

    Submitted 16 May, 2016; v1 submitted 11 February, 2016; originally announced February 2016.