Skip to main content

Showing 1–12 of 12 results for author: Haas, P J

.
  1. arXiv:2307.02860  [pdf, other

    cs.DB

    Scaling Package Queries to a Billion Tuples via Hierarchical Partitioning and Customized Optimization

    Authors: Anh L. Mai, Pengyu Wang, Azza Abouzied, Matteo Brucato, Peter J. Haas, Alexandra Meliou

    Abstract: A package query returns a package - a multiset of tuples - that maximizes or minimizes a linear objective function subject to linear constraints, thereby enabling in-database decision support. Prior work has established the equivalence of package queries to Integer Linear Programs (ILPs) and developed the SketchRefine algorithm for package query processing. While this algorithm was an important fi… ▽ More

    Submitted 14 November, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

  2. arXiv:2212.13643  [pdf, other

    cs.HC cs.DB

    Understanding Business Users' Data-Driven Decision-Making: Practices, Challenges, and Opportunities

    Authors: Sneha Gathani, Zhicheng Liu, Peter J. Haas, Çağatay Demiralp

    Abstract: Business users perform data analysis to inform decisions for improving business processes and outcomes despite having limited formal technical training. While earlier work has focused on data analysts' and data scientists' practices and challenges, little is known about business users' decision-making practices and how they incorporate data and visual analytics into their workflows. To address thi… ▽ More

    Submitted 17 October, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

    Comments: Submitted to IEEE TVCG

  3. arXiv:2109.06160  [pdf, other

    cs.DB cs.HC cs.LG

    Augmenting Decision Making via Interactive What-If Analysis

    Authors: Sneha Gathani, Madelon Hulsebos, James Gale, Peter J. Haas, Çağatay Demiralp

    Abstract: The fundamental goal of business data analysis is to improve business decisions using data. Business users often make decisions to achieve key performance indicators (KPIs) such as increasing customer retention or sales, or decreasing costs. To discover the relationship between data attributes hypothesized to be drivers and those corresponding to KPIs of interest, business users currently need to… ▽ More

    Submitted 8 February, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: CIDR'22

  4. arXiv:2105.10809  [pdf, ps, other

    stat.ME

    Exact PPS Sampling with Bounded Sample Size

    Authors: Brian Hentschel, Peter J. Haas, Yuanyuan Tian

    Abstract: Probability proportional to size (PPS) sampling schemes with a target sample size aim to produce a sample comprising a specified number $n$ of items while ensuring that each item in the population appears in the sample with a probability proportional to its specified "weight" (also called its "size"). These two objectives, however, cannot always be achieved simultaneously. Existing PPS schemes pri… ▽ More

    Submitted 22 May, 2021; originally announced May 2021.

  5. Stochastic Package Queries in Probabilistic Databases

    Authors: Matteo Brucato, Nishant Yadav, Azza Abouzied, Peter J. Haas, Alexandra Meliou

    Abstract: We provide methods for in-database support of decision making under uncertainty. Many important decision problems correspond to selecting a package (bag of tuples in a relational database) that jointly satisfy a set of constraints while minimizing some overall cost function; in most real-world problems, the data is uncertain. We provide methods for specifying -- via a SQL extension -- and processi… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

    Journal ref: SIGMOD 2020

  6. arXiv:1906.05677  [pdf, other

    cs.DB cs.DC cs.LG

    Temporally-Biased Sampling Schemes for Online Model Management

    Authors: Brian Hentschel, Peter J. Haas, Yuanyuan Tian

    Abstract: To maintain the accuracy of supervised learning models in the presence of evolving data streams, we provide temporally-biased sampling schemes that weight recent data most heavily, with inclusion probabilities for a given data item decaying over time according to a specified "decay function". We then periodically retrain the models on the current sample. This approach speeds up the training proces… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: 49 pages, 18 figures. arXiv admin note: substantial text overlap with arXiv:1801.09709

  7. arXiv:1808.08294  [pdf, other

    cs.LG stat.ML

    Unknown Examples & Machine Learning Model Generalization

    Authors: Yeounoh Chung, Peter J. Haas, Eli Upfal, Tim Kraska

    Abstract: Over the past decades, researchers and ML practitioners have come up with better and better ways to build, understand and improve the quality of ML models, but mostly under the key assumption that the training data is distributed identically to the testing data. In many real-world applications, however, some potential training examples are unknown to the modeler, due to sample selection bias or, m… ▽ More

    Submitted 11 October, 2019; v1 submitted 24 August, 2018; originally announced August 2018.

  8. arXiv:1801.09709  [pdf, other

    cs.DB

    Temporally-Biased Sampling for Online Model Management

    Authors: Brian Hentschel, Peter J. Haas, Yuanyuan Tian

    Abstract: To maintain the accuracy of supervised learning models in the presence of evolving data streams, we provide temporally-biased sampling schemes that weight recent data most heavily, with inclusion probabilities for a given data item decaying exponentially over time. We then periodically retrain the models on the current sample. This approach speeds up the training process relative to training on al… ▽ More

    Submitted 29 January, 2018; originally announced January 2018.

    Comments: 17 pages, 14 figures, extended version of an EDBT'18 paper

  9. arXiv:1709.10513  [pdf, other

    cs.HC

    Foresight: Rapid Data Exploration Through Guideposts

    Authors: Çağatay Demiralp, Peter J. Haas, Srinivasan Parthasarathy, Tejaswini Pedapati

    Abstract: Current tools for exploratory data analysis (EDA) require users to manually select data attributes, statistical computations and visual encodings. This can be daunting for large-scale, complex data. We introduce Foresight, a visualization recommender system that helps the user rapidly explore large high-dimensional datasets through "guideposts." A guidepost is a visualization corresponding to a pr… ▽ More

    Submitted 29 September, 2017; originally announced September 2017.

    Comments: IEEE VIS'17 Data Systems and Interactive Analysis (DSIA) Workshop

  10. arXiv:1707.03877  [pdf, other

    cs.DB

    Foresight: Recommending Visual Insights

    Authors: Çağatay Demiralp, Peter J. Haas, Srinivasan Parthasarathy, Tejaswini Pedapati

    Abstract: Current tools for exploratory data analysis (EDA) require users to manually select data attributes, statistical computations and visual encodings. This can be daunting for large-scale, complex data. We introduce Foresight, a system that helps the user rapidly discover visual insights from large high-dimensional datasets. Formally, an "insight" is a strong manifestation of a statistical property of… ▽ More

    Submitted 12 July, 2017; originally announced July 2017.

  11. arXiv:1401.4470  [pdf, ps, other

    physics.gen-ph

    Biquaternion formulation of relativistic tensor dynamics

    Authors: E. P. J. de Haas

    Abstract: In this paper we show how relativistic tensor dynamics and relativistic electrodynamics can be formulated in a biquaternion tensor language. The treatment is restricted to mathematical physics, known facts as the Lorentz Force Law and the Lagrange Equation are presented in a relatively new formalism. The goal is to fuse anti-symmetric tensor dynamics, as used for example in relativistic electrodyn… ▽ More

    Submitted 15 October, 2013; originally announced January 2014.

    Comments: 18 pages

    Journal ref: Apeiron, Vol. 15, No. 4, October 2008, 358-381

  12. The combination of de Broglie's Harmony of the Phases and Mie's theory of gravity results in a Principle of Equivalence for Quantum Gravity

    Authors: E. P. J. de Haas

    Abstract: Under a Lorentz-transformation, Mie's 1912 gravitational mass behaves identical as de Broglie's 1923 clock-like frequency. The same goes for Mie's inertial mass and de Broglie's wave-like frequency. This allows the interpretation of de Broglie's "Harmony of the Phases" as a "Principle of Equivalence" for Quantum Gravity. Thus, the particle-wave duality can be given a realist interpretation. The… ▽ More

    Submitted 21 July, 2005; originally announced July 2005.

    Comments: 8 pages, 2 figures

    Journal ref: Annales Fond.Broglie 29 (2004) 707-726