-
Compressing Tabular Data via Latent Variable Estimation
Authors:
Andrea Montanari,
Eric Weiner
Abstract:
Data used for analytics and machine learning often take the form of tables with categorical entries. We introduce a family of lossless compression algorithms for such data that proceed in four steps: $(i)$ Estimate latent variables associated to rows and columns; $(ii)$ Partition the table in blocks according to the row/column latents; $(iii)$ Apply a sequential (e.g. Lempel-Ziv) coder to each of…
▽ More
Data used for analytics and machine learning often take the form of tables with categorical entries. We introduce a family of lossless compression algorithms for such data that proceed in four steps: $(i)$ Estimate latent variables associated to rows and columns; $(ii)$ Partition the table in blocks according to the row/column latents; $(iii)$ Apply a sequential (e.g. Lempel-Ziv) coder to each of the blocks; $(iv)$ Append a compressed encoding of the latents.
We evaluate it on several benchmark datasets, and study optimal compression in a probabilistic model for that tabular data, whereby latent values are independent and table entries are conditionally independent given the latent values. We prove that the model has a well defined entropy rate and satisfies an asymptotic equipartition property. We also prove that classical compression schemes such as Lempel-Ziv and finite-state encoders do not achieve this rate. On the other hand, the latent estimation strategy outlined above achieves the optimal rate.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
Explaining Reinforcement Learning Policies through Counterfactual Trajectories
Authors:
Julius Frost,
Olivia Watkins,
Eric Weiner,
Pieter Abbeel,
Trevor Darrell,
Bryan Plummer,
Kate Saenko
Abstract:
In order for humans to confidently decide where to employ RL agents for real-world tasks, a human developer must validate that the agent will perform well at test-time. Some policy interpretability methods facilitate this by capturing the policy's decision making in a set of agent rollouts. However, even the most informative trajectories of training time behavior may give little insight into the a…
▽ More
In order for humans to confidently decide where to employ RL agents for real-world tasks, a human developer must validate that the agent will perform well at test-time. Some policy interpretability methods facilitate this by capturing the policy's decision making in a set of agent rollouts. However, even the most informative trajectories of training time behavior may give little insight into the agent's behavior out of distribution. In contrast, our method conveys how the agent performs under distribution shifts by showing the agent's behavior across a wider trajectory distribution. We generate these trajectories by guiding the agent to more diverse unseen states and showing the agent's behavior there. In a user study, we demonstrate that our method enables users to score better than baseline methods on one of two agent validation tasks.
△ Less
Submitted 18 March, 2022; v1 submitted 28 January, 2022;
originally announced January 2022.
-
MCTDH-X: The multiconfigurational time-dependent Hartree method for indistinguishable particles software
Authors:
Rui Lin,
Paolo Molignini,
Luca Papariello,
Marios C. Tsatsos,
Camille Lévêque,
Storm E. Weiner,
Elke Fasshauer,
R. Chitra,
Axel U. J. Lode
Abstract:
We introduce and describe the multiconfigurational time-depenent Hartree for indistinguishable particles (MCTDH-X) software. This powerful tool allows the investigation of ground state properties and dynamics of interacting quantum many-body systems in different spatial dimensions. The MCTDH-X software is a set of programs and scripts to compute, analyze, and visualize solutions for the time-depen…
▽ More
We introduce and describe the multiconfigurational time-depenent Hartree for indistinguishable particles (MCTDH-X) software. This powerful tool allows the investigation of ground state properties and dynamics of interacting quantum many-body systems in different spatial dimensions. The MCTDH-X software is a set of programs and scripts to compute, analyze, and visualize solutions for the time-dependent and time-independent many-body Schrödinger equation for indistinguishable quantum particles. As the MCTDH-X software represents a general solver for the Schrödinger equation, it is applicable to a wide range of problems in the fields of atomic, optical, molecular physics as well as condensed matter systems. In particular, it can be used to study light-matter interactions, correlated dynamics of electrons, as well as some aspects related to quantum information and computing. The MCTDH-X software solves a set of non-linear coupled working equations based on the application of the variational principle to the Schrödinger equation. These equations are obtained by using an ansatz for the many-body wavefunction that is a time-dependent expansion in a set of time-dependent many-body basis states. The time-dependence of the basis set enables MCTDH-X to deal with quantum dynamics at a superior accuracy as compared to, for instance, exact diagonalization approaches. Herein, we give an introduction to the MCTDH-X software via an easy-to-follow tutorial with a focus on accessibility. We use the double well to illustrate the fermionization of bosonic particles, the crystallization of fermionic particles, characteristics of the superfluid and Mott-insulator quantum phases in Hubbard models, and even dynamical quantum phase transitions. Our tutorial guides the potential user to apply the MCTDH-X software also to more complex systems.
△ Less
Submitted 16 January, 2020; v1 submitted 1 November, 2019;
originally announced November 2019.
-
Program Language Translation Using a Grammar-Driven Tree-to-Tree Model
Authors:
Mehdi Drissi,
Olivia Watkins,
Aditya Khant,
Vivaswat Ojha,
Pedro Sandoval,
Rakia Segev,
Eric Weiner,
Robert Keller
Abstract:
The task of translating between programming languages differs from the challenge of translating natural languages in that programming languages are designed with a far more rigid set of structural and grammatical rules. Previous work has used a tree-to-tree encoder/decoder model to take advantage of the inherent tree structure of programs during translation. Neural decoders, however, by default do…
▽ More
The task of translating between programming languages differs from the challenge of translating natural languages in that programming languages are designed with a far more rigid set of structural and grammatical rules. Previous work has used a tree-to-tree encoder/decoder model to take advantage of the inherent tree structure of programs during translation. Neural decoders, however, by default do not exploit known grammar rules of the target language. In this paper, we describe a tree decoder that leverages knowledge of a language's grammar rules to exclusively generate syntactically correct programs. We find that this grammar-based tree-to-tree model outperforms the state of the art tree-to-tree model in translating between two programming languages on a previously used synthetic task.
△ Less
Submitted 4 July, 2018;
originally announced July 2018.
-
Angular momentum in interacting many-body systems hides in phantom vortices
Authors:
Storm E. Weiner,
Marios C. Tsatsos,
Lorenz S. Cederbaum,
Axel U. J. Lode
Abstract:
Vortices are essential to angular momentum in quantum systems such as ultracold atomic gases. The existence of quantized vorticity in bosonic systems stimulated the development of the Gross-Pitaevskii mean-field approximation. However, the true dynamics of angular momentum in finite, interacting many-body systems like trapped Bose-Einstein condensates is enriched by the emergence of quantum correl…
▽ More
Vortices are essential to angular momentum in quantum systems such as ultracold atomic gases. The existence of quantized vorticity in bosonic systems stimulated the development of the Gross-Pitaevskii mean-field approximation. However, the true dynamics of angular momentum in finite, interacting many-body systems like trapped Bose-Einstein condensates is enriched by the emergence of quantum correlations whose description demands more elaborate methods. Herein we theoretically investigate the full many-body dynamics of the acquisition of angular momentum by a gas of ultracold bosons in two dimensions using a standard rotation procedure. We demonstrate the existence of a novel mode of quantized vorticity, which we term the $\textit{phantom vortex}$ that, contrary to the conventional mean-field vortex, can be detected as a topological defect of spatial coherence, but $\textit{not}$ of the density. We describe previously unknown many-body mechanisms of vortex nucleation and show that angular momentum is hidden in phantom vortex modes which so far seem to have evaded experimental detection.
△ Less
Submitted 6 December, 2016; v1 submitted 26 September, 2014;
originally announced September 2014.