Skip to main content

Showing 1–50 of 661 results for author: Wilson, A

.
  1. arXiv:2406.11463  [pdf, other

    cs.LG stat.ML

    Just How Flexible are Neural Networks in Practice?

    Authors: Ravid Shwartz-Ziv, Micah Goldblum, Arpit Bansal, C. Bayan Bruss, Yann LeCun, Andrew Gordon Wilson

    Abstract: It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters, underpinning notions of overparameterized and underparameterized models. In practice, however, we only find solutions accessible via our training procedure, including the optimizer and regularizers, limiting flexibility. Moreover, the exact parameterization of the function c… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2406.09177  [pdf, other

    stat.ML cs.LG

    Scalable and Flexible Causal Discovery with an Efficient Test for Adjacency

    Authors: Alan Nawzad Amin, Andrew Gordon Wilson

    Abstract: To make accurate predictions, understand mechanisms, and design interventions in systems of many variables, we wish to learn causal graphs from large scale data. Unfortunately the space of all possible causal graphs is enormous so scalably and accurately searching for the best fit to the data is a challenge. In principle we could substantially decrease the search space, or learn the graph entirely… ▽ More

    Submitted 18 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: ICML 2024; Code at https://github.com/AlanNawzadAmin/DAT-graph

  3. arXiv:2406.08391  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Large Language Models Must Be Taught to Know What They Don't Know

    Authors: Sanyam Kapoor, Nate Gruver, Manley Roberts, Katherine Collins, Arka Pal, Umang Bhatt, Adrian Weller, Samuel Dooley, Micah Goldblum, Andrew Gordon Wilson

    Abstract: When using large language models (LLMs) in high-stakes applications, we need to know when we can trust their predictions. Some works argue that prompting high-performance LLMs is sufficient to produce calibrated uncertainties, while others introduce sampling methods that can be prohibitively expensive. In this work, we first argue that prompting on its own is insufficient to achieve good calibrati… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Code available at: https://github.com/activatedgeek/calibration-tuning

  4. arXiv:2406.07337  [pdf, other

    cs.LG

    Transferring Knowledge from Large Foundation Models to Small Downstream Models

    Authors: Shikai Qiu, Boran Han, Danielle C. Maddix, Shuai Zhang, Yuyang Wang, Andrew Gordon Wilson

    Abstract: How do we transfer the relevant knowledge from ever larger foundation models into small, task-specific downstream models that can run at much lower costs? Standard transfer learning using pre-trained weights as the initialization transfers limited information and commits us to often massive pre-trained architectures. This procedure also precludes combining multiple pre-trained models that learn co… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ICML 2024. Code available at https://github.com/amazon-science/adaptive-feature-transfer

  5. arXiv:2406.06740  [pdf, other

    astro-ph.GA

    All-sky three-dimensional dust density and extinction Maps of the Milky Way out to 2.8 kpc

    Authors: T. E. Dharmawardena, C. A. L. Bailer-Jones, M. Fouesneau, D. Foreman-Mackey, P. Coronica, T. Colnaghi, T. Müller, A. G. Wilson

    Abstract: Three-dimensional dust density maps are crucial for understanding the structure of the interstellar medium of the Milky Way and the processes that shape it. However, constructing these maps requires large datasets and the methods used to analyse them are computationally expensive and difficult to scale up. As a result it is has only recently become possible to map kiloparsec-scale regions of our G… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted for publication in MNRAS; 13 pages in main document and 7 pages in appendix

  6. arXiv:2406.06248  [pdf, other

    cs.LG

    Compute Better Spent: Replacing Dense Layers with Structured Matrices

    Authors: Shikai Qiu, Andres Potapczynski, Marc Finzi, Micah Goldblum, Andrew Gordon Wilson

    Abstract: Dense linear layers are the dominant computational bottleneck in foundation models. Identifying more efficient alternatives to dense matrices has enormous potential for building more compute-efficient models, as exemplified by the success of convolutional networks in the image domain. In this work, we systematically explore structured matrices as replacements for dense matrices. We show that diffe… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: ICML 24. Code available at https://github.com/shikaiqiu/compute-better-spent

  7. arXiv:2406.02522  [pdf

    q-bio.CB astro-ph.EP astro-ph.IM physics.pop-ph

    Lichen-Mediated Self-Growing Construction Materials for Habitat Outfitting on Mars

    Authors: Nisha Rokaya, Erin C. Carr, Richard A. Wilson, Congrui **

    Abstract: As its next step in space exploration, the National Aeronautics and Space Administration (NASA) revealed plans to establish a permanent human presence on Mars. Habitat outfitting, i.e., the technology to provide the crew with the necessary equipment to perform mission tasks as well as a comfortable, safe, and livable habitable volume, has not been fully explored yet. This study proposes that, rath… ▽ More

    Submitted 13 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  8. arXiv:2405.14812  [pdf, other

    cs.CY

    As an AI Language Model, "Yes I Would Recommend Calling the Police'': Norm Inconsistency in LLM Decision-Making

    Authors: Shomik Jain, D Calacci, Ashia Wilson

    Abstract: We investigate the phenomenon of norm inconsistency: where LLMs apply different norms in similar situations. Specifically, we focus on the high-risk application of deciding whether to call the police in Amazon Ring home surveillance videos. We evaluate the decisions of three state-of-the-art LLMs -- GPT-4, Gemini 1.0, and Claude 3 Sonnet -- in relation to the activities portrayed in the videos, th… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  9. arXiv:2405.00812  [pdf, other

    physics.app-ph physics.flu-dyn

    Explosively driven Richtmyer--Meshkov instability jet suppression and enhancement via coupling machine learning and additive manufacturing

    Authors: Dane M. Sterbentz, Dylan J. Kline, Daniel A. White, Charles F. Jekel, Michael P. Hennessey, David K. Amondson, Abigail J. Wilson, Max J. Sevcik, Matthew F. L. Villena, Steve S. Lin, Michael D. Grapes, Kyle T. Sullivan, Jonathan L. Belof

    Abstract: The ability to control the behavior of fluid instabilities at material interfaces, such as the shock-driven Richtmyer--Meshkov instability, is a grand technological challenge with a broad number of applications ranging from inertial confinement fusion experiments to explosively driven shaped charges. In this work, we use a linear-geometry shaped charge as a means of studying methods for controllin… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Report number: LLNL-JRNL-862011

  10. arXiv:2405.00740  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Modeling Caption Diversity in Contrastive Vision-Language Pretraining

    Authors: Samuel Lavoie, Polina Kirichenko, Mark Ibrahim, Mahmoud Assran, Andrew Gordon Wilson, Aaron Courville, Nicolas Ballas

    Abstract: There are a thousand ways to caption an image. Contrastive Language Pretraining (CLIP) on the other hand, works by map** an image and its caption to a single vector -- limiting how well CLIP-like models can represent the diverse ways to describe an image. In this work, we introduce Llip, Latent Language Image Pretraining, which models the diversity of captions that could match an image. Llip's v… ▽ More

    Submitted 14 May, 2024; v1 submitted 29 April, 2024; originally announced May 2024.

    Comments: 14 pages, 8 figures, 7 tables, to be published at ICML2024

  11. arXiv:2404.18938  [pdf, ps, other

    physics.gen-ph hep-ph

    On possible embeddings of the standard model of particle physics and gravity in $E_8$

    Authors: Robert A. Wilson

    Abstract: I investigate the structure of $E_8$ under the action of the subalgebra/subgroup $A_1+G_2+C_3$, as a potential route to unification of the fundamental forces of nature into a single algebraic structure. The particular real form $E_{8(-24)}$ supports a decomposition into compact $G_2$ plus split $A_1+C_3$, which allows a restriction from $G_2$ to $SU(3)$ for the strong force, together with split… ▽ More

    Submitted 6 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 18 pages. v2: extended the discussion of gravity, revised the title and abstract accordingly

  12. arXiv:2404.14952  [pdf, other

    cs.CV cs.AI

    Leveraging Speech for Gesture Detection in Multimodal Communication

    Authors: Esam Ghaleb, Ilya Burenko, Marlou Rasenberg, Wim Pouw, Ivan Toni, Peter Uhrig, Anna Wilson, Judith Holler, Aslı Özyürek, Raquel Fernández

    Abstract: Gestures are inherent to human interaction and often complement speech in face-to-face communication, forming a multimodal communication system. An important task in gesture analysis is detecting a gesture's beginning and end. Research on automatic gesture detection has primarily focused on visual and kinematic information to detect a limited set of isolated or silent gestures with low variability… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  13. arXiv:2404.10214  [pdf, other

    quant-ph

    Simulating Chemistry on Bosonic Quantum Devices

    Authors: Rishab Dutta, Delmar G. A. Cabral, Ningyi Lyu, Nam P. Vu, Yuchen Wang, Brandon Allen, Xiaohan Dan, Rodrigo G. Cortiñas, Pouya Khazaei, Max Schäfer, Alejandro C. C. d. Albornoz, Scott E. Smart, Scott Nie, Michel H. Devoret, David A. Mazziotti, Prineha Narang, Chen Wang, James D. Whitfield, Angela K. Wilson, Heidi P. Hendrickson, Daniel A. Lidar, Francisco Pérez-Bernal, Lea F. Santos, Sabre Kais, Eitan Geva , et al. (1 additional authors not shown)

    Abstract: Bosonic quantum devices offer a novel approach to realize quantum computations, where the quantum two-level system (qubit) is replaced with the quantum (an)harmonic oscillator (qumode) as the fundamental building block of the quantum simulator. The simulation of chemical structure and dynamics can then be achieved by representing or map** the system Hamiltonians in terms of bosonic operators. In… ▽ More

    Submitted 12 June, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: 40 pages including references, 13 figures

  14. arXiv:2404.08592  [pdf, other

    cs.CY

    Scarce Resource Allocations That Rely On Machine Learning Should Be Randomized

    Authors: Shomik Jain, Kathleen Creel, Ashia Wilson

    Abstract: Contrary to traditional deterministic notions of algorithmic fairness, this paper argues that fairly allocating scarce resources using machine learning often requires randomness. We address why, when, and how to randomize by proposing stochastic procedures that more adequately account for all of the claims that individuals have to allocations of social goods or opportunities.

    Submitted 19 June, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: To appear in the proceedings of the International Conference on Machine Learning (ICML 2024)

    ACM Class: K.4.0

  15. arXiv:2404.04278  [pdf, ps, other

    physics.gen-ph

    A Clifford algebra model in phase space

    Authors: Robert A. Wilson

    Abstract: I show how the isomorphism between the Lie groups of types $B_2$ and $C_2$ leads to a faithful action of the Clifford algebra $\mathcal C\ell(3,2)$ on the phase space of 2-dimensional dynamics, and hence to a map** from Dirac spinors modulo scalars into this same phase space. Extending to the phase space of 3-dimensional dynamics allows one to embed all the gauge groups of the Standard Model as… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  16. arXiv:2404.02778  [pdf, other

    stat.AP

    Chain event graphs for assessing activity-level propositions in forensic science in relation to drug traces on banknotes

    Authors: Gail Robertson, Amy L Wilson, Jim Q Smith

    Abstract: Graphical models and likelihood ratios can be used by forensic scientists to compare support given by evidence to propositions put forward by competing parties during court proceedings. Such models can also be used to evaluate support for activity-level propositions, i.e. propositions that refer to the nature of activities associated with evidence and how this evidence came to be at a crime scene.… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  17. arXiv:2404.01450  [pdf, other

    math.CO

    Tutte polynomials in superspace

    Authors: Brendon Rhoades, Vasu Tewari, Andy Wilson

    Abstract: We associate a quotient of superspace to any hyperplane arrangement by considering the differential closure of an ideal generated by powers of certain homogeneous linear forms. This quotient is a superspace analogue of the external zonotopal algebra, and it further contains the central zonotopal algebra in the appropriate grading. We show that an evaluation of the bivariate Tutte polynomial is the… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  18. arXiv:2403.16628  [pdf, other

    stat.AP

    A comparison of graphical methods in the case of the murder of Meredith Kercher

    Authors: A. Philip Dawid, Francesco Dotto, Maxine Graves, Jay B. Kadane, Julia Mortera, Gail Robertson, Jim Q. Smith, Amy L. Wilson

    Abstract: We compare three graphical methods for displaying evidence in a legal case: Wigmore Charts, Bayesian Networks and Chain Event Graphs. We find that these methods are aimed at three distinct audiences, respectively lawyers, forensic scientists and the police. The methods are illustrated using part of the evidence in the case of the murder of Meredith Kercher. More specifically, we focus on represent… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  19. arXiv:2403.16365  [pdf, other

    cs.LG cs.CR cs.CV

    Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion

    Authors: Hossein Souri, Arpit Bansal, Hamid Kazemi, Liam Fowl, Aniruddha Saha, Jonas Gei**, Andrew Gordon Wilson, Rama Chellappa, Tom Goldstein, Micah Goldblum

    Abstract: Modern neural networks are often trained on massive datasets that are web scraped with minimal human inspection. As a result of this insecure curation pipeline, an adversary can poison or backdoor the resulting model by uploading malicious data to the internet and waiting for a victim to scrape and train on it. Existing approaches for creating poisons and backdoors start with randomly sampled clea… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  20. arXiv:2403.14029  [pdf, other

    cs.RO

    Quadcopter Team Configurable Motion Guided by a Quadruped

    Authors: Mohammad Ghufran, Sourish Tetakayala, Jack Hughes, Aron Wilson, Hossein Rastgoftar

    Abstract: The paper focuses on modeling and experimental evaluation of a quadcopter team configurable coordination guided by a single quadruped robot. We consider the quadcopter team as particles of a two-dimensional deformable body and propose a two-dimensional affine transformation model for safe and collision-free configurable coordination of this heterogeneous robotic system. The proposed affine transfo… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  21. arXiv:2403.13947  [pdf, other

    cs.HC cs.AI

    BlendScape: Enabling Unified and Personalized Video-Conferencing Environments through Generative AI

    Authors: Shwetha Rajaram, Nels Numan, Balasaravanan Thoravi Kumaravel, Nicolai Marquardt, Andrew D. Wilson

    Abstract: Today's video-conferencing tools support a rich range of professional and social activities, but their generic, grid-based environments cannot be easily adapted to meet the varying needs of distributed collaborators. To enable end-user customization, we developed BlendScape, a system for meeting participants to compose video-conferencing environments tailored to their collaboration context by leve… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  22. arXiv:2403.09869  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors

    Authors: Tim G. J. Rudner, Ya Shi Zhang, Andrew Gordon Wilson, Julia Kempe

    Abstract: Machine learning models often perform poorly under subpopulation shifts in the data distribution. Develo** methods that allow machine learning models to better generalize to such shifts is crucial for safe deployment in real-world settings. In this paper, we develop a family of group-aware prior (GAP) distributions over neural network parameters that explicitly favor models that generalize well… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Published in Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

  23. arXiv:2403.07815  [pdf, other

    cs.LG cs.AI

    Chronos: Learning the Language of Time Series

    Authors: Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, Yuyang Wang

    Abstract: We introduce Chronos, a simple yet effective framework for pretrained probabilistic time series models. Chronos tokenizes time series values using scaling and quantization into a fixed vocabulary and trains existing transformer-based language model architectures on these tokenized time series via the cross-entropy loss. We pretrained Chronos models based on the T5 family (ranging from 20M to 710M… ▽ More

    Submitted 2 May, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Code and model checkpoints available at https://github.com/amazon-science/chronos-forecasting

  24. arXiv:2403.02695  [pdf, other

    cs.LG

    Controllable Prompt Tuning For Balancing Group Distributional Robustness

    Authors: Hoang Phan, Andrew Gordon Wilson, Qi Lei

    Abstract: Models trained on data composed of different groups or domains can suffer from severe performance degradation under distribution shifts. While recent methods have largely focused on optimizing the worst-group objective, this often comes at the expense of good performance on other groups. To address this problem, we introduce an optimization scheme to achieve good performance across groups and find… ▽ More

    Submitted 4 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning

  25. arXiv:2402.05980  [pdf, other

    cs.SE cs.AI cs.LG cs.PL

    Do Large Code Models Understand Programming Concepts? A Black-box Approach

    Authors: Ashish Hooda, Mihai Christodorescu, Miltiadis Allamanis, Aaron Wilson, Kassem Fawaz, Somesh Jha

    Abstract: Large Language Models' success on text generation has also made them better at code generation and coding tasks. While a lot of work has demonstrated their remarkable performance on tasks such as code completion and editing, it is still unclear as to why. We help bridge this gap by exploring to what degree auto-regressive models understand the logical constructs of the underlying programs. We prop… ▽ More

    Submitted 23 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  26. arXiv:2402.05857  [pdf, other

    physics.atom-ph quant-ph

    Individual addressing and state readout of trapped ions utilizing rf micromotion

    Authors: Nathan K Lysne, Justin F Niedermeyer, Andrew C Wilson, Daniel H Slichter, Dietrich Leibfried

    Abstract: Excess "micromotion" of trapped ions due to the residual radio frequency (rf) trap** field at their location is often undesirable and is usually carefully minimized. Here, we induce precise amounts of excess micromotion on individual ions by adjusting the local static electric field they experience. Micromotion modulates the coupling of an ion to laser fields, ideally tuning it from its maximum… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  27. arXiv:2402.04379  [pdf, other

    cs.LG cond-mat.mtrl-sci

    Fine-Tuned Language Models Generate Stable Inorganic Materials as Text

    Authors: Nate Gruver, Anuroop Sriram, Andrea Madotto, Andrew Gordon Wilson, C. Lawrence Zitnick, Zachary Ulissi

    Abstract: We propose fine-tuning large language models for generation of stable materials. While unorthodox, fine-tuning large language models on text-encoded atomistic data is simple to implement yet reliable, with around 90% of sampled structures obeying physical constraints on atom positions and charges. Using energy above hull calculations from both learned ML potentials and gold-standard DFT calculatio… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: ICLR 2024. Code available at: https://github.com/facebookresearch/crystal-llm

  28. arXiv:2402.00809  [pdf, other

    cs.LG stat.ML

    Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

    Authors: Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

    Abstract: In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni… ▽ More

    Submitted 2 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  29. arXiv:2401.13000  [pdf, ps, other

    math.GR hep-ph

    A discrete model for Gell-Mann matrices

    Authors: Robert A. Wilson

    Abstract: I propose a discrete model for the Gell-Mann matrices, which allows them to participate in discrete symmetries of three generations of four types of elementary fermions, in addition to their usual role in describing a continuous group $SU(3)$ of colour symmetries. This model sheds new light on the mathematical (rather than physical) necessity for `mixing' between the various gauge groups $SU(3)$,… ▽ More

    Submitted 19 February, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: New sections on action of Gell-Mann matrices on Dirac matrices, and potential implications for quantum gravity. Now 35 pages

  30. arXiv:2401.10534  [pdf, ps, other

    math.GR hep-th math-ph

    A New Division Algebra Representation of $E_7$

    Authors: Tevian Dray, Corinne A. Manogue, Robert A. Wilson

    Abstract: We decompose the Lie algebra $\mathfrak{e}_{8(-24)}$ into representations of $\mathfrak{e}_{7(-25)}\oplus\mathfrak{sl}(2,\mathbb{R})$ using our recent description of $\mathfrak{e}_8$ in terms of (generalized) $3\times3$ matrices over pairs of division algebras. Freudenthal's description of both $\mathfrak{e}_7$ and its minimal representation are therefore realized explicitly within… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: 11 pages

    Journal ref: J. Math. Phys. 65, 031703 (2024)

  31. arXiv:2401.10149  [pdf, other

    cs.LG cs.CR cs.MA

    Multi-Agent Reinforcement Learning for Maritime Operational Technology Cyber Security

    Authors: Alec Wilson, Ryan Menzies, Neela Morarji, David Foster, Marco Casassa Mont, Esin Turkbeyler, Lisa Gralewski

    Abstract: This paper demonstrates the potential for autonomous cyber defence to be applied on industrial control systems and provides a baseline environment to further explore Multi-Agent Reinforcement Learning's (MARL) application to this problem domain. It introduces a simulation environment, IPMSRL, of a generic Integrated Platform Management System (IPMS) and explores the use of MARL for autonomous cybe… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: 13 pages, 7 figures, Proceedings of the Conference on Applied Machine Learning in Information Security 2023 (CAMLIS)

    Journal ref: Proceedings of the Conference on Applied Machine Learning in Information Security 2023 (CAMLIS), Arlington VA, USA, October 19-20, 2023, CEUR-WS.org, online CEUR-WS.org/Vol-3652/paper3.pdf

  32. arXiv:2401.04349  [pdf

    cs.CR cs.AR

    WebGPU-SPY: Finding Fingerprints in the Sandbox through GPU Cache Attacks

    Authors: Ethan Ferguson, Adam Wilson, Hoda Naghibijouybari

    Abstract: Microarchitectural attacks on CPU structures have been studied in native applications, as well as in web browsers. These attacks continue to be a substantial threat to computing systems at all scales. With the proliferation of heterogeneous systems and integration of hardware accelerators in every computing system, modern web browsers provide the support of GPU-based acceleration for the graphic… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  33. Penalized Distributed Lag Interaction Model: Air Pollution, Birth Weight and Neighborhood Vulnerability

    Authors: Danielle Demateis, Kayleigh P. Keller, David Rojas-Rueda, Marianthi-Anna Kioumourtzoglou, Ander Wilson

    Abstract: Maternal exposure to air pollution during pregnancy has a substantial public health impact. Epidemiological evidence supports an association between maternal exposure to air pollution and low birth weight. A popular method to estimate this association while identifying windows of susceptibility is a distributed lag model (DLM), which regresses an outcome onto exposure history observed at multiple… ▽ More

    Submitted 21 February, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

    Comments: 41 pages, 4 figures, 2 tables

  34. arXiv:2401.01764  [pdf, other

    cs.CV cs.LG

    Understanding the Detrimental Class-level Effects of Data Augmentation

    Authors: Polina Kirichenko, Mark Ibrahim, Randall Balestriero, Diane Bouchacourt, Ramakrishna Vedantam, Hamed Firooz, Andrew Gordon Wilson

    Abstract: Data augmentation (DA) encodes invariance and provides implicit regularization critical to a model's performance in image classification tasks. However, while DA improves average accuracy, recent studies have shown that its impact can be highly class dependent: achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet. The… ▽ More

    Submitted 7 December, 2023; originally announced January 2024.

    Comments: Neural Information Processing Systems (NeurIPS), 2023

  35. arXiv:2312.17252  [pdf, ps, other

    math.GR

    On subgroups of the Monster isomorphic to $PSL_2(8)$

    Authors: Robert A. Wilson

    Abstract: We describe computer calculations that were used in 2016 to classify subgroups of the Monster isomorphic to $PSL_2(8)$, containing $7B$-elements. It turns out that there is no such $PSL_2(8)$ in the Monster. These calculations confirm earlier unpublished calculations by P. E. Holmes that obtained the same result. The result has also been confirmed in independent calculations by H. Dietrich, M. Lee… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  36. arXiv:2312.17174  [pdf, other

    cs.CV cs.AI cs.LG

    Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution

    Authors: Ying Wang, Tim G. J. Rudner, Andrew Gordon Wilson

    Abstract: Vision-language pretrained models have seen remarkable success, but their application to safety-critical settings is limited by their lack of interpretability. To improve the interpretability of vision-language models such as CLIP, we propose a multi-modal information bottleneck (M2IB) approach that learns latent representations that compress irrelevant information while preserving relevant visual… ▽ More

    Submitted 22 June, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Advances in Neural Information Processing Systems 36 (NeurIPS 2023)

  37. arXiv:2312.17173  [pdf, other

    stat.ML cs.LG

    Non-Vacuous Generalization Bounds for Large Language Models

    Authors: Sanae Lotfi, Marc Finzi, Yilun Kuang, Tim G. J. Rudner, Micah Goldblum, Andrew Gordon Wilson

    Abstract: Modern language models can contain billions of parameters, raising the question of whether they can generalize beyond the training data or simply regurgitate their training corpora. We provide the first non-vacuous generalization bounds for pretrained large language models (LLMs), indicating that language models are capable of discovering regularities that generalize to unseen data. In particular,… ▽ More

    Submitted 12 February, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  38. arXiv:2312.17162  [pdf, other

    stat.ML cs.AI cs.LG

    Function-Space Regularization in Neural Networks: A Probabilistic Perspective

    Authors: Tim G. J. Rudner, Sanyam Kapoor, Shikai Qiu, Andrew Gordon Wilson

    Abstract: Parameter-space regularization in neural network optimization is a fundamental tool for improving generalization. However, standard parameter-space regularization methods make it challenging to encode explicit preferences about desired predictive functions into neural network training. In this work, we approach regularization in neural networks from a probabilistic perspective and show that by vie… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Published in Proceedings of the 40th International Conference on Machine Learning (ICML 2023)

  39. arXiv:2312.16360  [pdf, other

    stat.CO math.OC math.ST stat.ML

    Mean-field underdamped Langevin dynamics and its spacetime discretization

    Authors: Qiang Fu, Ashia Wilson

    Abstract: We propose a new method called the N-particle underdamped Langevin algorithm for optimizing a special class of non-linear functionals defined over the space of probability measures. Examples of problems with this formulation include training mean-field neural networks, maximum mean discrepancy minimization and kernel Stein discrepancy minimization. Our algorithm is based on a novel spacetime discr… ▽ More

    Submitted 6 February, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

    Comments: 40 pages, 5 figures, 2 tables

  40. arXiv:2312.09323  [pdf, other

    cs.AI cs.LG

    Perspectives on the State and Future of Deep Learning - 2023

    Authors: Micah Goldblum, Anima Anandkumar, Richard Baraniuk, Tom Goldstein, Kyunghyun Cho, Zachary C Lipton, Melanie Mitchell, Preetum Nakkiran, Max Welling, Andrew Gordon Wilson

    Abstract: The goal of this series is to chronicle opinions and issues in the field of machine learning as they stand today and as they change over time. The plan is to host this survey periodically until the AI singularity paperclip-frenzy-driven doomsday, kee** an updated list of topical questions and interviewing new community members for each edition. In this issue, we probed people's opinions on inter… ▽ More

    Submitted 18 December, 2023; v1 submitted 7 December, 2023; originally announced December 2023.

  41. arXiv:2312.08823  [pdf, other

    stat.CO cs.DS cs.LG math.ST stat.ML

    Fast sampling from constrained spaces using the Metropolis-adjusted Mirror Langevin algorithm

    Authors: Vishwak Srinivasan, Andre Wibisono, Ashia Wilson

    Abstract: We propose a new method called the Metropolis-adjusted Mirror Langevin algorithm for approximate sampling from distributions whose support is a compact and convex set. This algorithm adds an accept-reject filter to the Markov chain induced by a single step of the Mirror Langevin algorithm (Zhang et al., 2020), which is a basic discretisation of the Mirror Langevin dynamics. Due to the inclusion of… ▽ More

    Submitted 21 June, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: 49 pages, 6 figures, 2 tables. Shorter version without experiments accepted to COLT 2024

  42. arXiv:2312.05879  [pdf, other

    cs.CV

    Wild Motion Unleashed: Markerless 3D Kinematics and Force Estimation in Cheetahs

    Authors: Zico da Silva, Stacy Shield, Penny E. Hudson, Alan M. Wilson, Fred Nicolls, Amir Patel

    Abstract: The complex dynamics of animal manoeuvrability in the wild is extremely challenging to study. The cheetah ($\textit{Acinonyx jubatus}$) is a perfect example: despite great interest in its unmatched speed and manoeuvrability, obtaining complete whole-body motion data from these animals remains an unsolved problem. This is especially difficult in wild cheetahs, where it is essential that the methods… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  43. arXiv:2312.03092  [pdf, ps, other

    math.CO math.GR

    Coloring Groups

    Authors: Ben Adenbaum, Alexander Wilson

    Abstract: We introduce coloring groups, which are permutation groups obtained from a proper edge coloring of a graph. These groups generalize the generalized toggle groups of Striker (which themselves generalize the toggle groups introduced by Cameron and Fon-der-Flaass). We present some general results connecting the structure of a coloring group to the structure of its graph coloring, providing graph-theo… ▽ More

    Submitted 3 July, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    MSC Class: 05E18; 06A75; 05C25

  44. arXiv:2312.02796  [pdf, other

    cond-mat.mtrl-sci cond-mat.str-el cs.LG physics.data-an

    Materials Expert-Artificial Intelligence for Materials Discovery

    Authors: Yanjun Liu, Milena Jovanovic, Krishnanand Mallayya, Wesley J. Maddox, Andrew Gordon Wilson, Sebastian Klemenz, Leslie M. Schoop, Eun-Ah Kim

    Abstract: The advent of material databases provides an unprecedented opportunity to uncover predictive descriptors for emergent material properties from vast data space. However, common reliance on high-throughput ab initio data necessarily inherits limitations of such data: mismatch with experiments. On the other hand, experimental decisions are often guided by an expert's intuition honed from experiences… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 8 pages main text, 4 figs, 8 pages Supplementary material

  45. arXiv:2312.02517  [pdf, other

    cs.LG cs.AI

    Simplifying Neural Network Training Under Class Imbalance

    Authors: Ravid Shwartz-Ziv, Micah Goldblum, Yucen Lily Li, C. Bayan Bruss, Andrew Gordon Wilson

    Abstract: Real-world datasets are often highly class-imbalanced, which can adversely impact the performance of deep learning models. The majority of research on training neural networks under class imbalance has focused on specialized loss functions, sampling techniques, or two-stage training procedures. Notably, we demonstrate that simply tuning existing components of standard deep learning pipelines, such… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023. Code available at https://github.com/ravidziv/SimplifyingImbalancedTraining

  46. arXiv:2311.15990  [pdf, other

    cs.LG stat.ML

    Should We Learn Most Likely Functions or Parameters?

    Authors: Shikai Qiu, Tim G. J. Rudner, Sanyam Kapoor, Andrew Gordon Wilson

    Abstract: Standard regularized training procedures correspond to maximizing a posterior distribution over parameters, known as maximum a posteriori (MAP) estimation. However, model parameters are of interest only insomuch as they combine with the functional form of a model to provide a function that can make good predictions. Moreover, the most likely parameters under the parameter posterior do not generall… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023. Code available at https://github.com/activatedgeek/function-space-map

  47. arXiv:2311.05877  [pdf, other

    cs.LG cs.AI

    A Performance-Driven Benchmark for Feature Selection in Tabular Deep Learning

    Authors: Valeriia Cherepanova, Roman Levin, Gowthami Somepalli, Jonas Gei**, C. Bayan Bruss, Andrew Gordon Wilson, Tom Goldstein, Micah Goldblum

    Abstract: Academic tabular benchmarks often contain small sets of curated features. In contrast, data scientists typically collect as many features as possible into their datasets, and even engineer new features from existing ones. To prevent overfitting in subsequent downstream modeling, practitioners commonly use automated feature selection methods that identify a reduced subset of informative features. E… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Journal ref: Conference on Neural Information Processing Systems 2023

  48. arXiv:2311.03324  [pdf

    eess.SY

    Estimating Primary Substation Boundaries and the Value of Map** the Electrical Network Infrastructure of Great Britain

    Authors: Joseph Day, I. A. Grant Wilson, Daniel L. Donaldson, Edward Barbour, Bruno Cárdenas, Christopher R. Jones, Andrew J. Urquhart, Seamus D. Garvey

    Abstract: Localised data aggregation in many countries including Great Britain (GB) is typically done to a geographical level with polygon boundaries that have a robust and trusted governance system in place. At a minimum this will mean there is confidence in a process to create a set of polygons that have unique identifiers coupled to geographical areas, and the ability to have these updated through a defi… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 58 pages, 26 figures

  49. arXiv:2311.02477  [pdf, other

    astro-ph.EP

    On the likely magnesium-iron silicate dusty tails of catastrophically evaporating rocky planets

    Authors: Beatriz Campos Estrada, James E. Owen, Marija R. Jankovic, Anna Wilson, Christiane Helling

    Abstract: Catastrophically evaporating rocky planets provide a unique opportunity to study the composition of small planets. The surface composition of these planets can be constrained via modelling their comet-like tails of dust. In this work, we present a new self-consistent model of the dusty tails: we physically model the trajectory of the dust grains after they have left the gaseous outflow, including… ▽ More

    Submitted 9 January, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: Accepted for publication in MNRAS

  50. arXiv:2310.19909  [pdf, other

    cs.CV cs.LG

    Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks

    Authors: Micah Goldblum, Hossein Souri, Renkun Ni, Manli Shu, Viraj Prabhu, Gowthami Somepalli, Prithvijit Chattopadhyay, Mark Ibrahim, Adrien Bardes, Judy Hoffman, Rama Chellappa, Andrew Gordon Wilson, Tom Goldstein

    Abstract: Neural network based computer vision systems are typically built on a backbone, a pretrained or randomly initialized feature extractor. Several years ago, the default option was an ImageNet-trained convolutional neural network. However, the recent past has seen the emergence of countless backbones pretrained using various algorithms and datasets. While this abundance of choice has led to performan… ▽ More

    Submitted 19 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023