Skip to main content

Showing 1–50 of 294 results for author: Tegmark, M

.
  1. arXiv:2406.19384  [pdf, other

    cs.LG cs.AI cs.CL

    The Remarkable Robustness of LLMs: Stages of Inference?

    Authors: Vedang Lad, Wes Gurnee, Max Tegmark

    Abstract: We demonstrate and investigate the remarkable robustness of Large Language Models by deleting and swap** adjacent layers. We find that deleting and swap** interventions retain 72-95\% of the original model's prediction accuracy without fine-tuning, whereas models with more layers exhibit more robustness. Based on the results of the layer-wise intervention and further experiments, we hypothesiz… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2406.08467  [pdf, other

    cs.SE cs.AI cs.LG cs.PL

    DafnyBench: A Benchmark for Formal Software Verification

    Authors: Chloe Loughridge, Qinyi Sun, Seth Ahrenbach, Federico Cassano, Chuyue Sun, Ying Sheng, Anish Mudide, Md Rakib Hossain Misu, Nada Amin, Max Tegmark

    Abstract: We introduce DafnyBench, the largest benchmark of its kind for training and evaluating machine learning systems for formal software verification. We test the ability of LLMs such as GPT-4 and Claude 3 to auto-generate enough hints for the Dafny formal verification engine to successfully verify over 750 programs with about 53,000 lines of code. The best model and prompting scheme achieved 68% succe… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Code & dataset available at: https://github.com/sun-wendy/DafnyBench

  3. arXiv:2405.17420  [pdf, other

    cs.LG

    Survival of the Fittest Representation: A Case Study with Modular Addition

    Authors: Xiaoman Delores Ding, Zifan Carl Guo, Eric J. Michaud, Ziming Liu, Max Tegmark

    Abstract: When a neural network can learn multiple distinct algorithms to solve a task, how does it "choose" between them during training? To approach this question, we take inspiration from ecology: when multiple species coexist, they eventually reach an equilibrium where some survive while others die out. Analogously, we suggest that a neural network at initialization contains many solutions (representati… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  4. arXiv:2405.17209  [pdf, other

    cs.LG cond-mat.dis-nn cs.AI

    How Do Transformers "Do" Physics? Investigating the Simple Harmonic Oscillator

    Authors: Subhash Kantamneni, Ziming Liu, Max Tegmark

    Abstract: How do transformers model physics? Do transformers model systems with interpretable analytical solutions, or do they create "alien physics" that are difficult for humans to decipher? We take a step in demystifying this larger puzzle by investigating the simple harmonic oscillator (SHO), $\ddot{x}+2γ\dot{x}+ω_0^2x=0$, one of the most fundamental systems in physics. Our goal is to identify the metho… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 9 pages, 9 figures

  5. arXiv:2405.14860  [pdf, other

    cs.LG

    Not All Language Model Features Are Linear

    Authors: Joshua Engels, Isaac Liao, Eric J. Michaud, Wes Gurnee, Max Tegmark

    Abstract: Recent work has proposed the linear representation hypothesis: that language models perform computation by manipulating one-dimensional representations of concepts ("features") in activation space. In contrast, we explore whether some language model representations may be inherently multi-dimensional. We begin by develo** a rigorous definition of irreducible multi-dimensional features based on w… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Code and data at https://github.com/JoshEngels/MultiDimensionalFeatures

  6. arXiv:2405.06624  [pdf, other

    cs.AI

    Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

    Authors: David "davidad" Dalrymple, Joar Skalse, Yoshua Bengio, Stuart Russell, Max Tegmark, Sanjit Seshia, Steve Omohundro, Christian Szegedy, Ben Goldhaber, Nora Ammann, Alessandro Abate, Joe Halpern, Clark Barrett, Ding Zhao, Tan Zhi-Xuan, Jeannette Wing, Joshua Tenenbaum

    Abstract: Ensuring that AI systems reliably and robustly avoid harmful or dangerous behaviours is a crucial challenge, especially for AI systems with a high degree of autonomy and general intelligence, or systems used in safety-critical contexts. In this paper, we will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these appro… ▽ More

    Submitted 17 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

  7. arXiv:2405.04484  [pdf, other

    cs.LG physics.comp-ph

    OptPDE: Discovering Novel Integrable Systems via AI-Human Collaboration

    Authors: Subhash Kantamneni, Ziming Liu, Max Tegmark

    Abstract: Integrable partial differential equation (PDE) systems are of great interest in natural science, but are exceedingly rare and difficult to discover. To solve this, we introduce OptPDE, a first-of-its-kind machine learning approach that Optimizes PDEs' coefficients to maximize their number of conserved quantities, $n_{\rm CQ}$, and thus discover new integrable systems. We discover four families of… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  8. arXiv:2404.19756  [pdf, other

    cs.LG cond-mat.dis-nn cs.AI stat.ML

    KAN: Kolmogorov-Arnold Networks

    Authors: Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljačić, Thomas Y. Hou, Max Tegmark

    Abstract: Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function parametriz… ▽ More

    Submitted 16 June, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 48 pages, 20 figures. Codes are available at https://github.com/KindXiaoming/pykan

  9. arXiv:2402.05916  [pdf, other

    cs.LG

    GenEFT: Understanding Statics and Dynamics of Model Generalization via Effective Theory

    Authors: David D. Baek, Ziming Liu, Max Tegmark

    Abstract: We present GenEFT: an effective theory framework for shedding light on the statics and dynamics of neural network generalization, and illustrate it with graph learning examples. We first investigate the generalization phase transition as data size increases, comparing experimental results with information-theory-based approximations. We find generalization in a Goldilocks zone where the decoder is… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 12 pages, 6 figures

  10. arXiv:2402.05164  [pdf, other

    cs.LG cs.AI cs.NE

    A Resource Model For Neural Scaling Law

    Authors: **yeop Song, Ziming Liu, Max Tegmark, Jeff Gore

    Abstract: Neural scaling laws characterize how model performance improves as the model size scales up. Inspired by empirical observations, we introduce a resource model of neural scaling. A task is usually composite hence can be decomposed into many subtasks, which compete for resources (measured by the number of neurons allocated to subtasks). On toy problems, we empirically find that: (1) The loss of a su… ▽ More

    Submitted 15 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: 10 pages, 8 figures, Published as a workshop paper at ICLR 2024

  11. arXiv:2402.05110  [pdf, other

    cs.LG

    Opening the AI black box: program synthesis via mechanistic interpretability

    Authors: Eric J. Michaud, Isaac Liao, Vedang Lad, Ziming Liu, Anish Mudide, Chloe Loughridge, Zifan Carl Guo, Tara Rezaei Kheirkhah, Mateja Vukelić, Max Tegmark

    Abstract: We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code. We test MIPS on a benchmark of 62 algorithmic tasks that can be learned by an RNN and find it highly complementary to GPT-4: MIPS solves 32 of them, including 13 that are not solved by G… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 24 pages

  12. arXiv:2401.14446  [pdf, other

    cs.CY cs.AI cs.CR

    Black-Box Access is Insufficient for Rigorous AI Audits

    Authors: Stephen Casper, Carson Ezell, Charlotte Siegmann, Noam Kolt, Taylor Lynn Curtis, Benjamin Bucknall, Andreas Haupt, Kevin Wei, Jérémy Scheurer, Marius Hobbhahn, Lee Sharkey, Satyapriya Krishna, Marvin Von Hagen, Silas Alberti, Alan Chan, Qinyi Sun, Michael Gerovitch, David Bau, Max Tegmark, David Krueger, Dylan Hadfield-Menell

    Abstract: External audits of AI systems are increasingly recognized as a key mechanism for AI governance. The effectiveness of an audit, however, depends on the degree of access granted to auditors. Recent audits of state-of-the-art AI systems have primarily relied on black-box access, in which auditors can only query the system and observe its outputs. However, white-box access to the system's inner workin… ▽ More

    Submitted 29 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: FAccT 2024

    Journal ref: The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT '24), June 3-6, 2024, Rio de Janeiro, Brazil

  13. arXiv:2312.03051  [pdf, other

    cs.LG cs.AI cs.NE

    Generating Interpretable Networks using Hypernetworks

    Authors: Isaac Liao, Ziming Liu, Max Tegmark

    Abstract: An essential goal in mechanistic interpretability to decode a network, i.e., to convert a neural network's raw weights to an interpretable algorithm. Given the difficulty of the decoding problem, progress has been made to understand the easier encoding problem, i.e., to convert an interpretable algorithm into network weights. Previous works focus on encoding existing algorithms into networks, whic… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 15 pages, 7 figures

    MSC Class: 68T07 ACM Class: I.2.6

  14. arXiv:2310.07711  [pdf, other

    q-bio.NC cs.AI cs.LG cs.NE

    Growing Brains: Co-emergence of Anatomical and Functional Modularity in Recurrent Neural Networks

    Authors: Ziming Liu, Mikail Khona, Ila R. Fiete, Max Tegmark

    Abstract: Recurrent neural networks (RNNs) trained on compositional tasks can exhibit functional modularity, in which neurons can be clustered by activity similarity and participation in shared computational subtasks. Unlike brains, these RNNs do not exhibit anatomical modularity, in which functional clustering is correlated with strong recurrent coupling and spatial localization of functional clusters. Con… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 8 pages, 6 figures

  15. arXiv:2310.06824  [pdf, other

    cs.AI

    The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets

    Authors: Samuel Marks, Max Tegmark

    Abstract: Large Language Models (LLMs) have impressive capabilities, but are also prone to outputting falsehoods. Recent work has developed techniques for inferring whether a LLM is telling the truth by training probes on the LLM's internal activations. However, this line of work is controversial, with some authors pointing out failures of these probes to generalize in basic ways, among other conceptual iss… ▽ More

    Submitted 8 December, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

  16. arXiv:2310.06009  [pdf, other

    cs.CY cs.AI cs.LG

    Divide-and-Conquer Dynamics in AI-Driven Disempowerment

    Authors: Peter S. Park, Max Tegmark

    Abstract: AI companies are attempting to create AI systems that outperform humans at most economically valuable work. Current AI models are already automating away the livelihoods of some artists, actors, and writers. But there is infighting between those who prioritize current harms and future harms. We construct a game-theoretic model of conflict to study the causes and consequences of this disunity. Our… ▽ More

    Submitted 18 December, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 28 pages, nine visualizations (seven figures and two tables)

  17. arXiv:2310.05918  [pdf, other

    cs.LG cs.AI stat.ML

    Grokking as Compression: A Nonlinear Complexity Perspective

    Authors: Ziming Liu, Ziqian Zhong, Max Tegmark

    Abstract: We attribute grokking, the phenomenon where generalization is much delayed after memorization, to compression. To do so, we define linear map** number (LMN) to measure network complexity, which is a generalized version of linear region number for ReLU networks. LMN can nicely characterize neural network compression before generalization. Although the $L_2$ norm has been a popular choice for char… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  18. arXiv:2310.02258  [pdf, other

    cs.LG cs.AI physics.data-an stat.ML

    A Neural Scaling Law from Lottery Ticket Ensembling

    Authors: Ziming Liu, Max Tegmark

    Abstract: Neural scaling laws (NSL) refer to the phenomenon where model performance improves with scale. Sharma & Kaplan analyzed NSL using approximation theory and predict that MSE losses decay as $N^{-α}$, $α=4/d$, where $N$ is the number of model parameters, and $d$ is the intrinsic input dimension. Although their theory works well for some cases (e.g., ReLU networks), we surprisingly find that a simple… ▽ More

    Submitted 1 February, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: 14 pages, 13 figures. Note from authors: the theory in this paper is questionable; we are trying our best to fix it. Empirical results still stand

  19. arXiv:2310.02207  [pdf, other

    cs.LG cs.AI cs.CL

    Language Models Represent Space and Time

    Authors: Wes Gurnee, Max Tegmark

    Abstract: The capabilities of large language models (LLMs) have sparked debate over whether such systems just learn an enormous collection of superficial statistics or a set of more coherent and grounded representations that reflect the real world. We find evidence for the latter by analyzing the learned representations of three spatial datasets (world, US, NYC places) and three temporal datasets (historica… ▽ More

    Submitted 4 March, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

  20. arXiv:2309.01933  [pdf, other

    cs.CY cs.AI cs.LG

    Provably safe systems: the only path to controllable AGI

    Authors: Max Tegmark, Steve Omohundro

    Abstract: We describe a path to humanity safely thriving with powerful Artificial General Intelligences (AGIs) by building them to provably satisfy human-specified requirements. We argue that this will soon be technically feasible using advanced AI for formal verification and mechanistic interpretability. We further argue that it is the only path which guarantees safe controlled AGI. We end with a list of c… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: 17 pages

  21. arXiv:2306.17844  [pdf, other

    cs.LG

    The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks

    Authors: Ziqian Zhong, Ziming Liu, Max Tegmark, Jacob Andreas

    Abstract: Do neural networks, trained on well-understood algorithmic tasks, reliably rediscover known algorithms for solving those tasks? Several recent studies, on tasks ranging from group arithmetic to in-context linear regression, have suggested that the answer is yes. Using modular addition as a prototypical problem, we show that algorithm discovery in neural networks is sometimes more complex. Small ch… ▽ More

    Submitted 21 November, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: Accepted by NeurIPS 2023

  22. arXiv:2305.19525  [pdf, other

    math.DS cs.LG nlin.SI physics.class-ph physics.flu-dyn

    Discovering New Interpretable Conservation Laws as Sparse Invariants

    Authors: Ziming Liu, Patrick Obin Sturm, Saketh Bharadwaj, Sam Silva, Max Tegmark

    Abstract: Discovering conservation laws for a given dynamical system is important but challenging. In a theorist setup (differential equations and basis functions are both known), we propose the Sparse Invariant Detector (SID), an algorithm that auto-discovers conservation laws from differential equations. Its algorithmic simplicity allows robustness and interpretability of the discovered conserved quantiti… ▽ More

    Submitted 4 July, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: The codes are available here: https://github.com/KindXiaoming/sid

  23. arXiv:2305.08746  [pdf, other

    cs.NE cond-mat.dis-nn cs.AI cs.LG math.RT q-bio.NC

    Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability

    Authors: Ziming Liu, Eric Gan, Max Tegmark

    Abstract: We introduce Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable. Inspired by brains, BIMT embeds neurons in a geometric space and augments the loss function with a cost proportional to the length of each neuron connection. We demonstrate that BIMT discovers useful modular neural networks for many simple tasks, revealing compositional structur… ▽ More

    Submitted 6 June, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: Codes are available here: https://github.com/KindXiaoming/BIMT

  24. arXiv:2304.02637  [pdf, other

    cs.LG cs.AI physics.comp-ph physics.data-an quant-ph

    GenPhys: From Physical Processes to Generative Models

    Authors: Ziming Liu, Di Luo, Yilun Xu, Tommi Jaakkola, Max Tegmark

    Abstract: Since diffusion models (DM) and the more recent Poisson flow generative models (PFGM) are inspired by physical processes, it is reasonable to ask: Can physical processes offer additional new generative models? We show that the answer is yes. We introduce a general family, Generative Models from Physical Processes (GenPhys), where we translate partial differential equations (PDEs) describing physic… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Report number: MIT-CTP/5548

  25. arXiv:2303.13506  [pdf, other

    cs.LG cond-mat.dis-nn

    The Quantization Model of Neural Scaling

    Authors: Eric J. Michaud, Ziming Liu, Uzay Girit, Max Tegmark

    Abstract: We propose the Quantization Model of neural scaling laws, explaining both the observed power law dropoff of loss with model and data size, and also the sudden emergence of new capabilities with scale. We derive this model from what we call the Quantization Hypothesis, where network knowledge and skills are "quantized" into discrete chunks ($\textbf{quanta}$). We show that when quanta are learned i… ▽ More

    Submitted 13 January, 2024; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: 24 pages, 18 figures, NeurIPS 2023

  26. arXiv:2302.04265  [pdf, other

    cs.LG cs.CV

    PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

    Authors: Yilun Xu, Ziming Liu, Yonglong Tian, Shangyuan Tong, Max Tegmark, Tommi Jaakkola

    Abstract: We introduce a new family of physics-inspired generative models termed PFGM++ that unifies diffusion models and Poisson Flow Generative Models (PFGM). These models realize generative trajectories for $N$ dimensional data by embedding paths in $N{+}D$ dimensional space while still controlling the progression with a simple scalar norm of the $D$ additional variables. The new models reduce to PFGM wh… ▽ More

    Submitted 10 February, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

    Comments: Code is available at https://github.com/Newbeeer/pfgmpp

  27. arXiv:2210.13447  [pdf, other

    cs.LG physics.comp-ph

    Precision Machine Learning

    Authors: Eric J. Michaud, Ziming Liu, Max Tegmark

    Abstract: We explore unique considerations involved in fitting ML models to data with very high precision, as is often required for science applications. We empirically compare various function approximation methods and study how they scale with increasing parameters and data. We find that neural networks can often outperform classical approximation methods on high-dimensional examples, by auto-discovering… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

  28. arXiv:2210.01117  [pdf, other

    cs.LG cs.AI physics.data-an stat.ME stat.ML

    Omnigrok: Grokking Beyond Algorithmic Data

    Authors: Ziming Liu, Eric J. Michaud, Max Tegmark

    Abstract: Grokking, the unusual phenomenon for algorithmic datasets where generalization happens long after overfitting the training data, has remained elusive. We aim to understand grokking by analyzing the loss landscapes of neural networks, identifying the mismatch between training and test losses as the cause for grokking. We refer to this as the "LU mechanism" because training and test losses (against… ▽ More

    Submitted 23 March, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

  29. arXiv:2209.11178  [pdf, other

    cs.LG cs.CV

    Poisson Flow Generative Models

    Authors: Yilun Xu, Ziming Liu, Max Tegmark, Tommi Jaakkola

    Abstract: We propose a new "Poisson flow" generative model (PFGM) that maps a uniform distribution on a high-dimensional hemisphere into any data distribution. We interpret the data points as electrical charges on the $z=0$ hyperplane in a space augmented with an additional dimension $z$, generating a high-dimensional electric field (the gradient of the solution to Poisson equation). We prove that if these… ▽ More

    Submitted 19 October, 2022; v1 submitted 22 September, 2022; originally announced September 2022.

    Comments: Accepted by NeurIPS 2022

  30. arXiv:2205.10343  [pdf, other

    cs.LG cond-mat.dis-nn cond-mat.stat-mech cs.AI physics.class-ph

    Towards Understanding Grokking: An Effective Theory of Representation Learning

    Authors: Ziming Liu, Ouail Kitouni, Niklas Nolte, Eric J. Michaud, Max Tegmark, Mike Williams

    Abstract: We aim to understand grokking, a phenomenon where models generalize long after overfitting their training set. We present both a microscopic analysis anchored by an effective theory and a macroscopic analysis of phase diagrams describing learning performance across hyperparameters. We find that generalization originates from structured representations whose training dynamics and dependence on trai… ▽ More

    Submitted 14 October, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: Accepted by NeurIPS 2022

  31. arXiv:2204.02489  [pdf, other

    cs.LG cs.IT stat.ML

    Pareto-optimal clustering with the primal deterministic information bottleneck

    Authors: Andrew K. Tan, Max Tegmark, Isaac L. Chuang

    Abstract: At the heart of both lossy compression and clustering is a trade-off between the fidelity and size of the learned representation. Our goal is to map out and study the Pareto frontier that quantifies this trade-off. We focus on the optimization of the Deterministic Information Bottleneck (DIB) objective over the space of hard clusterings. To this end, we introduce the primal DIB problem, which we s… ▽ More

    Submitted 27 July, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: 28 pages, 12 figures

  32. arXiv:2203.12610  [pdf, other

    cs.LG astro-ph.EP nlin.SI physics.class-ph physics.flu-dyn

    AI Poincaré 2.0: Machine Learning Conservation Laws from Differential Equations

    Authors: Ziming Liu, Varun Madhavan, Max Tegmark

    Abstract: We present a machine learning algorithm that discovers conservation laws from differential equations, both numerically (parametrized as neural networks) and symbolically, ensuring their functional independence (a non-linear generalization of linear independence). Our independence module can be viewed as a nonlinear generalization of singular value decomposition. Our method can readily handle induc… ▽ More

    Submitted 30 October, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: 15 pages, 12 figures

    Journal ref: Phys. Rev. E 106, 045307, 2022

  33. arXiv:2202.12887  [pdf, other

    cs.LG cs.NE q-bio.NC stat.ML

    Fault-Tolerant Neural Networks from Biological Error Correction Codes

    Authors: Alexander Zlokapa, Andrew K. Tan, John M. Martyn, Ila R. Fiete, Max Tegmark, Isaac L. Chuang

    Abstract: It has been an open question in deep learning if fault-tolerant computation is possible: can arbitrarily reliable computation be achieved using only unreliable neurons? In the grid cells of the mammalian cortex, analog error correction codes have been observed to protect states against neural spiking noise, but their role in information processing is unclear. Here, we use these biological error co… ▽ More

    Submitted 9 February, 2024; v1 submitted 25 February, 2022; originally announced February 2022.

    Report number: MIT-CTP/5395

  34. arXiv:2109.13901  [pdf, other

    cs.LG physics.class-ph physics.comp-ph physics.data-an

    Physics-Augmented Learning: A New Paradigm Beyond Physics-Informed Learning

    Authors: Ziming Liu, Yunyue Chen, Yuanqi Du, Max Tegmark

    Abstract: Integrating physical inductive biases into machine learning can improve model generalizability. We generalize the successful paradigm of physics-informed learning (PIL) into a more general framework that also includes what we term physics-augmented learning (PAL). PIL and PAL complement each other by handling discriminative and generative properties, respectively. In numerical experiments, we show… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

    Comments: 10 pages, 3 figures, 4 tables

  35. arXiv:2109.09721  [pdf, other

    cs.LG gr-qc physics.class-ph

    Machine-learning hidden symmetries

    Authors: Ziming Liu, Max Tegmark

    Abstract: We present an automated method for finding hidden symmetries, defined as symmetries that become manifest only in a new coordinate system that must be discovered. Its core idea is to quantify asymmetry as violation of certain partial differential equations, and to numerically minimize such violation over the space of all invertible transformations, parametrized as invertible neural networks. For ex… ▽ More

    Submitted 6 May, 2022; v1 submitted 20 September, 2021; originally announced September 2021.

    Comments: Replaced to match accepted PRL version. Improved training, discussion & noise modeling. 14 pages & 4 figs including supplementary material

    Journal ref: Phys. Rev. Lett. 128, 180201 (2022)

  36. Machine-Learning media bias

    Authors: Samantha D'Alonzo, Max Tegmark

    Abstract: We present an automated method for measuring media bias. Inferring which newspaper published a given article, based only on the frequencies with which it uses different phrases, leads to a conditional probability distribution whose analysis lets us automatically map newspapers and phrases into a bias space. By analyzing roughly a million articles from roughly a hundred newspapers for bias in dozen… ▽ More

    Submitted 31 August, 2021; originally announced September 2021.

    Comments: 29 pages, 23 figs; data available at https://space.mit.edu/home/tegmark/phrasebias.html

  37. arXiv:2106.00026  [pdf, other

    cs.LG astro-ph.IM gr-qc physics.comp-ph

    Machine-Learning Non-Conservative Dynamics for New-Physics Detection

    Authors: Ziming Liu, Bohan Wang, Qi Meng, Wei Chen, Max Tegmark, Tie-Yan Liu

    Abstract: Energy conservation is a basic physics principle, the breakdown of which often implies new physics. This paper presents a method for data-driven "new physics" discovery. Specifically, given a trajectory governed by unknown forces, our Neural New-Physics Detector (NNPhD) aims to detect new physics by decomposing the force field into conservative and non-conservative components, which are represente… ▽ More

    Submitted 1 June, 2021; v1 submitted 31 May, 2021; originally announced June 2021.

    Comments: 17 pages, 7 figs, 2 tables; typo correction

  38. arXiv:2104.12240  [pdf, other

    astro-ph.CO astro-ph.IM

    Effects of model incompleteness on the drift-scan calibration of radio telescopes

    Authors: Bharat K. Gehlot, Daniel C. Jacobs, Judd D. Bowman, Nivedita Mahesh, Steven G. Murray, Matthew Kolopanis, Adam P. Beardsley, Zara Abdurashidova, James E. Aguirre, Paul Alexander, Zaki S. Ali, Yanga Balfour, Gianni Bernardi, Tashalee S. Billings, Richard F. Bradley, Phil Bull, Jacob Burba, Steve Carey, Chris L. Carilli, Carina Cheng, David R. DeBoer, Matt Dexter, Eloy de Lera Acedo, Joshua S. Dillon, John Ely , et al. (54 additional authors not shown)

    Abstract: Precision calibration poses challenges to experiments probing the redshifted 21-cm signal of neutral hydrogen from the Cosmic Dawn and Epoch of Reionization (z~30-6). In both interferometric and global signal experiments, systematic calibration is the leading source of error. Though many aspects of calibration have been studied, the overlap between the two types of instruments has received less at… ▽ More

    Submitted 15 July, 2021; v1 submitted 25 April, 2021; originally announced April 2021.

    Comments: 16 pages, 13 figures, 1 table; accepted for publication in MNRAS main journal

  39. arXiv:2011.04698  [pdf, other

    cs.LG astro-ph.EP nlin.SI physics.class-ph

    AI Poincaré: Machine Learning Conservation Laws from Trajectories

    Authors: Ziming Liu, Max Tegmark

    Abstract: We present AI Poincaré, a machine learning algorithm for auto-discovering conserved quantities using trajectory data from unknown dynamical systems. We test it on five Hamiltonian systems, including the gravitational 3-body problem, and find that it discovers not only all exactly conserved quantities, but also periodic orbits, phase transitions and breakdown timescales for approximate conservation… ▽ More

    Submitted 26 April, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: Replaced by accepted PRL version; expanded validation, improved presentation, more legible figs

    Journal ref: Phys. Rev. Lett. 126, 180604 (2021)

  40. arXiv:2006.10782  [pdf, other

    cs.LG cs.AI cs.IT physics.comp-ph stat.ML

    AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity

    Authors: Silviu-Marian Udrescu, Andrew Tan, Jiahai Feng, Orisvaldo Neto, Tailin Wu, Max Tegmark

    Abstract: We present an improved method for symbolic regression that seeks to fit data to formulas that are Pareto-optimal, in the sense of having the best accuracy for a given complexity. It improves on the previous state-of-the-art by typically being orders of magnitude more robust toward noise and bad data, and also by discovering many formulas that stumped previous methods. We develop a method for disco… ▽ More

    Submitted 16 December, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: 17 pages, 6 figs, replaced to match accepted NeurIPS version

    Journal ref: 34th Conference on Neural Information Processing Systems (Neurips 2020), Vancouver, Canada

  41. arXiv:2005.11212  [pdf, other

    cs.CV cs.AI cs.LG physics.comp-ph stat.ML

    Symbolic Pregression: Discovering Physical Laws from Distorted Video

    Authors: Silviu-Marian Udrescu, Max Tegmark

    Abstract: We present a method for unsupervised learning of equations of motion for objects in raw and optionally distorted unlabeled video. We first train an autoencoder that maps each video frame into a low-dimensional latent space where the laws of motion are as simple as possible, by minimizing a combination of non-linearity, acceleration and prediction error. Differential equations describing the motion… ▽ More

    Submitted 11 September, 2020; v1 submitted 19 May, 2020; originally announced May 2020.

    Comments: Expanded and improved physics discussion, additional method details. 9 pages, 7 figs

    Journal ref: Phys. Rev. E 103, 043307 (2021)

  42. Foreground modelling via Gaussian process regression: an application to HERA data

    Authors: Abhik Ghosh, Florent Mertens, Gianni Bernardi, Mário G. Santos, Nicholas S. Kern, Christopher L. Carilli, Trienko L. Grobler, Léon V. E. Koopmans, Daniel C. Jacobs, Adrian Liu, Aaron R. Parsons, Miguel F. Morales, James E. Aguirre, Joshua S. Dillon, Bryna J. Hazelton, Oleg M. Smirnov, Bharat K. Gehlot, Siyanda Matika, Paul Alexander, Zaki S. Ali, Adam P. Beardsley, Roshan K. Benefo, Tashalee S. Billings, Judd D. Bowman, Richard F. Bradley , et al. (48 additional authors not shown)

    Abstract: The key challenge in the observation of the redshifted 21-cm signal from cosmic reionization is its separation from the much brighter foreground emission. Such separation relies on the different spectral properties of the two components, although, in real life, the foreground intrinsic spectrum is often corrupted by the instrumental response, inducing systematic effects that can further jeopardize… ▽ More

    Submitted 12 May, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

    Comments: 15 pages, 15 figures, 1 table, Accepted to MNRAS

  43. arXiv:2003.08399  [pdf, other

    astro-ph.IM astro-ph.CO

    Redundant-Baseline Calibration of the Hydrogen Epoch of Reionization Array

    Authors: Joshua S. Dillon, Max Lee, Zaki S. Ali, Aaron R. Parsons, Naomi Orosz, Chuneeta Devi Nunhokee, Paul La Plante, Adam P. Beardsley, Nicholas S. Kern, Zara Abdurashidova, James E. Aguirre, Paul Alexander, Yanga Balfour, Gianni Bernardi, Tashalee S. Billings, Judd D. Bowman, Richard F. Bradley, Phil Bull, Jacob Burba, Steve Carey, Chris L. Carilli, Carina Cheng, David R. DeBoer, Matt Dexter, Eloy de Lera Acedo , et al. (54 additional authors not shown)

    Abstract: In 21 cm cosmology, precision calibration is key to the separation of the neutral hydrogen signal from very bright but spectrally-smooth astrophysical foregrounds. The Hydrogen Epoch of Reionization Array (HERA), an interferometer specialized for 21 cm cosmology and now under construction in South Africa, was designed to be largely calibrated using the self-consistency of repeated measurements of… ▽ More

    Submitted 3 November, 2020; v1 submitted 18 March, 2020; originally announced March 2020.

    Comments: 24 Pages, 19 Figures. Updated to match the accepted MNRAS version

  44. arXiv:1908.08961  [pdf, other

    cs.LG cs.CV cs.IT stat.ML

    Pareto-optimal data compression for binary classification tasks

    Authors: Max Tegmark, Tailin Wu

    Abstract: The goal of lossy data compression is to reduce the storage cost of a data set $X$ while retaining as much information as possible about something ($Y$) that you care about. For example, what aspects of an image $X$ contain the most information about whether it depicts a cat? Mathematically, this corresponds to finding a map** $X\to Z\equiv f(X)$ that maximizes the mutual information $I(Z,Y)$ wh… ▽ More

    Submitted 15 January, 2020; v1 submitted 23 August, 2019; originally announced August 2019.

    Comments: Replaced to match version published in Entropy. 17 pages, 9 figs; improved discussion, comparison with Blahut-Arimoto method

    Journal ref: Entropy (2020), 22, 7

  45. arXiv:1907.07331  [pdf, other

    cs.LG cs.IT stat.ML

    Learnability for the Information Bottleneck

    Authors: Tailin Wu, Ian Fischer, Isaac L. Chuang, Max Tegmark

    Abstract: The Information Bottleneck (IB) method (\cite{tishby2000information}) provides an insightful and principled approach for balancing compression and prediction for representation learning. The IB objective $I(X;Z)-βI(Y;Z)$ employs a Lagrange multiplier $β$ to tune this trade-off. However, in practice, not only is $β$ chosen empirically without theoretical guidance, there is also a lack of theoretica… ▽ More

    Submitted 17 July, 2019; originally announced July 2019.

    Comments: Accepted at UAI 2019

  46. arXiv:1905.11481  [pdf, other

    physics.comp-ph cs.AI cs.LG hep-th

    AI Feynman: a Physics-Inspired Method for Symbolic Regression

    Authors: Silviu-Marian Udrescu, Max Tegmark

    Abstract: A core challenge for both physics and artificial intellicence (AI) is symbolic regression: finding a symbolic expression that matches data from an unknown function. Although this problem is likely to be NP-hard in principle, functions of practical interest often exhibit symmetries, separability, compositionality and other simplifying properties. In this spirit, we develop a recursive multidimensio… ▽ More

    Submitted 15 April, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: 15 pages, 2 figs. Our code is available at https://github.com/SJ001/AI-Feynman and our Feynman Symbolic Regression Database for benchmarking can be downloaded at https://space.mit.edu/home/tegmark/aifeynman.html

    Journal ref: Science Advances, 6:eaay2631, April 15, 2020

  47. The role of artificial intelligence in achieving the Sustainable Development Goals

    Authors: Ricardo Vinuesa, Hossein Azizpour, Iolanda Leite, Madeline Balaam, Virginia Dignum, Sami Domisch, Anna Felländer, Simone Langhans, Max Tegmark, Francesco Fuso Nerini

    Abstract: The emergence of artificial intelligence (AI) and its progressively wider impact on many sectors across the society requires an assessment of its effect on sustainable development. Here we analyze published evidence of positive or negative impacts of AI on the achievement of each of the 17 goals and 169 targets of the 2030 Agenda for Sustainable Development. We find that AI can support the achieve… ▽ More

    Submitted 30 April, 2019; originally announced May 2019.

  48. arXiv:1902.03364  [pdf, other

    physics.data-an stat.ML

    Latent Representations of Dynamical Systems: When Two is Better Than One

    Authors: Max Tegmark

    Abstract: A popular approach for predicting the future of dynamical systems involves map** them into a lower-dimensional "latent space" where prediction is easier. We show that the information-theoretically optimal approach uses different map**s for present and future, in contrast to state-of-the-art machine-learning approaches where both map**s are the same. We illustrate this dichotomy by predicting… ▽ More

    Submitted 20 February, 2019; v1 submitted 8 February, 2019; originally announced February 2019.

    Comments: Improved references and explanation of why two representations generally outperform one for time-irreversible processes. 6 pages, 4 figs

  49. arXiv:1810.10525  [pdf, other

    physics.comp-ph cond-mat.dis-nn cs.LG

    Toward an AI Physicist for Unsupervised Learning

    Authors: Tailin Wu, Max Tegmark

    Abstract: We investigate opportunities and challenges for improving unsupervised machine learning using four common strategies with a long history in physics: divide-and-conquer, Occam's razor, unification and lifelong learning. Instead of using one model to learn everything, we propose a novel paradigm centered around the learning and manipulation of *theories*, which parsimoniously predict both aspects of… ▽ More

    Submitted 1 September, 2019; v1 submitted 24 October, 2018; originally announced October 2018.

    Comments: Replaced to match accepted PRE version. Added references, improved discussion. 22 pages, 7 figs

    Journal ref: Phys. Rev. E 100, 033311 (2019)

  50. arXiv:1810.07253  [pdf, other

    cond-mat.dis-nn q-bio.NC q-bio.QM

    Ensemble Inhibition and Excitation in the Human Cortex: an Ising Model Analysis with Uncertainties

    Authors: Cristian Zanoci, Nima Dehghani, Max Tegmark

    Abstract: The pairwise maximum entropy model, also known as the Ising model, has been widely used to analyze the collective activity of neurons. However, controversy persists in the literature about seemingly inconsistent findings, whose significance is unclear due to lack of reliable error estimates. We therefore develop a method for accurately estimating parameter uncertainty based on random walks in para… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

    Comments: 17 pages, 8 figures

    Journal ref: Phys. Rev. E 99, 032408 (2019)