Skip to main content

Showing 1–20 of 20 results for author: Freeman, C D

.
  1. arXiv:2311.07587  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

    Authors: C. Daniel Freeman, Laura Culp, Aaron Parisi, Maxwell L Bileschi, Gamaleldin F Elsayed, Alex Rizkowsky, Isabelle Simpson, Alex Alemi, Azade Nova, Ben Adlam, Bernd Bohnet, Gaurav Mishra, Hanie Sedghi, Igor Mordatch, Izzeddin Gur, Jaehoon Lee, JD Co-Reyes, Jeffrey Pennington, Kelvin Xu, Kevin Swersky, Kshiteej Mahajan, Lechao Xiao, Rosanne Liu, Simon Kornblith, Noah Constant , et al. (5 additional authors not shown)

    Abstract: We introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model alignment. This problem is comprised of arithmetic questions posed in natural language, with an arbitrary adversarial string inserted before the question is complete. Even in the simple setting of 1-digit addition problems, it is easy to find adversarial prompts that mak… ▽ More

    Submitted 15 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

  2. arXiv:2310.10047  [pdf, other

    cs.CL

    Improving Large Language Model Fine-tuning for Solving Math Problems

    Authors: Yixin Liu, Avi Singh, C. Daniel Freeman, John D. Co-Reyes, Peter J. Liu

    Abstract: Despite their success in many natural language tasks, solving math problems remains a significant challenge for large language models (LLMs). A large gap exists between LLMs' pass-at-one and pass-at-N performance in solving math problems, suggesting LLMs might be close to finding correct solutions, motivating our exploration of fine-tuning methods to unlock LLMs' performance. Using the challenging… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  3. arXiv:2212.01055  [pdf, other

    cs.CV

    Transformer-Based Learned Optimization

    Authors: Erik Gärtner, Luke Metz, Mykhaylo Andriluka, C. Daniel Freeman, Cristian Sminchisescu

    Abstract: We propose a new approach to learned optimization where we represent the computation of an optimizer's update step using a neural network. The parameters of the optimizer are then learned by training on a set of optimization tasks with the objective to perform minimization efficiently. Our innovation is a new neural network architecture, Optimus, for the learned optimizer inspired by the classic B… ▽ More

    Submitted 28 June, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: Accepted to the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (CVPR) in Vancouver, Canada

  4. arXiv:2211.09760  [pdf, other

    cs.LG math.OC stat.ML

    VeLO: Training Versatile Learned Optimizers by Scaling Up

    Authors: Luke Metz, James Harrison, C. Daniel Freeman, Amil Merchant, Lucas Beyer, James Bradbury, Naman Agrawal, Ben Poole, Igor Mordatch, Adam Roberts, Jascha Sohl-Dickstein

    Abstract: While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach behind the success of deep learning to learn versatile optimizers. We train an optimizer for deep learning which is itself a small neural network that ingests gradients and outputs parameter updates. M… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

  5. arXiv:2211.08199  [pdf, other

    cs.RO

    Allowing Safe Contact in Robotic Goal-Reaching: Planning and Tracking in Operational and Null Spaces

    Authors: Xinghao Zhu, Wenzhao Lian, Bodi Yuan, C. Daniel Freeman, Masayoshi Tomizuka

    Abstract: In recent years, impressive results have been achieved in robotic manipulation. While many efforts focus on generating collision-free reference signals, few allow safe contact between the robot bodies and the environment. However, in human's daily manipulation, contact between arms and obstacles is prevalent and even necessary. This paper investigates the benefit of allowing safe contact during ro… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: 7 pages, 5 figures, submitted to ICRA 2023

  6. arXiv:2203.11860  [pdf, other

    cs.LG cs.NE math.OC stat.ML

    Practical tradeoffs between memory, compute, and performance in learned optimizers

    Authors: Luke Metz, C. Daniel Freeman, James Harrison, Niru Maheswaranathan, Jascha Sohl-Dickstein

    Abstract: Optimization plays a costly and crucial role in develo** machine learning systems. In learned optimizers, the few hyperparameters of commonly used hand-designed optimizers, e.g. Adam or SGD, are replaced with flexible parametric functions. The parameters of these functions are then optimized so that the resulting learned optimizer minimizes a target loss on a chosen class of models. Learned opti… ▽ More

    Submitted 16 July, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

  7. arXiv:2111.05803  [pdf, other

    cs.LG stat.ML

    Gradients are Not All You Need

    Authors: Luke Metz, C. Daniel Freeman, Samuel S. Schoenholz, Tal Kachman

    Abstract: Differentiable programming techniques are widely used in the community and are responsible for the machine learning renaissance of the past several decades. While these methods are powerful, they have limits. In this short report, we discuss a common chaos based failure mode which appears in a variety of differentiable circumstances, ranging from recurrent neural networks and numerical physics sim… ▽ More

    Submitted 20 January, 2022; v1 submitted 10 November, 2021; originally announced November 2021.

  8. arXiv:2106.13281  [pdf, other

    cs.RO cs.AI

    Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation

    Authors: C. Daniel Freeman, Erik Frey, Anton Raichuk, Sertan Girgin, Igor Mordatch, Olivier Bachem

    Abstract: We present Brax, an open source library for rigid body simulation with a focus on performance and parallelism on accelerators, written in JAX. We present results on a suite of tasks inspired by the existing reinforcement learning literature, but remade in our engine. Additionally, we provide reimplementations of PPO, SAC, ES, and direct policy optimization in JAX that compile alongside our environ… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: 9 pages + 12 pages of appendices and references. In submission at NeurIPS 2021 Datasets and Benchmarks Track

  9. arXiv:2101.07367  [pdf, other

    cs.LG cs.NE

    Training Learned Optimizers with Randomly Initialized Learned Optimizers

    Authors: Luke Metz, C. Daniel Freeman, Niru Maheswaranathan, Jascha Sohl-Dickstein

    Abstract: Learned optimizers are increasingly effective, with performance exceeding that of hand designed optimizers such as Adam~\citep{kingma2014adam} on specific tasks \citep{metz2019understanding}. Despite the potential gains available, in current work the meta-training (or `outer-training') of the learned optimizer is performed by a hand-designed optimizer, or by an optimizer trained by a hand-designed… ▽ More

    Submitted 14 January, 2021; originally announced January 2021.

  10. arXiv:2009.11243  [pdf, other

    cs.LG cs.NE stat.ML

    Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves

    Authors: Luke Metz, Niru Maheswaranathan, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein

    Abstract: Much as replacing hand-designed features with learned functions has revolutionized how we solve perceptual tasks, we believe learned algorithms will transform how we train models. In this work we focus on general-purpose learned optimizers capable of training a wide variety of problems with no user-specified hyperparameters. We introduce a new, neural network parameterized, hierarchical optimizer… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

  11. arXiv:2002.11887  [pdf, other

    cs.LG stat.ML

    Using a thousand optimization tasks to learn hyperparameter search strategies

    Authors: Luke Metz, Niru Maheswaranathan, Ruoxi Sun, C. Daniel Freeman, Ben Poole, Jascha Sohl-Dickstein

    Abstract: We present TaskSet, a dataset of tasks for use in training and evaluating optimizers. TaskSet is unique in its size and diversity, containing over a thousand tasks ranging from image classification with fully connected or convolutional neural networks, to variational autoencoders, to non-volume preserving flows on a variety of datasets. As an example application of such a dataset we explore meta-l… ▽ More

    Submitted 31 March, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

  12. arXiv:1910.13038  [pdf, other

    cs.NE cs.LG

    Learning to Predict Without Looking Ahead: World Models Without Forward Prediction

    Authors: C. Daniel Freeman, Luke Metz, David Ha

    Abstract: Much of model-based reinforcement learning involves learning a model of an agent's world, and training an agent to leverage this model to perform a task more efficiently. While these models are demonstrably useful for agents, every naturally occurring model of the world of which we are aware---e.g., a brain---arose as the byproduct of competing evolutionary pressures for survival, not minimization… ▽ More

    Submitted 30 October, 2019; v1 submitted 28 October, 2019; originally announced October 2019.

    Comments: To appear at the Thirty-third Conference on Neural Information Processing Systems (NeurIPS 2019)

  13. arXiv:1810.10180  [pdf, other

    cs.NE stat.ML

    Understanding and correcting pathologies in the training of learned optimizers

    Authors: Luke Metz, Niru Maheswaranathan, Jeremy Nixon, C. Daniel Freeman, Jascha Sohl-Dickstein

    Abstract: Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks. Analogously, this suggests that learned optimizers may similarly outperform current hand-designed optimizers, especially for specific problems. However, learned optimizers are notoriously difficult to train and have yet to demonstrate wall-clock speedups over hand-designed optimi… ▽ More

    Submitted 7 June, 2019; v1 submitted 24 October, 2018; originally announced October 2018.

  14. arXiv:1807.00821  [pdf, other

    physics.comp-ph cond-mat.mtrl-sci physics.chem-ph quant-ph

    Modern Approaches to Exact Diagonalization and Selected Configuration Interaction with the Adaptive Sampling CI Method

    Authors: Norm M. Tubman, C. Daniel Freeman, Daniel S. Levine, Diptarka Hait, Martin Head-Gordon, K. Birgitta Whaley

    Abstract: Recent advances in selected CI, including the adaptive sampling configuration interaction (ASCI) algorithm and its heat bath extension, have made the ASCI approach competitive with the most accurate techniques available, and hence an increasingly powerful tool in solving quantum Hamiltonians. In this work, we show that a useful paradigm for generating efficient selected CI/exact diagonalization al… ▽ More

    Submitted 28 December, 2019; v1 submitted 2 July, 2018; originally announced July 2018.

    Comments: 22 pages,8 figures, 15 tables (added supplemental information on Cr2 in the svp basis)

    Journal ref: J. Chem. Theory Comput. 2020, 16, 4, 2139-2159

  15. arXiv:1710.03757  [pdf, other

    cond-mat.str-el cond-mat.stat-mech quant-ph

    Monte Carlo Tensor Network Renormalization

    Authors: William Huggins, C. Daniel Freeman, Miles Stoudenmire, Norm M. Tubman, K. Birgitta Whaley

    Abstract: Techniques for approximately contracting tensor networks are limited in how efficiently they can make use of parallel computing resources. In this work we demonstrate and characterize a Monte Carlo approach to the tensor network renormalization group method which can be used straightforwardly on modern computing architectures. We demonstrate the efficiency of the technique and show that Monte Carl… ▽ More

    Submitted 10 October, 2017; originally announced October 2017.

    Comments: 8 pages, 3 figures

  16. arXiv:1708.02260  [pdf, other

    quant-ph cond-mat.stat-mech

    Stable quantum memories with limited measurement

    Authors: C. Daniel Freeman, Mohan Sarovar, C. M. Herdman, K. B. Whaley

    Abstract: We demonstrate the existence of a finite temperature threshold for a 1D stabilizer code under an error correcting protocol that requires only a fraction of the syndrome measurements. Below the threshold temperature, encoded states have exponentially long lifetimes, as demonstrated by numerical and analytical arguments. We sketch how this algorithm generalizes to higher dimensional stabilizer codes… ▽ More

    Submitted 7 August, 2017; originally announced August 2017.

    Comments: 11 Pages, 7 Figures

    Journal ref: Phys. Rev. A 98, 032322 (2018)

  17. arXiv:1611.01540  [pdf, other

    stat.ML cs.LG

    Topology and Geometry of Half-Rectified Network Optimization

    Authors: C. Daniel Freeman, Joan Bruna

    Abstract: The loss surface of deep neural networks has recently attracted interest in the optimization and machine learning communities as a prime example of high-dimensional non-convex problem. Some insights were recently gained using spin glass models and mean-field approximations, but at the expense of strongly simplifying the nonlinear nature of the model. In this work, we do not make any such assumpt… ▽ More

    Submitted 1 June, 2017; v1 submitted 4 November, 2016; originally announced November 2016.

    Comments: 22 Pages (10 main + Appendices), 4 Figures, 1 Table, Published as a conference paper at ICLR 2017

  18. arXiv:1608.05074  [pdf, other

    cond-mat.str-el hep-th quant-ph

    Entanglement structure of non-equilibrium steady states

    Authors: Raghu Mahajan, C. Daniel Freeman, Sam Mumford, Norm Tubman, Brian Swingle

    Abstract: We study the problem of calculating transport properties of interacting quantum systems, specifically electrical and thermal conductivities, by computing the non-equilibrium steady state (NESS) of the system biased by contacts. Our approach is based on the structure of entanglement in the NESS. With reasonable physical assumptions, we show that a NESS close to local equilibrium is lightly entangle… ▽ More

    Submitted 17 August, 2016; originally announced August 2016.

    Comments: 10 pages + appendices, 10 figures

  19. arXiv:1603.05005  [pdf, other

    quant-ph cond-mat.stat-mech

    Engineering autonomous error correction in stabilizer codes at finite temperature

    Authors: C. Daniel Freeman, C. M. Herdman, K. B. Whaley

    Abstract: We present an error correcting protocol that enhances the lifetime of stabilizer code based qubits which are susceptible to the creation of pairs of localized defects (due to string-like error operators) at finite temperature, such as the toric code. The primary tool employed is dynamic application of a local, unitary operator which exchanges defects and thereby translates localized excitations. C… ▽ More

    Submitted 5 May, 2016; v1 submitted 16 March, 2016; originally announced March 2016.

    Comments: 14 pages, 13 figures. Comments welcome, APS March Meeting session K44.00007

    Journal ref: Phys. Rev. A 96, 012311 (2017)

  20. arXiv:1405.2315  [pdf, other

    cond-mat.stat-mech quant-ph

    Relaxation dynamics of the toric code in contact with a thermal reservoir: Finite-size scaling in a low temperature regime

    Authors: C. Daniel Freeman, C. M. Herdman, Dylan J Gorman, K. B. Whaley

    Abstract: We present an analysis of the relaxation dynamics of finite-size topological qubits in contact with a thermal bath. Using a continuous-time Monte Carlo method, we explicitly compute the low-temperature nonequilibrium dynamics of the toric code on finite lattices. In contrast to the size-independent bound predicted for the toric code in the thermodynamic limit, we identify a low-temperature regime… ▽ More

    Submitted 4 December, 2014; v1 submitted 9 May, 2014; originally announced May 2014.

    Comments: 14 Pages, 13 figures

    Journal ref: Phys. Rev. B 90, 134302 (2014)