-
Nature's Cost Function: Simulating Physics by Minimizing the Action
Authors:
Tim Strang,
Isabella Caruso,
Sam Greydanus
Abstract:
In physics, there is a scalar function called the action which behaves like a cost function. When minimized, it yields the "path of least action" which represents the path a physical system will take through space and time. This function is crucial in theoretical physics and is usually minimized analytically to obtain equations of motion for various problems. In this paper, we propose a different…
▽ More
In physics, there is a scalar function called the action which behaves like a cost function. When minimized, it yields the "path of least action" which represents the path a physical system will take through space and time. This function is crucial in theoretical physics and is usually minimized analytically to obtain equations of motion for various problems. In this paper, we propose a different approach: instead of minimizing the action analytically, we discretize it and then minimize it directly with gradient descent. We use this approach to obtain dynamics for six different physical systems and show that they are nearly identical to ground-truth dynamics. We discuss failure modes such as the unconstrained energy effect and show how to address them. Finally, we use the discretized action to construct a simple but novel quantum simulation.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
A Tutorial on Structural Optimization
Authors:
Sam Greydanus
Abstract:
Structural optimization is a useful and interesting tool. Unfortunately, it can be hard for new researchers to get started on the topic because existing tutorials assume the reader has substantial domain knowledge. They obscure the fact that structural optimization is really quite simple, elegant, and easy to implement. With that in mind, let's write our own structural optimization code, from scra…
▽ More
Structural optimization is a useful and interesting tool. Unfortunately, it can be hard for new researchers to get started on the topic because existing tutorials assume the reader has substantial domain knowledge. They obscure the fact that structural optimization is really quite simple, elegant, and easy to implement. With that in mind, let's write our own structural optimization code, from scratch, in 180 lines.
△ Less
Submitted 9 May, 2022;
originally announced May 2022.
-
Dissipative Hamiltonian Neural Networks: Learning Dissipative and Conservative Dynamics Separately
Authors:
Andrew Sosanya,
Sam Greydanus
Abstract:
Understanding natural symmetries is key to making sense of our complex and ever-changing world. Recent work has shown that neural networks can learn such symmetries directly from data using Hamiltonian Neural Networks (HNNs). But HNNs struggle when trained on datasets where energy is not conserved. In this paper, we ask whether it is possible to identify and decompose conservative and dissipative…
▽ More
Understanding natural symmetries is key to making sense of our complex and ever-changing world. Recent work has shown that neural networks can learn such symmetries directly from data using Hamiltonian Neural Networks (HNNs). But HNNs struggle when trained on datasets where energy is not conserved. In this paper, we ask whether it is possible to identify and decompose conservative and dissipative dynamics simultaneously. We propose Dissipative Hamiltonian Neural Networks (D-HNNs), which parameterize both a Hamiltonian and a Rayleigh dissipation function. Taken together, they represent an implicit Helmholtz decomposition which can separate dissipative effects such as friction from symmetries such as conservation of energy. We train our model to decompose a damped mass-spring system into its friction and inertial terms and then show that this decomposition can be used to predict dynamics for unseen friction coefficients. Then we apply our model to real world data including a large, noisy ocean current dataset where decomposing the velocity field yields useful scientific insights.
△ Less
Submitted 25 January, 2022; v1 submitted 24 January, 2022;
originally announced January 2022.
-
Piecewise-constant Neural ODEs
Authors:
Sam Greydanus,
Stefan Lee,
Alan Fern
Abstract:
Neural networks are a popular tool for modeling sequential data but they generally do not treat time as a continuous variable. Neural ODEs represent an important exception: they parameterize the time derivative of a hidden state with a neural network and then integrate over arbitrary amounts of time. But these parameterizations, which have arbitrary curvature, can be hard to integrate and thus tra…
▽ More
Neural networks are a popular tool for modeling sequential data but they generally do not treat time as a continuous variable. Neural ODEs represent an important exception: they parameterize the time derivative of a hidden state with a neural network and then integrate over arbitrary amounts of time. But these parameterizations, which have arbitrary curvature, can be hard to integrate and thus train and evaluate. In this paper, we propose making a piecewise-constant approximation to Neural ODEs to mitigate these issues. Our model can be integrated exactly via Euler integration and can generate autoregressive samples in 3-20 times fewer steps than comparable RNN and ODE-RNN models. We evaluate our model on several synthetic physics tasks and a planning task inspired by the game of billiards. We find that it matches the performance of baseline approaches while requiring less time to train and evaluate.
△ Less
Submitted 11 June, 2021;
originally announced June 2021.
-
Scaling Down Deep Learning with MNIST-1D
Authors:
Sam Greydanus,
Dmitry Kobak
Abstract:
Although deep learning models have taken on commercial and political relevance, key aspects of their training and operation remain poorly understood. This has sparked interest in science of deep learning projects, many of which require large amounts of time, money, and electricity. But how much of this research really needs to occur at scale? In this paper, we introduce MNIST-1D: a minimalist, pro…
▽ More
Although deep learning models have taken on commercial and political relevance, key aspects of their training and operation remain poorly understood. This has sparked interest in science of deep learning projects, many of which require large amounts of time, money, and electricity. But how much of this research really needs to occur at scale? In this paper, we introduce MNIST-1D: a minimalist, procedurally generated, low-memory, and low-compute alternative to classic deep learning benchmarks. Although the dimensionality of MNIST-1D is only 40 and its default training set size only 4000, MNIST-1D can be used to study inductive biases of different deep architectures, find lottery tickets, observe deep double descent, metalearn an activation function, and demonstrate guillotine regularization in self-supervised learning. All these experiments can be conducted on a GPU or often even on a CPU within minutes, allowing for fast prototy**, educational use cases, and cutting-edge research on a low budget.
△ Less
Submitted 3 June, 2024; v1 submitted 29 November, 2020;
originally announced November 2020.
-
The Story of Airplane Wings
Authors:
Sam Greydanus
Abstract:
The purpose of this work is to explain how wings work and how they were invented. We use the lens of history, looking at the individual people who wanted to fly, the lens of technology, looking at the key inventions leading up to modern airplanes, and the lens of physics, looking at the equations of airflow that made it all possible. Finally, we derive our own wing from scratch. We do this by simu…
▽ More
The purpose of this work is to explain how wings work and how they were invented. We use the lens of history, looking at the individual people who wanted to fly, the lens of technology, looking at the key inventions leading up to modern airplanes, and the lens of physics, looking at the equations of airflow that made it all possible. Finally, we derive our own wing from scratch. We do this by simulating a wind tunnel, placing a rectangular occlusion in it, and then using gradient ascent to turn it into a wing.
△ Less
Submitted 14 October, 2020;
originally announced October 2020.
-
Lagrangian Neural Networks
Authors:
Miles Cranmer,
Sam Greydanus,
Stephan Hoyer,
Peter Battaglia,
David Spergel,
Shirley Ho
Abstract:
Accurate models of the world are built upon notions of its underlying symmetries. In physics, these symmetries correspond to conservation laws, such as for energy and momentum. Yet even though neural network models see increasing use in the physical sciences, they struggle to learn these symmetries. In this paper, we propose Lagrangian Neural Networks (LNNs), which can parameterize arbitrary Lagra…
▽ More
Accurate models of the world are built upon notions of its underlying symmetries. In physics, these symmetries correspond to conservation laws, such as for energy and momentum. Yet even though neural network models see increasing use in the physical sciences, they struggle to learn these symmetries. In this paper, we propose Lagrangian Neural Networks (LNNs), which can parameterize arbitrary Lagrangians using neural networks. In contrast to models that learn Hamiltonians, LNNs do not require canonical coordinates, and thus perform well in situations where canonical momenta are unknown or difficult to compute. Unlike previous approaches, our method does not restrict the functional form of learned energies and will produce energy-conserving models for a variety of tasks. We test our approach on a double pendulum and a relativistic particle, demonstrating energy conservation where a baseline approach incurs dissipation and modeling relativity without canonical coordinates where a Hamiltonian approach fails. Finally, we show how this model can be applied to graphs and continuous systems using a Lagrangian Graph Network, and demonstrate it on the 1D wave equation.
△ Less
Submitted 30 July, 2020; v1 submitted 10 March, 2020;
originally announced March 2020.
-
Neural reparameterization improves structural optimization
Authors:
Stephan Hoyer,
Jascha Sohl-Dickstein,
Sam Greydanus
Abstract:
Structural optimization is a popular method for designing objects such as bridge trusses, airplane wings, and optical devices. Unfortunately, the quality of solutions depends heavily on how the problem is parameterized. In this paper, we propose using the implicit bias over functions induced by neural networks to improve the parameterization of structural optimization. Rather than directly optimiz…
▽ More
Structural optimization is a popular method for designing objects such as bridge trusses, airplane wings, and optical devices. Unfortunately, the quality of solutions depends heavily on how the problem is parameterized. In this paper, we propose using the implicit bias over functions induced by neural networks to improve the parameterization of structural optimization. Rather than directly optimizing densities on a grid, we instead optimize the parameters of a neural network which outputs those densities. This reparameterization leads to different and often better solutions. On a selection of 116 structural optimization tasks, our approach produces the best design 50% more often than the best baseline method.
△ Less
Submitted 13 September, 2019; v1 submitted 9 September, 2019;
originally announced September 2019.
-
Hamiltonian Neural Networks
Authors:
Sam Greydanus,
Misko Dzamba,
Jason Yosinski
Abstract:
Even though neural networks enjoy widespread use, they still struggle to learn the basic laws of physics. How might we endow them with better inductive biases? In this paper, we draw inspiration from Hamiltonian mechanics to train models that learn and respect exact conservation laws in an unsupervised manner. We evaluate our models on problems where conservation of energy is important, including…
▽ More
Even though neural networks enjoy widespread use, they still struggle to learn the basic laws of physics. How might we endow them with better inductive biases? In this paper, we draw inspiration from Hamiltonian mechanics to train models that learn and respect exact conservation laws in an unsupervised manner. We evaluate our models on problems where conservation of energy is important, including the two-body problem and pixel observations of a pendulum. Our model trains faster and generalizes better than a regular neural network. An interesting side effect is that our model is perfectly reversible in time.
△ Less
Submitted 5 September, 2019; v1 submitted 4 June, 2019;
originally announced June 2019.
-
Learning Finite State Representations of Recurrent Policy Networks
Authors:
Anurag Koul,
Sam Greydanus,
Alan Fern
Abstract:
Recurrent neural networks (RNNs) are an effective representation of control policies for a wide range of reinforcement and imitation learning problems. RNN policies, however, are particularly difficult to explain, understand, and analyze due to their use of continuous-valued memory vectors and observation features. In this paper, we introduce a new technique, Quantized Bottleneck Insertion, to lea…
▽ More
Recurrent neural networks (RNNs) are an effective representation of control policies for a wide range of reinforcement and imitation learning problems. RNN policies, however, are particularly difficult to explain, understand, and analyze due to their use of continuous-valued memory vectors and observation features. In this paper, we introduce a new technique, Quantized Bottleneck Insertion, to learn finite representations of these vectors and features. The result is a quantized representation of the RNN that can be analyzed to improve our understanding of memory use and general behavior. We present results of this approach on synthetic environments and six Atari games. The resulting finite representations are surprisingly small in some cases, using as few as 3 discrete memory states and 10 observations for a perfect Pong policy. We also show that these finite policy representations lead to improved interpretability.
△ Less
Submitted 29 November, 2018;
originally announced November 2018.
-
Visualizing and Understanding Atari Agents
Authors:
Sam Greydanus,
Anurag Koul,
Jonathan Dodge,
Alan Fern
Abstract:
While deep reinforcement learning (deep RL) agents are effective at maximizing rewards, it is often unclear what strategies they use to do so. In this paper, we take a step toward explaining deep RL agents through a case study using Atari 2600 environments. In particular, we focus on using saliency maps to understand how an agent learns and executes a policy. We introduce a method for generating u…
▽ More
While deep reinforcement learning (deep RL) agents are effective at maximizing rewards, it is often unclear what strategies they use to do so. In this paper, we take a step toward explaining deep RL agents through a case study using Atari 2600 environments. In particular, we focus on using saliency maps to understand how an agent learns and executes a policy. We introduce a method for generating useful saliency maps and use it to show 1) what strong agents attend to, 2) whether agents are making decisions for the right or wrong reasons, and 3) how agents evolve during learning. We also test our method on non-expert human subjects and find that it improves their ability to reason about these agents. Overall, our results show that saliency information can provide significant insight into an RL agent's decisions and learning behavior.
△ Less
Submitted 10 September, 2018; v1 submitted 31 October, 2017;
originally announced November 2017.
-
Learning the Enigma with Recurrent Neural Networks
Authors:
Sam Greydanus
Abstract:
Recurrent neural networks (RNNs) represent the state of the art in translation, image captioning, and speech recognition. They are also capable of learning algorithmic tasks such as long addition, copying, and sorting from a set of training examples. We demonstrate that RNNs can learn decryption algorithms -- the map**s from plaintext to ciphertext -- for three polyalphabetic ciphers (Vigenère,…
▽ More
Recurrent neural networks (RNNs) represent the state of the art in translation, image captioning, and speech recognition. They are also capable of learning algorithmic tasks such as long addition, copying, and sorting from a set of training examples. We demonstrate that RNNs can learn decryption algorithms -- the map**s from plaintext to ciphertext -- for three polyalphabetic ciphers (Vigenère, Autokey, and Enigma). Most notably, we demonstrate that an RNN with a 3000-unit Long Short-Term Memory (LSTM) cell can learn the decryption function of the Enigma machine. We argue that our model learns efficient internal representations of these ciphers 1) by exploring activations of individual memory neurons and 2) by comparing memory usage across the three ciphers. To be clear, our work is not aimed at 'cracking' the Enigma cipher. However, we do show that our model can perform elementary cryptanalysis by running known-plaintext attacks on the Vigenère and Autokey ciphers. Our results indicate that RNNs can learn algorithmic representations of black box polyalphabetic ciphers and that these representations are useful for cryptanalysis.
△ Less
Submitted 7 September, 2017; v1 submitted 24 August, 2017;
originally announced August 2017.