-
Consistent, multidimensional differential histogramming and summary statistics with YODA 2
Authors:
Andy Buckley,
Louie Corpe,
Matthew Filipovich,
Christian Gutschow,
Nick Rozinsky,
Simon Thor,
Yoran Yeh,
Jamie Yellen
Abstract:
Histogramming is often taken for granted, but the power and compactness of partially aggregated, multidimensional summary statistics, and their fundamental connection to differential and integral calculus make them formidable statistical objects, especially when very large data volumes are involved. But expressing these concepts robustly and efficiently in high-dimensional parameter spaces and for…
▽ More
Histogramming is often taken for granted, but the power and compactness of partially aggregated, multidimensional summary statistics, and their fundamental connection to differential and integral calculus make them formidable statistical objects, especially when very large data volumes are involved. But expressing these concepts robustly and efficiently in high-dimensional parameter spaces and for large data samples is a highly non-trivial challenge -- doubly so if the resulting library is to remain usable by scientists as opposed to software engineers. In this paper we summarise the core principles required for consistent generalised histogramming, and use them to motivate the design principles and implementation mechanics of the re-engineered YODA histogramming library, a key component of physics data-model comparison and statistical interpretation in collider physics.
△ Less
Submitted 25 April, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
Role of Spatial Coherence in Diffractive Optical Neural Networks
Authors:
Matthew J. Filipovich,
Aleksei Malyshev,
A. I. Lvovsky
Abstract:
Diffractive optical neural networks (DONNs) have emerged as a promising optical hardware platform for ultra-fast and energy-efficient signal processing for machine learning tasks, particularly in computer vision. Previous experimental demonstrations of DONNs have only been performed using coherent light. However, many real-world DONN applications require consideration of the spatial coherence prop…
▽ More
Diffractive optical neural networks (DONNs) have emerged as a promising optical hardware platform for ultra-fast and energy-efficient signal processing for machine learning tasks, particularly in computer vision. Previous experimental demonstrations of DONNs have only been performed using coherent light. However, many real-world DONN applications require consideration of the spatial coherence properties of the optical signals. Here, we study the role of spatial coherence in DONN operation and performance. We propose a numerical approach to efficiently simulate DONNs under incoherent and partially coherent input illumination and discuss the corresponding computational complexity. As a demonstration, we train and evaluate simulated DONNs on the MNIST dataset of handwritten digits to process light with varying spatial coherence.
△ Less
Submitted 23 May, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Scaling Laws Beyond Backpropagation
Authors:
Matthew J. Filipovich,
Alessandro Cappelli,
Daniel Hesslow,
Julien Launay
Abstract:
Alternatives to backpropagation have long been studied to better understand how biological brains may learn. Recently, they have also garnered interest as a way to train neural networks more efficiently. By relaxing constraints inherent to backpropagation (e.g., symmetric feedforward and feedback weights, sequential updates), these methods enable promising prospects, such as local learning. Howeve…
▽ More
Alternatives to backpropagation have long been studied to better understand how biological brains may learn. Recently, they have also garnered interest as a way to train neural networks more efficiently. By relaxing constraints inherent to backpropagation (e.g., symmetric feedforward and feedback weights, sequential updates), these methods enable promising prospects, such as local learning. However, the tradeoffs between different methods in terms of final task performance, convergence speed, and ultimately compute and data requirements are rarely outlined. In this work, we use scaling laws to study the ability of Direct Feedback Alignment~(DFA) to train causal decoder-only Transformers efficiently. Scaling laws provide an overview of the tradeoffs implied by a modeling decision, up to extrapolating how it might transfer to increasingly large models. We find that DFA fails to offer more efficient scaling than backpropagation: there is never a regime for which the degradation in loss incurred by using DFA is worth the potential reduction in compute budget. Our finding comes at variance with previous beliefs in the alternative training methods community, and highlights the need for holistic empirical approaches to better understand modeling decisions.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
Silicon Photonic Architecture for Training Deep Neural Networks with Direct Feedback Alignment
Authors:
Matthew J. Filipovich,
Zhimu Guo,
Mohammed Al-Qadasi,
Bicky A. Marquez,
Hugh D. Morison,
Volker J. Sorger,
Paul R. Prucnal,
Sudip Shekhar,
Bhavin J. Shastri
Abstract:
There has been growing interest in using photonic processors for performing neural network inference operations; however, these networks are currently trained using standard digital electronics. Here, we propose on-chip training of neural networks enabled by a CMOS-compatible silicon photonic architecture to harness the potential for massively parallel, efficient, and fast data operations. Our sch…
▽ More
There has been growing interest in using photonic processors for performing neural network inference operations; however, these networks are currently trained using standard digital electronics. Here, we propose on-chip training of neural networks enabled by a CMOS-compatible silicon photonic architecture to harness the potential for massively parallel, efficient, and fast data operations. Our scheme employs the direct feedback alignment training algorithm, which trains neural networks using error feedback rather than error backpropagation, and can operate at speeds of trillions of multiply-accumulate (MAC) operations per second while consuming less than one picojoule per MAC operation. The photonic architecture exploits parallelized matrix-vector multiplications using arrays of microring resonators for processing multi-channel analog signals along single waveguide buses to calculate the gradient vector for each neural network layer in situ. We also experimentally demonstrate training deep neural networks with the MNIST dataset using on-chip MAC operation results. Our novel approach for efficient, ultra-fast neural network training showcases photonics as a promising platform for executing AI applications.
△ Less
Submitted 18 August, 2022; v1 submitted 12 November, 2021;
originally announced November 2021.
-
PyCharge: An open-source Python package for self-consistent electrodynamics simulations of Lorentz oscillators and moving point charges
Authors:
Matthew J. Filipovich,
Stephen Hughes
Abstract:
PyCharge is a computational electrodynamics Python simulator that can calculate the electromagnetic fields and potentials generated by moving point charges and can self-consistently simulate dipoles modeled as Lorentz oscillators. To calculate the total fields and potentials along a discretized spatial grid at a specified time, PyCharge computes the retarded time of the point charges at each grid…
▽ More
PyCharge is a computational electrodynamics Python simulator that can calculate the electromagnetic fields and potentials generated by moving point charges and can self-consistently simulate dipoles modeled as Lorentz oscillators. To calculate the total fields and potentials along a discretized spatial grid at a specified time, PyCharge computes the retarded time of the point charges at each grid point, which are subsequently used to compute the analytical solutions to Maxwell's equations for each point charge. The Lorentz oscillators are driven by the electric field in the system and PyCharge self-consistently determines the reaction of the radiation on the dipole moment at each time step. PyCharge treats the two opposite charges in the dipole as separate point charge sources and calculates their individual contributions to the total electromagnetic fields and potentials. The expected coupling that arises between dipoles is captured in the PyCharge simulation, and the modified radiative properties of the dipoles (radiative decay rate and frequency shift) can be extracted using the dipole's energy at each time step throughout the simulation. The modified radiative properties of two dipoles separated in the near-field, which require a full dipole response to yield the correct physics, are calculated by PyCharge and shown to be in excellent agreement with the analytical Green's function results ($<0.2\%$ relative error, over a wide range of spatial separations). Moving dipoles can also be modeled by specifying the dipole's origin position as a function of time. PyCharge includes a parallelized version of the dipole simulation method to enable the parallel execution of computationally demanding simulations on high performance computing environments to significantly improve run time.
△ Less
Submitted 6 December, 2021; v1 submitted 26 July, 2021;
originally announced July 2021.
-
Space-Time Computation and Visualization of the Electromagnetic Fields and Potentials Generated by Moving Point Charges
Authors:
Matthew J. Filipovich,
Stephen Hughes
Abstract:
We present a computational methodology to directly calculate and visualize the directional components of the Coulomb, radiation, and total electromagnetic fields, as well as the scalar and vector potentials, generated by moving point charges in arbitrary motion with varying speeds. Our method explicitly calculates the retarded time of the point charge along a discretized grid which is then used to…
▽ More
We present a computational methodology to directly calculate and visualize the directional components of the Coulomb, radiation, and total electromagnetic fields, as well as the scalar and vector potentials, generated by moving point charges in arbitrary motion with varying speeds. Our method explicitly calculates the retarded time of the point charge along a discretized grid which is then used to determine the fields and potentials. The computational approach, implemented in Python, provides an intuitive understanding of the electromagnetic waves generated by moving point charges and can be used as a pedagogical tool for undergraduate and graduate-level electromagnetic theory courses. Our computer code, freely available for download, can also approximate complicated time-varying continuous charge and current densities, and can be used in conjunction with grid-based numerical modeling methods to solve real-world computational electromagnetics problems, such as experiments with high-energy electron sources. We simulate and discuss several interesting example applications and lab experiments including electric and magnetic dipoles, oscillating and linear accelerating point charges, synchrotron radiation, and Bremsstrahlung.
△ Less
Submitted 2 December, 2020; v1 submitted 4 October, 2020;
originally announced October 2020.