Skip to main content

Showing 1–50 of 58 results for author: Baldi, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.09688  [pdf, ps, other

    cs.NE cs.AI cs.LG

    From Local to Global Order: A Theory of Neural Synaptic Balance

    Authors: Pierre Baldi, Alireza Rahmansetayesh

    Abstract: We develop a theory of neural synaptic balance and how it can emerge or be enforced in neural networks. For a given additive cost function $R$ (regularizer), a neuron is said to be in balance if the total cost of its input weights is equal to the total cost of its output weights. The basic example is provided by feedforward networks of ReLU units trained with $L_2$ regularizers, which exhibit bala… ▽ More

    Submitted 25 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  2. arXiv:2404.14332  [pdf, other

    hep-ex cs.AI cs.LG hep-ph

    Full Event Particle-Level Unfolding with Variable-Length Latent Variational Diffusion

    Authors: Alexander Shmakov, Kevin Greif, Michael James Fenton, Aishik Ghosh, Pierre Baldi, Daniel Whiteson

    Abstract: The measurements performed by particle physics experiments must account for the imperfect response of the detectors used to observe the interactions. One approach, unfolding, statistically adjusts the experimental data for detector effects. Recently, generative machine learning models have shown promise for performing unbinned unfolding in a high number of dimensions. However, all current generati… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Submission to SciPost

  3. arXiv:2403.10596  [pdf, other

    cs.CL cs.AI q-bio.NC

    Neural Erosion: Emulating Controlled Neurodegeneration and Aging in AI Systems

    Authors: Antonios Alexos, Yu-Dai Tsai, Ian Domingo, Maryam Pishgar, Pierre Baldi

    Abstract: Creating controlled methods to simulate neurodegeneration in artificial intelligence (AI) is crucial for applications that emulate brain function decline and cognitive disorders. We use IQ tests performed by Large Language Models (LLMs) and, more specifically, the LLaMA 2 to introduce the concept of ``neural erosion." This deliberate erosion involves ablating synapses or neurons, or adding Gaussia… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 19 pages, 6 figures in the main text, 5 figures in the Appendix

  4. arXiv:2403.05129  [pdf, other

    cond-mat.soft cs.AI

    Unraveling the Molecular Magic: AI Insights on the Formation of Extraordinarily Stretchable Hydrogels

    Authors: Shahriar Hojjati Emmami, Ali Pilehvar Meibody, Lobat Tayebi, Mohammadamin Tavakoli, Pierre Baldi

    Abstract: The deliberate manipulation of ammonium persulfate, methylenebisacrylamide, dimethyleacrylamide, and polyethylene oxide concentrations resulted in the development of a hydrogel with an exceptional stretchability, capable of extending up to 260 times its original length. This study aims to elucidate the molecular architecture underlying this unique phenomenon by exploring potential reaction mechani… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  5. arXiv:2403.04804  [pdf, other

    eess.AS cs.CL cs.LG cs.MM

    AttentionStitch: How Attention Solves the Speech Editing Problem

    Authors: Antonios Alexos, Pierre Baldi

    Abstract: The generation of natural and high-quality speech from text is a challenging problem in the field of natural language processing. In addition to speech generation, speech editing is also a crucial task, which requires the seamless and unnoticeable integration of edited speech into synthesized speech. We propose a novel approach to speech editing by leveraging a pre-trained text-to-speech (TTS) mod… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted in Machine Learning for Audio workship in NeurIPS 2023

  6. arXiv:2312.16845  [pdf, other

    cs.CL cs.AI

    Evaluating the Performance of Large Language Models for Spanish Language in Undergraduate Admissions Exams

    Authors: Sabino Miranda, Obdulia Pichardo-Lagunas, Bella Martínez-Seis, Pierre Baldi

    Abstract: This study evaluates the performance of large language models, specifically GPT-3.5 and BARD (supported by Gemini Pro model), in undergraduate admissions exams proposed by the National Polytechnic Institute in Mexico. The exams cover Engineering/Mathematical and Physical Sciences, Biological and Medical Sciences, and Social and Administrative Sciences. Both models demonstrated proficiency, exceedi… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: 11 pages, 1 figure. Submitted to a journal

    MSC Class: 68T50 ACM Class: I.2.7

  7. arXiv:2312.10553  [pdf, other

    cs.LG

    Machine Learning-Enhanced Prediction of Surface Smoothness for Inertial Confinement Fusion Target Polishing Using Limited Data

    Authors: Antonios Alexos, Junze Liu, Akash Tiwari, Kshitij Bhardwaj, Sean Hayes, Pierre Baldi, Satish Bukkapatnam, Suhas Bhandarkar

    Abstract: In Inertial Confinement Fusion (ICF) process, roughly a 2mm spherical shell made of high density carbon is used as target for laser beams, which compress and heat it to energy levels needed for high fusion yield. These shells are polished meticulously to meet the standards for a fusion shot. However, the polishing of these shells involves multiple stages, with each stage taking several hours. To m… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: Accepted as Extended Abstract in AIM 2024

  8. arXiv:2311.01118  [pdf, other

    cs.LG physics.chem-ph

    AI for Interpretable Chemistry: Predicting Radical Mechanistic Pathways via Contrastive Learning

    Authors: Mohammadamin Tavakoli, Yin Ting T. Chiu, Alexander Shmakov, Ann Marie Carlton, David Van Vranken, Pierre Baldi

    Abstract: Deep learning-based reaction predictors have undergone significant architectural evolution. However, their reliance on reactions from the US Patent Office results in a lack of interpretable predictions and limited generalization capability to other chemistry domains, such as radical and atmospheric chemistry. To address these challenges, we introduce a new reaction predictor system, RMechRP, that… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  9. arXiv:2309.01886  [pdf

    hep-ex cs.LG hep-ph

    Reconstruction of Unstable Heavy Particles Using Deep Symmetry-Preserving Attention Networks

    Authors: Michael James Fenton, Alexander Shmakov, Hideki Okawa, Yuji Li, Ko-Yang Hsiao, Shih-Chieh Hsu, Daniel Whiteson, Pierre Baldi

    Abstract: Reconstructing unstable heavy particles requires sophisticated techniques to sift through the large number of possible permutations for assignment of detector objects to the underlying partons. Anapproach based on a generalized attention mechanism, symmetry preserving attention networks (SPA-NET), has been previously applied to top quark pair decays at the Large Hadron Collider which produce only… ▽ More

    Submitted 30 April, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: Accepted by Nature Communications Physics, replaced with published version

    Journal ref: Commun Phys 7, 139 (2024)

  10. arXiv:2307.11922  [pdf, other

    cs.LG cs.AI cs.CL

    Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors

    Authors: Kolby Nottingham, Yasaman Razeghi, Kyungmin Kim, JB Lanier, Pierre Baldi, Roy Fox, Sameer Singh

    Abstract: Large language models (LLMs) are being applied as actors for sequential decision making tasks in domains such as robotics and games, utilizing their general world knowledge and planning abilities. However, previous work does little to explore what environment state information is provided to LLM actors via language. Exhaustively describing high-dimensional states can impair performance and raise i… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  11. arXiv:2306.08754  [pdf, other

    cs.LG physics.ao-ph

    ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation

    Authors: Sungduk Yu, Walter Hannah, Liran Peng, Jerry Lin, Mohamed Aziz Bhouri, Ritwik Gupta, Björn Lütjens, Justus Christopher Will, Gunnar Behrens, Julius Busecke, Nora Loose, Charles I Stern, Tom Beucler, Bryce Harrop, Benjamin R Hillman, Andrea Jenney, Savannah Ferretti, Nana Liu, Anima Anandkumar, Noah D Brenowitz, Veronika Eyring, Nicholas Geneva, Pierre Gentine, Stephan Mandt, Jaideep Pathak , et al. (31 additional authors not shown)

    Abstract: Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short,… ▽ More

    Submitted 6 February, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 Outstanding Datasets and Benchmarks Track Paper

  12. arXiv:2305.11531  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Generalizing to new geometries with Geometry-Aware Autoregressive Models (GAAMs) for fast calorimeter simulation

    Authors: Junze Liu, Aishik Ghosh, Dylan Smith, Pierre Baldi, Daniel Whiteson

    Abstract: Generation of simulated detector response to collision products is crucial to data analysis in particle physics, but computationally very expensive. One subdetector, the calorimeter, dominates the computational time due to the high granularity of its cells and complexity of the interactions. Generative models can provide more rapid sample production, but currently require significant effort to opt… ▽ More

    Submitted 14 November, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

  13. arXiv:2305.10399  [pdf, other

    hep-ex cs.LG

    End-To-End Latent Variational Diffusion Models for Inverse Problems in High Energy Physics

    Authors: Alexander Shmakov, Kevin Greif, Michael Fenton, Aishik Ghosh, Pierre Baldi, Daniel Whiteson

    Abstract: High-energy collisions at the Large Hadron Collider (LHC) provide valuable insights into open questions in particle physics. However, detector effects must be corrected before measurements can be compared to certain theoretical predictions or measurements from other detectors. Methods to solve this \textit{inverse problem} of map** detector observations to theoretical quantities of the underlyin… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  14. arXiv:2303.17491  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Language Models can Solve Computer Tasks

    Authors: Geunwoo Kim, Pierre Baldi, Stephen McAleer

    Abstract: Agents capable of carrying out general tasks on a computer can improve efficiency and productivity by automating repetitive tasks and assisting in complex problem-solving. Ideally, such agents should be able to solve new computer tasks presented to them through natural language commands. However, previous approaches to this problem require large amounts of expert demonstrations and task-specific r… ▽ More

    Submitted 16 November, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

  15. arXiv:2303.06201  [pdf, other

    cs.LG

    Interpretable Joint Event-Particle Reconstruction for Neutrino Physics at NOvA with Sparse CNNs and Transformers

    Authors: Alexander Shmakov, Alejandro Yankelevich, Jianming Bian, Pierre Baldi

    Abstract: The complex events observed at the NOvA long-baseline neutrino oscillation experiment contain vital information for understanding the most elusive particles in the standard model. The NOvA detectors observe interactions of neutrinos from the NuMI beam at Fermilab. Associating the particles produced in these interaction events to their source particles, a process known as reconstruction, is critica… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

  16. arXiv:2212.08233  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    Geometry-aware Autoregressive Models for Calorimeter Shower Simulations

    Authors: Junze Liu, Aishik Ghosh, Dylan Smith, Pierre Baldi, Daniel Whiteson

    Abstract: Calorimeter shower simulations are often the bottleneck in simulation time for particle physics detectors. A lot of effort is currently spent on optimizing generative architectures for specific detector geometries, which generalize poorly. We develop a geometry-aware autoregressive model on a range of calorimeter geometries such that the model learns to adapt its energy deposition depending on the… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: This paper was submitted to NeurIPS Machine Learning and the Physical Sciences Workshop 2022

  17. arXiv:2207.09597  [pdf, other

    cs.LG cs.AI cs.GT

    Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments

    Authors: JB Lanier, Stephen McAleer, Pierre Baldi, Roy Fox

    Abstract: Robust reinforcement learning (RL) considers the problem of learning policies that perform well in the worst case among a set of possible environment parameter values. In real-world environments, choosing the set of possible values for robust RL can be a difficult task. When that set is specified too narrowly, the agent will be left vulnerable to reasonable parameter values unaccounted for. When s… ▽ More

    Submitted 3 October, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

    Comments: Added new theory sections. Added comparison to self-play. Added adversary mixed-strategy analysis

  18. arXiv:2207.06541  [pdf, other

    cs.GT cs.LG cs.MA

    Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games

    Authors: Stephen McAleer, JB Lanier, Kevin Wang, Pierre Baldi, Roy Fox, Tuomas Sandholm

    Abstract: In competitive two-agent environments, deep reinforcement learning (RL) methods based on the \emph{Double Oracle (DO)} algorithm, such as \emph{Policy Space Response Oracles (PSRO)} and \emph{Anytime PSRO (APSRO)}, iteratively add RL best response policies to a population. Eventually, an optimal mixture of these population policies will approximate a Nash equilibrium. However, these methods might… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

  19. arXiv:2206.02819  [pdf, other

    astro-ph.HE astro-ph.GA cs.LG hep-ph

    Deep Learning Models of the Discrete Component of the Galactic Interstellar Gamma-Ray Emission

    Authors: Alexander Shmakov, Mohammadamin Tavakoli, Pierre Baldi, Christopher M. Karwin, Alex Broughton, Simona Murgia

    Abstract: A significant point-like component from the small scale (or discrete) structure in the H2 interstellar gas might be present in the Fermi-LAT data, but modeling this emission relies on observations of rare gas tracers only available in limited regions of the sky. Identifying this contribution is important to discriminate gamma-ray point sources from interstellar gas, and to better characterize exte… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

    Comments: Submitted. Companion paper to "Improved modeling of the discrete component of the galactic interstellar gamma-ray emission and implications for the Fermi--LAT galactic center excess"

  20. arXiv:2205.03198  [pdf, ps, other

    cs.LO cs.AI

    A Logic-based Tractable Approximation of Probability

    Authors: Paolo Baldi, Hykel Hosni

    Abstract: We provide a logical framework in which a resource-bounded agent can be seen to perform approximations of probabilistic reasoning. Our main results read as follows. First we identify the conditions under which propositional probability functions can be approximated by a hierarchy of depth-bounded Belief functions. Second we show that under rather palatable restrictions, our approximations of proba… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

  21. arXiv:2202.08371  [pdf, ps, other

    cs.LG cs.AI stat.ML

    The Quarks of Attention

    Authors: Pierre Baldi, Roman Vershynin

    Abstract: Attention plays a fundamental role in both natural and artificial intelligence systems. In deep learning, attention-based neural architectures, such as transformer architectures, are widely used to tackle problems in natural language processing and beyond. Here we investigate the fundamental building blocks of attention and their computational properties. Within the standard model of deep learning… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

  22. arXiv:2201.07700  [pdf, other

    cs.GT cs.LG cs.MA

    Anytime PSRO for Two-Player Zero-Sum Games

    Authors: Stephen McAleer, Kevin Wang, John Lanier, Marc Lanctot, Pierre Baldi, Tuomas Sandholm, Roy Fox

    Abstract: Policy space response oracles (PSRO) is a multi-agent reinforcement learning algorithm that has achieved state-of-the-art performance in very large two-player zero-sum games. PSRO is based on the tabular double oracle (DO) method, an algorithm that is guaranteed to converge to a Nash equilibrium, but may increase exploitability from one iteration to the next. We propose anytime double oracle (ADO)… ▽ More

    Submitted 28 January, 2022; v1 submitted 19 January, 2022; originally announced January 2022.

    Comments: Published in AAAI Reinforcement Learning in Games Workshop

  23. arXiv:2201.01196  [pdf, other

    cs.LG cs.AI physics.chem-ph

    Rxn Hypergraph: a Hypergraph Attention Model for Chemical Reaction Representation

    Authors: Mohammadamin Tavakoli, Alexander Shmakov, Francesco Ceccarelli, Pierre Baldi

    Abstract: It is fundamental for science and technology to be able to predict chemical reactions and their properties. To achieve such skills, it is important to develop good representations of chemical reactions, or good deep learning architectures that can learn such representations automatically from the data. There is currently no universal and widely adopted method for robustly representing chemical rea… ▽ More

    Submitted 2 January, 2022; originally announced January 2022.

  24. arXiv:2107.06424  [pdf, other

    cs.LG cs.AI

    Tourbillon: a Physically Plausible Neural Architecture

    Authors: Mohammadamin Tavakoli, Peter Sadowski, Pierre Baldi

    Abstract: In a physical neural system, backpropagation is faced with a number of obstacles including: the need for labeled data, the violation of the locality learning principle, the need for symmetric connections, and the lack of modularity. Tourbillon is a new architecture that addresses all these limitations. At its core, it consists of a stack of circular autoencoders followed by an output layer. The ci… ▽ More

    Submitted 22 July, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

  25. arXiv:2106.03927  [pdf, other

    cs.GT cs.AI cs.LG cs.MA

    Improving Social Welfare While Preserving Autonomy via a Pareto Mediator

    Authors: Stephen McAleer, John Lanier, Michael Dennis, Pierre Baldi, Roy Fox

    Abstract: Machine learning algorithms often make decisions on behalf of agents with varied and sometimes conflicting interests. In domains where agents can choose to take their own action or delegate their action to a central mediator, an open question is how mediators should take actions on behalf of delegating agents. The main existing approach uses delegating agents to punish non-delegating agents in an… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

  26. SPANet: Generalized Permutationless Set Assignment for Particle Physics using Symmetry Preserving Attention

    Authors: Alexander Shmakov, Michael James Fenton, Ta-Wei Ho, Shih-Chieh Hsu, Daniel Whiteson, Pierre Baldi

    Abstract: The creation of unstable heavy particles at the Large Hadron Collider is the most direct way to address some of the deepest open questions in physics. Collisions typically produce variable-size sets of observed particles which have inherent ambiguities complicating the assignment of observed particles to the decay products of the heavy particles. Current strategies for tackling these challenges in… ▽ More

    Submitted 22 July, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: published in SciPost

    Journal ref: SciPost Phys. 12, 178 (2022)

  27. arXiv:2103.14536  [pdf, other

    physics.comp-ph cs.LG

    Quantum Mechanics and Machine Learning Synergies: Graph Attention Neural Networks to Predict Chemical Reactivity

    Authors: Mohammadamin Tavakoli, Aaron Mood, David Van Vranken, Pierre Baldi

    Abstract: There is a lack of scalable quantitative measures of reactivity for functional groups in organic chemistry. Measuring reactivity experimentally is costly and time-consuming and does not scale to the astronomical size of chemical space. In previous quantum chemistry studies, we have introduced Methyl Cation Affinities (MCA*) and Methyl Anion Affinities (MAA*), using a solvation model, as quantitati… ▽ More

    Submitted 24 March, 2021; originally announced March 2021.

  28. arXiv:2103.06426  [pdf, other

    cs.GT cs.AI cs.LG cs.MA

    XDO: A Double Oracle Algorithm for Extensive-Form Games

    Authors: Stephen McAleer, John Lanier, Kevin Wang, Pierre Baldi, Roy Fox

    Abstract: Policy Space Response Oracles (PSRO) is a reinforcement learning (RL) algorithm for two-player zero-sum games that has been empirically shown to find approximate Nash equilibria in large games. Although PSRO is guaranteed to converge to an approximate Nash equilibrium and can handle continuous actions, it may take an exponential number of iterations as the number of information states (infostates)… ▽ More

    Submitted 28 January, 2022; v1 submitted 10 March, 2021; originally announced March 2021.

    Comments: 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  29. arXiv:2102.10148  [pdf, other

    cs.LG cs.NE stat.ML

    A theory of capacity and sparse neural encoding

    Authors: Pierre Baldi, Roman Vershynin

    Abstract: Motivated by biological considerations, we study sparse neural maps from an input layer to a target layer with sparse activity, and specifically the problem of storing $K$ input-target associations $(x,y)$, or memories, when the target vectors $y$ are sparse. We mathematically prove that $K$ undergoes a phase transition and that in general, and somewhat paradoxically, sparsity in the target layers… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

    Comments: 31 pages

    MSC Class: 68Q32; 06E30; 92B20

  30. arXiv:2102.04518  [pdf, ps, other

    cs.AI cs.LG

    A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks

    Authors: Forest Agostinelli, Alexander Shmakov, Stephen McAleer, Roy Fox, Pierre Baldi

    Abstract: Efficiently solving problems with large action spaces using A* search has been of importance to the artificial intelligence community for decades. This is because the computation and memory requirements of A* search grow linearly with the size of the action space. This burden becomes even more apparent when A* search uses a heuristic function learned by computationally expensive function approxima… ▽ More

    Submitted 23 March, 2023; v1 submitted 8 February, 2021; originally announced February 2021.

    Comments: Added theoretical results to show that Q* search is an admissible search algorithm. Added comparisons to deferred heuristic evaluation. Added experiments with Lights Out and the 35-Pancake puzzle

  31. arXiv:2102.00280  [pdf

    eess.IV cs.CV cs.LG

    Detecting Pulmonary Coccidioidomycosis (Valley fever) with Deep Convolutional Neural Networks

    Authors: Jordan Ott, David Bruyette, Cody Arbuckle, Dylan Balsz, Silke Hecht, Lisa Shubitz, Pierre Baldi

    Abstract: Coccidioidomycosis is the most common systemic mycosis in dogs in the southwestern United States. With warming climates, affected areas and number of cases are expected to increase in the coming years, escalating also the chances of transmission to humans. As a result, develo** methods for automating the detection of the disease is important, as this will help doctors and veterinarians more easi… ▽ More

    Submitted 30 January, 2021; originally announced February 2021.

  32. arXiv:2012.06181  [pdf, other

    physics.ins-det cs.CV hep-ex

    Deep-Learning-Based Kinematic Reconstruction for DUNE

    Authors: Junze Liu, Jordan Ott, Julian Collado, Benjamin Jargowsky, Wenjie Wu, Jianming Bian, Pierre Baldi

    Abstract: In the framework of three-active-neutrino mixing, the charge parity phase, the neutrino mass ordering, and the octant of $θ_{23}$ remain unknown. The Deep Underground Neutrino Experiment (DUNE) is a next-generation long-baseline neutrino oscillation experiment, which aims to address these questions by measuring the oscillation patterns of $ν_μ/ν_e$ and $\barν_μ/\barν_e$ over a range of energies sp… ▽ More

    Submitted 13 December, 2020; v1 submitted 11 December, 2020; originally announced December 2020.

    Comments: NuerIPS workshop proceeding

  33. arXiv:2011.06408  [pdf

    eess.IV cs.LG

    Deep machine learning-assisted multiphoton microscopy to reduce light exposure and expedite imaging

    Authors: Stephen McAleer, Alex Fast, Yuntian Xue, Magdalene Seiler, William Tang, Mihaela Balu, Pierre Baldi, Andrew W. Browne

    Abstract: Two-photon excitation fluorescence (2PEF) allows imaging of tissue up to about one millimeter in thickness. Typically, reducing fluorescence excitation exposure reduces the quality of the image. However, using deep learning super resolution techniques, these low-resolution images can be converted to high-resolution images. This work explores improving human tissue imaging by applying deep learning… ▽ More

    Submitted 10 November, 2020; originally announced November 2020.

  34. arXiv:2010.09206  [pdf

    hep-ex cs.LG hep-ph

    Permutationless Many-Jet Event Reconstruction with Symmetry Preserving Attention Networks

    Authors: Michael James Fenton, Alexander Shmakov, Ta-Wei Ho, Shih-Chieh Hsu, Daniel Whiteson, Pierre Baldi

    Abstract: Top quarks, produced in large numbers at the Large Hadron Collider, have a complex detector signature and require special reconstruction techniques. The most common decay mode, the "all-jet" channel, results in a 6-jet final state which is particularly difficult to reconstruct in $pp$ collisions due to the large number of permutations possible. We present a novel approach to this class of problem,… ▽ More

    Submitted 14 July, 2022; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: replaced with final published version

    Journal ref: Phys. Rev. D 105, 11200 Published 15 June 2022

  35. arXiv:2007.14604  [pdf, other

    cs.LG cs.AI stat.ML

    Quantity vs. Quality: On Hyperparameter Optimization for Deep Reinforcement Learning

    Authors: Lars Hertel, Pierre Baldi, Daniel L. Gillen

    Abstract: Reinforcement learning algorithms can show strong variation in performance between training runs with different random seeds. In this paper we explore how this affects hyperparameter optimization when the goal is to find hyperparameter settings that perform well across random seeds. In particular, we benchmark whether it is better to explore a large quantity of hyperparameter settings via pruning… ▽ More

    Submitted 30 July, 2020; v1 submitted 29 July, 2020; originally announced July 2020.

  36. arXiv:2006.08947  [pdf, other

    cs.LG cs.NE stat.ML

    SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness

    Authors: Mohammadamin Tavakoli, Forest Agostinelli, Pierre Baldi

    Abstract: We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: 1) continuous; 2) grounded (f(0) = 0); 3) use symmetric… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

  37. arXiv:2006.08555  [pdf, other

    cs.GT cs.AI cs.LG cs.MA

    Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

    Authors: Stephen McAleer, John Lanier, Roy Fox, Pierre Baldi

    Abstract: Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game theory that is guaranteed to converge to an approximate Nash equilibrium. However, PSRO requires training a reinforcement learning policy at each iteration, making… ▽ More

    Submitted 18 February, 2021; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: SM and JL contributed equally

  38. arXiv:2005.04048  [pdf, other

    cs.LG stat.ML

    Sherpa: Robust Hyperparameter Optimization for Machine Learning

    Authors: Lars Hertel, Julian Collado, Peter Sadowski, Jordan Ott, Pierre Baldi

    Abstract: Sherpa is a hyperparameter optimization library for machine learning models. It is specifically designed for problems with computationally expensive, iterative function evaluations, such as the hyperparameter tuning of deep neural networks. With Sherpa, scientists can quickly optimize hyperparameters using a variety of powerful and interchangeable algorithms. Sherpa can be run on either a single m… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

  39. arXiv:2004.10652  [pdf, other

    cs.LG cs.PL

    A Fortran-Keras Deep Learning Bridge for Scientific Computing

    Authors: Jordan Ott, Mike Pritchard, Natalie Best, Erik Linstead, Milan Curcic, Pierre Baldi

    Abstract: Implementing artificial neural networks is commonly achieved via high-level programming languages like Python and easy-to-use deep learning libraries like Keras. These software libraries come pre-loaded with a variety of network architectures, provide autodifferentiation, and support GPUs for fast and efficient computation. As a result, a deep learning practitioner will favor training a neural net… ▽ More

    Submitted 3 August, 2020; v1 submitted 14 April, 2020; originally announced April 2020.

  40. arXiv:2004.08152  [pdf, other

    cs.LG

    Continuous Representation of Molecules Using Graph Variational Autoencoder

    Authors: Mohammadamin Tavakoli, Pierre Baldi

    Abstract: In order to continuously represent molecules, we propose a generative model in the form of a VAE which is operating on the 2D-graph structure of molecules. A side predictor is employed to prune the latent space and help the decoder in generating meaningful adjacency tensor of molecules. Other than the potential applicability in drug design and property prediction, we show the superior performance… ▽ More

    Submitted 17 April, 2020; originally announced April 2020.

  41. arXiv:1912.04451  [pdf, other

    cs.MA

    ColosseumRL: A Framework for Multiagent Reinforcement Learning in $N$-Player Games

    Authors: Alexander Shmakov, John Lanier, Stephen McAleer, Rohan Achar, Cristina Lopes, Pierre Baldi

    Abstract: Much of recent success in multiagent reinforcement learning has been in two-player zero-sum games. In these games, algorithms such as fictitious self-play and minimax tree search can converge to an approximate Nash equilibrium. While playing a Nash equilibrium strategy in a two-player zero-sum game is optimal, in an $n$-player general sum game, it becomes a much less informative solution concept.… ▽ More

    Submitted 9 December, 2019; originally announced December 2019.

    Comments: Accepted for the 2020 AAAI Spring Symposium, Challenges and Opportunities for Multi-Agent Reinforcement Learning. Source code available at https://github.com/colosseumrl/colosseumrl/

  42. arXiv:1909.11483  [pdf, other

    cs.LG cs.NE q-bio.NC

    Learning in the Machine: To Share or Not to Share?

    Authors: Jordan Ott, Erik Linstead, Nicholas LaHaye, Pierre Baldi

    Abstract: Weight-sharing is one of the pillars behind Convolutional Neural Networks and their successes. However, in physical neural systems such as the brain, weight-sharing is implausible. This discrepancy raises the fundamental question of whether weight-sharing is necessary. If so, to which degree of precision? If not, what are the alternatives? The goal of this study is to investigate these questions,… ▽ More

    Submitted 4 October, 2019; v1 submitted 23 September, 2019; originally announced September 2019.

  43. arXiv:1906.03710  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Curiosity-Driven Multi-Criteria Hindsight Experience Replay

    Authors: John B. Lanier, Stephen McAleer, Pierre Baldi

    Abstract: Dealing with sparse rewards is a longstanding challenge in reinforcement learning. The recent use of hindsight methods have achieved success on a variety of sparse-reward tasks, but they fail on complex tasks such as stacking multiple blocks with a robot arm in simulation. Curiosity-driven exploration using the prediction error of a learned dynamics model as an intrinsic reward has been shown to b… ▽ More

    Submitted 9 June, 2019; originally announced June 2019.

    Comments: 14 pages

  44. arXiv:1901.00434  [pdf, ps, other

    cs.LG cs.NE math.CO stat.ML

    The capacity of feedforward neural networks

    Authors: Pierre Baldi, Roman Vershynin

    Abstract: A long standing open problem in the theory of neural networks is the development of quantitative methods to estimate and compare the capabilities of different architectures. Here we define the capacity of an architecture by the binary logarithm of the number of functions it can compute, as the synaptic weights are varied. The capacity provides an upper bound on the number of bits that can be extra… ▽ More

    Submitted 27 March, 2019; v1 submitted 2 January, 2019; originally announced January 2019.

    Comments: 49 pages. Introduction is expanded and conclusion is added

    MSC Class: 68Q32; 06E30; 92B20

  45. Towards Automating Precision Studies of Clone Detectors

    Authors: Vaibhav Saini, Farima Farmahinifarahani, Yadong Lu, Di Yang, Pedro Martins, Hitesh Sajnani, Pierre Baldi, Cristina Lopes

    Abstract: Current research in clone detection suffers from poor ecosystems for evaluating precision of clone detection tools. Corpora of labeled clones are scarce and incomplete, making evaluation labor intensive and idiosyncratic, and limiting inter tool comparison. Precision-assessment tools are simply lacking. We present a semi-automated approach to facilitate precision studies of clone detection tools.… ▽ More

    Submitted 13 December, 2018; v1 submitted 12 December, 2018; originally announced December 2018.

    Comments: Accepted to be published in the 41st ACM/IEEE International Conference on Software Engineering

    ACM Class: D.2.13

    Journal ref: Proceeding 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)

  46. Oreo: Detection of Clones in the Twilight Zone

    Authors: Vaibhav Saini, Farima Farmahinifarahani, Yadong Lu, Pierre Baldi, Cristina Lopes

    Abstract: Source code clones are categorized into four types of increasing difficulty of detection, ranging from purely textual (Type-1) to purely semantic (Type-4). Most clone detectors reported in the literature work well up to Type-3, which accounts for syntactic differences. In between Type-3 and Type-4, however, there lies a spectrum of clones that, although still exhibiting some syntactic similarities… ▽ More

    Submitted 15 June, 2018; originally announced June 2018.

    ACM Class: D.2.13

    Journal ref: Proceedings of the 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (2018) 354-365

  47. arXiv:1805.07470  [pdf, other

    cs.AI

    Solving the Rubik's Cube Without Human Knowledge

    Authors: Stephen McAleer, Forest Agostinelli, Alexander Shmakov, Pierre Baldi

    Abstract: A generally intelligent agent must be able to teach itself how to solve problems in complex domains with minimal human supervision. Recently, deep reinforcement learning algorithms combined with self-play have achieved superhuman proficiency in Go, Chess, and Shogi without human data or domain knowledge. In these environments, a reward is always received at the end of the game, however, for many c… ▽ More

    Submitted 18 May, 2018; originally announced May 2018.

    Comments: First three authors contributed equally. Submitted to NIPS 2018

  48. arXiv:1712.08608  [pdf, ps, other

    cs.NE cs.LG

    Learning in the Machine: the Symmetries of the Deep Learning Channel

    Authors: Pierre Baldi, Peter Sadowski, Zhiqin Lu

    Abstract: In a physical neural system, learning rules must be local both in space and time. In order for learning to occur, non-local information must be communicated to the deep synapses through a communication channel, the deep learning channel. We identify several possible architectures for this learning channel (Bidirectional, Conjoined, Twin, Distinct) and six symmetry challenges: 1) symmetry of archit… ▽ More

    Submitted 22 December, 2017; originally announced December 2017.

  49. arXiv:1706.01826  [pdf, other

    physics.ins-det cs.LG hep-ex

    Efficient Antihydrogen Detection in Antimatter Physics by Deep Learning

    Authors: Peter Sadowski, Balint Radics, Ananya, Yasunori Yamazaki, Pierre Baldi

    Abstract: Antihydrogen is at the forefront of antimatter research at the CERN Antiproton Decelerator. Experiments aiming to test the fundamental CPT symmetry and antigravity effects require the efficient detection of antihydrogen annihilation events, which is performed using highly granular tracking detectors installed around an antimatter trap. Improving the efficiency of the antihydrogen annihilation dete… ▽ More

    Submitted 6 June, 2017; originally announced June 2017.

  50. arXiv:1612.02734  [pdf, ps, other

    cs.LG cs.AI cs.NE

    Learning in the Machine: Random Backpropagation and the Deep Learning Channel

    Authors: Pierre Baldi, Peter Sadowski, Zhiqin Lu

    Abstract: Random backpropagation (RBP) is a variant of the backpropagation algorithm for training neural networks, where the transpose of the forward matrices are replaced by fixed random matrices in the calculation of the weight updates. It is remarkable both because of its effectiveness, in spite of using random matrices to communicate error information, and because it completely removes the taxing requir… ▽ More

    Submitted 22 December, 2017; v1 submitted 8 December, 2016; originally announced December 2016.