Skip to main content

Showing 1–28 of 28 results for author: Schaeffer, H

.
  1. arXiv:2404.12355  [pdf, other

    cs.LG math.NA

    Towards a Foundation Model for Partial Differential Equations: Multi-Operator Learning and Extrapolation

    Authors: **gmin Sun, Yuxuan Liu, Zecheng Zhang, Hayden Schaeffer

    Abstract: Foundation models, such as large language models, have demonstrated success in addressing various language and image processing tasks. In this work, we introduce a multi-modal foundation model for scientific problems, named PROSE-PDE. Our model, designed for bi-modality to bi-modality learning, is a multi-operator learning approach which can predict future states of spatiotemporal systems while co… ▽ More

    Submitted 19 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  2. arXiv:2310.18888  [pdf, other

    math.NA cs.LG

    D2NO: Efficient Handling of Heterogeneous Input Function Spaces with Distributed Deep Neural Operators

    Authors: Zecheng Zhang, Christian Moya, Lu Lu, Guang Lin, Hayden Schaeffer

    Abstract: Neural operators have been applied in various scientific fields, such as solving parametric partial differential equations, dynamical systems with control, and inverse problems. However, challenges arise when dealing with input functions that exhibit heterogeneous properties, requiring multiple sensors to handle functions with minimal regularity. To address this issue, discretization-invariant neu… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  3. arXiv:2309.16816  [pdf, other

    cs.LG

    PROSE: Predicting Operators and Symbolic Expressions using Multimodal Transformers

    Authors: Yuxuan Liu, Zecheng Zhang, Hayden Schaeffer

    Abstract: Approximating nonlinear differential equations using a neural network provides a robust and efficient tool for various scientific computing tasks, including real-time predictions, inverse problems, optimal controls, and surrogate modeling. Previous works have focused on embedding dynamical systems into networks through two approaches: learning a single solution operator (i.e., the map** from inp… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  4. arXiv:2308.14188  [pdf, other

    math.NA

    Bayesian deep operator learning for homogenized to fine-scale maps for multiscale PDE

    Authors: Zecheng Zhang, Christian Moya, Wing Tat Leung, Guang Lin, Hayden Schaeffer

    Abstract: We present a new framework for computing fine-scale solutions of multiscale Partial Differential Equations (PDEs) using operator learning tools. Obtaining fine-scale solutions of multiscale PDEs can be challenging, but there are many inexpensive computational methods for obtaining coarse-scale solutions. Additionally, in many real-world applications, fine-scale solutions can only be observed at a… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

  5. arXiv:2307.09738  [pdf, other

    math.NA

    A discretization-invariant extension and analysis of some deep operator networks

    Authors: Zecheng Zhang, Wing Tat Leung, Hayden Schaeffer

    Abstract: We present a generalized version of the discretization-invariant neural operator and prove that the network is a universal approximation in the operator sense. Moreover, by incorporating additional terms in the architecture, we establish a connection between this discretization-invariant neural operator network and those discussed before. The discretization-invariance property of the operator netw… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  6. arXiv:2212.07336  [pdf, other

    math.NA

    BelNet: Basis enhanced learning, a mesh-free neural operator

    Authors: Zecheng Zhang, Wing Tat Leung, Hayden Schaeffer

    Abstract: Operator learning trains a neural network to map functions to functions. An ideal operator learning framework should be mesh-free in the sense that the training does not require a particular choice of discretization for the input functions, allows for the input and output functions to be on different domains, and is able to have different grids between samples. We propose a mesh-free neural operat… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

  7. arXiv:2212.05591  [pdf, other

    cs.LG math.NA stat.ML

    Random Feature Models for Learning Interacting Dynamical Systems

    Authors: Yuxuan Liu, Scott G. McCalla, Hayden Schaeffer

    Abstract: Particle dynamics and multi-agent systems provide accurate dynamical models for studying and forecasting the behavior of complex interacting systems. They often take the form of a high-dimensional system of differential equations parameterized by an interaction kernel that models the underlying attractive or repulsive forces between agents. We consider the problem of constructing a data-based appr… ▽ More

    Submitted 11 December, 2022; originally announced December 2022.

  8. arXiv:2204.06935  [pdf, other

    stat.ML cs.LG math.NA math.PR

    Concentration of Random Feature Matrices in High-Dimensions

    Authors: Zhijun Chen, Hayden Schaeffer, Rachel Ward

    Abstract: The spectra of random feature matrices provide essential information on the conditioning of the linear system used in random feature regression problems and are thus connected to the consistency and generalization of random feature models. Random feature matrices are asymmetric rectangular nonlinear matrices depending on two input variables, the data and the weights, which can make their character… ▽ More

    Submitted 11 December, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

  9. arXiv:2204.06108  [pdf, other

    eess.SP cs.CV stat.ML

    SRMD: Sparse Random Mode Decomposition

    Authors: Nicholas Richardson, Hayden Schaeffer, Giang Tran

    Abstract: Signal decomposition and multiscale signal analysis provide many useful tools for time-frequency analysis. We proposed a random feature method for analyzing time-series data by constructing a sparse approximation to the spectrogram. The randomization is both in the time window locations and the frequency sampling, which lowers the overall sampling and computational cost. The sparsification of the… ▽ More

    Submitted 15 March, 2023; v1 submitted 12 April, 2022; originally announced April 2022.

  10. arXiv:2202.02877  [pdf, other

    stat.ML cs.LG math.OC

    HARFE: Hard-Ridge Random Feature Expansion

    Authors: Esha Saha, Hayden Schaeffer, Giang Tran

    Abstract: We propose a random feature model for approximating high-dimensional sparse additive functions called the hard-ridge random feature expansion method (HARFE). This method utilizes a hard-thresholding pursuit-based algorithm applied to the sparse ridge regression (SRR) problem to approximate the coefficients with respect to the random feature matrix. The SRR formulation balances between obtaining sp… ▽ More

    Submitted 2 May, 2023; v1 submitted 6 February, 2022; originally announced February 2022.

    Journal ref: Sampling Theory, Signal Processing, and Data Analysis.21.2 (2023) 1-24

  11. arXiv:2112.04002  [pdf, other

    cs.LG math.OC stat.ML

    SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning

    Authors: Yuege Xie, Bobby Shi, Hayden Schaeffer, Rachel Ward

    Abstract: Sparse shrunk additive models and sparse random feature models have been developed separately as methods to learn low-order functions, where there are few interactions between variables, but neither offers computational efficiency. On the other hand, $\ell_2$-based shrunk additive models are efficient but do not offer feature selection as the resulting coefficient vectors are dense. Inspired by th… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

  12. arXiv:2110.11477  [pdf, other

    stat.ML cs.LG math.OC math.PR

    Conditioning of Random Feature Matrices: Double Descent and Generalization Error

    Authors: Zhijun Chen, Hayden Schaeffer

    Abstract: We provide (high probability) bounds on the condition number of random feature matrices. In particular, we show that if the complexity ratio $\frac{N}{m}$ where $N$ is the number of neurons and $m$ is the number of data samples scales like $\log^{-1}(N)$ or $\log(m)$, then the random feature matrix is well-conditioned. This result holds without the need of regularization and relies on establishing… ▽ More

    Submitted 4 November, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

  13. arXiv:2103.03191  [pdf, other

    stat.ML cs.LG math.NA math.OC math.PR

    Generalization Bounds for Sparse Random Feature Expansions

    Authors: Abolfazl Hashemi, Hayden Schaeffer, Robert Shi, Ufuk Topcu, Giang Tran, Rachel Ward

    Abstract: Random feature methods have been successful in various machine learning tasks, are easy to compute, and come with theoretical accuracy bounds. They serve as an alternative approach to standard neural networks since they can represent similar function spaces without a costly training phase. However, for accuracy, random feature methods require more measurements than trainable parameters, limiting t… ▽ More

    Submitted 20 August, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

  14. arXiv:2012.09940  [pdf, ps, other

    cs.LG math.NA

    Reduced Order Modeling using Shallow ReLU Networks with Grassmann Layers

    Authors: Kayla Bollinger, Hayden Schaeffer

    Abstract: This paper presents a nonlinear model reduction method for systems of equations using a structured neural network. The neural network takes the form of a "three-layer" network with the first layer constrained to lie on the Grassmann manifold and the first activation function set to identity, while the remaining network is a standard two-layer ReLU neural network. The Grassmann layer determines the… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: 18 pages, 2 Figures

  15. arXiv:1908.03190  [pdf, other

    cs.LG stat.ML

    NeuPDE: Neural Network Based Ordinary and Partial Differential Equations for Modeling Time-Dependent Data

    Authors: Yifan Sun, Linan Zhang, Hayden Schaeffer

    Abstract: We propose a neural network based approach for extracting models from dynamic data using ordinary and partial differential equations. In particular, given a time-series or spatio-temporal dataset, we seek to identify an accurate governing system which respects the intrinsic differential structure. The unknown governing model is parameterized by using both (shallow) multilayer perceptrons and nonli… ▽ More

    Submitted 8 August, 2019; originally announced August 2019.

  16. arXiv:1908.01753  [pdf, ps, other

    stat.ML cs.LG

    Extending the step-size restriction for gradient descent to avoid strict saddle points

    Authors: Hayden Schaeffer, Scott G. McCalla

    Abstract: We provide larger step-size restrictions for which gradient descent based algorithms (almost surely) avoid strict saddle points. In particular, consider a twice differentiable (non-convex) objective function whose gradient has Lipschitz constant L and whose Hessian is well-behaved. We prove that the probability of initial conditions for gradient descent with step-size up to 2/L converging to a str… ▽ More

    Submitted 5 August, 2019; originally announced August 2019.

  17. arXiv:1811.10115  [pdf, other

    cs.IT cs.LG stat.ML

    Recovery guarantees for polynomial approximation from dependent data with outliers

    Authors: Lam Si Tung Ho, Hayden Schaeffer, Giang Tran, Rachel Ward

    Abstract: Learning non-linear systems from noisy, limited, and/or dependent data is an important task across various scientific fields including statistics, engineering, computer science, mathematics, and many more. In general, this learning task is ill-posed; however, additional information about the data's structure or on the behavior of the unknown function can make the task well-posed. In this work, we… ▽ More

    Submitted 25 November, 2018; originally announced November 2018.

    Comments: 17 pages, 1 figure

    MSC Class: 68T05; 41A10; 60F05; 68Q32; 62G08; 94A15; 65K10

  18. arXiv:1811.09885  [pdf, other

    cs.CV math.DS

    Forward Stability of ResNet and Its Variants

    Authors: Linan Zhang, Hayden Schaeffer

    Abstract: The residual neural network (ResNet) is a popular deep network architecture which has the ability to obtain high-accuracy results on several image processing problems. In order to analyze the behavior and structure of ResNet, recent work has been on establishing connections between ResNets and continuous-time optimal control problems. In this work, we show that the post-activation ResNet is relate… ▽ More

    Submitted 24 November, 2018; originally announced November 2018.

    Comments: 35 pages, 8 figures, 5 tables

  19. Stability and Error Estimates of BV Solutions to the Abel Inverse Problem

    Authors: Linan Zhang, Hayden Schaeffer

    Abstract: Reconstructing images from ill-posed inverse problems often utilizes total variation regularization in order to recover discontinuities in the data while also removing noise and other artifacts. Total variation regularization has been successful in recovering images for (noisy) Abel transformed data, where object boundaries and data support will lead to sharp edges in the reconstructed image. In t… ▽ More

    Submitted 18 June, 2018; originally announced June 2018.

    Comments: 40 pages, 3 figures, 2 tables

  20. arXiv:1805.06445  [pdf, other

    math.OC cs.IT

    On the Convergence of the SINDy Algorithm

    Authors: Linan Zhang, Hayden Schaeffer

    Abstract: One way to understand time-series data is to identify the underlying dynamical system which generates it. This task can be done by selecting an appropriate model and a set of parameters which best fits the dynamics while providing the simplest representation (i.e. the smallest amount of terms). One such approach is the sparse identification of nonlinear dynamics framework [6] which uses a sparsity… ▽ More

    Submitted 16 May, 2018; originally announced May 2018.

    Comments: 24 pages, 4 figures, 3 tables

  21. arXiv:1805.04158  [pdf, other

    cs.IT math.NA

    Extracting structured dynamical systems using sparse optimization with very few samples

    Authors: Hayden Schaeffer, Giang Tran, Rachel Ward, Linan Zhang

    Abstract: Learning governing equations allows for deeper understanding of the structure and dynamics of data. We present a random sampling method for learning structured dynamical systems from under-sampled and possibly noisy state-space measurements. The learning problem takes the form of a sparse least-squares fitting over a large set of candidate functions. Based on a Bernstein-like inequality for partly… ▽ More

    Submitted 10 May, 2018; originally announced May 2018.

    Comments: 37 pages, 6 figures, 6 tables

  22. arXiv:1709.01558  [pdf, other

    math.NA

    Learning Dynamical Systems and Bifurcation via Group Sparsity

    Authors: Hayden Schaeffer, Giang Tran, Rachel Ward

    Abstract: Learning governing equations from a family of data sets which share the same physical laws but differ in bifurcation parameters is challenging. This is due, in part, to the wide range of phenomena that could be represented in the data sets as well as the range of parameter values. On the other hand, it is common to assume only a small number of candidate functions contribute to the observed dynami… ▽ More

    Submitted 5 September, 2017; originally announced September 2017.

    Comments: 16 pages, 18 figures

    MSC Class: 65P30; 65K10; 37N30; 15A12; 65L09; 65L99

  23. arXiv:1707.08528  [pdf, ps, other

    math.OC

    Extracting Sparse High-Dimensional Dynamics from Limited Data

    Authors: Hayden Schaeffer, Giang Tran, Rachel Ward

    Abstract: Extracting governing equations from dynamic data is an essential task in model selection and parameter estimation. The form of the governing equation is rarely known a priori; however, based on the sparsity-of-effect principle one may assume that the number of candidate functions needed to represent the dynamics is very small. In this work, we leverage the sparse structure of the governing equatio… ▽ More

    Submitted 18 October, 2018; v1 submitted 26 July, 2017; originally announced July 2017.

    Comments: 22 pages, 2 figures, 4 tables

    MSC Class: 34F05; 37H99; 65P99; 65L09; 65L99; 37N30

  24. arXiv:1505.00552  [pdf, ps, other

    math.CO

    Lexicographic Generation of Projective Spaces

    Authors: Christoph Hering, Hans-Jörg Schaeffer

    Abstract: Lexicographic or first choice constructions of geometric objects sometimes lead to amazingly good results. Usually it is difficult to determine the precise identity of these geometries. Here we find infinitely many cases where the identification actually can be accomplished.

    Submitted 4 May, 2015; originally announced May 2015.

    Comments: 3 pages

    MSC Class: 51E15; 05B15

  25. The ALMA Band 9 receiver - Design, construction, characterization, and first light

    Authors: A. M. Baryshev, R. Hesper, F. P. Mena, T. M. Klapwijk, T. A. van Kempen, M. R. Hogerheijde, B. D. Jackson, J. Adema, G. J. Gerlofsma, M. E. Bekema, J. Barkhof, L. H. R. de Haan-Stijkel, M. van den Bemt, A. Koops, K. Keizer, C. Pieters, J. Koops van het Jagt, H. H. A. Schaeffer, T. Zijlstra, M. Kroug, C. F. J. Lodewijk, K. Wielinga, W. Boland, M. W. M. de Graauw, E. F. van Dishoeck , et al. (2 additional authors not shown)

    Abstract: We describe the design, construction, and characterization of the Band 9 heterodyne receivers (600-720 GHz) for the Atacama Large Millimeter / submillimeter Array (ALMA). The ALMA Band 9 receiver units ("cartridges"), which are installed in the telescope's front end, have been designed to detect and down-convert two orthogonal linear polarization components of the light collected by the ALMA anten… ▽ More

    Submitted 6 March, 2015; originally announced March 2015.

    Journal ref: A&A 577, A129 (2015)

  26. arXiv:1404.1370  [pdf, ps, other

    math.NA

    An L1 Penalty Method for General Obstacle Problems

    Authors: Giang Tran, Hayden Schaeffer, William M. Feldman, Stanley J. Osher

    Abstract: We construct an efficient numerical scheme for solving obstacle problems in divergence form. The numerical method is based on a reformulation of the obstacle in terms of an L1-like penalty on the variational problem. The reformulation is an exact regularizer in the sense that for large (but finite) penalty parameter, we recover the exact solution. Our formulation is applied to classical elliptic o… ▽ More

    Submitted 4 April, 2014; originally announced April 2014.

    Comments: 20 pages, 18 figures

  27. arXiv:1311.5850  [pdf, ps, other

    math.AP

    PDEs with Compressed Solutions

    Authors: Russel E. Caflisch, Stanley J. Osher, Hayden Schaeffer, Giang Tran

    Abstract: Sparsity plays a central role in recent developments in signal processing, linear algebra, statistics, optimization, and other fields. In these developments, sparsity is promoted through the addition of an $L^1$ norm (or related quantity) as a constraint or penalty in a variational principle. We apply this approach to partial differential equations that come from a variational quantity, either by… ▽ More

    Submitted 1 August, 2014; v1 submitted 22 November, 2013; originally announced November 2013.

    Comments: 21 pages, 15 figures

  28. Sparse Dynamics for Partial Differential Equations

    Authors: Hayden Schaeffer, Stanley Osher, Russel Caflisch, Cory Hauck

    Abstract: We investigate the approximate dynamics of several differential equations when the solutions are restricted to a sparse subset of a given basis. The restriction is enforced at every time step by simply applying soft thresholding to the coefficients of the basis approximation. By reducing or compressing the information needed to represent the solution at every step, only the essential dynamics are… ▽ More

    Submitted 17 December, 2012; originally announced December 2012.