Skip to main content

Showing 1–12 of 12 results for author: Maddix, D C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07337  [pdf, other

    cs.LG

    Transferring Knowledge from Large Foundation Models to Small Downstream Models

    Authors: Shikai Qiu, Boran Han, Danielle C. Maddix, Shuai Zhang, Yuyang Wang, Andrew Gordon Wilson

    Abstract: How do we transfer the relevant knowledge from ever larger foundation models into small, task-specific downstream models that can run at much lower costs? Standard transfer learning using pre-trained weights as the initialization transfers limited information and commits us to often massive pre-trained architectures. This procedure also precludes combining multiple pre-trained models that learn co… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ICML 2024. Code available at https://github.com/amazon-science/adaptive-feature-transfer

  2. arXiv:2403.10642  [pdf, other

    cs.LG math.NA

    Using Uncertainty Quantification to Characterize and Improve Out-of-Domain Learning for PDEs

    Authors: S. Chandra Mouli, Danielle C. Maddix, Shima Alizadeh, Gaurav Gupta, Andrew Stuart, Michael W. Mahoney, Yuyang Wang

    Abstract: Existing work in scientific machine learning (SciML) has shown that data-driven learning of solution operators can provide a fast approximate alternative to classical numerical partial differential equation (PDE) solvers. Of these, Neural Operators (NOs) have emerged as particularly promising. We observe that several uncertainty quantification (UQ) methods for NOs fail for test inputs that are eve… ▽ More

    Submitted 12 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: ICML 2024

  3. arXiv:2403.07815  [pdf, other

    cs.LG cs.AI

    Chronos: Learning the Language of Time Series

    Authors: Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, Yuyang Wang

    Abstract: We introduce Chronos, a simple yet effective framework for pretrained probabilistic time series models. Chronos tokenizes time series values using scaling and quantization into a fixed vocabulary and trains existing transformer-based language model architectures on these tokenized time series via the cross-entropy loss. We pretrained Chronos models based on the T5 family (ranging from 20M to 710M… ▽ More

    Submitted 2 May, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Code and model checkpoints available at https://github.com/amazon-science/chronos-forecasting

  4. arXiv:2305.15786  [pdf, other

    cs.LG math.ST stat.ML

    Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting

    Authors: Hilaf Hasson, Danielle C. Maddix, Yuyang Wang, Gaurav Gupta, Youngsuk Park

    Abstract: Ensembling is among the most popular tools in machine learning (ML) due to its effectiveness in minimizing variance and thus improving generalization. Most ensembling methods for black-box base learners fall under the umbrella of "stacked generalization," namely training an ML algorithm that takes the inferences from the base learners as input. While stacking has been widely applied in practice, i… ▽ More

    Submitted 28 August, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: ICML 2023

  5. arXiv:2302.11002  [pdf, other

    cs.LG math.AP math.NA

    Learning Physical Models that Can Respect Conservation Laws

    Authors: Derek Hansen, Danielle C. Maddix, Shima Alizadeh, Gaurav Gupta, Michael W. Mahoney

    Abstract: Recent work in scientific machine learning (SciML) has focused on incorporating partial differential equation (PDE) information into the learning process. Much of this work has focused on relatively "easy" PDE operators (e.g., elliptic and parabolic), with less emphasis on relatively "hard" PDE operators (e.g., hyperbolic). Within numerical PDEs, the latter problem class requires control of a type… ▽ More

    Submitted 10 October, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: ICML 2023, Physica D: Nonlinear Phenomena, Accepted

    Journal ref: Physica D: Nonlinear Phenomena, 457 (2024) 133952

  6. arXiv:2302.02077  [pdf, other

    cs.LG

    Cross-Frequency Time Series Meta-Forecasting

    Authors: Mike Van Ness, Huibin Shen, Hao Wang, Xiaoyong **, Danielle C. Maddix, Karthick Gopalswamy

    Abstract: Meta-forecasting is a newly emerging field which combines meta-learning and time series forecasting. The goal of meta-forecasting is to train over a collection of source time series and generalize to new time series one-at-a-time. Previous approaches in meta-forecasting achieve competitive performance, but with the restriction of training a separate model for each sampling frequency. In this work,… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

  7. arXiv:2212.08151  [pdf, other

    cs.LG

    First De-Trend then Attend: Rethinking Attention for Time-Series Forecasting

    Authors: Xiyuan Zhang, Xiaoyong **, Karthick Gopalswamy, Gaurav Gupta, Youngsuk Park, Xingjian Shi, Hao Wang, Danielle C. Maddix, Yuyang Wang

    Abstract: Transformer-based models have gained large popularity and demonstrated promising results in long-term time-series forecasting in recent years. In addition to learning attention in time domain, recent works also explore learning attention in frequency domains (e.g., Fourier domain, wavelet domain), given that seasonal patterns can be better captured in these domains. In this work, we seek to unders… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: NeurIPS 2022 All Things Attention Workshop

  8. arXiv:2212.07477  [pdf, other

    cs.LG math.AP math.OA

    Guiding continuous operator learning through Physics-based boundary constraints

    Authors: Nadim Saad, Gaurav Gupta, Shima Alizadeh, Danielle C. Maddix

    Abstract: Boundary conditions (BCs) are important groups of physics-enforced constraints that are necessary for solutions of Partial Differential Equations (PDEs) to satisfy at specific spatial locations. These constraints carry important physical meaning, and guarantee the existence and the uniqueness of the PDE solution. Current neural-network based approaches that aim to solve PDEs rely only on training… ▽ More

    Submitted 2 March, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: Nadim and Gaurav contributed equally in this work. 31 pages, 7 figures, 16 tables

    Journal ref: ICLR 2023

  9. arXiv:2102.06828  [pdf, other

    cs.LG stat.ML

    Domain Adaptation for Time Series Forecasting via Attention Sharing

    Authors: Xiaoyong **, Youngsuk Park, Danielle C. Maddix, Hao Wang, Yuyang Wang

    Abstract: Recently, deep neural networks have gained increasing popularity in the field of time series forecasting. A primary reason for their success is their ability to effectively capture complex temporal dynamics across multiple related time series. The advantages of these deep forecasters only start to emerge in the presence of a sufficient amount of data. This poses a challenge for typical forecasting… ▽ More

    Submitted 21 June, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: ICML 2022

  10. arXiv:1906.05264  [pdf, other

    cs.LG stat.ML

    GluonTS: Probabilistic Time Series Models in Python

    Authors: Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C. Maddix, Syama Rangapuram, David Salinas, Jasper Schulz, Lorenzo Stella, Ali Caner Türkmen, Yuyang Wang

    Abstract: We introduce Gluon Time Series (GluonTS, available at https://gluon-ts.mxnet.io), a library for deep-learning-based time series modeling. GluonTS simplifies the development of and experimentation with time series models for common tasks such as forecasting or anomaly detection. It provides all necessary components and tools that scientists need for quickly building new models, for efficiently runn… ▽ More

    Submitted 14 June, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

    Comments: ICML Time Series Workshop 2019

  11. arXiv:1905.12417  [pdf, other

    stat.ML cs.LG

    Deep Factors for Forecasting

    Authors: Yuyang Wang, Alex Smola, Danielle C. Maddix, Jan Gasthaus, Dean Foster, Tim Januschowski

    Abstract: Producing probabilistic forecasts for large collections of similar and/or dependent time series is a practically relevant and challenging task. Classical time series models fail to capture complex patterns in the data, and multivariate techniques struggle to scale to large problem sizes. Their reliance on strong structural assumptions makes them data-efficient, and allows them to provide uncertain… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

    Comments: http://proceedings.mlr.press/v97/wang19k/wang19k.pdf. arXiv admin note: substantial text overlap with arXiv:1812.00098

    Journal ref: Proceedings of Machine Learning Research, Volume 97: International Conference on Machine Learning, 2019

  12. arXiv:1812.00098  [pdf, other

    stat.ML cs.LG

    Deep Factors with Gaussian Processes for Forecasting

    Authors: Danielle C. Maddix, Yuyang Wang, Alex Smola

    Abstract: A large collection of time series poses significant challenges for classical and neural forecasting approaches. Classical time series models fail to fit data well and to scale to large problems, but succeed at providing uncertainty estimates. The converse is true for deep neural networks. In this paper, we propose a hybrid model that incorporates the benefits of both approaches. Our new method is… ▽ More

    Submitted 30 November, 2018; originally announced December 2018.

    Comments: Third workshop on Bayesian Deep Learning (NeurIPS 2018), Montreal, Canada