-
TimeGPT-1
Authors:
Azul Garza,
Cristian Challu,
Max Mergenthaler-Canseco
Abstract:
In this paper, we introduce TimeGPT, the first foundation model for time series, capable of generating accurate predictions for diverse datasets not seen during training. We evaluate our pre-trained model against established statistical, machine learning, and deep learning methods, demonstrating that TimeGPT zero-shot inference excels in performance, efficiency, and simplicity. Our study provides…
▽ More
In this paper, we introduce TimeGPT, the first foundation model for time series, capable of generating accurate predictions for diverse datasets not seen during training. We evaluate our pre-trained model against established statistical, machine learning, and deep learning methods, demonstrating that TimeGPT zero-shot inference excels in performance, efficiency, and simplicity. Our study provides compelling evidence that insights from other domains of artificial intelligence can be effectively applied to time series analysis. We conclude that large-scale time series models offer an exciting opportunity to democratize access to precise predictions and reduce uncertainty by leveraging the capabilities of contemporary advancements in deep learning.
△ Less
Submitted 27 May, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Forecasting Response to Treatment with Global Deep Learning and Patient-Specific Pharmacokinetic Priors
Authors:
Willa Potosnak,
Cristian Challu,
Kin G. Olivares,
Artur Dubrawski
Abstract:
Forecasting healthcare time series is crucial for early detection of adverse outcomes and for patient monitoring. Forecasting, however, can be difficult in practice due to noisy and intermittent data. The challenges are often exacerbated by change points induced via extrinsic factors, such as the administration of medication. To address these challenges, we propose a novel hybrid global-local arch…
▽ More
Forecasting healthcare time series is crucial for early detection of adverse outcomes and for patient monitoring. Forecasting, however, can be difficult in practice due to noisy and intermittent data. The challenges are often exacerbated by change points induced via extrinsic factors, such as the administration of medication. To address these challenges, we propose a novel hybrid global-local architecture and a pharmacokinetic encoder that informs deep learning models of patient-specific treatment effects. We showcase the efficacy of our approach in achieving significant accuracy gains for a blood glucose forecasting task using both realistically simulated and real-world data. Our global-local architecture improves over patient-specific models by 9.2-14.6%. Additionally, our pharmacokinetic encoder improves over alternative encoding techniques by 4.4% on simulated data and 2.1% on real-world data. The proposed approach can have multiple beneficial applications in clinical practice, such as issuing early warnings about unexpected treatment responses, or hel** to characterize patient-specific treatment effects in terms of drug absorption and elimination characteristics.
△ Less
Submitted 15 February, 2024; v1 submitted 22 September, 2023;
originally announced September 2023.
-
Hierarchically Coherent Multivariate Mixture Networks
Authors:
Kin G. Olivares,
David Luo,
Cristian Challu,
Stefania La Vattiata,
Max Mergenthaler,
Artur Dubrawski
Abstract:
Large collections of time series data are often organized into hierarchies with different levels of aggregation; examples include product and geographical grou**s. Probabilistic coherent forecasting is tasked to produce forecasts consistent across levels of aggregation. In this study, we propose to augment neural forecasting architectures with a coherent multivariate mixture output. We optimize…
▽ More
Large collections of time series data are often organized into hierarchies with different levels of aggregation; examples include product and geographical grou**s. Probabilistic coherent forecasting is tasked to produce forecasts consistent across levels of aggregation. In this study, we propose to augment neural forecasting architectures with a coherent multivariate mixture output. We optimize the networks with a composite likelihood objective, allowing us to capture time series' relationships while maintaining high computational efficiency. Our approach demonstrates 13.2% average accuracy improvements on most datasets compared to state-of-the-art baselines. We conduct ablation studies of the framework components and provide theoretical foundations for them. To assist related work, the code is available at this https://github.com/Nixtla/neuralforecast.
△ Less
Submitted 16 October, 2023; v1 submitted 11 May, 2023;
originally announced May 2023.
-
SpectraNet: Multivariate Forecasting and Imputation under Distribution Shifts and Missing Data
Authors:
Cristian Challu,
Peihong Jiang,
Ying Nian Wu,
Laurent Callot
Abstract:
In this work, we tackle two widespread challenges in real applications for time-series forecasting that have been largely understudied: distribution shifts and missing data. We propose SpectraNet, a novel multivariate time-series forecasting model that dynamically infers a latent space spectral decomposition to capture current temporal dynamics and correlations on the recent observed history. A Co…
▽ More
In this work, we tackle two widespread challenges in real applications for time-series forecasting that have been largely understudied: distribution shifts and missing data. We propose SpectraNet, a novel multivariate time-series forecasting model that dynamically infers a latent space spectral decomposition to capture current temporal dynamics and correlations on the recent observed history. A Convolution Neural Network maps the learned representation by sequentially mixing its components and refining the output. Our proposed approach can simultaneously produce forecasts and interpolate past observations and can, therefore, greatly simplify production systems by unifying imputation and forecasting tasks into a single model. SpectraNet achieves SoTA performance simultaneously on both tasks on five benchmark datasets, compared to forecasting and imputation models, with up to 92% fewer parameters and comparable training times. On settings with up to 80% missing data, SpectraNet has average performance improvements of almost 50% over the second-best alternative. Our code is available at https://github.com/cchallu/spectranet.
△ Less
Submitted 25 October, 2022; v1 submitted 22 October, 2022;
originally announced October 2022.
-
Unsupervised Model Selection for Time-series Anomaly Detection
Authors:
Mononito Goswami,
Cristian Challu,
Laurent Callot,
Lenon Minorics,
Andrey Kan
Abstract:
Anomaly detection in time-series has a wide range of practical applications. While numerous anomaly detection methods have been proposed in the literature, a recent survey concluded that no single method is the most accurate across various datasets. To make matters worse, anomaly labels are scarce and rarely available in practice. The practical problem of selecting the most accurate model for a gi…
▽ More
Anomaly detection in time-series has a wide range of practical applications. While numerous anomaly detection methods have been proposed in the literature, a recent survey concluded that no single method is the most accurate across various datasets. To make matters worse, anomaly labels are scarce and rarely available in practice. The practical problem of selecting the most accurate model for a given dataset without labels has received little attention in the literature. This paper answers this question i.e. Given an unlabeled dataset and a set of candidate anomaly detectors, how can we select the most accurate model? To this end, we identify three classes of surrogate (unsupervised) metrics, namely, prediction error, model centrality, and performance on injected synthetic anomalies, and show that some metrics are highly correlated with standard supervised anomaly detection performance metrics such as the $F_1$ score, but to varying degrees. We formulate metric combination with multiple imperfect surrogate metrics as a robust rank aggregation problem. We then provide theoretical justification behind the proposed approach. Large-scale experiments on multiple real-world datasets demonstrate that our proposed unsupervised approach is as effective as selecting the most accurate model based on partially labeled data.
△ Less
Submitted 24 January, 2023; v1 submitted 3 October, 2022;
originally announced October 2022.
-
HierarchicalForecast: A Reference Framework for Hierarchical Forecasting in Python
Authors:
Kin G. Olivares,
Federico Garza,
David Luo,
Cristian Challú,
Max Mergenthaler,
Souhaib Ben Taieb,
Shanika L. Wickramasuriya,
Artur Dubrawski
Abstract:
Large collections of time series data are commonly organized into structures with different levels of aggregation; examples include product and geographical grou**s. It is often important to ensure that the forecasts are coherent so that the predicted values at disaggregate levels add up to the aggregate forecast. The growing interest of the Machine Learning community in hierarchical forecasting…
▽ More
Large collections of time series data are commonly organized into structures with different levels of aggregation; examples include product and geographical grou**s. It is often important to ensure that the forecasts are coherent so that the predicted values at disaggregate levels add up to the aggregate forecast. The growing interest of the Machine Learning community in hierarchical forecasting systems indicates that we are in a propitious moment to ensure that scientific endeavors are grounded on sound baselines. For this reason, we put forward the HierarchicalForecast library, which contains preprocessed publicly available datasets, evaluation metrics, and a compiled set of statistical baseline models. Our Python-based reference framework aims to bridge the gap between statistical and econometric modeling, and Machine Learning forecasting research. Code and documentation are available in https://github.com/Nixtla/hierarchicalforecast.
△ Less
Submitted 24 January, 2023; v1 submitted 7 July, 2022;
originally announced July 2022.
-
Deep Generative model with Hierarchical Latent Factors for Time Series Anomaly Detection
Authors:
Cristian Challu,
Peihong Jiang,
Ying Nian Wu,
Laurent Callot
Abstract:
Multivariate time series anomaly detection has become an active area of research in recent years, with Deep Learning models outperforming previous approaches on benchmark datasets. Among reconstruction-based models, most previous work has focused on Variational Autoencoders and Generative Adversarial Networks. This work presents DGHL, a new family of generative models for time series anomaly detec…
▽ More
Multivariate time series anomaly detection has become an active area of research in recent years, with Deep Learning models outperforming previous approaches on benchmark datasets. Among reconstruction-based models, most previous work has focused on Variational Autoencoders and Generative Adversarial Networks. This work presents DGHL, a new family of generative models for time series anomaly detection, trained by maximizing the observed likelihood by posterior sampling and alternating back-propagation. A top-down Convolution Network maps a novel hierarchical latent space to time series windows, exploiting temporal dynamics to encode information efficiently. Despite relying on posterior sampling, it is computationally more efficient than current approaches, with up to 10x shorter training times than RNN based models. Our method outperformed current state-of-the-art models on four popular benchmark datasets. Finally, DGHL is robust to variable features between entities and accurate even with large proportions of missing values, settings with increasing relevance with the advent of IoT. We demonstrate the superior robustness of DGHL with novel occlusion experiments in this literature. Our code is available at https://github.com/cchallu/dghl.
△ Less
Submitted 25 February, 2022; v1 submitted 15 February, 2022;
originally announced February 2022.
-
N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting
Authors:
Cristian Challu,
Kin G. Olivares,
Boris N. Oreshkin,
Federico Garza,
Max Mergenthaler-Canseco,
Artur Dubrawski
Abstract:
Recent progress in neural forecasting accelerated improvements in the performance of large-scale forecasting systems. Yet, long-horizon forecasting remains a very difficult task. Two common challenges afflicting the task are the volatility of the predictions and their computational complexity. We introduce N-HiTS, a model which addresses both challenges by incorporating novel hierarchical interpol…
▽ More
Recent progress in neural forecasting accelerated improvements in the performance of large-scale forecasting systems. Yet, long-horizon forecasting remains a very difficult task. Two common challenges afflicting the task are the volatility of the predictions and their computational complexity. We introduce N-HiTS, a model which addresses both challenges by incorporating novel hierarchical interpolation and multi-rate data sampling techniques. These techniques enable the proposed method to assemble its predictions sequentially, emphasizing components with different frequencies and scales while decomposing the input signal and synthesizing the forecast. We prove that the hierarchical interpolation technique can efficiently approximate arbitrarily long horizons in the presence of smoothness. Additionally, we conduct extensive large-scale dataset experiments from the long-horizon forecasting literature, demonstrating the advantages of our method over the state-of-the-art methods, where N-HiTS provides an average accuracy improvement of almost 20% over the latest Transformer architectures while reducing the computation time by an order of magnitude (50 times). Our code is available at bit.ly/3VA5DoT
△ Less
Submitted 29 November, 2022; v1 submitted 30 January, 2022;
originally announced January 2022.
-
DMIDAS: Deep Mixed Data Sampling Regression for Long Multi-Horizon Time Series Forecasting
Authors:
Cristian Challu,
Kin G. Olivares,
Gus Welter,
Artur Dubrawski
Abstract:
Neural forecasting has shown significant improvements in the accuracy of large-scale systems, yet predicting extremely long horizons remains a challenging task. Two common problems are the volatility of the predictions and their computational complexity; we addressed them by incorporating smoothness regularization and mixed data sampling techniques to a well-performing multi-layer perceptron based…
▽ More
Neural forecasting has shown significant improvements in the accuracy of large-scale systems, yet predicting extremely long horizons remains a challenging task. Two common problems are the volatility of the predictions and their computational complexity; we addressed them by incorporating smoothness regularization and mixed data sampling techniques to a well-performing multi-layer perceptron based architecture (NBEATS). We validate our proposed method, DMIDAS, on high-frequency healthcare and electricity price data with long forecasting horizons (~1000 timestamps) where we improve the prediction accuracy by 5% over state-of-the-art models, reducing the number of parameters of NBEATS by nearly 70%.
△ Less
Submitted 7 June, 2021;
originally announced June 2021.
-
Neural basis expansion analysis with exogenous variables: Forecasting electricity prices with NBEATSx
Authors:
Kin G. Olivares,
Cristian Challu,
Grzegorz Marcjasz,
Rafał Weron,
Artur Dubrawski
Abstract:
We extend the neural basis expansion analysis (NBEATS) to incorporate exogenous factors. The resulting method, called NBEATSx, improves on a well performing deep learning model, extending its capabilities by including exogenous variables and allowing it to integrate multiple sources of useful information. To showcase the utility of the NBEATSx model, we conduct a comprehensive study of its applica…
▽ More
We extend the neural basis expansion analysis (NBEATS) to incorporate exogenous factors. The resulting method, called NBEATSx, improves on a well performing deep learning model, extending its capabilities by including exogenous variables and allowing it to integrate multiple sources of useful information. To showcase the utility of the NBEATSx model, we conduct a comprehensive study of its application to electricity price forecasting (EPF) tasks across a broad range of years and markets. We observe state-of-the-art performance, significantly improving the forecast accuracy by nearly 20% over the original NBEATS model, and by up to 5% over other well established statistical and machine learning methods specialized for these tasks. Additionally, the proposed neural network has an interpretable configuration that can structurally decompose time series, visualizing the relative impact of trend and seasonal components and revealing the modeled processes' interactions with exogenous factors. To assist related work we made the code available in https://github.com/cchallu/nbeatsx.
△ Less
Submitted 4 April, 2022; v1 submitted 12 April, 2021;
originally announced April 2021.
-
Double Adaptive Stochastic Gradient Optimization
Authors:
Kin Gutierrez,
** Li,
Cristian Challu,
Artur Dubrawski
Abstract:
Adaptive moment methods have been remarkably successful in deep learning optimization, particularly in the presence of noisy and/or sparse gradients. We further the advantages of adaptive moment techniques by proposing a family of double adaptive stochastic gradient methods~\textsc{DASGrad}. They leverage the complementary ideas of the adaptive moment algorithms widely used by deep learning commun…
▽ More
Adaptive moment methods have been remarkably successful in deep learning optimization, particularly in the presence of noisy and/or sparse gradients. We further the advantages of adaptive moment techniques by proposing a family of double adaptive stochastic gradient methods~\textsc{DASGrad}. They leverage the complementary ideas of the adaptive moment algorithms widely used by deep learning community, and recent advances in adaptive probabilistic algorithms.We analyze the theoretical convergence improvements of our approach in a stochastic convex optimization setting, and provide empirical validation of our findings with convex and non convex objectives. We observe that the benefits of~\textsc{DASGrad} increase with the model complexity and variability of the gradients, and we explore the resulting utility in extensions of distribution-matching multitask learning.
△ Less
Submitted 6 November, 2018;
originally announced November 2018.