Search | arXiv e-print repository

Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series

Authors: Vijay Ekambaram, Arindam Jati, Pankaj Dayama, Sumanta Mukherjee, Nam H. Nguyen, Wesley M. Gifford, Chandra Reddy, Jayant Kalagnanam

Abstract: Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on develo** pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot f… ▽ More Large pre-trained models excel in zero/few-shot learning for language and vision tasks but face challenges in multivariate time series (TS) forecasting due to diverse data characteristics. Consequently, recent research efforts have focused on develo** pre-trained TS forecasting models. These models, whether built from scratch or adapted from large language models (LLMs), excel in zero/few-shot forecasting tasks. However, they are limited by slow performance, high computational demands, and neglect of cross-channel and exogenous correlations. To address this, we introduce Tiny Time Mixers (TTM), a compact model (starting from 1M parameters) with effective transfer learning capabilities, trained exclusively on public TS datasets. TTM, based on the light-weight TSMixer architecture, incorporates innovations like adaptive patching, diverse resolution sampling, and resolution prefix tuning to handle pre-training on varied dataset resolutions with minimal model capacity. Additionally, it employs multi-level modeling to capture channel correlations and infuse exogenous signals during fine-tuning. TTM outperforms existing popular benchmarks in zero/few-shot forecasting by (4-40\%), while reducing computational requirements significantly. Moreover, TTMs are lightweight and can be executed even on CPU-only machines, enhancing usability and fostering wider adoption in resource-constrained environments. Model weights for our initial variant (TTM-Q) are available at https://huggingface.co/ibm-granite/granite-timeseries-ttm-v1. Model weights for more sophisticated variants (TTM-B, TTM-E, and TTM-A) will be shared soon. The source code for TTM can be accessed at https://github.com/ibm-granite/granite-tsfm/tree/main/tsfm_public/models/tinytimemixer. △ Less

Submitted 5 June, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

arXiv:2310.20280 [pdf, other]

AutoMixer for Improved Multivariate Time-Series Forecasting on Business and IT Observability Data

Authors: Santosh Palaskar, Vijay Ekambaram, Arindam Jati, Neelamadhav Gantayat, Avirup Saha, Seema Nagar, Nam H. Nguyen, Pankaj Dayama, Renuka Sindhgatta, Prateeti Mohapatra, Harshit Kumar, Jayant Kalagnanam, Nandyala Hemachandra, Narayan Rangaraj

Abstract: The efficiency of business processes relies on business key performance indicators (Biz-KPIs), that can be negatively impacted by IT failures. Business and IT Observability (BizITObs) data fuses both Biz-KPIs and IT event channels together as multivariate time series data. Forecasting Biz-KPIs in advance can enhance efficiency and revenue through proactive corrective measures. However, BizITObs da… ▽ More The efficiency of business processes relies on business key performance indicators (Biz-KPIs), that can be negatively impacted by IT failures. Business and IT Observability (BizITObs) data fuses both Biz-KPIs and IT event channels together as multivariate time series data. Forecasting Biz-KPIs in advance can enhance efficiency and revenue through proactive corrective measures. However, BizITObs data generally exhibit both useful and noisy inter-channel interactions between Biz-KPIs and IT events that need to be effectively decoupled. This leads to suboptimal forecasting performance when existing multivariate forecasting models are employed. To address this, we introduce AutoMixer, a time-series Foundation Model (FM) approach, grounded on the novel technique of channel-compressed pretrain and finetune workflows. AutoMixer leverages an AutoEncoder for channel-compressed pretraining and integrates it with the advanced TSMixer model for multivariate time series forecasting. This fusion greatly enhances the potency of TSMixer for accurate forecasts and also generalizes well across several downstream tasks. Through detailed experiments and dashboard analytics, we show AutoMixer's capability to consistently improve the Biz-KPI's forecasting accuracy (by 11-15\%) which directly translates to actionable business insights. △ Less

Submitted 2 November, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

Comments: Accepted in the Thirty-Sixth Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-24)

arXiv:2306.09364 [pdf, other]

doi 10.1145/3580305.3599533

TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting

Authors: Vijay Ekambaram, Arindam Jati, Nam Nguyen, Phanwadee Sinthong, Jayant Kalagnanam

Abstract: Transformers have gained popularity in time series forecasting for their ability to capture long-sequence interactions. However, their high memory and computing requirements pose a critical bottleneck for long-term forecasting. To address this, we propose TSMixer, a lightweight neural architecture exclusively composed of multi-layer perceptron (MLP) modules for multivariate forecasting and represe… ▽ More Transformers have gained popularity in time series forecasting for their ability to capture long-sequence interactions. However, their high memory and computing requirements pose a critical bottleneck for long-term forecasting. To address this, we propose TSMixer, a lightweight neural architecture exclusively composed of multi-layer perceptron (MLP) modules for multivariate forecasting and representation learning on patched time series. Inspired by MLP-Mixer's success in computer vision, we adapt it for time series, addressing challenges and introducing validated components for enhanced accuracy. This includes a novel design paradigm of attaching online reconciliation heads to the MLP-Mixer backbone, for explicitly modeling the time-series properties such as hierarchy and channel-correlations. We also propose a novel Hybrid channel modeling and infusion of a simple gating approach to effectively handle noisy channel interactions and generalization across diverse datasets. By incorporating these lightweight components, we significantly enhance the learning capability of simple MLP structures, outperforming complex Transformer models with minimal computing usage. Moreover, TSMixer's modular design enables compatibility with both supervised and masked self-supervised learning methods, making it a promising building block for time-series Foundation Models. TSMixer outperforms state-of-the-art MLP and Transformer models in forecasting by a considerable margin of 8-60%. It also outperforms the latest strong benchmarks of Patch-Transformer models (by 1-2%) with a significant reduction in memory and runtime (2-3X). The source code of our model is officially released as PatchTSMixer in the HuggingFace. Model: https://huggingface.co/docs/transformers/main/en/model_doc/patchtsmixer Examples: https://github.com/ibm/tsfm/#notebooks-links △ Less

Submitted 11 December, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

Comments: Accepted in the Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 23), Research Track. Delayed release in arXiv to comply with the conference policies on the double-blind review process. This paper has been submitted to the KDD peer-review process on Feb 02, 2023

ACM Class: I.2

arXiv:2306.00778 [pdf, other]

An End-to-End Time Series Model for Simultaneous Imputation and Forecast

Authors: Trang H. Tran, Lam M. Nguyen, Kyongmin Yeo, Nam Nguyen, Dzung Phan, Roman Vaculin, Jayant Kalagnanam

Abstract: Time series forecasting using historical data has been an interesting and challenging topic, especially when the data is corrupted by missing values. In many industrial problem, it is important to learn the inference function between the auxiliary observations and target variables as it provides additional knowledge when the data is not fully observed. We develop an end-to-end time series model th… ▽ More Time series forecasting using historical data has been an interesting and challenging topic, especially when the data is corrupted by missing values. In many industrial problem, it is important to learn the inference function between the auxiliary observations and target variables as it provides additional knowledge when the data is not fully observed. We develop an end-to-end time series model that aims to learn the such inference relation and make a multiple-step ahead forecast. Our framework trains jointly two neural networks, one to learn the feature-wise correlations and the other for the modeling of temporal behaviors. Our model is capable of simultaneously imputing the missing entries and making a multiple-step ahead prediction. The experiments show good overall performance of our framework over existing methods in both imputation and forecasting tasks. △ Less

Submitted 1 June, 2023; originally announced June 2023.

arXiv:2211.14730 [pdf, other]

A Time Series is Worth 64 Words: Long-term Forecasting with Transformers

Authors: Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, Jayant Kalagnanam

Abstract: We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same emb… ▽ More We propose an efficient design of Transformer-based models for multivariate time series forecasting and self-supervised representation learning. It is based on two key components: (i) segmentation of time series into subseries-level patches which are served as input tokens to Transformer; (ii) channel-independence where each channel contains a single univariate time series that shares the same embedding and Transformer weights across all the series. Patching design naturally has three-fold benefit: local semantic information is retained in the embedding; computation and memory usage of the attention maps are quadratically reduced given the same look-back window; and the model can attend longer history. Our channel-independent patch time series Transformer (PatchTST) can improve the long-term forecasting accuracy significantly when compared with that of SOTA Transformer-based models. We also apply our model to self-supervised pre-training tasks and attain excellent fine-tuning performance, which outperforms supervised training on large datasets. Transferring of masked pre-trained representation on one dataset to others also produces SOTA forecasting accuracy. Code is available at: https://github.com/yuqinie98/PatchTST. △ Less

Submitted 5 March, 2023; v1 submitted 27 November, 2022; originally announced November 2022.

Comments: Accepted by ICLR 2023

arXiv:2112.05653 [pdf, other]

Interpretable Clustering via Multi-Polytope Machines

Authors: Connor Lawless, Jayant Kalagnanam, Lam M. Nguyen, Dzung Phan, Chandra Reddy

Abstract: Clustering is a popular unsupervised learning tool often used to discover groups within a larger population such as customer segments, or patient subtypes. However, despite its use as a tool for subgroup discovery and description - few state-of-the-art algorithms provide any rationale or description behind the clusters found. We propose a novel approach for interpretable clustering that both clust… ▽ More Clustering is a popular unsupervised learning tool often used to discover groups within a larger population such as customer segments, or patient subtypes. However, despite its use as a tool for subgroup discovery and description - few state-of-the-art algorithms provide any rationale or description behind the clusters found. We propose a novel approach for interpretable clustering that both clusters data points and constructs polytopes around the discovered clusters to explain them. Our framework allows for additional constraints on the polytopes - including ensuring that the hyperplanes constructing the polytope are axis-parallel or sparse with integer coefficients. We formulate the problem of constructing clusters via polytopes as a Mixed-Integer Non-Linear Program (MINLP). To solve our formulation we propose a two phase approach where we first initialize clusters and polytopes using alternating minimization, and then use coordinate descent to boost clustering performance. We benchmark our approach on a suite of synthetic and real world clustering problems, where our algorithm outperforms state of the art interpretable and non-interpretable clustering algorithms. △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: Accepted to the 36th AAAI Conference on Artificial Intelligence (AAAI 2022)

arXiv:2112.02215 [pdf, other]

Deep Policy Iteration with Integer Programming for Inventory Management

Authors: Pavithra Harsha, Ashish Jagmohan, Jayant R. Kalagnanam, Brian Quanz, Divya Singhvi

Abstract: We present a Reinforcement Learning (RL) based framework for optimizing long-term discounted reward problems with large combinatorial action space and state dependent constraints. These characteristics are common to many operations management problems, e.g., network inventory replenishment, where managers have to deal with uncertain demand, lost sales, and capacity constraints that results in more… ▽ More We present a Reinforcement Learning (RL) based framework for optimizing long-term discounted reward problems with large combinatorial action space and state dependent constraints. These characteristics are common to many operations management problems, e.g., network inventory replenishment, where managers have to deal with uncertain demand, lost sales, and capacity constraints that results in more complex feasible action spaces. Our proposed Programmable Actor Reinforcement Learning (PARL) uses a deep-policy iteration method that leverages neural networks (NNs) to approximate the value function and combines it with mathematical programming (MP) and sample average approximation (SAA) to solve the per-step-action optimally while accounting for combinatorial action spaces and state-dependent constraint sets. We show how the proposed methodology can be applied to complex inventory replenishment problems where analytical solutions are intractable. We also benchmark the proposed algorithm against state-of-the-art RL algorithms and commonly used replenishment heuristics and find that the proposed algorithm considerably outperforms existing methods by as much as 14.7\% on average in various supply chain settings. This improvement in performance of PARL over benchmark algorithms can be attributed to better inventory cost management, especially in inventory constrained settings. Furthermore, in a simpler back order setting where the optimal solution is tractable, we find that the RL based policy also converges to the optimal policy. Finally, to make RL algorithms more accessible for inventory management researchers, we also discuss a modular Python library developed that can be used to test the performance of RL algorithms with various supply chain structures. This library can spur future research in develo** practical and near-optimal algorithms for inventory management problems. △ Less

Submitted 14 October, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

Comments: Prior shorter version accepted to NeurIPS 2021 Deep RL Workshop. Authors are listed in alphabetical order

ACM Class: I.2.6; I.2.1; I.2.8; J.7; I.5.1; G.3

arXiv:2011.03375 [pdf, other]

A Scalable MIP-based Method for Learning Optimal Multivariate Decision Trees

Authors: Haoran Zhu, Pavankumar Murali, Dzung T. Phan, Lam M. Nguyen, Jayant R. Kalagnanam

Abstract: Several recent publications report advances in training optimal decision trees (ODT) using mixed-integer programs (MIP), due to algorithmic advances in integer programming and a growing interest in addressing the inherent suboptimality of heuristic approaches such as CART. In this paper, we propose a novel MIP formulation, based on a 1-norm support vector machine model, to train a multivariate ODT… ▽ More Several recent publications report advances in training optimal decision trees (ODT) using mixed-integer programs (MIP), due to algorithmic advances in integer programming and a growing interest in addressing the inherent suboptimality of heuristic approaches such as CART. In this paper, we propose a novel MIP formulation, based on a 1-norm support vector machine model, to train a multivariate ODT for classification problems. We provide cutting plane techniques that tighten the linear relaxation of the MIP formulation, in order to improve run times to reach optimality. Using 36 data-sets from the University of California Irvine Machine Learning Repository, we demonstrate that our formulation outperforms its counterparts in the literature by an average of about 10% in terms of mean out-of-sample testing accuracy across the data-sets. We provide a scalable framework to train multivariate ODT on large data-sets by introducing a novel linear programming (LP) based data selection method to choose a subset of the data for training. Our method is able to routinely handle large data-sets with more than 7,000 sample points and outperform heuristics methods and other MIP based techniques. We present results on data-sets containing up to 245,000 samples. Existing MIP-based methods do not scale well on training data-sets beyond 5,500 samples. △ Less

Submitted 6 November, 2020; originally announced November 2020.

arXiv:2003.01184 [pdf, other]

Variational inference formulation for a model-free simulation of a dynamical system with unknown parameters by a recurrent neural network

Authors: Kyongmin Yeo, Dylan E. C. Grullon, Fan-Keng Sun, Duane S. Boning, Jayant R. Kalagnanam

Abstract: We propose a recurrent neural network for a "model-free" simulation of a dynamical system with unknown parameters without prior knowledge. The deep learning model aims to jointly learn the nonlinear time marching operator and the effects of the unknown parameters from a time series dataset. We assume that the time series data set consists of an ensemble of trajectories for a range of the parameter… ▽ More We propose a recurrent neural network for a "model-free" simulation of a dynamical system with unknown parameters without prior knowledge. The deep learning model aims to jointly learn the nonlinear time marching operator and the effects of the unknown parameters from a time series dataset. We assume that the time series data set consists of an ensemble of trajectories for a range of the parameters. The learning task is formulated as a statistical inference problem by considering the unknown parameters as random variables. A latent variable is introduced to model the effects of the unknown parameters, and a variational inference method is employed to simultaneously train probabilistic models for the time marching operator and an approximate posterior distribution for the latent variable. Unlike the classical variational inference, where a factorized distribution is used to approximate the posterior, we employ a feedforward neural network supplemented by an encoder recurrent neural network to develop a more flexible probabilistic model. The approximate posterior distribution makes an inference on a trajectory to identify the effects of the unknown parameters. The time marching operator is approximated by a recurrent neural network, which takes a latent state sampled from the approximate posterior distribution as one of the input variables, to compute the time evolution of the probability distribution conditioned on the latent variable. In the numerical experiments, it is shown that the proposed variational inference model makes a more accurate simulation compared to the standard recurrent neural networks. It is found that the proposed deep learning model is capable of correctly identifying the dimensions of the random parameters and learning a representation of complex time series data. △ Less

Submitted 26 February, 2021; v1 submitted 2 March, 2020; originally announced March 2020.

arXiv:1901.07648 [pdf, other]

Finite-Sum Smooth Optimization with SARAH

Authors: Lam M. Nguyen, Marten van Dijk, Dzung T. Phan, Phuong Ha Nguyen, Tsui-Wei Weng, Jayant R. Kalagnanam

Abstract: The total complexity (measured as the total number of gradient computations) of a stochastic first-order optimization algorithm that finds a first-order stationary point of a finite-sum smooth nonconvex objective function $F(w)=\frac{1}{n} \sum_{i=1}^n f_i(w)$ has been proven to be at least $Ω(\sqrt{n}/ε)$ for $n \leq \mathcal{O}(ε^{-2})$ where $ε$ denotes the attained accuracy… ▽ More The total complexity (measured as the total number of gradient computations) of a stochastic first-order optimization algorithm that finds a first-order stationary point of a finite-sum smooth nonconvex objective function $F(w)=\frac{1}{n} \sum_{i=1}^n f_i(w)$ has been proven to be at least $Ω(\sqrt{n}/ε)$ for $n \leq \mathcal{O}(ε^{-2})$ where $ε$ denotes the attained accuracy $\mathbb{E}[ \|\nabla F(\tilde{w})\|^2] \leq ε$ for the outputted approximation $\tilde{w}$ (Fang et al., 2018). In this paper, we provide a convergence analysis for a slightly modified version of the SARAH algorithm (Nguyen et al., 2017a;b) and achieve total complexity that matches the lower-bound worst case complexity in (Fang et al., 2018) up to a constant factor when $n \leq \mathcal{O}(ε^{-2})$ for nonconvex problems. For convex optimization, we propose SARAH++ with sublinear convergence for general convex and linear convergence for strongly convex problems; and we provide a practical version for which numerical experiments on various datasets show an improved performance. △ Less

Submitted 22 April, 2019; v1 submitted 22 January, 2019; originally announced January 2019.

arXiv:1901.07634

DTN: A Learning Rate Scheme with Convergence Rate of $\mathcal{O}(1/t)$ for SGD

Authors: Lam M. Nguyen, Phuong Ha Nguyen, Dzung T. Phan, Jayant R. Kalagnanam, Marten van Dijk

Abstract: This paper has some inconsistent results, i.e., we made some failed claims because we did some mistakes for using the test criterion for a series. Precisely, our claims on the convergence rate of $\mathcal{O}(1/t)$ of SGD presented in Theorem 1, Corollary 1, Theorem 2 and Corollary 2 are wrongly derived because they are based on Lemma 5. In Lemma 5, we do not correctly use the test criterion for a… ▽ More This paper has some inconsistent results, i.e., we made some failed claims because we did some mistakes for using the test criterion for a series. Precisely, our claims on the convergence rate of $\mathcal{O}(1/t)$ of SGD presented in Theorem 1, Corollary 1, Theorem 2 and Corollary 2 are wrongly derived because they are based on Lemma 5. In Lemma 5, we do not correctly use the test criterion for a series. Hence, the result of Lemma 5 is not valid. We would like to thank the community for pointing out this mistake! △ Less

Submitted 27 February, 2019; v1 submitted 22 January, 2019; originally announced January 2019.

Comments: This paper has inconsistent results, i.e., we made some failed claims because we did some mistakes for using the test criterion for a series

arXiv:1801.06159 [pdf, other]

When Does Stochastic Gradient Algorithm Work Well?

Authors: Lam M. Nguyen, Nam H. Nguyen, Dzung T. Phan, Jayant R. Kalagnanam, Katya Scheinberg

Abstract: In this paper, we consider a general stochastic optimization problem which is often at the core of supervised learning, such as deep learning and linear classification. We consider a standard stochastic gradient descent (SGD) method with a fixed, large step size and propose a novel assumption on the objective function, under which this method has the improved convergence rates (to a neighborhood o… ▽ More In this paper, we consider a general stochastic optimization problem which is often at the core of supervised learning, such as deep learning and linear classification. We consider a standard stochastic gradient descent (SGD) method with a fixed, large step size and propose a novel assumption on the objective function, under which this method has the improved convergence rates (to a neighborhood of the optimal solutions). We then empirically demonstrate that these assumptions hold for logistic regression and standard deep neural networks on classical data sets. Thus our analysis helps to explain when efficient behavior can be expected from the SGD method in training classification models and deep neural networks. △ Less

Submitted 25 December, 2018; v1 submitted 18 January, 2018; originally announced January 2018.

arXiv:1801.03009 [pdf, other]

doi 10.1016/j.cma.2018.12.022

Development of hp-inverse model by using generalized polynomial chaos

Authors: Kyongmin Yeo, Youngdeok Hwang, Xiao Liu, Jayant Kalagnanam

Abstract: We present a hp-inverse model to estimate a smooth, non-negative source function from a limited number of observations for a two-dimensional linear source inversion problem. A standard least-square inverse model is formulated by using a set of Gaussian radial basis functions (GRBF) on a rectangular mesh system with a uniform grid space. Here, the choice of the mesh system is modeled as a random va… ▽ More We present a hp-inverse model to estimate a smooth, non-negative source function from a limited number of observations for a two-dimensional linear source inversion problem. A standard least-square inverse model is formulated by using a set of Gaussian radial basis functions (GRBF) on a rectangular mesh system with a uniform grid space. Here, the choice of the mesh system is modeled as a random variable and the generalized polynomial chaos (gPC) expansion is used to represent the random mesh system. It is shown that the convolution of gPC and GRBF provides hierarchical basis functions for the linear source inverse model with the $hp$-refinement capability. We propose a mixed l_1 and l_2 regularization to exploit the hierarchical nature of the basis functions to find a sparse solution. The $hp$-inverse model has an advantage over the standard least-square inverse model when the number of data is limited. It is shown that the hp-inverse model provides a good estimate of the source function even when the number of unknown parameters ($m$) is much larger the number of data ($n$), e.g., m/n > 40. △ Less

Submitted 14 December, 2018; v1 submitted 9 January, 2018; originally announced January 2018.

arXiv:1612.03225 [pdf, ps, other]

Optimal Generalized Decision Trees via Integer Programming

Authors: Oktay Gunluk, Jayant Kalagnanam, Minhan Li, Matt Menickelly, Katya Scheinberg

Abstract: Decision trees have been a very popular class of predictive models for decades due to their interpretability and good performance on categorical features. However, they are not always robust and tend to overfit the data. Additionally, if allowed to grow large, they lose interpretability. In this paper, we present a mixed integer programming formulation to construct optimal decision trees of a pres… ▽ More Decision trees have been a very popular class of predictive models for decades due to their interpretability and good performance on categorical features. However, they are not always robust and tend to overfit the data. Additionally, if allowed to grow large, they lose interpretability. In this paper, we present a mixed integer programming formulation to construct optimal decision trees of a prespecified size. We take the special structure of categorical features into account and allow combinatorial decisions (based on subsets of values of features) at each node. Our approach can also handle numerical features via thresholding. We show that very good accuracy can be achieved with small trees using moderately-sized training sets. The optimization problems we solve are tractable with modern solvers. △ Less

Submitted 13 August, 2019; v1 submitted 9 December, 2016; originally announced December 2016.

MSC Class: 90C10

arXiv:1609.09816 [pdf, other]

A Spatio-Temporal Modeling Approach for Weather Radar Reflectivity Data and Its Applications in Tropical Southeast Asia

Authors: Xiao Liu, Viknesswaran Gopal, Jayant Kalagnanam

Abstract: Weather radar echoes, correlated in both space and time, are the most important input data for short-term precipitation forecast. Motivated by real datasets, this paper is concerned with the spatio-temporal modeling of two-dimensional radar reflectivity fields from a sequence of radar images. Under a Lagrangian integration scheme, we model the radar reflectivity data by a spatio-temporal condition… ▽ More Weather radar echoes, correlated in both space and time, are the most important input data for short-term precipitation forecast. Motivated by real datasets, this paper is concerned with the spatio-temporal modeling of two-dimensional radar reflectivity fields from a sequence of radar images. Under a Lagrangian integration scheme, we model the radar reflectivity data by a spatio-temporal conditional autoregressive process which is driven by two hidden sub-processes. The first sub-process is the dynamic velocity field which determines the motion of the weather system, while the second sub-process governs the growth or decay of the strength of radar reflectivity. The proposed method is demonstrated, and compared with existing methods, using the real radar data collected from the tropical southeast Asia. Note that, since the tropical storms are known to be highly chaotic and extremely difficult to be predicted, we only focus on the modeling of reflectivity data within a short-period of time and consider the short-term prediction problem based on the proposed model. This is often referred to as the nowcasting issue in the meteorology society. △ Less

Submitted 30 September, 2016; originally announced September 2016.

Comments: 31 pages, 9 figures

arXiv:1609.07217 [pdf, other]

Statistical Modeling for Spatio-Temporal Degradation Data

Authors: Xiao Liu, Kyongmin Yeo, Jayant Kalagnanam

Abstract: This paper investigates the modeling of an important class of degradation data, which are collected from a spatial domain over time; for example, the surface quality degradation. Like many existing time-dependent stochastic degradation models, a special random field is constructed for modeling the spatio-temporal degradation process. In particular, we express the degradation at any spatial locatio… ▽ More This paper investigates the modeling of an important class of degradation data, which are collected from a spatial domain over time; for example, the surface quality degradation. Like many existing time-dependent stochastic degradation models, a special random field is constructed for modeling the spatio-temporal degradation process. In particular, we express the degradation at any spatial location and time as an additive superposition of two stochastic components: a dynamic spatial degradation generation process, and a spatio-temporal degradation propagation process. Some unique challenges are addressed, including the spatial heterogeneity of the degradation process, the spatial propagation of degradation to neighboring areas, the anisotropic and space-time non-separable covariance structure often associated with a complex spatio-temporal degradation process, and the computational issue related to parameter estimation. When the spatial dependence is ignored, we show that the proposed spatio-temporal degradation model incorporates some existing pure time-dependent degradation processes as its special cases. We also show the connection, under special conditions, between the proposed model and general physical degradation processes which are often defined by stochastic partial differential equations. A numerical example is presented to illustrate the modeling approach and model validation. △ Less

Submitted 27 December, 2017; v1 submitted 22 September, 2016; originally announced September 2016.

Comments: 30 pages, 7 figures. Manuscript prepared for submission

arXiv:1304.2362 [pdf]

A Comparison of Decision Analysis and Expert Rules for Sequential Diagnosis

Authors: Jayant Kalagnanam, Max Henrion

Abstract: There has long been debate about the relative merits of decision theoretic methods and heuristic rule-based approaches for reasoning under uncertainty. We report an experimental comparison of the performance of the two approaches to troubleshooting, specifically to test selection for fault diagnosis. We use as experimental testbed the problem of diagnosing motorcycle engines. The first approach… ▽ More There has long been debate about the relative merits of decision theoretic methods and heuristic rule-based approaches for reasoning under uncertainty. We report an experimental comparison of the performance of the two approaches to troubleshooting, specifically to test selection for fault diagnosis. We use as experimental testbed the problem of diagnosing motorcycle engines. The first approach employs heuristic test selection rules obtained from expert mechanics. We compare it with the optimal decision analytic algorithm for test selection which employs estimated component failure probabilities and test costs. The decision analytic algorithm was found to reduce the expected cost (i.e. time) to arrive at a diagnosis by an average of 14% relative to the expert rules. Sensitivity analysis shows the results are quite robust to inaccuracy in the probability and cost estimates. This difference suggests some interesting implications for knowledge acquisition. △ Less

Submitted 27 March, 2013; originally announced April 2013.

Comments: Appears in Proceedings of the Fourth Conference on Uncertainty in Artificial Intelligence (UAI1988)

Report number: UAI-P-1988-PG-205-212

Showing 1–17 of 17 results for author: Kalagnanam, J