Search | arXiv e-print repository

Bayesian Kernelized Tensor Factorization as Surrogate for Bayesian Optimization

Abstract: Bayesian optimization (BO) primarily uses Gaussian processes (GP) as the key surrogate model, mostly with a simple stationary and separable kernel function such as the squared-exponential kernel with automatic relevance determination (SE-ARD). However, such simple kernel specifications are deficient in learning functions with complex features, such as being nonstationary, nonseparable, and multimo… ▽ More Bayesian optimization (BO) primarily uses Gaussian processes (GP) as the key surrogate model, mostly with a simple stationary and separable kernel function such as the squared-exponential kernel with automatic relevance determination (SE-ARD). However, such simple kernel specifications are deficient in learning functions with complex features, such as being nonstationary, nonseparable, and multimodal. Approximating such functions using a local GP, even in a low-dimensional space, requires a large number of samples, not to mention in a high-dimensional setting. In this paper, we propose to use Bayesian Kernelized Tensor Factorization (BKTF) -- as a new surrogate model -- for BO in a $D$-dimensional Cartesian product space. Our key idea is to approximate the underlying $D$-dimensional solid with a fully Bayesian low-rank tensor CP decomposition, in which we place GP priors on the latent basis functions for each dimension to encode local consistency and smoothness. With this formulation, information from each sample can be shared not only with neighbors but also across dimensions. Although BKTF no longer has an analytical posterior, we can still efficiently approximate the posterior distribution through Markov chain Monte Carlo (MCMC) and obtain prediction and full uncertainty quantification (UQ). We conduct numerical experiments on both standard BO test functions and machine learning hyperparameter tuning problems, and our results show that BKTF offers a flexible and highly effective approach for characterizing complex functions with UQ, especially in cases where the initial sample size and budget are severely limited. △ Less

Submitted 26 May, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

arXiv:2208.09978 [pdf, other]

Bayesian Complementary Kernelized Learning for Multidimensional Spatiotemporal Data

Authors: Mengying Lei, Aurelie Labbe, Lijun Sun

Abstract: Probabilistic modeling of multidimensional spatiotemporal data is critical to many real-world applications. As real-world spatiotemporal data often exhibits complex dependencies that are nonstationary and nonseparable, develo** effective and computationally efficient statistical models to accommodate nonstationary/nonseparable processes containing both long-range and short-scale variations becom… ▽ More Probabilistic modeling of multidimensional spatiotemporal data is critical to many real-world applications. As real-world spatiotemporal data often exhibits complex dependencies that are nonstationary and nonseparable, develo** effective and computationally efficient statistical models to accommodate nonstationary/nonseparable processes containing both long-range and short-scale variations becomes a challenging task, in particular for large-scale datasets with various corruption/missing structures. In this paper, we propose a new statistical framework -- Bayesian Complementary Kernelized Learning (BCKL) -- to achieve scalable probabilistic modeling for multidimensional spatiotemporal data. To effectively characterize complex dependencies, BCKL integrates two complementary approaches -- kernelized low-rank tensor factorization and short-range spatiotemporal Gaussian Processes. Specifically, we use a multi-linear low-rank factorization component to capture the global/long-range correlations in the data and introduce an additive short-scale GP based on compactly supported kernel functions to characterize the remaining local variabilities. We develop an efficient Markov chain Monte Carlo (MCMC) algorithm for model inference and evaluate the proposed BCKL framework on both synthetic and real-world spatiotemporal datasets. Our experiment results show that BCKL offers superior performance in providing accurate posterior mean and high-quality uncertainty estimates, confirming the importance of both global and local components in modeling spatiotemporal data. △ Less

Submitted 30 May, 2023; v1 submitted 21 August, 2022; originally announced August 2022.

arXiv:2109.00046 [pdf, other]

doi 10.1214/24-BA1428

Scalable Spatiotemporally Varying Coefficient Modelling with Bayesian Kernelized Tensor Regression

Authors: Mengying Lei, Aurelie Labbe, Lijun Sun

Abstract: As a regression technique in spatial statistics, the spatiotemporally varying coefficient model (STVC) is an important tool for discovering nonstationary and interpretable response-covariate associations over both space and time. However, it is difficult to apply STVC for large-scale spatiotemporal analyses due to its high computational cost. To address this challenge, we summarize the spatiotempo… ▽ More As a regression technique in spatial statistics, the spatiotemporally varying coefficient model (STVC) is an important tool for discovering nonstationary and interpretable response-covariate associations over both space and time. However, it is difficult to apply STVC for large-scale spatiotemporal analyses due to its high computational cost. To address this challenge, we summarize the spatiotemporally varying coefficients using a third-order tensor structure and propose to reformulate the spatiotemporally varying coefficient model as a special low-rank tensor regression problem. The low-rank decomposition can effectively model the global patterns of large data sets with a substantially reduced number of parameters. To further incorporate the local spatiotemporal dependencies, we use Gaussian process (GP) priors on the spatial and temporal factor matrices. We refer to the overall framework as Bayesian Kernelized Tensor Regression (BKTR), and kernelized tensor factorization can be considered a new and scalable approach to modeling multivariate spatiotemporal processes with a low-rank covariance structure. For model inference, we develop an efficient Markov chain Monte Carlo (MCMC) algorithm, which uses Gibbs sampling to update factor matrices and slice sampling to update kernel hyperparameters. We conduct extensive experiments on both synthetic and real-world data sets, and our results confirm the superior performance and efficiency of BKTR for model estimation and parameter inference. △ Less

Submitted 13 April, 2024; v1 submitted 31 August, 2021; originally announced September 2021.

Journal ref: Bayesian Analysis (2024)

arXiv:2104.14936 [pdf, other]

doi 10.1109/TITS.2021.3113608

Low-Rank Autoregressive Tensor Completion for Spatiotemporal Traffic Data Imputation

Authors: Xinyu Chen, Mengying Lei, Nicolas Saunier, Lijun Sun

Abstract: Spatiotemporal traffic time series (e.g., traffic volume/speed) collected from sensing systems are often incomplete with considerable corruption and large amounts of missing values, preventing users from harnessing the full power of the data. Missing data imputation has been a long-standing research topic and critical application for real-world intelligent transportation systems. A widely applied… ▽ More Spatiotemporal traffic time series (e.g., traffic volume/speed) collected from sensing systems are often incomplete with considerable corruption and large amounts of missing values, preventing users from harnessing the full power of the data. Missing data imputation has been a long-standing research topic and critical application for real-world intelligent transportation systems. A widely applied imputation method is low-rank matrix/tensor completion; however, the low-rank assumption only preserves the global structure while ignores the strong local consistency in spatiotemporal data. In this paper, we propose a low-rank autoregressive tensor completion (LATC) framework by introducing \textit{temporal variation} as a new regularization term into the completion of a third-order (sensor $\times$ time of day $\times$ day) tensor. The third-order tensor structure allows us to better capture the global consistency of traffic data, such as the inherent seasonality and day-to-day similarity. To achieve local consistency, we design the temporal variation by imposing an AR($p$) model for each time series with coefficients as learnable parameters. Different from previous spatial and temporal regularization schemes, the minimization of temporal variation can better characterize temporal generative mechanisms beyond local smoothness, allowing us to deal with more challenging scenarios such "blackout" missing. To solve the optimization problem in LATC, we introduce an alternating minimization scheme that estimates the low-rank tensor and autoregressive coefficients iteratively. We conduct extensive numerical experiments on several real-world traffic data sets, and our results demonstrate the effectiveness of LATC in diverse missing scenarios. △ Less

Submitted 30 April, 2021; originally announced April 2021.

Journal ref: IEEE Transactions on Intelligent Transportation Systems (2022)

arXiv:2002.07601 [pdf, other]

ADMM-based Decoder for Binary Linear Codes Aided by Deep Learning

Authors: Yi Wei, Ming-Min Zhao, Min-Jian Zhao, Ming Lei

Abstract: Inspired by the recent advances in deep learning (DL), this work presents a deep neural network aided decoding algorithm for binary linear codes. Based on the concept of deep unfolding, we design a decoding network by unfolding the alternating direction method of multipliers (ADMM)-penalized decoder. In addition, we propose two improved versions of the proposed network. The first one transforms th… ▽ More Inspired by the recent advances in deep learning (DL), this work presents a deep neural network aided decoding algorithm for binary linear codes. Based on the concept of deep unfolding, we design a decoding network by unfolding the alternating direction method of multipliers (ADMM)-penalized decoder. In addition, we propose two improved versions of the proposed network. The first one transforms the penalty parameter into a set of iteration-dependent ones, and the second one adopts a specially designed penalty function, which is based on a piecewise linear function with adjustable slopes. Numerical results show that the resulting DL-aided decoders outperform the original ADMM-penalized decoder for various low density parity check (LDPC) codes with similar computational complexity. △ Less

Submitted 13 February, 2020; originally announced February 2020.

Comments: 5 pages, 4 figures, accepted for publication in IEEE communications letters

arXiv:1906.03814 [pdf, other]

doi 10.1109/TSP.2020.3035832

Learned Conjugate Gradient Descent Network for Massive MIMO Detection

Authors: Yi Wei, Ming-Min Zhao, Mingyi Hong, Min-jian Zhao, Ming Lei

Abstract: In this work, we consider the use of model-driven deep learning techniques for massive multiple-input multiple-output (MIMO) detection. Compared with conventional MIMO systems, massive MIMO promises improved spectral efficiency, coverage and range. Unfortunately, these benefits are coming at the cost of significantly increased computational complexity. To reduce the complexity of signal detection… ▽ More In this work, we consider the use of model-driven deep learning techniques for massive multiple-input multiple-output (MIMO) detection. Compared with conventional MIMO systems, massive MIMO promises improved spectral efficiency, coverage and range. Unfortunately, these benefits are coming at the cost of significantly increased computational complexity. To reduce the complexity of signal detection and guarantee the performance, we present a learned conjugate gradient descent network (LcgNet), which is constructed by unfolding the iterative conjugate gradient descent (CG) detector. In the proposed network, instead of calculating the exact values of the scalar step-sizes, we explicitly learn their universal values. Also, we can enhance the proposed network by augmenting the dimensions of these step-sizes. Furthermore, in order to reduce the memory costs, a novel quantized LcgNet is proposed, where a low-resolution nonuniform quantizer is integrated into the LcgNet to smartly quantize the aforementioned step-sizes. The quantizer is based on a specially designed soft staircase function with learnable parameters to adjust its shape. Meanwhile, due to fact that the number of learnable parameters is limited, the proposed networks are easy and fast to train. Numerical results demonstrate that the proposed network can achieve promising performance with much lower complexity. △ Less

Submitted 1 June, 2020; v1 submitted 10 June, 2019; originally announced June 2019.

Comments: Part of this work has been accepted by IEEE ICC 2020

arXiv:1805.03504 [pdf, other]

Diffusion Based Network Embedding

Authors: Yong Shi, Minglong Lei, Peng Zhang, Lingfeng Niu

Abstract: In network embedding, random walks play a fundamental role in preserving network structures. However, random walk based embedding methods have two limitations. First, random walk methods are fragile when the sampling frequency or the number of node sequences changes. Second, in disequilibrium networks such as highly biases networks, random walk methods often perform poorly due to the lack of globa… ▽ More In network embedding, random walks play a fundamental role in preserving network structures. However, random walk based embedding methods have two limitations. First, random walk methods are fragile when the sampling frequency or the number of node sequences changes. Second, in disequilibrium networks such as highly biases networks, random walk methods often perform poorly due to the lack of global network information. In order to solve the limitations, we propose in this paper a network diffusion based embedding method. To solve the first limitation, our method employs a diffusion driven process to capture both depth information and breadth information. The time dimension is also attached to node sequences that can strengthen information preserving. To solve the second limitation, our method uses the network inference technique based on cascades to capture the global network information. To verify the performance, we conduct experiments on node classification tasks using the learned representations. Results show that compared with random walk based methods, diffusion based models are more robust when samplings under each node is rare. We also conduct experiments on a highly imbalanced network. Results shows that the proposed model are more robust under the biased network structure. △ Less

Submitted 11 May, 2018; v1 submitted 9 May, 2018; originally announced May 2018.

arXiv:1005.5348 [pdf, ps, other]

Error Analysis of Approximated PCRLBs for Nonlinear Dynamics

Authors: Ming Lei, Pierre Del Moral, Christophe Baehr

Abstract: In practical nonlinear filtering, the assessment of achievable filtering performance is important. In this paper, we focus on the problem of efficiently approximate the posterior Cramer-Rao lower bound (CRLB) in a recursive manner. By using Gaussian assumptions, two types of approximations for calculating the CRLB are proposed: An exact model using the state estimate as well as a Taylor-series-exp… ▽ More In practical nonlinear filtering, the assessment of achievable filtering performance is important. In this paper, we focus on the problem of efficiently approximate the posterior Cramer-Rao lower bound (CRLB) in a recursive manner. By using Gaussian assumptions, two types of approximations for calculating the CRLB are proposed: An exact model using the state estimate as well as a Taylor-series-expanded model using both of the state estimate and its error covariance, are derived. Moreover, the difference between the two approximated CRLBs is also formulated analytically. By employing the particle filter (PF) and the unscented Kalman filter (UKF) to compute, simulation results reveal that the approximated CRLB using mean-covariance-based model outperforms that using the mean-based exact model. It is also shown that the theoretical difference between the estimated CRLBs can be improved through an improved filtering method. △ Less

Submitted 28 May, 2010; originally announced May 2010.

Comments: 5 pages, 4 figures;

Showing 1–8 of 8 results for author: Lei, M