-
Variational Nonparametric Inference in Functional Stochastic Block Model
Authors:
Zuofeng Shang,
Peijun Sang,
Yang Feng,
Chong **
Abstract:
We propose a functional stochastic block model whose vertices involve functional data information. This new model extends the classic stochastic block model with vector-valued nodal information, and finds applications in real-world networks whose nodal information could be functional curves. Examples include international trade data in which a network vertex (country) is associated with the annual…
▽ More
We propose a functional stochastic block model whose vertices involve functional data information. This new model extends the classic stochastic block model with vector-valued nodal information, and finds applications in real-world networks whose nodal information could be functional curves. Examples include international trade data in which a network vertex (country) is associated with the annual or quarterly GDP over certain time period, and MyFitnessPal data in which a network vertex (MyFitnessPal user) is associated with daily calorie information measured over certain time period. Two statistical tasks will be jointly executed. First, we will detect community structures of the network vertices assisted by the functional nodal information. Second, we propose computationally efficient variational test to examine the significance of the functional nodal information. We show that the community detection algorithms achieve weak and strong consistency, and the variational test is asymptotically chi-square with diverging degrees of freedom. As a byproduct, we propose pointwise confidence intervals for the slop function of the functional nodal information. Our methods are examined through both simulated and real datasets.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Circuit-theoretic Joint Parameter-State Estimation -- Balancing Optimality and AC Feasibility
Authors:
Peng Sang,
Amritanshu Pandey
Abstract:
AC State Estimation (ACSE) is widely recognized as a practical approach for determining the grid states in steady-state conditions. It serves as a fundamental analysis to ensure grid security and is a reference for market dispatch. As grid complexity increases with rapid electrification and decarbonization, there is a growing need for more accurate knowledge of the grid operating state. However, e…
▽ More
AC State Estimation (ACSE) is widely recognized as a practical approach for determining the grid states in steady-state conditions. It serves as a fundamental analysis to ensure grid security and is a reference for market dispatch. As grid complexity increases with rapid electrification and decarbonization, there is a growing need for more accurate knowledge of the grid operating state. However, existing ACSE algorithms have technical gaps. Critically, current ACSE algorithms are susceptible to erroneous system parameters, which are assumed to be fixed in traditional approaches. In this paper, we build a novel circuit-theoretic joint parameter-state estimation algorithm to address this limitation. The innovative algorithm builds an analogous equivalent circuit of the grid with states and certain parameters unknown. It solves a circuit-constrained optimization to estimate the most likely grid states and parameters given a set of measurements. Further, it quantifies the goodness of the estimated output by formulating tight convex envelopes around the original non-convex problem to quantify the quality of estimates. We compare the various proposed approaches on systems with up to 2869 nodes while demonstrating a tradeoff between solution optimality and model fidelity.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Can an LLM-Powered Socially Assistive Robot Effectively and Safely Deliver Cognitive Behavioral Therapy? A Study With University Students
Authors:
Mina J. Kian,
Mingyu Zong,
Katrin Fischer,
Abhyuday Singh,
Anna-Maria Velentza,
Pau Sang,
Shriya Upadhyay,
Anika Gupta,
Misha A. Faruki,
Wallace Browning,
Sebastien M. R. Arnold,
Bhaskar Krishnamachari,
Maja J. Mataric
Abstract:
Cognitive behavioral therapy (CBT) is a widely used therapeutic method for guiding individuals toward restructuring their thinking patterns as a means of addressing anxiety, depression, and other challenges. We developed a large language model (LLM)-powered prompt-engineered socially assistive robot (SAR) that guides participants through interactive CBT at-home exercises. We evaluated the performa…
▽ More
Cognitive behavioral therapy (CBT) is a widely used therapeutic method for guiding individuals toward restructuring their thinking patterns as a means of addressing anxiety, depression, and other challenges. We developed a large language model (LLM)-powered prompt-engineered socially assistive robot (SAR) that guides participants through interactive CBT at-home exercises. We evaluated the performance of the SAR through a 15-day study with 38 university students randomly assigned to interact daily with the robot or a chatbot (using the same LLM), or complete traditional CBT worksheets throughout the duration of the study. We measured weekly therapeutic outcomes, changes in pre-/post-session anxiety measures, and adherence to completing CBT exercises. We found that self-reported measures of general psychological distress significantly decreased over the study period in the robot and worksheet conditions but not the chatbot condition. Furthermore, the SAR enabled significant single-session improvements for more sessions than the other two conditions combined. Our findings suggest that SAR-guided LLM-powered CBT may be as effective as traditional worksheet methods in supporting therapeutic progress from the beginning to the end of the study and superior in decreasing user anxiety immediately after completing the CBT exercise.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Two-sample inference for sparse functional data
Authors:
Chi Zhang,
Peijun Sang,
Yingli Qin
Abstract:
We propose a novel test procedure for comparing mean functions across two groups within the reproducing kernel Hilbert space (RKHS) framework. Our proposed method is adept at handling sparsely and irregularly sampled functional data when observation times are random for each subject. Conventional approaches, which are built upon functional principal components analysis, usually assume a homogeneou…
▽ More
We propose a novel test procedure for comparing mean functions across two groups within the reproducing kernel Hilbert space (RKHS) framework. Our proposed method is adept at handling sparsely and irregularly sampled functional data when observation times are random for each subject. Conventional approaches, which are built upon functional principal components analysis, usually assume a homogeneous covariance structure across groups. Nonetheless, justifying this assumption in real-world scenarios can be challenging. To eliminate the need for a homogeneous covariance structure, we first develop the functional Bahadur representation for the mean estimator under the RKHS framework; this representation naturally leads to the desirable pointwise limiting distributions. Moreover, we establish weak convergence for the mean estimator, allowing us to construct a test statistic for the mean difference. Our method is easily implementable and outperforms some conventional tests in controlling type I errors across various settings. We demonstrate the finite sample performance of our approach through extensive simulations and two real-world applications.
△ Less
Submitted 29 December, 2023; v1 submitted 12 December, 2023;
originally announced December 2023.
-
Restricted Tweedie Stochastic Block Models
Authors:
Jie Jian,
Mu Zhu,
Peijun Sang
Abstract:
The stochastic block model (SBM) is a widely used framework for community detection in networks, where the network structure is typically represented by an adjacency matrix. However, conventional SBMs are not directly applicable to an adjacency matrix that consists of non-negative zero-inflated continuous edge weights. To model the international trading network, where edge weights represent tradin…
▽ More
The stochastic block model (SBM) is a widely used framework for community detection in networks, where the network structure is typically represented by an adjacency matrix. However, conventional SBMs are not directly applicable to an adjacency matrix that consists of non-negative zero-inflated continuous edge weights. To model the international trading network, where edge weights represent trading values between countries, we propose an innovative SBM based on a restricted Tweedie distribution. Additionally, we incorporate nodal information, such as the geographical distance between countries, and account for its dynamic effect on edge weights. Notably, we show that given a sufficiently large number of nodes, estimating this covariate effect becomes independent of community labels of each node when computing the maximum likelihood estimator of parameters in our model. This result enables the development of an efficient two-step algorithm that separates the estimation of covariate effects from other parameters. We demonstrate the effectiveness of our proposed method through extensive simulation studies and an application to real-world international trading data.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
A Bayesian Collocation Integral Method for Parameter Estimation in Ordinary Differential Equations
Authors:
Mingwei Xu,
Samuel W. K. Wong,
Peijun Sang
Abstract:
Inferring the parameters of ordinary differential equations (ODEs) from noisy observations is an important problem in many scientific fields. Currently, most parameter estimation methods that bypass numerical integration tend to rely on basis functions or Gaussian processes to approximate the ODE solution and its derivatives. Due to the sensitivity of the ODE solution to its derivatives, these met…
▽ More
Inferring the parameters of ordinary differential equations (ODEs) from noisy observations is an important problem in many scientific fields. Currently, most parameter estimation methods that bypass numerical integration tend to rely on basis functions or Gaussian processes to approximate the ODE solution and its derivatives. Due to the sensitivity of the ODE solution to its derivatives, these methods can be hindered by estimation error, especially when only sparse time-course observations are available. We present a Bayesian collocation framework that operates on the integrated form of the ODEs and also avoids the expensive use of numerical solvers. Our methodology has the capability to handle general nonlinear ODE systems. We demonstrate the accuracy of the proposed method through simulation studies, where the estimated parameters and recovered system trajectories are compared with other recent methods. A real data example is also provided.
△ Less
Submitted 23 October, 2023; v1 submitted 4 April, 2023;
originally announced April 2023.
-
Scalable inference in functional linear regression with streaming data
Authors:
**han Xie,
Enze Shi,
Peijun Sang,
Zuofeng Shang,
Bei Jiang,
Linglong Kong
Abstract:
Traditional static functional data analysis is facing new challenges due to streaming data, where data constantly flow in. A major challenge is that storing such an ever-increasing amount of data in memory is nearly impossible. In addition, existing inferential tools in online learning are mainly developed for finite-dimensional problems, while inference methods for functional data are focused on…
▽ More
Traditional static functional data analysis is facing new challenges due to streaming data, where data constantly flow in. A major challenge is that storing such an ever-increasing amount of data in memory is nearly impossible. In addition, existing inferential tools in online learning are mainly developed for finite-dimensional problems, while inference methods for functional data are focused on the batch learning setting. In this paper, we tackle these issues by develo** functional stochastic gradient descent algorithms and proposing an online bootstrap resampling procedure to systematically study the inference problem for functional linear regression. In particular, the proposed estimation and inference procedures use only one pass over the data; thus they are easy to implement and suitable to the situation where data arrive in a streaming manner. Furthermore, we establish the convergence rate as well as the asymptotic distribution of the proposed estimator. Meanwhile, the proposed perturbed estimator from the bootstrap procedure is shown to enjoy the same theoretical properties, which provide the theoretical justification for our online inference tool. As far as we know, this is the first inference result on the functional linear regression model with streaming data. Simulation studies are conducted to investigate the finite-sample performance of the proposed procedure. An application is illustrated with the Bei**g multi-site air-quality data.
△ Less
Submitted 10 October, 2023; v1 submitted 5 February, 2023;
originally announced February 2023.
-
Order Statistics Approaches to Unobserved Heterogeneity in Auctions
Authors:
Yao Luo,
Peijun Sang,
Ruli Xiao
Abstract:
We establish nonparametric identification of auction models with continuous and nonseparable unobserved heterogeneity using three consecutive order statistics of bids. We then propose sieve maximum likelihood estimators for the joint distribution of unobserved heterogeneity and the private value, as well as their conditional and marginal distributions. Lastly, we apply our methodology to a novel d…
▽ More
We establish nonparametric identification of auction models with continuous and nonseparable unobserved heterogeneity using three consecutive order statistics of bids. We then propose sieve maximum likelihood estimators for the joint distribution of unobserved heterogeneity and the private value, as well as their conditional and marginal distributions. Lastly, we apply our methodology to a novel dataset from judicial auctions in China. Our estimates suggest substantial gains from accounting for unobserved heterogeneity when setting reserve prices. We propose a simple scheme that achieves nearly optimal revenue by using the appraisal value as the reserve price.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
Optimality conditions and Lipschitz stability for non-smooth semilinear elliptic optimal control problems with sparse controls
Authors:
Vu Huu Nhu,
Phan Quang Sang
Abstract:
This paper is concerned with first- and second-order optimality conditions as well as the stability for non-smooth semilinear optimal control problems involving the $L^1$-norm of the control in the cost functional.
In addition to the appearance of the $L^1$-norm leading to the non-differentiability of the objective and promoting the sparsity of the optimal controls, the non-smoothness of the non…
▽ More
This paper is concerned with first- and second-order optimality conditions as well as the stability for non-smooth semilinear optimal control problems involving the $L^1$-norm of the control in the cost functional.
In addition to the appearance of the $L^1$-norm leading to the non-differentiability of the objective and promoting the sparsity of the optimal controls, the non-smoothness of the nonlinear coefficient in the state equation causes the same property of the control-to-state operator. Exploiting a regularization scheme, we derive $C$-stationarity conditions for any local optimal control. Under a structural assumption on the associated state, we define the curvature functional for the part not including the $L^1$-norm of controls of the objective for which the second-order necessary and sufficient optimality conditions are shown. Furthermore, under a more restrictive structural assumption imposed on the mentioned state, an explicit formulation of the curvature is established and thus the explicit second-order optimality conditions are stated. Finally, the Lipschitz stability of local solutions with respect to the sparsity parameter is shown.
△ Less
Submitted 16 October, 2023; v1 submitted 9 September, 2022;
originally announced September 2022.
-
Nonlinear function-on-function regression by RKHS
Authors:
Peijun Sang,
Bing Li
Abstract:
We propose a nonlinear function-on-function regression model where both the covariate and the response are random functions. The nonlinear regression is carried out in two steps: we first construct Hilbert spaces to accommodate the functional covariate and the functional response, and then build a second-layer Hilbert space for the covariate to capture nonlinearity. The second-layer space is assum…
▽ More
We propose a nonlinear function-on-function regression model where both the covariate and the response are random functions. The nonlinear regression is carried out in two steps: we first construct Hilbert spaces to accommodate the functional covariate and the functional response, and then build a second-layer Hilbert space for the covariate to capture nonlinearity. The second-layer space is assumed to be a reproducing kernel Hilbert space, which is generated by a positive definite kernel determined by the inner product of the first-layer Hilbert space for $X$--this structure is known as the nested Hilbert spaces. We develop estimation procedures to implement the proposed method, which allows the functional data to be observed at different time points for different subjects. Furthermore, we establish the convergence rate of our estimator as well as the weak convergence of the predicted response in the Hilbert space. Numerical studies including both simulations and a data application are conducted to investigate the performance of our estimator in finite sample.
△ Less
Submitted 17 July, 2022;
originally announced July 2022.
-
Penalized Sieve Estimation of Structural Models
Authors:
Yao Luo,
Peijun Sang
Abstract:
Estimating structural models is an essential tool for economists. However, existing methods are often inefficient either computationally or statistically, depending on how equilibrium conditions are imposed. We propose a class of penalized sieve estimators that are consistent, asymptotic normal, and asymptotically efficient. Instead of solving the model repeatedly, we approximate the solution with…
▽ More
Estimating structural models is an essential tool for economists. However, existing methods are often inefficient either computationally or statistically, depending on how equilibrium conditions are imposed. We propose a class of penalized sieve estimators that are consistent, asymptotic normal, and asymptotically efficient. Instead of solving the model repeatedly, we approximate the solution with a linear combination of basis functions and impose equilibrium conditions as a penalty in searching for the best fitting coefficients. We apply our method to an entry game between Walmart and Kmart.
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
Stop** time detection of wood panel compression: A functional time series approach
Authors:
H. L. Shang,
J. Cao,
P. Sang
Abstract:
We consider determining the optimal stop** time for the glue curing of wood panels in an automatic process environment. Using the near-infrared spectroscopy technology to monitor the manufacturing process ensures substantial savings in energy and time. We collect a time series of curves from a near-infrared spectrum probe consisting of 72 spectra and aim to detect an optimal stop** time. We pr…
▽ More
We consider determining the optimal stop** time for the glue curing of wood panels in an automatic process environment. Using the near-infrared spectroscopy technology to monitor the manufacturing process ensures substantial savings in energy and time. We collect a time series of curves from a near-infrared spectrum probe consisting of 72 spectra and aim to detect an optimal stop** time. We propose an estimation procedure to determine the optimal stop** time of wood panel compression and the estimation uncertainty associated with the estimated stop** time. Our method first divides the entire data set into a training sample and a testing sample, then iteratively computes integrated squared forecast errors based on the testing sample. We then apply a structural break detection method with one breakpoint to determine an estimated optimal stop** time from a univariate time series of the integrated squared forecast errors. We also investigate the finite-sample performance of the proposed method via a series of simulation studies.
△ Less
Submitted 27 April, 2022;
originally announced April 2022.
-
Functional principal component analysis for longitudinal observations with sampling at random
Authors:
Peijun Sang,
Dehan Kong,
Shu Yang
Abstract:
Functional principal component analysis has been shown to be invaluable for revealing variation modes of longitudinal outcomes, which serves as important building blocks for forecasting and model building. Decades of research have advanced methods for functional principal component analysis often assuming independence between the observation times and longitudinal outcomes. Yet such assumptions ar…
▽ More
Functional principal component analysis has been shown to be invaluable for revealing variation modes of longitudinal outcomes, which serves as important building blocks for forecasting and model building. Decades of research have advanced methods for functional principal component analysis often assuming independence between the observation times and longitudinal outcomes. Yet such assumptions are fragile in real-world settings where observation times may be driven by outcome-related reasons. Rather than ignoring the informative observation time process, we explicitly model the observational times by a counting process dependent on time-varying prognostic factors. Identification of the mean, covariance function, and functional principal components ensues via inverse intensity weighting. We propose using weighted penalized splines for estimation and establish consistency and convergence rates for the weighted estimators. Simulation studies demonstrate that the proposed estimators are substantially more accurate than the existing ones in the presence of a correlation between the observation time process and the longitudinal outcome process. We further examine the finite-sample performance of the proposed method using the Acute Infection and Early Disease Research Program study.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
Statistical Inference for Functional Linear Quantile Regression
Authors:
Peijun Sang,
Zuofeng Shang,
Pang Du
Abstract:
We propose inferential tools for functional linear quantile regression where the conditional quantile of a scalar response is assumed to be a linear functional of a functional covariate. In contrast to conventional approaches, we employ kernel convolution to smooth the original loss function. The coefficient function is estimated under a reproducing kernel Hilbert space framework. A gradient desce…
▽ More
We propose inferential tools for functional linear quantile regression where the conditional quantile of a scalar response is assumed to be a linear functional of a functional covariate. In contrast to conventional approaches, we employ kernel convolution to smooth the original loss function. The coefficient function is estimated under a reproducing kernel Hilbert space framework. A gradient descent algorithm is designed to minimize the smoothed loss function with a roughness penalty. With the aid of the Banach fixed-point theorem, we show the existence and uniqueness of our proposed estimator as the minimizer of the regularized loss function in an appropriate Hilbert space. Furthermore, we establish the convergence rate as well as the weak convergence of our estimator. As far as we know, this is the first weak convergence result for a functional quantile regression model. Pointwise confidence intervals and a simultaneous confidence band for the true coefficient function are then developed based on these theoretical properties. Numerical studies including both simulations and a data application are conducted to investigate the performance of our estimator and inference tools in finite sample.
△ Less
Submitted 23 February, 2022;
originally announced February 2022.
-
Two Gaussian regularization methods for time-varying networks
Authors:
Jie Jian,
Peijun Sang,
Mu Zhu
Abstract:
We model time-varying network data as realizations from multivariate Gaussian distributions with precision matrices that change over time. To facilitate parameter estimation, we require not only that each precision matrix at any given time point be sparse, but also that precision matrices at neighboring time points be similar. We accomplish this with two different algorithms, by generalizing the e…
▽ More
We model time-varying network data as realizations from multivariate Gaussian distributions with precision matrices that change over time. To facilitate parameter estimation, we require not only that each precision matrix at any given time point be sparse, but also that precision matrices at neighboring time points be similar. We accomplish this with two different algorithms, by generalizing the elastic net and the fused LASSO, respectively. Our main focuses are efficient computational algorithms and convenient degree-of-freedom formulae for choosing tuning parameters. We illustrate our methods with two simulation studies. By applying them to an fMRI data set, we also detect some interesting differences in brain connectivity between healthy individuals and ADHD patients.
△ Less
Submitted 9 March, 2022; v1 submitted 14 February, 2022;
originally announced February 2022.
-
A reproducing kernel Hilbert space framework for functional data classification
Authors:
Peijun Sang,
Adam B Kashlak,
Linglong Kong
Abstract:
We encounter a bottleneck when we try to borrow the strength of classical classifiers to classify functional data. The major issue is that functional data are intrinsically infinite dimensional, thus classical classifiers cannot be applied directly or have poor performance due to the curse of dimensionality. To address this concern, we propose to project functional data onto one specific direction…
▽ More
We encounter a bottleneck when we try to borrow the strength of classical classifiers to classify functional data. The major issue is that functional data are intrinsically infinite dimensional, thus classical classifiers cannot be applied directly or have poor performance due to the curse of dimensionality. To address this concern, we propose to project functional data onto one specific direction, and then a distance-weighted discrimination DWD classifier is built upon the projection score. The projection direction is identified through minimizing an empirical risk function that contains the particular loss function in a DWD classifier, over a reproducing kernel Hilbert space. Hence our proposed classifier can avoid overfitting and enjoy appealing properties of DWD classifiers. This framework is further extended to accommodate functional data classification problems where scalar covariates are involved. In contrast to previous work, we establish a non-asymptotic estimation error bound on the relative misclassification rate. In finite sample case, we demonstrate that the proposed classifiers compare favorably with some commonly used functional classifiers in terms of prediction accuracy through simulation studies and a real-world application.
△ Less
Submitted 7 March, 2021;
originally announced March 2021.
-
Continuum centroid classifier for functional data
Authors:
Zhiyang Zhou,
Peijun Sang
Abstract:
Aiming at the binary classification of functional data, we propose the continuum centroid classifier (CCC) built upon projections of functional data onto one specific direction. This direction is obtained via bridging the regression and classification. Controlling the extent of supervision, our technique is neither unsupervised nor fully supervised. Thanks to the intrinsic infinite dimension of fu…
▽ More
Aiming at the binary classification of functional data, we propose the continuum centroid classifier (CCC) built upon projections of functional data onto one specific direction. This direction is obtained via bridging the regression and classification. Controlling the extent of supervision, our technique is neither unsupervised nor fully supervised. Thanks to the intrinsic infinite dimension of functional data, one of two subtypes of CCC enjoys the (asymptotic) zero misclassification rate. Our proposal includes an effective algorithm that yields a consistent empirical counterpart of CCC. Simulation studies demonstrate the performance of CCC in different scenarios. Finally, we apply CCC to two real examples.
△ Less
Submitted 11 February, 2021;
originally announced February 2021.
-
Density dependencies of interaction strengths and their influence on nuclear matter and neutron star in relativistic mean field theory
Authors:
S. F. Ban,
J. Li,
S. Q. Zhang,
H. Y. Jia,
J. P. Sang,
J. Meng
Abstract:
The density dependencies of various effective interaction strengths in the relativistic mean field are studied and carefully compared for nuclear matter and neutron stars. The influences of different density dependencies are presented and discussed on mean field potentials, saturation properties for nuclear matter, equations of state, maximum masses and corresponding radii for neutron stars. Tho…
▽ More
The density dependencies of various effective interaction strengths in the relativistic mean field are studied and carefully compared for nuclear matter and neutron stars. The influences of different density dependencies are presented and discussed on mean field potentials, saturation properties for nuclear matter, equations of state, maximum masses and corresponding radii for neutron stars. Though the interaction strengths and the potentials given by various interactions are quite different in nuclear matter, the differences of saturation properties are subtle, except for NL2 and TM2, which are mainly used for light nuclei, while the properties by various interactions for pure neutron matter are quite different. To get an equation of state for neutron matter without any ambiguity, it is necessary to constrain the effective interactions either by microscopic many-body calculations for the neutron matter data or the data of nuclei with extreme isospin. For neutron stars, the interaction with large interaction strengths give strong potentials and large Oppenheimer-Volkoff (OV) mass limits. The density-dependent interactions DD-ME1 and TW-99 favor a large neutron population due to their weak $ρ$-meson field at high densities. The OV mass limits calculated from different equations of state are 2.02 $\sim$ 2.81$ M_\odot$, and the corresponding radii are 10.78 $\sim$ 13.27 km. After the inclusion of the hyperons, the corresponding values become 1.52 $\sim$ 2.06 $M_\odot$ and 10.24 $\sim$ 11.38 km.
△ Less
Submitted 12 February, 2004;
originally announced February 2004.