-
Fuglede-Kadison determinants of matrix-valued semicircular elements and capacity estimates
Authors:
Tobias Mai,
Roland Speicher
Abstract:
We calculate the Fuglede-Kadison determinant of arbitrary matrix-valued semicircular operators in terms of the capacity of the corresponding covariance map**. We also improve a lower bound by Garg, Gurvits, Oliveira, and Widgerson on this capacity, by making it dimension-independent.
We calculate the Fuglede-Kadison determinant of arbitrary matrix-valued semicircular operators in terms of the capacity of the corresponding covariance map**. We also improve a lower bound by Garg, Gurvits, Oliveira, and Widgerson on this capacity, by making it dimension-independent.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Concentration of a sparse Bayesian model with Horseshoe prior in estimating high-dimensional precision matrix
Authors:
The Tien Mai
Abstract:
Precision matrices are crucial in many fields such as social networks, neuroscience, and economics, representing the edge structure of Gaussian graphical models (GGMs), where a zero in an off-diagonal position of the precision matrix indicates conditional independence between nodes. In high-dimensional settings where the dimension of the precision matrix $p$ exceeds the sample size $n$ and the mat…
▽ More
Precision matrices are crucial in many fields such as social networks, neuroscience, and economics, representing the edge structure of Gaussian graphical models (GGMs), where a zero in an off-diagonal position of the precision matrix indicates conditional independence between nodes. In high-dimensional settings where the dimension of the precision matrix $p$ exceeds the sample size $n$ and the matrix is sparse, methods like graphical Lasso, graphical SCAD, and CLIME are popular for estimating GGMs. While frequentist methods are well-studied, Bayesian approaches for (unstructured) sparse precision matrices are less explored. The graphical horseshoe estimate by \citet{li2019graphical}, applying the global-local horseshoe prior, shows superior empirical performance, but theoretical work for sparse precision matrix estimations using shrinkage priors is limited. This paper addresses these gaps by providing concentration results for the tempered posterior with the fully specified horseshoe prior in high-dimensional settings. Moreover, we also provide novel theoretical results for model misspecification, offering a general oracle inequality for the posterior.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Adaptive posterior concentration rates for sparse high-dimensional linear regression with random design and unknown error variance
Authors:
The Tien Mai
Abstract:
This paper investigates sparse high-dimensional linear regression, particularly examining the properties of the posterior under conditions of random design and unknown error variance. We provide consistency results for the posterior and analyze its concentration rates, demonstrating adaptiveness to the unknown sparsity level of the regression coefficient vector. Furthermore, we extend our investig…
▽ More
This paper investigates sparse high-dimensional linear regression, particularly examining the properties of the posterior under conditions of random design and unknown error variance. We provide consistency results for the posterior and analyze its concentration rates, demonstrating adaptiveness to the unknown sparsity level of the regression coefficient vector. Furthermore, we extend our investigation to establish concentration outcomes for parameter estimation using specific distance measures. These findings are in line with recent discoveries in frequentist studies. Additionally, by employing techniques to address model misspecification through a fractional posterior, we broaden our analysis through oracle inequalities to encompass the critical aspect of model misspecification for the regular posterior. Our novel findings are demonstrated using two different types of sparsity priors: a shrinkage prior and a spike-and-slab prior.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Misclassification bounds for PAC-Bayesian sparse deep learning
Authors:
The Tien Mai
Abstract:
Recently, there has been a significant focus on exploring the theoretical aspects of deep learning, especially regarding its performance in classification tasks. Bayesian deep learning has emerged as a unified probabilistic framework, seeking to integrate deep learning with Bayesian methodologies seamlessly. However, there exists a gap in the theoretical understanding of Bayesian approaches in dee…
▽ More
Recently, there has been a significant focus on exploring the theoretical aspects of deep learning, especially regarding its performance in classification tasks. Bayesian deep learning has emerged as a unified probabilistic framework, seeking to integrate deep learning with Bayesian methodologies seamlessly. However, there exists a gap in the theoretical understanding of Bayesian approaches in deep learning for classification. This study presents an attempt to bridge that gap. By leveraging PAC-Bayes bounds techniques, we present theoretical results on the prediction or misclassification error of a probabilistic approach utilizing Spike-and-Slab priors for sparse deep learning in classification. We establish non-asymptotic results for the prediction error. Additionally, we demonstrate that, by considering different architectures, our results can achieve minimax optimal rates in both low and high-dimensional settings, up to a logarithmic factor. Moreover, our additional logarithmic term yields slight improvements over previous works. Additionally, we propose and analyze an automated model selection approach aimed at optimally choosing a network architecture with guaranteed optimality.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Tracking and classifying objects with DAS data along railway
Authors:
Simon L. B. Fredriksen,
The Tien Mai,
Kevin Growe,
Jo Eidsvik
Abstract:
Distributed acoustic sensing through fiber-optical cables can contribute to traffic monitoring systems. Using data from a day of field testing on a 50 km long fiber-optic cable along a railroad track in Norway, we detect and track cars and trains along a segment of the fiber-optic cable where the road runs parallel to the railroad tracks. We develop a method for automatic detection of events and t…
▽ More
Distributed acoustic sensing through fiber-optical cables can contribute to traffic monitoring systems. Using data from a day of field testing on a 50 km long fiber-optic cable along a railroad track in Norway, we detect and track cars and trains along a segment of the fiber-optic cable where the road runs parallel to the railroad tracks. We develop a method for automatic detection of events and then use these in a Kalman filter variant known as joint probabilistic data association for object tracking and classification. Model parameters are specified using in-situ log data along with the fiber-optic signals. Running the algorithm over an entire day, we highlight results of counting cars and trains over time and their estimated velocities.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher
Authors:
Mohsen Koohi Esfahani,
Marco D'Antonio,
Syed Ibtisam Tauhidi,
Thai Son Mai,
Hans Vandierendonck
Abstract:
Comprehensive evaluation is one of the basis of experimental science. In High-Performance Graph Processing, a thorough evaluation of contributions becomes more achievable by supporting common input formats over different frameworks. However, each framework creates its specific format, which may not support reading large-scale real-world graph datasets. This shows a demand for high-performance libr…
▽ More
Comprehensive evaluation is one of the basis of experimental science. In High-Performance Graph Processing, a thorough evaluation of contributions becomes more achievable by supporting common input formats over different frameworks. However, each framework creates its specific format, which may not support reading large-scale real-world graph datasets. This shows a demand for high-performance libraries capable of loading graphs to (i) accelerate designing new graph algorithms, (ii) to evaluate the contributions on a wide range of graph algorithms, and (iii) to facilitate easy and fast comparison over different graph frameworks.
To that end, we present ParaGrapher, a high-performance API and library for loading large-scale and compressed graphs. ParaGrapher supports different types of requests for accessing graphs in shared- and distributed-memory and out-of-core graph processing. We explain the design of ParaGrapher and present a performance model of graph decompression, which is used for evaluation of ParaGrapher over three storage types. Our evaluation shows that by decompressing compressed graphs in WebGraph format, ParaGrapher delivers up to 3.2 times speedup in loading and up to 5.2 times speedup in end-to-end execution in comparison to the binary and textual formats.
ParaGrapher is available online on https://blogs.qub.ac.uk/DIPSA/ParaGrapher/.
△ Less
Submitted 17 June, 2024; v1 submitted 30 April, 2024;
originally announced April 2024.
-
On properties of fractional posterior in generalized reduced-rank regression
Authors:
The Tien Mai
Abstract:
Reduced rank regression (RRR) is a widely employed model for investigating the linear association between multiple response variables and a set of predictors. While RRR has been extensively explored in various works, the focus has predominantly been on continuous response variables, overlooking other types of outcomes. This study shifts its attention to the Bayesian perspective of generalized line…
▽ More
Reduced rank regression (RRR) is a widely employed model for investigating the linear association between multiple response variables and a set of predictors. While RRR has been extensively explored in various works, the focus has predominantly been on continuous response variables, overlooking other types of outcomes. This study shifts its attention to the Bayesian perspective of generalized linear models (GLM) within the RRR framework. In this work, we relax the requirement for the link function of the generalized linear model to be canonical. We examine the properties of fractional posteriors in GLM within the RRR context, where a fractional power of the likelihood is utilized. By employing a spectral scaled Student prior distribution, we establish consistency and concentration results for the fractional posterior. Our results highlight adaptability, as they do not necessitate prior knowledge of the rank of the parameter matrix. These results are in line with those found in frequentist literature. Additionally, an examination of model mis-specification is undertaken, underscoring the effectiveness of our approach in such scenarios.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
Concentration properties of fractional posterior in 1-bit matrix completion
Authors:
The Tien Mai
Abstract:
The problem of estimating a matrix based on a set of its observed entries is commonly referred to as the matrix completion problem. In this work, we specifically address the scenario of binary observations, often termed as 1-bit matrix completion. While numerous studies have explored Bayesian and frequentist methods for real-value matrix completion, there has been a lack of theoretical exploration…
▽ More
The problem of estimating a matrix based on a set of its observed entries is commonly referred to as the matrix completion problem. In this work, we specifically address the scenario of binary observations, often termed as 1-bit matrix completion. While numerous studies have explored Bayesian and frequentist methods for real-value matrix completion, there has been a lack of theoretical exploration regarding Bayesian approaches in 1-bit matrix completion. We tackle this gap by considering a general, non-uniform sampling scheme and providing theoretical assurances on the efficacy of the fractional posterior. Our contributions include obtaining concentration results for the fractional posterior and demonstrating its effectiveness in recovering the underlying parameter matrix. We accomplish this using two distinct types of prior distributions: low-rank factorization priors and a spectral scaled Student prior, with the latter requiring fewer assumptions. Importantly, our results exhibit an adaptive nature by not mandating prior knowledge of the rank of the parameter matrix. Our findings are comparable to those found in the frequentist literature, yet demand fewer restrictive assumptions.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Hallucination Diversity-Aware Active Learning for Text Summarization
Authors:
Yu Xia,
Xu Liu,
Tong Yu,
Sungchul Kim,
Ryan A. Rossi,
Anup Rao,
Tung Mai,
Shuai Li
Abstract:
Large Language Models (LLMs) have shown propensity to generate hallucinated outputs, i.e., texts that are factually incorrect or unsupported. Existing methods for alleviating hallucinations typically require costly human annotations to identify and correct hallucinations in LLM outputs. Moreover, most of these methods focus on a specific type of hallucination, e.g., entity or token errors, which l…
▽ More
Large Language Models (LLMs) have shown propensity to generate hallucinated outputs, i.e., texts that are factually incorrect or unsupported. Existing methods for alleviating hallucinations typically require costly human annotations to identify and correct hallucinations in LLM outputs. Moreover, most of these methods focus on a specific type of hallucination, e.g., entity or token errors, which limits their effectiveness in addressing various types of hallucinations exhibited in LLM outputs. To our best knowledge, in this paper we propose the first active learning framework to alleviate LLM hallucinations, reducing costly human annotations of hallucination needed. By measuring fine-grained hallucinations from errors in semantic frame, discourse and content verifiability in text summarization, we propose HAllucination Diversity-Aware Sampling (HADAS) to select diverse hallucinations for annotations in active learning for LLM finetuning. Extensive experiments on three datasets and different backbone models demonstrate advantages of our method in effectively and efficiently mitigating LLM hallucinations.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Generalized multiscale finite element method for a nonlinear elastic strain-limiting Cosserat model
Authors:
Dmitry Ammosov,
Tina Mai,
Juan Galvis
Abstract:
For nonlinear Cosserat elasticity, we consider multiscale methods in this paper. In particular, we explore the generalized multiscale finite element method (GMsFEM) to solve an isotropic Cosserat problem with strain-limiting property (ensuring bounded linearized strains even under high stresses). Such strain-limiting Cosserat model can find potential applications in solids and biological fibers. H…
▽ More
For nonlinear Cosserat elasticity, we consider multiscale methods in this paper. In particular, we explore the generalized multiscale finite element method (GMsFEM) to solve an isotropic Cosserat problem with strain-limiting property (ensuring bounded linearized strains even under high stresses). Such strain-limiting Cosserat model can find potential applications in solids and biological fibers. However, Cosserat media with naturally rotational degrees of freedom, nonlinear constitutive relations, high contrast, and heterogeneities may produce challenging multiscale characteristics in the solution, and upscaling by multiscale methods is necessary. Therefore, we utilize the offline and residual-based online (adaptive or uniform) GMsFEM in this context while handling the nonlinearity by Picard iteration. Through various two-dimensional experiments (for perforated, composite, and stochastically heterogeneous media with small and big strain-limiting parameters), our numerical results show the approaches' convergence, efficiency, and robustness. In addition, these results demonstrate that such approaches provide good accuracy, the online GMsFEM gives more accurate solutions than the offline one, and the online adaptive strategy has similar accuracy to the uniform one but with fewer degrees of freedom.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Prediction of discretization of online GMsFEM using deep learning for Richards equation
Authors:
Denis Spiridonov,
Sergei Stepanov,
Tina Mai
Abstract:
We develop a new coarse-scale approximation strategy for the nonlinear single-continuum Richards equation as an unsaturated flow over heterogeneous non-periodic media, using the online generalized multiscale finite element method (online GMsFEM) together with deep learning. A novelty of this approach is that local online multiscale basis functions are computed rapidly and frequently by utilizing d…
▽ More
We develop a new coarse-scale approximation strategy for the nonlinear single-continuum Richards equation as an unsaturated flow over heterogeneous non-periodic media, using the online generalized multiscale finite element method (online GMsFEM) together with deep learning. A novelty of this approach is that local online multiscale basis functions are computed rapidly and frequently by utilizing deep neural networks (DNNs). More precisely, we employ the training set of stochastic permeability realizations and the computed relating online multiscale basis functions to train neural networks. The nonlinear map between such permeability fields and online multiscale basis functions is developed by our proposed deep learning algorithm. That is, in a new way, the predicted online multiscale basis functions incorporate the nonlinearity treatment of the Richards equation and refect any time-dependent changes in the problem's properties. Multiple numerical experiments in two-dimensional model problems show the good performance of this technique, in terms of predictions of the online multiscale basis functions and thus finding solutions.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
On high-dimensional classification by sparse generalized Bayesian logistic regression
Authors:
The Tien Mai
Abstract:
This work addresses the problem of high-dimensional classification by exploring the generalized Bayesian logistic regression method under a sparsity-inducing prior distribution. The method involves utilizing a fractional power of the likelihood resulting the fractional posterior. Our study yields concentration results for the fractional posterior, not only on the joint distribution of the predicto…
▽ More
This work addresses the problem of high-dimensional classification by exploring the generalized Bayesian logistic regression method under a sparsity-inducing prior distribution. The method involves utilizing a fractional power of the likelihood resulting the fractional posterior. Our study yields concentration results for the fractional posterior, not only on the joint distribution of the predictor and response variable but also for the regression coefficients. Significantly, we derive novel findings concerning misclassification excess risk bounds using sparse generalized Bayesian logistic regression. These results parallel recent findings for penalized methods in the frequentist literature. Furthermore, we extend our results to the scenario of model misspecification, which is of critical importance.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Competitive Facility Location under Random Utilities and Routing Constraints
Authors:
Hoang Giang Pham,
Tien Thanh Dam,
Ngan Ha Duong,
Tien Mai,
Minh Hoang Ha
Abstract:
In this paper, we study a facility location problem within a competitive market context, where customer demand is predicted by a random utility choice model. Unlike prior research, which primarily focuses on simple constraints such as a cardinality constraint on the number of selected locations, we introduce routing constraints that necessitate the selection of locations in a manner that guarantee…
▽ More
In this paper, we study a facility location problem within a competitive market context, where customer demand is predicted by a random utility choice model. Unlike prior research, which primarily focuses on simple constraints such as a cardinality constraint on the number of selected locations, we introduce routing constraints that necessitate the selection of locations in a manner that guarantees the existence of a tour visiting all chosen locations while adhering to a specified tour length upper bound. Such routing constraints find crucial applications in various real-world scenarios. The problem at hand features a non-linear objective function, resulting from the utilization of random utilities, together with complex routing constraints, making it computationally challenging. To tackle this problem, we explore three types of valid cuts, namely, outer-approximation and submodular cuts to handle the nonlinear objective function, as well as sub-tour elimination cuts to address the complex routing constraints. These lead to the development of two exact solution methods: a nested cutting plane and nested branch-and-cut algorithms, where these valid cuts are iteratively added to a master problem through two nested loops. We also prove that our nested cutting plane method always converges to optimality after a finite number of iterations. Furthermore, we develop a local search-based metaheuristic tailored for solving large-scale instances and show its pros and cons compared to exact methods. Extensive experiments are conducted on problem instances of varying sizes, demonstrating that our approach excels in terms of solution quality and computation time when compared to other baseline approaches.
△ Less
Submitted 9 March, 2024; v1 submitted 7 March, 2024;
originally announced March 2024.
-
A practical and efficient approach for Bayesian reservoir inversion: Insights from the Alvheim field data
Authors:
Karen S Auestad,
The Tien Mai,
Mina Spremic,
Jo Eidsvik
Abstract:
Stochastic reservoir characterization, a critical aspect of subsurface exploration for oil and gas reservoirs, relies on stochastic methods to model and understand subsurface properties using seismic data. This paper addresses the computational challenges associated with Bayesian reservoir inversion methods, focusing on two key obstacles: the demanding forward model and the high dimensionality of…
▽ More
Stochastic reservoir characterization, a critical aspect of subsurface exploration for oil and gas reservoirs, relies on stochastic methods to model and understand subsurface properties using seismic data. This paper addresses the computational challenges associated with Bayesian reservoir inversion methods, focusing on two key obstacles: the demanding forward model and the high dimensionality of Gaussian random fields. Leveraging the generalized Bayesian approach, we replace the intricate forward function with a computationally efficient multivariate adaptive regression splines method, resulting in a 34 acceleration in computational efficiency. For handling high-dimensional Gaussian random fields, we employ a fast Fourier transform (FFT) technique. Additionally, we explore the preconditioned Crank-Nicolson method for sampling, providing a more efficient exploration of high-dimensional parameter spaces. The practicality and efficacy of our approach are tested extensively in simulations and its validity is demonstrated in application to the Alvheim field data.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Quantum Hall Transport Measurements of Lateral p-n Junctions Formed via Precise Spatial Photodo** of Graphene/hBN Heterostructures
Authors:
Son T. Le,
Thuc T. Mai,
Maria F. Munoz,
Angela R. Hight Walker,
Curt A. Richter,
Aubrey T. Hanbicki,
Adam L. Friedman
Abstract:
Doped semiconductors are a central and crucial component of all integrated circuits. By using a combination of white light and a focused laser beam, and exploiting hBN defect states, heterostructures of hBN/Graphene/hBN are photodoped in-operando, reproducibly and reversibly. We demonstrate device geometries with spatially-defined do** type and magnitude. After each optical do** procedure, mag…
▽ More
Doped semiconductors are a central and crucial component of all integrated circuits. By using a combination of white light and a focused laser beam, and exploiting hBN defect states, heterostructures of hBN/Graphene/hBN are photodoped in-operando, reproducibly and reversibly. We demonstrate device geometries with spatially-defined do** type and magnitude. After each optical do** procedure, magnetotransport measurements including quantum Hall measurements are performed to characterize the device performance. In the unipolar (p+-p-p+ and n-n+-n) configurations, we observe quantization of the longitudinal resistance, proving well-defined doped regions and interfaces that are further analyzed by Landauer-Buttiker modeling. Our unique measurements and modeling of these optically doped devices reveal a complete separation of the p- and n-Landau level edge states. The non-interaction of the edge states results in an observed "insulating" state in devices with a bi-polar p-n-p configuration that is uncommon and has not been measured previously in graphene devices. This insulating state could be utilized in high-performance graphene electrical switches. These quantitative magnetotransport measurements confirm that these do** techniques can be applied to any 2D materials encapsulated within hBN layers, enabling versatile, rewritable circuit elements for future computing and memory applications.
△ Less
Submitted 3 June, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
SPRINQL: Sub-optimal Demonstrations driven Offline Imitation Learning
Authors:
Huy Hoang,
Tien Mai,
Pradeep Varakantham
Abstract:
We focus on offline imitation learning (IL), which aims to mimic an expert's behavior using demonstrations without any interaction with the environment. One of the main challenges in offline IL is the limited support of expert demonstrations, which typically cover only a small fraction of the state-action space. While it may not be feasible to obtain numerous expert demonstrations, it is often pos…
▽ More
We focus on offline imitation learning (IL), which aims to mimic an expert's behavior using demonstrations without any interaction with the environment. One of the main challenges in offline IL is the limited support of expert demonstrations, which typically cover only a small fraction of the state-action space. While it may not be feasible to obtain numerous expert demonstrations, it is often possible to gather a larger set of sub-optimal demonstrations. For example, in treatment optimization problems, there are varying levels of doctor treatments available for different chronic conditions. These range from treatment specialists and experienced general practitioners to less experienced general practitioners. Similarly, when robots are trained to imitate humans in routine tasks, they might learn from individuals with different levels of expertise and efficiency.
In this paper, we propose an offline IL approach that leverages the larger set of sub-optimal demonstrations while effectively mimicking expert trajectories. Existing offline IL methods based on behavior cloning or distribution matching often face issues such as overfitting to the limited set of expert demonstrations or inadvertently imitating sub-optimal trajectories from the larger dataset. Our approach, which is based on inverse soft-Q learning, learns from both expert and sub-optimal demonstrations. It assigns higher importance (through learned weights) to aligning with expert demonstrations and lower importance to aligning with sub-optimal ones. A key contribution of our approach, called SPRINQL, is transforming the offline IL problem into a convex optimization over the space of Q functions. Through comprehensive experimental evaluations, we demonstrate that the SPRINQL algorithm achieves state-of-the-art (SOTA) performance on offline IL benchmarks. Code is available at https://github.com/hmhuy2000/SPRINQL.
△ Less
Submitted 23 May, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Super-droplet-repellent carbon-based printable perovskite solar cells
Authors:
Cuc Thi Kim Mai,
Janne Halme,
Heikki A. Nurmi,
Aldeliane M. da Silva,
Gabriela S. Lorite,
David Martineau,
Stéphanie Narbey,
Naeimeh Mozaffari,
Robin H. A. Ras,
Syed Ghufran Hashmi and,
Maja Vuckovac
Abstract:
Despite attractive cost-effectiveness, scalability, and superior stability, carbon-based printable perovskite solar cells (CPSCs) still face moisture-induced degradation that limits their lifespan and commercial potential. Here, we investigate the moisture-preventing mechanisms of thin nanostructured super-repellent coating (advancing contact angle $>$167$^{\circ}$ and contact angle hysteresis 7…
▽ More
Despite attractive cost-effectiveness, scalability, and superior stability, carbon-based printable perovskite solar cells (CPSCs) still face moisture-induced degradation that limits their lifespan and commercial potential. Here, we investigate the moisture-preventing mechanisms of thin nanostructured super-repellent coating (advancing contact angle $>$167$^{\circ}$ and contact angle hysteresis 7$^{\circ}$ integrated into CPSCs for different moisture forms (falling water droplets vs water vapor vs condensed water droplets). We show that unencapsulated super-repellent CPSCs have superior performance under continuous droplet impact for 12h (rain simulation experiments) compared to unencapsulated pristine (uncoated) CPSCs that degrade within seconds. Contrary to falling water droplets, where super-repellent coating serves as a shield, we found water vapor to physisorb through porous super-repellent coating (room temperature and relative humidity, RH 65\% and 85\%) that increased the CPSCs performance for 21\% during ~43 days similarly to pristine CPSCs. We further showed that, water condensation forms within or below the super-repellent coating (40$^{\circ}$ C and RH 85\%), followed by chemisorption and degradation of CPSCs. Because different forms of water have distinct effect on CPSC, we suggest that future standard tests for repellent CPSCs should include rain simulation and condensation tests. Our findings will thus inspire the development of super-repellent coatings for moisture prevention.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
High-dimensional sparse classification using exponential weighting with empirical hinge loss
Authors:
The Tien Mai
Abstract:
In this study, we address the problem of high-dimensional binary classification. Our proposed solution involves employing an aggregation technique founded on exponential weights and empirical hinge loss. Through the employment of a suitable sparsity-inducing prior distribution, we demonstrate that our method yields favorable theoretical results on prediction error. The efficiency of our procedure…
▽ More
In this study, we address the problem of high-dimensional binary classification. Our proposed solution involves employing an aggregation technique founded on exponential weights and empirical hinge loss. Through the employment of a suitable sparsity-inducing prior distribution, we demonstrate that our method yields favorable theoretical results on prediction error. The efficiency of our procedure is achieved through the utilization of Langevin Monte Carlo, a gradient-based sampling approach. To illustrate the effectiveness of our approach, we conduct comparisons with the logistic Lasso on simulated data and a real dataset. Our method frequently demonstrates superior performance compared to the logistic Lasso.
△ Less
Submitted 6 March, 2024; v1 submitted 20 December, 2023;
originally announced December 2023.
-
Misclassification excess risk bounds for 1-bit matrix completion
Authors:
The Tien Mai
Abstract:
This study investigates the misclassification excess risk bound in the context of 1-bit matrix completion, a significant problem in machine learning involving the recovery of an unknown matrix from a limited subset of its entries. Matrix completion has garnered considerable attention in the last two decades due to its diverse applications across various fields. Unlike conventional approaches that…
▽ More
This study investigates the misclassification excess risk bound in the context of 1-bit matrix completion, a significant problem in machine learning involving the recovery of an unknown matrix from a limited subset of its entries. Matrix completion has garnered considerable attention in the last two decades due to its diverse applications across various fields. Unlike conventional approaches that deal with real-valued samples, 1-bit matrix completion is concerned with binary observations. While prior research has predominantly focused on the estimation error of proposed estimators, our study shifts attention to the prediction error. This paper offers theoretical analysis regarding the prediction errors of two previous works utilizing the logistic regression model: one employing a max-norm constrained minimization and the other employing nuclear-norm penalization. Significantly, our findings demonstrate that the latter achieves the minimax-optimal rate without the need for an additional logarithmic term. These novel results contribute to a deeper understanding of 1-bit matrix completion by shedding light on the predictive performance of specific methodologies.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement Learning
Authors:
Huy Hoang,
Tien Mai,
Pradeep Varakantham
Abstract:
A popular framework for enforcing safe actions in Reinforcement Learning (RL) is Constrained RL, where trajectory based constraints on expected cost (or other cost measures) are employed to enforce safety and more importantly these constraints are enforced while maximizing expected reward. Most recent approaches for solving Constrained RL convert the trajectory based cost constraint into a surroga…
▽ More
A popular framework for enforcing safe actions in Reinforcement Learning (RL) is Constrained RL, where trajectory based constraints on expected cost (or other cost measures) are employed to enforce safety and more importantly these constraints are enforced while maximizing expected reward. Most recent approaches for solving Constrained RL convert the trajectory based cost constraint into a surrogate problem that can be solved using minor modifications to RL methods. A key drawback with such approaches is an over or underestimation of the cost constraint at each state. Therefore, we provide an approach that does not modify the trajectory based cost constraint and instead imitates ``good'' trajectories and avoids ``bad'' trajectories generated from incrementally improving policies. We employ an oracle that utilizes a reward threshold (which is varied with learning) and the overall cost constraint to label trajectories as ``good'' or ``bad''. A key advantage of our approach is that we are able to work from any starting policy or set of trajectories and improve on it. In an exhaustive set of experiments, we demonstrate that our approach is able to outperform top benchmark approaches for solving Constrained RL problems, with respect to expected cost, CVaR cost, or even unknown cost constraints.
△ Less
Submitted 13 March, 2024; v1 submitted 16 December, 2023;
originally announced December 2023.
-
Raman fingerprints of spin-phonon coupling and magnetic transition in an organic molecule intercalated Cr2Ge2Te6
Authors:
Sudeshna Samanta,
Hector Iturriaga,
Thuc T. Mai,
Adam J. Biacchi,
Rajbul Islam,
Angela R. Hight Walker,
Mohamed Fathi Sanad,
Charudatta Phatak,
Ryan Siebenaller,
Emmanuel Rowe Michael A. Susner,
Fei Xue,
Srinivasa R. Singamaneni
Abstract:
The manipulation of spin-phonon coupling in both formations and explorations of magnetism in two-dimensional van der Waals ferromagnetic semiconductors facilitates unprecedented prospects for spintronics devices. The interlayer engineering tunes spin-phonon coupling significantly and holds the promise for controllable magnetism via organic cation intercalation. Here, we present spectroscopic evide…
▽ More
The manipulation of spin-phonon coupling in both formations and explorations of magnetism in two-dimensional van der Waals ferromagnetic semiconductors facilitates unprecedented prospects for spintronics devices. The interlayer engineering tunes spin-phonon coupling significantly and holds the promise for controllable magnetism via organic cation intercalation. Here, we present spectroscopic evidence to reveal the intercalation effect on intrinsic magnetic and electronic transitions in quasi-two-dimensional Cr2Ge2Te6 using tetrabutyl ammonium as the intercalant. The temperature-evolution of Raman modes E_g^3 and A_g^1, along with the magnetization measurements, unambiguously captures the enhancement of the ferromagnetic Curie temperature in the intercalated heterostructure. Moreover, the E_g^4 mode highlighted the increased effect of spin-phonon interaction in magnetic order-induced lattice distortion. Combined with the first-principle calculations, we observed a substantial number of electrons transferred from TBA+ to Cr through the interface. These results provide the interplay between spin-phonon coupling and magnetic ordering in van der Waals magnets where Raman fingerprints would be highly beneficial for further understanding the manipulation of magnetism in layered heterostructures.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Fiber-based Ratiometric Optical Thermometry with Silicon-Vacancy in Microdiamonds
Authors:
Md Shakhawath Hossain,
Miguel Bacaoco,
Thi Ngoc Anh Mai,
Guillaume Ponchon,
Chaohao Chen,
Lei Ding,
Yongliang Chen,
Evgeny Ekimov,
Helen Xu,
Alexander S. Solntsev,
Toan Trong Tran
Abstract:
Fiber optic all-optical thermometry is a promising technology to track temperature at a micro-scale while designing efficient and reliable microelectronic devices and components. In this work, we demonstrate a novel real-time ratiometric fiber optic thermometry technique based on silicon-vacancy (SiV) diamond that shows the highest temperature resolution (22.91 KHz^(-1/2) Wcm^(-2)) and spatial res…
▽ More
Fiber optic all-optical thermometry is a promising technology to track temperature at a micro-scale while designing efficient and reliable microelectronic devices and components. In this work, we demonstrate a novel real-time ratiometric fiber optic thermometry technique based on silicon-vacancy (SiV) diamond that shows the highest temperature resolution (22.91 KHz^(-1/2) Wcm^(-2)) and spatial resolution (~7.5 um) among all-optical fiber-based thermosensors reported to date. Instead of analyzing the spectral features of temperature-dependent SiV signal, coming from SiV micro-diamond fixed on the fiber tip, an alternative parallel detection method based on filtering optics and photon counters is proposed to read out the sample temperature in real-time. The signal collection efficiency of the fiber is also investigated numerically with semi-analytic ray-optical analysis and then compared with our experimental study. We finally demonstrate the performance of the thermosensor by monitoring the temperature at distinct locations in a lab-built graphite-based microheater device. Our work introduces a reconfigurable method for temperature monitoring in microelectronic, microfluidic devices, or biological environments and unlocks a new direction for fiber-based all-optical thermometry research.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Cryogenic Thermal Shock Effects on Optical Properties of Quantum Emitters in Hexagonal Boron Nitride
Authors:
Thi Ngoc Anh Mai,
Sajid Ali,
Md Shakhawath Hossain,
Chaohao Chen,
Lei Ding,
Yongliang Chen,
Alexander S. Solntsev,
Hongwei Mou,
Xiaoxue Xu,
Nikhil Medhekar,
Toan Trong Tran
Abstract:
Solid-state quantum emitters are vital building blocks for quantum information science and quantum technology. Among various types of solid-state emitters discovered to date, color centers in hexagonal boron nitride have garnered tremendous traction in recent years thanks to their environmental robustness, high brightness and room-temperature operation. Most recently, these quantum emitters have b…
▽ More
Solid-state quantum emitters are vital building blocks for quantum information science and quantum technology. Among various types of solid-state emitters discovered to date, color centers in hexagonal boron nitride have garnered tremendous traction in recent years thanks to their environmental robustness, high brightness and room-temperature operation. Most recently, these quantum emitters have been employed for satellite-based quantum key distribution. One of the most important requirements to qualify these emitters for space-based applications is their optical stability against cryogenic thermal shock. Such understanding has, however, remained elusive to date. Here, we report on the effects caused by such thermal shock which induces random, irreversible changes in the spectral characteristics of the quantum emitters. By employing a combination of structural characterizations and density functional calculations, we attribute the observed changes to lattice strains caused by the cryogenic temperature shock. Our study shed light on the stability of the quantum emitters under extreme conditions, similar to those countered in outer space.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Scalable and Adaptively Secure Any-Trust Distributed Key Generation and All-hands Checkpointing
Authors:
Hanwen Feng,
Tiancheng Mai,
Qiang Tang
Abstract:
The classical distributed key generation protocols (DKG) are resurging due to their widespread applications in blockchain. While efforts have been made to improve DKG communication, practical large-scale deployments are still yet to come due to various challenges, including the heavy computation and communication (particularly broadcast) overhead in their adversarial cases. In this paper, we propo…
▽ More
The classical distributed key generation protocols (DKG) are resurging due to their widespread applications in blockchain. While efforts have been made to improve DKG communication, practical large-scale deployments are still yet to come due to various challenges, including the heavy computation and communication (particularly broadcast) overhead in their adversarial cases. In this paper, we propose a practical DKG for DLog-based cryptosystems, which achieves (quasi-)linear computation and communication per-node cost with the help of a common coin, even in the face of the maximal amount of Byzantine nodes. Moreover, our protocol is secure against adaptive adversaries, which can corrupt less than half of all nodes. The key to our improvements lies in delegating the most costly operations to an Any-Trust group together with a set of techniques for adaptive security. This group is randomly sampled and consists of a small number of individuals. The population only trusts that at least one member in the group is honest, without knowing which one. Moreover, we present a generic transformer that enables us to efficiently deploy a conventional distributed protocol like our DKG, even when the participants have different weights. Additionally, we introduce an extended broadcast channel based on a blockchain and data dispersal network (such as IPFS), enabling reliable broadcasting of arbitrary-size messages at the cost of constant-size blockchain storage.
△ Less
Submitted 5 May, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Key Factors Affecting European Reactions to AI in European Full and Flawed Democracies
Authors:
Long Pham,
Barry O'Sullivan,
Tai Tan Mai
Abstract:
This study examines the key factors that affect European reactions to artificial intelligence (AI) in the context of both full and flawed democracies in Europe. Analysing a dataset of 4,006 respondents, categorised into full democracies and flawed democracies based on the Democracy Index developed by the Economist Intelligence Unit (EIU), this research identifies crucial factors that shape Europea…
▽ More
This study examines the key factors that affect European reactions to artificial intelligence (AI) in the context of both full and flawed democracies in Europe. Analysing a dataset of 4,006 respondents, categorised into full democracies and flawed democracies based on the Democracy Index developed by the Economist Intelligence Unit (EIU), this research identifies crucial factors that shape European attitudes toward AI in these two types of democracies. The analysis reveals noteworthy findings. Firstly, it is observed that flawed democracies tend to exhibit higher levels of trust in government entities compared to their counterparts in full democracies. Additionally, individuals residing in flawed democracies demonstrate a more positive attitude toward AI when compared to respondents from full democracies. However, the study finds no significant difference in AI awareness between the two types of democracies, indicating a similar level of general knowledge about AI technologies among European citizens. Moreover, the study reveals that trust in AI measures, specifically "Trust AI Solution", does not significantly vary between full and flawed democracies. This suggests that despite the differences in democratic quality, both types of democracies have similar levels of confidence in AI solutions.
△ Less
Submitted 4 October, 2023;
originally announced November 2023.
-
Inverse Factorized Q-Learning for Cooperative Multi-agent Imitation Learning
Authors:
The Viet Bui,
Tien Mai,
Thanh Hong Nguyen
Abstract:
This paper concerns imitation learning (IL) (i.e, the problem of learning to mimic expert behaviors from demonstrations) in cooperative multi-agent systems. The learning problem under consideration poses several challenges, characterized by high-dimensional state and action spaces and intricate inter-agent dependencies. In a single-agent setting, IL has proven to be done efficiently through an inv…
▽ More
This paper concerns imitation learning (IL) (i.e, the problem of learning to mimic expert behaviors from demonstrations) in cooperative multi-agent systems. The learning problem under consideration poses several challenges, characterized by high-dimensional state and action spaces and intricate inter-agent dependencies. In a single-agent setting, IL has proven to be done efficiently through an inverse soft-Q learning process given expert demonstrations. However, extending this framework to a multi-agent context introduces the need to simultaneously learn both local value functions to capture local observations and individual actions, and a joint value function for exploiting centralized learning. In this work, we introduce a novel multi-agent IL algorithm designed to address these challenges. Our approach enables the centralized learning by leveraging mixing networks to aggregate decentralized Q functions. A main advantage of this approach is that the weights of the mixing networks can be trained using information derived from global states. We further establish conditions for the mixing networks under which the multi-agent objective function exhibits convexity within the Q function space. We present extensive experiments conducted on some challenging competitive and cooperative multi-agent game environments, including an advanced version of the Star-Craft multi-agent challenge (i.e., SMACv2), which demonstrates the effectiveness of our proposed algorithm compared to existing state-of-the-art multi-agent IL algorithms.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
Numerical modeling of hydrogel scaffold anisotropy during extrusion-based 3D printing for tissue engineering
Authors:
V. T. Mai,
R. Chatelin,
E. -J. Courtial,
C. Boulocher,
R. Rieger
Abstract:
Extrusion-based 3D printing is a recent and widely used tissue engineering tool that provides precise 3D control of bioinks to create organ-size biomaterial-based objects composed of hierarchically organized cellularized scaffolds. The internal organization of scaffold constituents should mimic the structural anisotropy of targeted tissue to stimulate proliferative cellular behavior during 3D cell…
▽ More
Extrusion-based 3D printing is a recent and widely used tissue engineering tool that provides precise 3D control of bioinks to create organ-size biomaterial-based objects composed of hierarchically organized cellularized scaffolds. The internal organization of scaffold constituents should mimic the structural anisotropy of targeted tissue to stimulate proliferative cellular behavior during 3D cell culture. Both the choice of polymers constituting the bioink and the topological properties during the extrusion process greatly influence the structural anisotropy and cellular response of tissue engineering constructs. The bioink used in our study was a hydrogel made of three constituents: fibrinogen, alginate and gelatin. These components provide biocompatibility, printability and conservation of 3D shape after printing. The topological properties in flowing polymers are dictated by macromolecule conformation i.e. their orientation and degree of stretch. In this study, we used the micro-macro approach to describe the orientation state of hydrogel macromolecules during the extrusion process, offering a two-scale description of fluid behavior. The goal of our study was to use the Fokker-Planck equation, which describes the evolution of probability distribution function over time, to represent the real state of the constituent population in the representative elementary volume within a hydrogel during extrusion-based 3D printing. Our data suggest that for a tubular nozzle syringe, constituent orientation is driven by a high shear rate, which overcomes the fluid rheological behavior. Also, the interaction coefficient, which represents the microscopic interaction between fluid particles overcomes hydrogel behavior for constituent orientation in the prediction model.
△ Less
Submitted 8 October, 2023; v1 submitted 4 October, 2023;
originally announced October 2023.
-
Understanding the limitations of self-supervised learning for tabular anomaly detection
Authors:
Kimberly T. Mai,
Toby Davies,
Lewis D. Griffin
Abstract:
While self-supervised learning has improved anomaly detection in computer vision and natural language processing, it is unclear whether tabular data can benefit from it. This paper explores the limitations of self-supervision for tabular anomaly detection. We conduct several experiments spanning various pretext tasks on 26 benchmark datasets to understand why this is the case. Our results confirm…
▽ More
While self-supervised learning has improved anomaly detection in computer vision and natural language processing, it is unclear whether tabular data can benefit from it. This paper explores the limitations of self-supervision for tabular anomaly detection. We conduct several experiments spanning various pretext tasks on 26 benchmark datasets to understand why this is the case. Our results confirm representations derived from self-supervision do not improve tabular anomaly detection performance compared to using the raw representations of the data. We show this is due to neural networks introducing irrelevant features, which reduces the effectiveness of anomaly detectors. However, we demonstrate that using a subspace of the neural network's representation can recover performance.
△ Less
Submitted 14 March, 2024; v1 submitted 15 September, 2023;
originally announced September 2023.
-
From SMOTE to Mixup for Deep Imbalanced Classification
Authors:
Wei-Chao Cheng,
Tan-Ha Mai,
Hsuan-Tien Lin
Abstract:
Given imbalanced data, it is hard to train a good classifier using deep learning because of the poor generalization of minority classes. Traditionally, the well-known synthetic minority oversampling technique (SMOTE) for data augmentation, a data mining approach for imbalanced learning, has been used to improve this generalization. However, it is unclear whether SMOTE also benefits deep learning.…
▽ More
Given imbalanced data, it is hard to train a good classifier using deep learning because of the poor generalization of minority classes. Traditionally, the well-known synthetic minority oversampling technique (SMOTE) for data augmentation, a data mining approach for imbalanced learning, has been used to improve this generalization. However, it is unclear whether SMOTE also benefits deep learning. In this work, we study why the original SMOTE is insufficient for deep learning, and enhance SMOTE using soft labels. Connecting the resulting soft SMOTE with Mixup, a modern data augmentation technique, leads to a unified framework that puts traditional and modern data augmentation techniques under the same umbrella. A careful study within this framework shows that Mixup improves generalization by implicitly achieving uneven margins between majority and minority classes. We then propose a novel margin-aware Mixup technique that more explicitly achieves uneven margins. Extensive experimental results demonstrate that our proposed technique yields state-of-the-art performance on deep imbalanced classification while achieving superior performance on extremely imbalanced data. The code is open-sourced in our developed package https://github.com/ntucllab/imbalanced-DL to foster future research in this direction.
△ Less
Submitted 3 November, 2023; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Mimicking To Dominate: Imitation Learning Strategies for Success in Multiagent Competitive Games
Authors:
The Viet Bui,
Tien Mai,
Thanh Hong Nguyen
Abstract:
Training agents in multi-agent competitive games presents significant challenges due to their intricate nature. These challenges are exacerbated by dynamics influenced not only by the environment but also by opponents' strategies. Existing methods often struggle with slow convergence and instability. To address this, we harness the potential of imitation learning to comprehend and anticipate oppon…
▽ More
Training agents in multi-agent competitive games presents significant challenges due to their intricate nature. These challenges are exacerbated by dynamics influenced not only by the environment but also by opponents' strategies. Existing methods often struggle with slow convergence and instability. To address this, we harness the potential of imitation learning to comprehend and anticipate opponents' behavior, aiming to mitigate uncertainties with respect to the game dynamics. Our key contributions include: (i) a new multi-agent imitation learning model for predicting next moves of the opponents -- our model works with hidden opponents' actions and local observations; (ii) a new multi-agent reinforcement learning algorithm that combines our imitation learning model and policy training into one single training process; and (iii) extensive experiments in three challenging game environments, including an advanced version of the Star-Craft multi-agent challenge (i.e., SMACv2). Experimental results show that our approach achieves superior performance compared to existing state-of-the-art multi-agent RL algorithms.
△ Less
Submitted 20 August, 2023;
originally announced August 2023.
-
Computing the noncommutative inner rank by means of operator-valued free probability theory
Authors:
Johannes Hoffmann,
Tobias Mai,
Roland Speicher
Abstract:
We address the noncommutative version of the Edmonds' problem, which asks to determine the inner rank of a matrix in noncommuting variables. We provide an algorithm for the calculation of this inner rank by relating the problem with the distribution of a basic object in free probability theory, namely operator-valued semicircular elements. We have to solve a matrix-valued quadratic equation, for w…
▽ More
We address the noncommutative version of the Edmonds' problem, which asks to determine the inner rank of a matrix in noncommuting variables. We provide an algorithm for the calculation of this inner rank by relating the problem with the distribution of a basic object in free probability theory, namely operator-valued semicircular elements. We have to solve a matrix-valued quadratic equation, for which we provide precise analytical and numerical control on the fixed point algorithm for solving the equation. Numerical examples show the efficiency of the algorithm.
△ Less
Submitted 28 June, 2024; v1 submitted 7 August, 2023;
originally announced August 2023.
-
A general method for estimating zonal transmission interface limits from nodal network data
Authors:
Patrick R. Brown,
Clayton P. Barrows,
Jarrad G. Wright,
Gregory L. Brinkman,
Sourabh Dalvi,
Jiazi Zhang,
Trieu Mai
Abstract:
Capacity expansion models for the electric power system often employ zonal (rather than nodal) resolution, necessitating estimates of aggregate power transfer limits across the interfaces between model zones. Interface limits between planning areas are sometimes published, but they are not generalizable to arbitrary zone shapes. There is thus a need for a reproducible method for estimating interfa…
▽ More
Capacity expansion models for the electric power system often employ zonal (rather than nodal) resolution, necessitating estimates of aggregate power transfer limits across the interfaces between model zones. Interface limits between planning areas are sometimes published, but they are not generalizable to arbitrary zone shapes. There is thus a need for a reproducible method for estimating interface transfer limits (ITLs) between user-defined zones directly from nodal transmission system data. Here, we present a simple method for estimating ITLs using a DC power flow approximation via the power transfer distribution factor (PTDF) matrix. Linear optimization is performed to identify the distribution of power flows that maximizes the total flow on interface-crossing lines, subject to individual line ratings, limits on bus injection/withdrawal, and the relationships among flows, injections, and withdrawals imposed by the PTDF matrix. We demonstrate the application of the method on a 134-zone ~65000-bus system, and we explore the influence of flow direction, contingency level, and zone size on the estimated ITLs. There is significant heterogeneity in the ratio of the ITL to the sum of interface-crossing line ratings, which highlights the importance of accounting for the physical constraints on power flows imposed by Kirchhoff's laws when estimating zonal ITLs.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
A reduced-rank approach to predicting multiple binary responses through machine learning
Authors:
The Tien Mai
Abstract:
This paper investigates the problem of simultaneously predicting multiple binary responses by utilizing a shared set of covariates. Our approach incorporates machine learning techniques for binary classification, without making assumptions about the underlying observations. Instead, our focus lies on a group of predictors, aiming to identify the one that minimizes prediction error. Unlike previous…
▽ More
This paper investigates the problem of simultaneously predicting multiple binary responses by utilizing a shared set of covariates. Our approach incorporates machine learning techniques for binary classification, without making assumptions about the underlying observations. Instead, our focus lies on a group of predictors, aiming to identify the one that minimizes prediction error. Unlike previous studies that primarily address estimation error, we directly analyze the prediction error of our method using PAC-Bayesian bounds techniques. In this paper, we introduce a pseudo-Bayesian approach capable of handling incomplete response data. Our strategy is efficiently implemented using the Langevin Monte Carlo method. Through simulation studies and a practical application using real data, we demonstrate the effectiveness of our proposed method, producing comparable or sometimes superior results compared to the current state-of-the-art method.
△ Less
Submitted 6 March, 2024; v1 submitted 9 June, 2023;
originally announced June 2023.
-
Network-based Representations and Dynamic Discrete Choice Models for Multiple Discrete Choice Analysis
Authors:
Hung Tran,
Tien Mai
Abstract:
In many choice modeling applications, people demand is frequently characterized as multiple discrete, which means that people choose multiple items simultaneously. The analysis and prediction of people behavior in multiple discrete choice situations pose several challenges. In this paper, to address this, we propose a random utility maximization (RUM) based model that considers each subset of choi…
▽ More
In many choice modeling applications, people demand is frequently characterized as multiple discrete, which means that people choose multiple items simultaneously. The analysis and prediction of people behavior in multiple discrete choice situations pose several challenges. In this paper, to address this, we propose a random utility maximization (RUM) based model that considers each subset of choice alternatives as a composite alternative, where individuals choose a subset according to the RUM framework. While this approach offers a natural and intuitive modeling approach for multiple-choice analysis, the large number of subsets of choices in the formulation makes its estimation and application intractable. To overcome this challenge, we introduce directed acyclic graph (DAG) based representations of choices where each node of the DAG is associated with an elemental alternative and additional information such that the number of selected elemental alternatives. Our innovation is to show that the multi-choice model is equivalent to a recursive route choice model on the DAG, leading to the development of new efficient estimation algorithms based on dynamic programming. In addition, the DAG representations enable us to bring some advanced route choice models to capture the correlation between subset choice alternatives. Numerical experiments based on synthetic and real datasets show many advantages of our modeling approach and the proposed estimation algorithms.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
CLImage: Human-Annotated Datasets for Complementary-Label Learning
Authors:
Hsiu-Hsuan Wang,
Tan-Ha Mai,
Nai-Xuan Ye,
Wei-I Lin,
Hsuan-Tien Lin
Abstract:
Complementary-label learning (CLL) is a weakly-supervised learning paradigm that aims to train a multi-class classifier using only complementary labels, which indicate classes to which an instance does not belong. Despite numerous algorithmic proposals for CLL, their practical applicability remains unverified for two reasons. Firstly, these algorithms often rely on assumptions about the generation…
▽ More
Complementary-label learning (CLL) is a weakly-supervised learning paradigm that aims to train a multi-class classifier using only complementary labels, which indicate classes to which an instance does not belong. Despite numerous algorithmic proposals for CLL, their practical applicability remains unverified for two reasons. Firstly, these algorithms often rely on assumptions about the generation of complementary labels, and it is not clear how far the assumptions are from reality. Secondly, their evaluation has been limited to synthetic datasets. To gain insights into the real-world performance of CLL algorithms, we developed a protocol to collect complementary labels from human annotators. Our efforts resulted in the creation of four datasets: CLCIFAR10, CLCIFAR20, CLMicroImageNet10, and CLMicroImageNet20, derived from well-known classification datasets CIFAR10, CIFAR100, and TinyImageNet200. These datasets represent the very first real-world CLL datasets. Through extensive benchmark experiments, we discovered a notable decrease in performance when transitioning from synthetic datasets to real-world datasets. We investigated the key factors contributing to the decrease with a thorough dataset-level ablation study. Our analyses highlight annotation noise as the most influential factor in the real-world datasets. In addition, we discover that the biased-nature of human-annotated complementary labels and the difficulty to validate with only complementary labels are two outstanding barriers to practical CLL. These findings suggest that the community focus more research efforts on develo** CLL algorithms and validation schemes that are robust to noisy and biased complementary-label distributions.
△ Less
Submitted 22 June, 2024; v1 submitted 14 May, 2023;
originally announced May 2023.
-
Constrained Assortment Optimization under the Cross-Nested Logit Model
Authors:
Cuong Le,
Tien Mai
Abstract:
We study the assortment optimization problem under general linear constraints, where the customer choice behavior is captured by the Cross-Nested Logit model. In this problem, there is a set of products organized into multiple subsets (or nests), where each product can belong to more than one nest. The aim is to find an assortment to offer to customers so that the expected revenue is maximized. We…
▽ More
We study the assortment optimization problem under general linear constraints, where the customer choice behavior is captured by the Cross-Nested Logit model. In this problem, there is a set of products organized into multiple subsets (or nests), where each product can belong to more than one nest. The aim is to find an assortment to offer to customers so that the expected revenue is maximized. We show that, under the Cross-Nested Logit model, the assortment problem is NP-hard, even without any constraints. To tackle the assortment optimization problem, we develop a new discretization mechanism to approximate the problem by a linear fractional program with a performance guarantee of $\frac{1 - ε}{1+ε}$, for any accuracy level $ε>0$. We then show that optimal solutions to the approximate problem can be obtained by solving mixed-integer linear programs. We further show that our discretization approach can also be applied to solve a joint assortment optimization and pricing problem, as well as an assortment problem under a mixture of Cross-Nested Logit models to account for multiple classes of customers. Our empirical results on a large number of randomly generated test instances demonstrate that, under a performance guarantee of 90%, the percentage gaps between the objective values obtained from our approximation methods and the optimal expected revenues are no larger than 1.2%.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
Dynamic Vector Bin Packing for Online Resource Allocation in the Cloud
Authors:
Aniket Murhekar,
David Arbour,
Tung Mai,
Anup Rao
Abstract:
Several cloud-based applications, such as cloud gaming, rent servers to execute jobs which arrive in an online fashion. Each job has a resource demand and must be dispatched to a cloud server which has enough resources to execute the job, which departs after its completion. Under the `pay-as-you-go' billing model, the server rental cost is proportional to the total time that servers are actively r…
▽ More
Several cloud-based applications, such as cloud gaming, rent servers to execute jobs which arrive in an online fashion. Each job has a resource demand and must be dispatched to a cloud server which has enough resources to execute the job, which departs after its completion. Under the `pay-as-you-go' billing model, the server rental cost is proportional to the total time that servers are actively running jobs. The problem of efficiently allocating a sequence of online jobs to servers without exceeding the resource capacity of any server while minimizing total server usage time can be modelled as a variant of the dynamic bin packing problem (DBP), called MinUsageTime DBP.
In this work, we initiate the study of the problem with multi-dimensional resource demands (e.g. CPU/GPU usage, memory requirement, bandwidth usage, etc.), called MinUsageTime Dynamic Vector Bin Packing (DVBP). We study the competitive ratio (CR) of Any Fit packing algorithms for this problem. We show almost-tight bounds on the CR of three specific Any Fit packing algorithms, namely First Fit, Next Fit, and Move To Front. We prove that the CR of Move To Front is at most $(2μ+1)d +1$, where $μ$ is the ratio of the max/min item durations. For $d=1$, this significantly improves the previously known upper bound of $6μ+7$ (Kamali & Lopez-Ortiz, 2015). We then prove the CR of First Fit and Next Fit are bounded by $(μ+2)d+1$ and $2μd+1$, respectively. Next, we prove a lower bound of $(μ+1)d$ on the CR of any Any Fit packing algorithm, an improved lower bound of $2μd$ for Next Fit, and a lower bound of $2μ$ for Move To Front in the 1-D case. All our bounds improve or match the best-known bounds for the 1-D case. Finally, we experimentally study the average-case performance of these algorithms on randomly generated synthetic data, and observe that Move To Front outperforms other Any Fit packing algorithms.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
PaaS: Planning as a Service for reactive driving in CARLA Leaderboard
Authors:
Nhat Hao Truong,
Huu Thien Mai,
Tuan Anh Tran,
Minh Quang Tran,
Duc Duy Nguyen,
Ngoc Viet Phuong Pham
Abstract:
End-to-end deep learning approaches has been proven to be efficient in autonomous driving and robotics. By using deep learning techniques for decision-making, those systems are often referred to as a black box, and the result is driven by data. In this paper, we propose PaaS (Planning as a Service), a vanilla module to generate local trajectory planning for autonomous driving in CARLA simulation.…
▽ More
End-to-end deep learning approaches has been proven to be efficient in autonomous driving and robotics. By using deep learning techniques for decision-making, those systems are often referred to as a black box, and the result is driven by data. In this paper, we propose PaaS (Planning as a Service), a vanilla module to generate local trajectory planning for autonomous driving in CARLA simulation. Our method is submitted in International CARLA Autonomous Driving Leaderboard (CADL), which is a platform to evaluate the driving proficiency of autonomous agents in realistic traffic scenarios. Our approach focuses on reactive planning in Frenet frame under complex urban street's constraints and driver's comfort. The planner generates a collection of feasible trajectories, leveraging heuristic cost functions with controllable driving style factor to choose the optimal-control path that satisfies safe travelling criteria. PaaS can provide sufficient solutions to handle well under challenging traffic situations in CADL. As the strict evaluation in CADL Map Track, our approach ranked 3rd out of 9 submissions regarding the measure of driving score. However, with the focus on minimizing the risk of maneuver and ensuring passenger safety, our figures corresponding to infraction penalty dominate the two leading submissions for 20 percent.
△ Less
Submitted 14 June, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
The Stable Matching Lattice under Changed Preferences, and Associated Algorithms
Authors:
Rohith Reddy Gangam,
Tung Mai,
Nitya Raju,
Vijay V. Vazirani
Abstract:
[MV18] introduced a fundamental new algorithmic question on stable matching, namely finding a matching that is stable under two ``nearby'' instances, where ``nearby'' meant that in going from instance $A$ to $B$, only one agent changes its preference list. By first establishing a sequence of structural results on the lattices of $A$ and $B$, [MV18] and [GMRV22] settled all algorithmic questions re…
▽ More
[MV18] introduced a fundamental new algorithmic question on stable matching, namely finding a matching that is stable under two ``nearby'' instances, where ``nearby'' meant that in going from instance $A$ to $B$, only one agent changes its preference list. By first establishing a sequence of structural results on the lattices of $A$ and $B$, [MV18] and [GMRV22] settled all algorithmic questions related to this case. The current paper essentially settles the general case.
Assume that instance $B$ is obtained from $A$, both on $n$ workers and $n$ firms, via changes in the preferences of $p$ workers and $q$ firms. If so, we will denote the change by $(p, q)$. Thus [MV18] and [GMRV22] settled the case $(0, 1)$, since they adopt the convention that one firm changes its preferences. Let $\mathcal{M}_A$ and $\mathcal{M}_B$ be the sets of stable matchings of instances $A$ and $B$, and let $\mathcal{L}_A$ and $\mathcal{L}_B$ be their lattices. Our results are:
1. For $(0, n)$, $\mathcal{M}_A \cap \mathcal{M}_B$ is a sublattice of $\mathcal{L}_A$ and of $\mathcal{L}_B$. We can efficiently obtain the worker-optimal and firm-optimal stable matchings in $\mathcal{M}_A \cap \mathcal{M}_B$. We also obtain the associated partial order, as promised by Birkhoff's Representation Theorem, and use it to enumerate these matchings with polynomial delay.
2. For $(1, n)$, the only missing results are the partial order and enumeration.
3. We give an example with $(2, 2)$ for which $\mathcal{M}_A \cap \mathcal{M}_B$ fails to be a sublattice of $\mathcal{L}_A$.
In light of the fact that for $(n, n)$, determining if $(\mathcal{M}_A \cap \mathcal{M}_B) = \emptyset$ is NP-hard [MO19], a number of open questions arise; in particular, closing the gap between $(2, 2)$ and $(n, n)$.
△ Less
Submitted 20 July, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
Optimal Sketching Bounds for Sparse Linear Regression
Authors:
Tung Mai,
Alexander Munteanu,
Cameron Musco,
Anup B. Rao,
Chris Schwiegelshohn,
David P. Woodruff
Abstract:
We study oblivious sketching for $k$-sparse linear regression under various loss functions such as an $\ell_p$ norm, or from a broad class of hinge-like loss functions, which includes the logistic and ReLU losses. We show that for sparse $\ell_2$ norm regression, there is a distribution over oblivious sketches with $Θ(k\log(d/k)/\varepsilon^2)$ rows, which is tight up to a constant factor. This ex…
▽ More
We study oblivious sketching for $k$-sparse linear regression under various loss functions such as an $\ell_p$ norm, or from a broad class of hinge-like loss functions, which includes the logistic and ReLU losses. We show that for sparse $\ell_2$ norm regression, there is a distribution over oblivious sketches with $Θ(k\log(d/k)/\varepsilon^2)$ rows, which is tight up to a constant factor. This extends to $\ell_p$ loss with an additional additive $O(k\log(k/\varepsilon)/\varepsilon^2)$ term in the upper bound. This establishes a surprising separation from the related sparse recovery problem, which is an important special case of sparse regression. For this problem, under the $\ell_2$ norm, we observe an upper bound of $O(k \log (d)/\varepsilon + k\log(k/\varepsilon)/\varepsilon^2)$ rows, showing that sparse recovery is strictly easier to sketch than sparse regression. For sparse regression under hinge-like loss functions including sparse logistic and sparse ReLU regression, we give the first known sketching bounds that achieve $o(d)$ rows showing that $O(μ^2 k\log(μn d/\varepsilon)/\varepsilon^2)$ rows suffice, where $μ$ is a natural complexity parameter needed to obtain relative error bounds for these loss functions. We again show that this dimension is tight, up to lower order terms and the dependence on $μ$. Finally, we show that similar sketching bounds can be achieved for LASSO regression, a popular convex relaxation of sparse regression, where one aims to minimize $\|Ax-b\|_2^2+λ\|x\|_1$ over $x\in\mathbb{R}^d$. We show that sketching dimension $O(\log(d)/(λ\varepsilon)^2)$ suffices and that the dependence on $d$ and $λ$ is tight.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
Susceptibility to Influence of Large Language Models
Authors:
Lewis D Griffin,
Bennett Kleinberg,
Maximilian Mozes,
Kimberly T Mai,
Maria Vau,
Matthew Caldwell,
Augustine Marvor-Parker
Abstract:
Two studies tested the hypothesis that a Large Language Model (LLM) can be used to model psychological change following exposure to influential input. The first study tested a generic mode of influence - the Illusory Truth Effect (ITE) - where earlier exposure to a statement (through, for example, rating its interest) boosts a later truthfulness test rating. Data was collected from 1000 human part…
▽ More
Two studies tested the hypothesis that a Large Language Model (LLM) can be used to model psychological change following exposure to influential input. The first study tested a generic mode of influence - the Illusory Truth Effect (ITE) - where earlier exposure to a statement (through, for example, rating its interest) boosts a later truthfulness test rating. Data was collected from 1000 human participants using an online experiment, and 1000 simulated participants using engineered prompts and LLM completion. 64 ratings per participant were collected, using all exposure-test combinations of the attributes: truth, interest, sentiment and importance. The results for human participants reconfirmed the ITE, and demonstrated an absence of effect for attributes other than truth, and when the same attribute is used for exposure and test. The same pattern of effects was found for LLM-simulated participants. The second study concerns a specific mode of influence - populist framing of news to increase its persuasion and political mobilization. Data from LLM-simulated participants was collected and compared to previously published data from a 15-country experiment on 7286 human participants. Several effects previously demonstrated from the human study were replicated by the simulated study, including effects that surprised the authors of the human study by contradicting their theoretical expectations (anti-immigrant framing of news decreases its persuasion and mobilization); but some significant relationships found in human data (modulation of the effectiveness of populist framing according to relative deprivation of the participant) were not present in the LLM data. Together the two studies support the view that LLMs have potential to act as models of the effect of influence.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
Quantifying the common genetic variability of bacterial traits
Authors:
T. Tien Mai,
Gerry Tonkin-Hill,
John A. Lees,
Jukka Corander
Abstract:
The study of common heritability, or co-heritability, among multiple traits has been widely established in quantitative and molecular genetics. However, in bacteria, genome-based estimation of heritability has only been considered very recently and no methods are currently available for considering co-heritability. Here we introduce such a method and demonstrate its usefulness by multi-trait analy…
▽ More
The study of common heritability, or co-heritability, among multiple traits has been widely established in quantitative and molecular genetics. However, in bacteria, genome-based estimation of heritability has only been considered very recently and no methods are currently available for considering co-heritability. Here we introduce such a method and demonstrate its usefulness by multi-trait analyses of the three major human pathogens \textit{Escherichia coli}, \textit{Neisseria gonorrhoeae} and \textit{Streprococcus pneumoniae}. We anticipate that the increased availability of high-throughput genomic and phenotypic screens of bacterial populations will spawn ample future opportunities to understand the common molecular basis of different traits in bacteria.
△ Less
Submitted 22 February, 2023;
originally announced February 2023.
-
Tackling Stackelberg Network Interdiction against a Boundedly Rational Adversary
Authors:
Tien Mai,
Avinandan Bose,
Arunesh Sinha,
Thanh H. Nguyen
Abstract:
This work studies Stackelberg network interdiction games -- an important class of games in which a defender first allocates (randomized) defense resources to a set of critical nodes on a graph while an adversary chooses its path to attack these nodes accordingly. We consider a boundedly rational adversary in which the adversary's response model is based on a dynamic form of classic logit-based dis…
▽ More
This work studies Stackelberg network interdiction games -- an important class of games in which a defender first allocates (randomized) defense resources to a set of critical nodes on a graph while an adversary chooses its path to attack these nodes accordingly. We consider a boundedly rational adversary in which the adversary's response model is based on a dynamic form of classic logit-based discrete choice models. We show that the problem of finding an optimal interdiction strategy for the defender in the rational setting is NP-hard. The resulting optimization is in fact non-convex and additionally, involves complex terms that sum over exponentially many paths. We tackle these computational challenges by presenting new efficient approximation algorithms with bounded solution guarantees. First, we address the exponentially-many-path challenge by proposing a polynomial-time dynamic programming-based formulation. We then show that the gradient of the non-convex objective can also be computed in polynomial time, which allows us to use a gradient-based method to solve the problem efficiently. Second, we identify a restricted problem that is convex and hence gradient-based methods find the global optimal solution for this restricted problem. We further identify mild conditions under which this restricted problem provides a bounded approximation for the original problem.
△ Less
Submitted 28 January, 2023;
originally announced January 2023.
-
Solving Richly Constrained Reinforcement Learning through State Augmentation and Reward Penalties
Authors:
Hao Jiang,
Tien Mai,
Pradeep Varakantham,
Minh Huy Hoang
Abstract:
Constrained Reinforcement Learning has been employed to enforce safety constraints on policy through the use of expected cost constraints. The key challenge is in handling expected cost accumulated using the policy and not just in a single step. Existing methods have developed innovative ways of converting this cost constraint over entire policy to constraints over local decisions (at each time st…
▽ More
Constrained Reinforcement Learning has been employed to enforce safety constraints on policy through the use of expected cost constraints. The key challenge is in handling expected cost accumulated using the policy and not just in a single step. Existing methods have developed innovative ways of converting this cost constraint over entire policy to constraints over local decisions (at each time step). While such approaches have provided good solutions with regards to objective, they can either be overly aggressive or conservative with respect to costs. This is owing to use of estimates for "future" or "backward" costs in local cost constraints.
To that end, we provide an equivalent unconstrained formulation to constrained RL that has an augmented state space and reward penalties. This intuitive formulation is general and has interesting theoretical properties. More importantly, this provides a new paradigm for solving constrained RL problems effectively. As we show in our experimental results, we are able to outperform leading approaches on multiple benchmark problems from literature.
△ Less
Submitted 31 May, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
Spectroscopy of photoionization from the $^1E$ singlet state in nitrogen$-$vacancy centers in diamond
Authors:
Sean M. Blakley,
Thuc T. Mai,
Stephen J. Moxim,
Jason T. Ryan,
Adam J. Biacchi,
Angela R. Hight Walker,
Robert D. McMichael
Abstract:
The $^1E-^1A_1$ singlet manifold of the negatively charged nitrogen vacancy $(NV^-)$ center in diamond plays a central role in the quantum information and quantum sensing applications of the $NV^-$ center. However, the energy of this manifold within the diamond bandgap and with respect to the $^3A_2-^3E$ triplet manifold has not been measured directly. Using field-quenching effects on photolumines…
▽ More
The $^1E-^1A_1$ singlet manifold of the negatively charged nitrogen vacancy $(NV^-)$ center in diamond plays a central role in the quantum information and quantum sensing applications of the $NV^-$ center. However, the energy of this manifold within the diamond bandgap and with respect to the $^3A_2-^3E$ triplet manifold has not been measured directly. Using field-quenching effects on photoluminescence (PL) spectra, we report on the energy gap between the $^1E-^1A_1$ singlet manifold and the $^3A_2$ and $^3E$ ground and excited triplet states of the $NV^-$ as a function of excitation wavelength and power, temperature, and applied magnetic field in a heavily nitrogen-doped sample. Increased PL and decreased zero-phonon line width from the $NV^0$ were observed in the presence of an applied magnetic field, indicating ionization from the long-lived $^1E$ singlet state. A temperature-dependent ionization threshold between 532 nm and 550 nm was found, locating the singlet states within the diamond band gap.
△ Less
Submitted 26 February, 2024; v1 submitted 24 January, 2023;
originally announced January 2023.
-
Warning: Humans Cannot Reliably Detect Speech Deepfakes
Authors:
Kimberly T. Mai,
Sergi D. Bray,
Toby Davies,
Lewis D. Griffin
Abstract:
Speech deepfakes are artificial voices generated by machine learning models. Previous literature has highlighted deepfakes as one of the biggest security threats arising from progress in artificial intelligence due to their potential for misuse. However, studies investigating human detection capabilities are limited. We presented genuine and deepfake audio to n = 529 individuals and asked them to…
▽ More
Speech deepfakes are artificial voices generated by machine learning models. Previous literature has highlighted deepfakes as one of the biggest security threats arising from progress in artificial intelligence due to their potential for misuse. However, studies investigating human detection capabilities are limited. We presented genuine and deepfake audio to n = 529 individuals and asked them to identify the deepfakes. We ran our experiments in English and Mandarin to understand if language affects detection performance and decision-making rationale. We found that detection capability is unreliable. Listeners only correctly spotted the deepfakes 73% of the time, and there was no difference in detectability between the two languages. Increasing listener awareness by providing examples of speech deepfakes only improves results slightly. As speech synthesis algorithms improve and become more realistic, we can expect the detection task to become harder. The difficulty of detecting speech deepfakes confirms their potential for misuse and signals that defenses against this threat are needed.
△ Less
Submitted 2 August, 2023; v1 submitted 18 January, 2023;
originally announced January 2023.
-
One-loop contributions to decays $e_b\to e_a γ$ and $(g-2)_{e_a}$ anomalies, and Ward identity
Authors:
L. T. Hue,
H. N. Long,
V. H. Binh,
H. L. T. Mai,
T. Phong Nguyen
Abstract:
In this paper, we will present analytic formulas to express one-loop contributions to lepton flavor violating decays $e_b\to e_a γ$, which are also relevant to the anomalous dipole magnetic moments of charged leptons $e_a$. These formulas were computed in the unitary gauge, using the well-known Passarino-Veltman notations. We also show that our results are consistent with those calculated previous…
▽ More
In this paper, we will present analytic formulas to express one-loop contributions to lepton flavor violating decays $e_b\to e_a γ$, which are also relevant to the anomalous dipole magnetic moments of charged leptons $e_a$. These formulas were computed in the unitary gauge, using the well-known Passarino-Veltman notations. We also show that our results are consistent with those calculated previously in the 't Hooft-Veltman gauge, or in the limit of zero lepton masses. At the one-loop level, we show that the appearance of fermion-scalar-vector type diagrams in the unitary gauge will violate the Ward Identity relating to an external photon. As a result, the validation of the Ward Identity guarantees that the photon always couples with two identical particles in an arbitrary triple coupling vertex containing a photon.
△ Less
Submitted 25 May, 2023; v1 submitted 13 January, 2023;
originally announced January 2023.
-
Ring-Exchange Interaction Effects on Magnons in Dirac Magnet CoTiO$_3$
Authors:
Yufei Li,
Thuc T. Mai,
M. Karaki,
E. V. Jasper,
K. F. Garrity,
C. Lyon,
D. Shaw,
T. DeLazzer,
A. J. Biacchi,
R. L. Dally,
D. M. Heligman,
J. Gdanski,
T. Adel,
M. F. Muñoz,
A. Giovannone,
A. Pawbake,
C. Faugeras,
J. R. Simpson,
K. Ross,
N. Trivedi,
Y. M. Lu,
A. R. Hight Walker,
R. Valdés Aguilar
Abstract:
The magnetic interactions that determine magnetic order and magnon energies typically involve only two spins. While rare, multi-spin interactions can also appear in quantum magnets and be the driving force in the ground state selection and in the nature of its excitations. By performing time-domain terahertz and magneto-Raman spectroscopy measurements combined with theoretical modeling, we determi…
▽ More
The magnetic interactions that determine magnetic order and magnon energies typically involve only two spins. While rare, multi-spin interactions can also appear in quantum magnets and be the driving force in the ground state selection and in the nature of its excitations. By performing time-domain terahertz and magneto-Raman spectroscopy measurements combined with theoretical modeling, we determine the origin of the magnon excitation gap in Dirac antiferromagnet CoTiO$_3$. By adding a ring-exchange interaction in a hexagonal plaquette of the honeycomb lattice to both an XXZ spin model and to a low energy spin-orbital flavor wave model, a gap is generated in the magnon spectrum at the Brillouin zone center. With this addition, the flavor wave model reproduces a large swath of experimental results including terahertz, Raman, inelastic neutron scattering, and magnetization experiments.
△ Less
Submitted 4 June, 2024; v1 submitted 10 December, 2022;
originally announced December 2022.
-
Binary-Continuous Sum-of-ratios Optimization: Discretization, Approximations, and Convex Reformulations
Authors:
Tien Mai,
Ngan Ha Duong,
Thuy Anh Ta
Abstract:
We study a class of non-convex sum-of-ratios programs which can be used for decision-making in prominent areas such as product assortment and price optimization, facility location, and security games. Such an optimization problem involves both continuous and binary decision variables and is known to be highly non-convex and intractable to solve. We explore a discretization approach to approximate…
▽ More
We study a class of non-convex sum-of-ratios programs which can be used for decision-making in prominent areas such as product assortment and price optimization, facility location, and security games. Such an optimization problem involves both continuous and binary decision variables and is known to be highly non-convex and intractable to solve. We explore a discretization approach to approximate the optimization problem and show that the approximate program can be reformulated as mixed-integer linear or second-order cone programs, which can be conveniently handled by an off-the-shelf solver (e.g., CPLEX or GUROBI). We further establish (mild) conditions under which solutions to the approximate problem converge to optimal solutions as the number of discretization points increases. We also provide approximation abounds for solutions obtained from the approximated problem. We show how our approach applies to product assortment and price optimization, maximum covering facility location, and Bayesian Stackelberg security games and provide experimental results to evaluate the efficiency of our approach.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Imitating Opponent to Win: Adversarial Policy Imitation Learning in Two-player Competitive Games
Authors:
The Viet Bui,
Tien Mai,
Thanh H. Nguyen
Abstract:
Recent research on vulnerabilities of deep reinforcement learning (RL) has shown that adversarial policies adopted by an adversary agent can influence a target RL agent (victim agent) to perform poorly in a multi-agent environment. In existing studies, adversarial policies are directly trained based on experiences of interacting with the victim agent. There is a key shortcoming of this approach; k…
▽ More
Recent research on vulnerabilities of deep reinforcement learning (RL) has shown that adversarial policies adopted by an adversary agent can influence a target RL agent (victim agent) to perform poorly in a multi-agent environment. In existing studies, adversarial policies are directly trained based on experiences of interacting with the victim agent. There is a key shortcoming of this approach; knowledge derived from historical interactions may not be properly generalized to unexplored policy regions of the victim agent, making the trained adversarial policy significantly less effective. In this work, we design a new effective adversarial policy learning algorithm that overcomes this shortcoming. The core idea of our new algorithm is to create a new imitator to imitate the victim agent's policy while the adversarial policy will be trained not only based on interactions with the victim agent but also based on feedback from the imitator to forecast victim's intention. By doing so, we can leverage the capability of imitation learning in well capturing underlying characteristics of the victim policy only based on sample trajectories of the victim. Our victim imitation learning model differs from prior models as the environment's dynamics are driven by adversary's policy and will keep changing during the adversarial policy training. We provide a provable bound to guarantee a desired imitating policy when the adversary's policy becomes stable. We further strengthen our adversarial policy learning by making our imitator a stronger version of the victim. Finally, our extensive experiments using four competitive MuJoCo game environments show that our proposed adversarial policy learning algorithm outperforms state-of-the-art algorithms.
△ Less
Submitted 30 October, 2022;
originally announced October 2022.