-
MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning
Authors:
Chenyu Wang,
Weixin Luo,
Qianyu Chen,
Haonan Mai,
**di Guo,
Sixun Dong,
Xiaohua,
Xuan,
Zhengxin Li,
Lin Ma,
Shenghua Gao
Abstract:
Recently, the astonishing performance of large language models (LLMs) in natural language comprehension and generation tasks triggered lots of exploration of using them as central controllers to build agent systems. Multiple studies focus on bridging the LLMs to external tools to extend the application scenarios. However, the current LLMs' perceiving tool-use ability is limited to a single text qu…
▽ More
Recently, the astonishing performance of large language models (LLMs) in natural language comprehension and generation tasks triggered lots of exploration of using them as central controllers to build agent systems. Multiple studies focus on bridging the LLMs to external tools to extend the application scenarios. However, the current LLMs' perceiving tool-use ability is limited to a single text query, which may result in ambiguity in understanding the users' real intentions. LLMs are expected to eliminate that by perceiving the visual- or auditory-grounded instructions' information. Therefore, in this paper, we propose MLLM-Tool, a system incorporating open-source LLMs and multi-modal encoders so that the learnt LLMs can be conscious of multi-modal input instruction and then select the function-matched tool correctly. To facilitate the evaluation of the model's capability, we collect a dataset featured by consisting of multi-modal input tools from HuggingFace. Another important feature of our dataset is that our dataset also contains multiple potential choices for the same instruction due to the existence of identical functions and synonymous functions, which provides more potential solutions for the same query. The experiments reveal that our MLLM-Tool is capable of recommending appropriate tools for multi-modal instructions. Codes and data are available at https://github.com/MLLM-Tool/MLLM-Tool.
△ Less
Submitted 23 January, 2024; v1 submitted 19 January, 2024;
originally announced January 2024.
-
Shadows Don't Lie and Lines Can't Bend! Generative Models don't know Projective Geometry...for now
Authors:
Ayush Sarkar,
Hanlin Mai,
Amitabh Mahapatra,
Svetlana Lazebnik,
D. A. Forsyth,
Anand Bhattad
Abstract:
Generative models can produce impressively realistic images. This paper demonstrates that generated images have geometric features different from those of real images. We build a set of collections of generated images, prequalified to fool simple, signal-based classifiers into believing they are real. We then show that prequalified generated images can be identified reliably by classifiers that on…
▽ More
Generative models can produce impressively realistic images. This paper demonstrates that generated images have geometric features different from those of real images. We build a set of collections of generated images, prequalified to fool simple, signal-based classifiers into believing they are real. We then show that prequalified generated images can be identified reliably by classifiers that only look at geometric properties. We use three such classifiers. All three classifiers are denied access to image pixels, and look only at derived geometric features. The first classifier looks at the perspective field of the image, the second looks at lines detected in the image, and the third looks at relations between detected objects and shadows. Our procedure detects generated images more reliably than SOTA local signal based detectors, for images from a number of distinct generators. Saliency maps suggest that the classifiers can identify geometric problems reliably. We conclude that current generators cannot reliably reproduce geometric properties of real images.
△ Less
Submitted 30 May, 2024; v1 submitted 28 November, 2023;
originally announced November 2023.
-
Prefix-Tuning Based Unsupervised Text Style Transfer
Authors:
Huiyu Mai,
Wenhao Jiang,
Zhihong Deng
Abstract:
Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content without using any parallel data. In this paper, we employ powerful pre-trained large language models and present a new prefix-tuning-based method for unsupervised text style transfer. We construct three different kinds of prefixes, i.e., \textit{shared pr…
▽ More
Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content without using any parallel data. In this paper, we employ powerful pre-trained large language models and present a new prefix-tuning-based method for unsupervised text style transfer. We construct three different kinds of prefixes, i.e., \textit{shared prefix, style prefix}, and \textit{content prefix}, to encode task-specific information, target style, and the content information of the input sentence, respectively. Compared to embeddings used by previous works, the proposed prefixes can provide richer information for the model. Furthermore, we adopt a recursive way of using language models in the process of style transfer. This strategy provides a more effective way for the interactions between the input sentence and GPT-2, helps the model construct more informative prefixes, and thus, helps improve the performance. Evaluations on the well-known datasets show that our method outperforms the state-of-the-art baselines. Results, analysis of ablation studies, and subjective evaluations from humans are also provided for a deeper understanding of the proposed method.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
On Prediction Feature Assignment in the Heckman Selection Model
Authors:
Huy Mai,
Xintao Wu
Abstract:
Under missing-not-at-random (MNAR) sample selection bias, the performance of a prediction model is often degraded. This paper focuses on one classic instance of MNAR sample selection bias where a subset of samples have non-randomly missing outcomes. The Heckman selection model and its variants have commonly been used to handle this type of sample selection bias. The Heckman model uses two separate…
▽ More
Under missing-not-at-random (MNAR) sample selection bias, the performance of a prediction model is often degraded. This paper focuses on one classic instance of MNAR sample selection bias where a subset of samples have non-randomly missing outcomes. The Heckman selection model and its variants have commonly been used to handle this type of sample selection bias. The Heckman model uses two separate equations to model the prediction and selection of samples, where the selection features include all prediction features. When using the Heckman model, the prediction features must be properly chosen from the set of selection features. However, choosing the proper prediction features is a challenging task for the Heckman model. This is especially the case when the number of selection features is large. Existing approaches that use the Heckman model often provide a manually chosen set of prediction features. In this paper, we propose Heckman-FA as a novel data-driven framework for obtaining prediction features for the Heckman model. Heckman-FA first trains an assignment function that determines whether or not a selection feature is assigned as a prediction feature. Using the parameters of the trained function, the framework extracts a suitable set of prediction features based on the goodness-of-fit of the prediction model given the chosen prediction features and the correlation between noise terms of the prediction and selection equations. Experimental results on real-world datasets show that Heckman-FA produces a robust regression model under MNAR sample selection bias.
△ Less
Submitted 22 April, 2024; v1 submitted 14 September, 2023;
originally announced September 2023.
-
From SMOTE to Mixup for Deep Imbalanced Classification
Authors:
Wei-Chao Cheng,
Tan-Ha Mai,
Hsuan-Tien Lin
Abstract:
Given imbalanced data, it is hard to train a good classifier using deep learning because of the poor generalization of minority classes. Traditionally, the well-known synthetic minority oversampling technique (SMOTE) for data augmentation, a data mining approach for imbalanced learning, has been used to improve this generalization. However, it is unclear whether SMOTE also benefits deep learning.…
▽ More
Given imbalanced data, it is hard to train a good classifier using deep learning because of the poor generalization of minority classes. Traditionally, the well-known synthetic minority oversampling technique (SMOTE) for data augmentation, a data mining approach for imbalanced learning, has been used to improve this generalization. However, it is unclear whether SMOTE also benefits deep learning. In this work, we study why the original SMOTE is insufficient for deep learning, and enhance SMOTE using soft labels. Connecting the resulting soft SMOTE with Mixup, a modern data augmentation technique, leads to a unified framework that puts traditional and modern data augmentation techniques under the same umbrella. A careful study within this framework shows that Mixup improves generalization by implicitly achieving uneven margins between majority and minority classes. We then propose a novel margin-aware Mixup technique that more explicitly achieves uneven margins. Extensive experimental results demonstrate that our proposed technique yields state-of-the-art performance on deep imbalanced classification while achieving superior performance on extremely imbalanced data. The code is open-sourced in our developed package https://github.com/ntucllab/imbalanced-DL to foster future research in this direction.
△ Less
Submitted 3 November, 2023; v1 submitted 29 August, 2023;
originally announced August 2023.
-
A Robust Classifier Under Missing-Not-At-Random Sample Selection Bias
Authors:
Huy Mai,
Wen Huang,
Wei Du,
Xintao Wu
Abstract:
The shift between the training and testing distributions is commonly due to sample selection bias, a type of bias caused by non-random sampling of examples to be included in the training set. Although there are many approaches proposed to learn a classifier under sample selection bias, few address the case where a subset of labels in the training set are missing-not-at-random (MNAR) as a result of…
▽ More
The shift between the training and testing distributions is commonly due to sample selection bias, a type of bias caused by non-random sampling of examples to be included in the training set. Although there are many approaches proposed to learn a classifier under sample selection bias, few address the case where a subset of labels in the training set are missing-not-at-random (MNAR) as a result of the selection process. In statistics, Greene's method formulates this type of sample selection with logistic regression as the prediction model. However, we find that simply integrating this method into a robust classification framework is not effective for this bias setting. In this paper, we propose BiasCorr, an algorithm that improves on Greene's method by modifying the original training set in order for a classifier to learn under MNAR sample selection bias. We provide theoretical guarantee for the improvement of BiasCorr over Greene's method by analyzing its bias. Experimental results on real-world datasets demonstrate that BiasCorr produces robust classifiers and can be extended to outperform state-of-the-art classifiers that have been proposed to train under sample selection bias.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
PaaS: Planning as a Service for reactive driving in CARLA Leaderboard
Authors:
Nhat Hao Truong,
Huu Thien Mai,
Tuan Anh Tran,
Minh Quang Tran,
Duc Duy Nguyen,
Ngoc Viet Phuong Pham
Abstract:
End-to-end deep learning approaches has been proven to be efficient in autonomous driving and robotics. By using deep learning techniques for decision-making, those systems are often referred to as a black box, and the result is driven by data. In this paper, we propose PaaS (Planning as a Service), a vanilla module to generate local trajectory planning for autonomous driving in CARLA simulation.…
▽ More
End-to-end deep learning approaches has been proven to be efficient in autonomous driving and robotics. By using deep learning techniques for decision-making, those systems are often referred to as a black box, and the result is driven by data. In this paper, we propose PaaS (Planning as a Service), a vanilla module to generate local trajectory planning for autonomous driving in CARLA simulation. Our method is submitted in International CARLA Autonomous Driving Leaderboard (CADL), which is a platform to evaluate the driving proficiency of autonomous agents in realistic traffic scenarios. Our approach focuses on reactive planning in Frenet frame under complex urban street's constraints and driver's comfort. The planner generates a collection of feasible trajectories, leveraging heuristic cost functions with controllable driving style factor to choose the optimal-control path that satisfies safe travelling criteria. PaaS can provide sufficient solutions to handle well under challenging traffic situations in CADL. As the strict evaluation in CADL Map Track, our approach ranked 3rd out of 9 submissions regarding the measure of driving score. However, with the focus on minimizing the risk of maneuver and ensuring passenger safety, our figures corresponding to infraction penalty dominate the two leading submissions for 20 percent.
△ Less
Submitted 14 June, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
The Short Text Matching Model Enhanced with Knowledge via Contrastive Learning
Authors:
Ruiqiang Liu,
Qiqiang Zhong,
Mengmeng Cui,
Hanjie Mai,
Qiang Zhang,
Shaohua Xu,
Xiangzheng Liu,
Yanlong Du
Abstract:
In recent years, short Text Matching tasks have been widely applied in the fields ofadvertising search and recommendation. The difficulty lies in the lack of semantic information and word ambiguity caused by the short length of the text. Previous works have introduced complement sentences or knowledge bases to provide additional feature information. However, these methods have not fully interacted…
▽ More
In recent years, short Text Matching tasks have been widely applied in the fields ofadvertising search and recommendation. The difficulty lies in the lack of semantic information and word ambiguity caused by the short length of the text. Previous works have introduced complement sentences or knowledge bases to provide additional feature information. However, these methods have not fully interacted between the original sentence and the complement sentence, and have not considered the noise issue that may arise from the introduction of external knowledge bases. Therefore, this paper proposes a short Text Matching model that combines contrastive learning and external knowledge. The model uses a generative model to generate corresponding complement sentences and uses the contrastive learning method to guide the model to obtain more semantically meaningful encoding of the original sentence. In addition, to avoid noise, we use keywords as the main semantics of the original sentence to retrieve corresponding knowledge words in the knowledge base, and construct a knowledge graph. The graph encoding model is used to integrate the knowledge base information into the model. Our designed model achieves state-of-the-art performance on two publicly available Chinese Text Matching datasets, demonstrating the effectiveness of our model.
△ Less
Submitted 19 December, 2023; v1 submitted 7 April, 2023;
originally announced April 2023.
-
Sums of squares representations on singular loci
Authors:
Ngoc Hoang Anh Mai,
Victor Magron
Abstract:
The problem of characterizing a real polynomial $f$ as a sum of squares of polynomials on a real algebraic variety $V$ dates back to the pioneering work of Hilbert in [Mathematische Annalen 32.3 (1888): 342-350]. In this paper, we investigate this problem with a focus on cases where the real zeros of $f$ on $V$ are singular points of $V$. By using optimality conditions and irreducible decompositio…
▽ More
The problem of characterizing a real polynomial $f$ as a sum of squares of polynomials on a real algebraic variety $V$ dates back to the pioneering work of Hilbert in [Mathematische Annalen 32.3 (1888): 342-350]. In this paper, we investigate this problem with a focus on cases where the real zeros of $f$ on $V$ are singular points of $V$. By using optimality conditions and irreducible decomposition, we provide a positive answer to the following essential question of polynomial optimization: Are there always exact semidefinite programs to compute the minimum value attained by a given polynomial over a given real algebraic variety? Our answer implies that Lasserre's hierarchy, which is known as a bridge between convex and non-convex programs with algebraic structures, has finite convergence not only in the generic case but also in the general case. As a result, we constructively prove that each hyperbolic program is equivalent to a semidefinite program.
△ Less
Submitted 9 March, 2023;
originally announced March 2023.
-
One-loop contributions to decays $e_b\to e_a γ$ and $(g-2)_{e_a}$ anomalies, and Ward identity
Authors:
L. T. Hue,
H. N. Long,
V. H. Binh,
H. L. T. Mai,
T. Phong Nguyen
Abstract:
In this paper, we will present analytic formulas to express one-loop contributions to lepton flavor violating decays $e_b\to e_a γ$, which are also relevant to the anomalous dipole magnetic moments of charged leptons $e_a$. These formulas were computed in the unitary gauge, using the well-known Passarino-Veltman notations. We also show that our results are consistent with those calculated previous…
▽ More
In this paper, we will present analytic formulas to express one-loop contributions to lepton flavor violating decays $e_b\to e_a γ$, which are also relevant to the anomalous dipole magnetic moments of charged leptons $e_a$. These formulas were computed in the unitary gauge, using the well-known Passarino-Veltman notations. We also show that our results are consistent with those calculated previously in the 't Hooft-Veltman gauge, or in the limit of zero lepton masses. At the one-loop level, we show that the appearance of fermion-scalar-vector type diagrams in the unitary gauge will violate the Ward Identity relating to an external photon. As a result, the validation of the Ward Identity guarantees that the photon always couples with two identical particles in an arbitrary triple coupling vertex containing a photon.
△ Less
Submitted 25 May, 2023; v1 submitted 13 January, 2023;
originally announced January 2023.
-
A Nichtnegativstellensatz on singular varieties under the denseness of regular loci
Authors:
Ngoc Hoang Anh Mai
Abstract:
Let $V$ be a real algebraic variety with singularities and $f$ be a real polynomial non-negative on $V$. Assume that the regular locus of $V$ is dense in $V$ by the usual topology. Using Hironaka's resolution of singularities and Demmel--Nie--Powers' Nichtnegativstellensatz, we obtain a sum of squares-based representation that characterizes the non-negativity of $f$ on $V$. This representation all…
▽ More
Let $V$ be a real algebraic variety with singularities and $f$ be a real polynomial non-negative on $V$. Assume that the regular locus of $V$ is dense in $V$ by the usual topology. Using Hironaka's resolution of singularities and Demmel--Nie--Powers' Nichtnegativstellensatz, we obtain a sum of squares-based representation that characterizes the non-negativity of $f$ on $V$. This representation allows us to build up exact semidefinite relaxations for polynomial optimization problems whose optimal solutions are possibly singularities of the constraint sets.
△ Less
Submitted 9 March, 2023; v1 submitted 22 November, 2022;
originally announced November 2022.
-
Semi-algebraic description of the closure of the image of a semi-algebraic set under a polynomial
Authors:
Ngoc Hoang Anh Mai
Abstract:
Given a polynomial $f$ and a semi-algebraic set $S$, we provide a symbolic algorithm to find the equations and inequalities defining a semi-algebraic set $Q$ which is identical to the closure of the image of $S$ under $f$, i.e., \begin{equation} Q=\overline{f(S)}\,. \end{equation} Consequently, every polynomial optimization problem whose optimum value is finite has an equivalent form with attained…
▽ More
Given a polynomial $f$ and a semi-algebraic set $S$, we provide a symbolic algorithm to find the equations and inequalities defining a semi-algebraic set $Q$ which is identical to the closure of the image of $S$ under $f$, i.e., \begin{equation} Q=\overline{f(S)}\,. \end{equation} Consequently, every polynomial optimization problem whose optimum value is finite has an equivalent form with attained optimum value, i.e., \begin{equation} \min \limits_{t\in Q} t =\inf\limits_{x\in S} f(x) \end{equation} whenever the right-hand side is finite. Given $d$ as the upper bound on the degrees of $f$ and polynomials defining $S$, we prove that our method requires $O(d^{O(n)})$ arithmetic operations to produce polynomials of degrees at most $d^{O(n)}$ defining $\overline{f(S)}$.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
A direct time-of-flight image sensor with in-pixel surface detection and dynamic vision
Authors:
Istvan Gyongy,
Ahmet T. Erdogan,
Neale A. W. Dutton,
Germán Mora Martín,
Alistair Gorman,
Hanning Mai,
Francesco Mattioli Della Rocca,
Robert K. Henderson
Abstract:
3D flash LIDAR is an alternative to the traditional scanning LIDAR systems, promising precise depth imaging in a compact form factor, and free of moving parts, for applications such as self-driving cars, robotics and augmented reality (AR). Typically implemented using single-photon, direct time-of-flight (dToF) receivers in image sensor format, the operation of the devices can be hindered by the l…
▽ More
3D flash LIDAR is an alternative to the traditional scanning LIDAR systems, promising precise depth imaging in a compact form factor, and free of moving parts, for applications such as self-driving cars, robotics and augmented reality (AR). Typically implemented using single-photon, direct time-of-flight (dToF) receivers in image sensor format, the operation of the devices can be hindered by the large number of photon events needing to be processed and compressed in outdoor scenarios, limiting frame rates and scalability to larger arrays. We here present a 64x32 pixel (256x128 SPAD) dToF imager that overcomes these limitations by using pixels with embedded histogramming, which lock onto and track the return signal. This reduces the size of output data frames considerably, enabling maximum frame rates in the 10 kFPS range or 100 kFPS for direct depth readings. The sensor offers selective readout of pixels detecting surfaces, or those sensing motion, leading to reduced power consumption and off-chip processing requirements. We demonstrate the application of the sensor in mid-range LIDAR.
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
EMaP: Explainable AI with Manifold-based Perturbations
Authors:
Minh N. Vu,
Huy Q. Mai,
My T. Thai
Abstract:
In the last few years, many explanation methods based on the perturbations of input data have been introduced to improve our understanding of decisions made by black-box models. The goal of this work is to introduce a novel perturbation scheme so that more faithful and robust explanations can be obtained. Our study focuses on the impact of perturbing directions on the data topology. We show that p…
▽ More
In the last few years, many explanation methods based on the perturbations of input data have been introduced to improve our understanding of decisions made by black-box models. The goal of this work is to introduce a novel perturbation scheme so that more faithful and robust explanations can be obtained. Our study focuses on the impact of perturbing directions on the data topology. We show that perturbing along the orthogonal directions of the input manifold better preserves the data topology, both in the worst-case analysis of the discrete Gromov-Hausdorff distance and in the average-case analysis via persistent homology. From those results, we introduce EMaP algorithm, realizing the orthogonal perturbation scheme. Our experiments show that EMaP not only improves the explainers' performance but also helps them overcome a recently-developed attack against perturbation-based methods.
△ Less
Submitted 17 September, 2022;
originally announced September 2022.
-
Tractable hierarchies of convex relaxations for polynomial optimization on the nonnegative orthant
Authors:
Ngoc Hoang Anh Mai,
Victor Magron,
Jean-Bernard Lasserre,
Kim-Chuan Toh
Abstract:
We consider polynomial optimization problems (POP) on a semialgebraic set contained in the nonnegative orthant (every POP on a compact set can be put in this format by a simple translation of the origin). Such a POP can be converted to an equivalent POP by squaring each variable. Using even symmetry and the concept of factor width, we propose a hierarchy of semidefinite relaxations based on the ex…
▽ More
We consider polynomial optimization problems (POP) on a semialgebraic set contained in the nonnegative orthant (every POP on a compact set can be put in this format by a simple translation of the origin). Such a POP can be converted to an equivalent POP by squaring each variable. Using even symmetry and the concept of factor width, we propose a hierarchy of semidefinite relaxations based on the extension of Pólya's Positivstellensatz by Dickinson-Povh. As its distinguishing and crucial feature, the maximal matrix size of each resulting semidefinite relaxation can be chosen arbitrarily and in addition, we prove that the sequence of values returned by the new hierarchy converges to the optimal value of the original POP at the rate $O(\varepsilon^{-c})$ if the semialgebraic set has nonempty interior. When applied to (i) robustness certification of multi-layer neural networks and (ii) computation of positive maximal singular values, our method based on Pólya's Positivstellensatz provides better bounds and runs several hundred times faster than the standard Moment-SOS hierarchy.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Exploring PROTAC cooperativity with coarse-grained alchemical methods
Authors:
Huanghao Mai,
Matthew H. Zimmer,
Thomas F. Miller III
Abstract:
Proteolysis targeting chimera (PROTAC) is a novel drug modality that facilitates the degradation of a target protein by inducing proximity with an E3 ligase. In this work, we present a new computational framework to model the cooperativity between PROTAC-E3 binding and PROTAC-target binding principally through protein-protein interactions (PPIs) induced by the PROTAC. Due to the scarcity and low r…
▽ More
Proteolysis targeting chimera (PROTAC) is a novel drug modality that facilitates the degradation of a target protein by inducing proximity with an E3 ligase. In this work, we present a new computational framework to model the cooperativity between PROTAC-E3 binding and PROTAC-target binding principally through protein-protein interactions (PPIs) induced by the PROTAC. Due to the scarcity and low resolution of experimental measurements, the physical and chemical drivers of these non-native PPIs remain to be elucidated. We develop a coarse-grained (CG) approach to model interactions in the target-PROTAC-E3 complexes, which enables converged thermodynamic estimations using alchemical free energy calculation methods despite an unconventional scale of perturbations. With minimal parameterization, we successfully capture fundamental principles of cooperativity, including the optimality of intermediate PROTAC linker lengths that originates from configurational entropy. We qualitatively characterize the dependency of cooperativity on PROTAC linker lengths and protein charges and shapes. Minimal inclusion of sequence- and conformation-specific features in our current forcefield, however, limits quantitative modeling to reproduce experimental measurements, but further development of the CG model may allow for efficient computational screening to optimize PROTAC cooperativity.
△ Less
Submitted 17 November, 2022; v1 submitted 12 August, 2022;
originally announced August 2022.
-
A symbolic algorithm for exact polynomial optimization strengthened with Fritz John conditions
Authors:
Ngoc Hoang Anh Mai
Abstract:
Consider a polynomial optimization problem. Adding polynomial equations generated by the Fritz John conditions to the constraint set does not change the optimal value. As proved in [arXiv:2205.04254 (2022)], the objective polynomial has finitely many values on the new constraint set under some genericity assumption. Based on this, we provide an algorithm that allows us to compute exactly this opti…
▽ More
Consider a polynomial optimization problem. Adding polynomial equations generated by the Fritz John conditions to the constraint set does not change the optimal value. As proved in [arXiv:2205.04254 (2022)], the objective polynomial has finitely many values on the new constraint set under some genericity assumption. Based on this, we provide an algorithm that allows us to compute exactly this optimal value. Our method depends on the computations of real radical generators and Gröbner basis. Finally, we apply our method to solve some instances of mathematical program with complementarity constraints.
△ Less
Submitted 12 November, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Complexity for exact polynomial optimization strengthened with Fritz John conditions
Authors:
Ngoc Hoang Anh Mai
Abstract:
Let $f,g_1,\dots,g_m$ be polynomials of degree at most $d$ with real coefficients in a vector of variables $x=(x_1,\dots,x_n)$. Assume that $f$ is non-negative on a basic semi-algebraic set $S$ defined by polynomial inequalities $g_j(x)\ge 0$, for $j=1,\dots,m$. Our previous work [arXiv:2205.04254 (2022)] has stated several representations of $f$ based on the Fritz John conditions. This paper prov…
▽ More
Let $f,g_1,\dots,g_m$ be polynomials of degree at most $d$ with real coefficients in a vector of variables $x=(x_1,\dots,x_n)$. Assume that $f$ is non-negative on a basic semi-algebraic set $S$ defined by polynomial inequalities $g_j(x)\ge 0$, for $j=1,\dots,m$. Our previous work [arXiv:2205.04254 (2022)] has stated several representations of $f$ based on the Fritz John conditions. This paper provides some explicit degree bounds depending on $n$, $m$, and $d$ for these representations. In application to polynomial optimization, we obtain explicit rates of finite convergence of the hierarchies of semidefinite relaxations based on these representations.
△ Less
Submitted 15 November, 2022; v1 submitted 24 May, 2022;
originally announced May 2022.
-
On the exactness for polynomial optimization strengthened with Fritz John conditions
Authors:
Ngoc Hoang Anh Mai
Abstract:
We utilize the same technique as in [arXiv:2205.04254 (2022)] to provide some representations of polynomials non-negative on a basic semi-algebraic set, defined by polynomial inequalities, under more general conditions. Based on each representation, we obtain semidefinite programs which return a sequence of values that finitely converges to the optimal value of a given polynomial optimization prob…
▽ More
We utilize the same technique as in [arXiv:2205.04254 (2022)] to provide some representations of polynomials non-negative on a basic semi-algebraic set, defined by polynomial inequalities, under more general conditions. Based on each representation, we obtain semidefinite programs which return a sequence of values that finitely converges to the optimal value of a given polynomial optimization problem under generic assumption. Consequently, we can compute exactly the minimal value of any polynomial over a basic convex semi-algebraic set which is defined by the inequalities of concave polynomials.
△ Less
Submitted 11 October, 2022; v1 submitted 17 May, 2022;
originally announced May 2022.
-
Exact polynomial optimization strengthened with Fritz John conditions
Authors:
Ngoc Hoang Anh Mai
Abstract:
Let $f,g_1,\dots,g_m$ be polynomials with real coefficients in a vector of variables $x=(x_1,\dots,x_n)$. Denote by $\text{diag}(g)$ the diagonal matrix with coefficients $g=(g_1,\dots,g_m)$ and denote by $\nabla g$ the Jacobian of $g$. Let $C$ be the set of critical points defined by \begin{equation}
C=\{x\in\mathbb R^n\,:\,\text{rank}(\varphi(x))< m\}\quad\text{with}\quad\varphi:=\begin{bmatri…
▽ More
Let $f,g_1,\dots,g_m$ be polynomials with real coefficients in a vector of variables $x=(x_1,\dots,x_n)$. Denote by $\text{diag}(g)$ the diagonal matrix with coefficients $g=(g_1,\dots,g_m)$ and denote by $\nabla g$ the Jacobian of $g$. Let $C$ be the set of critical points defined by \begin{equation}
C=\{x\in\mathbb R^n\,:\,\text{rank}(\varphi(x))< m\}\quad\text{with}\quad\varphi:=\begin{bmatrix} \nabla g\\ \text{diag}(g) \end{bmatrix}\,. \end{equation} Assume that the image of $C$ under $f$, denoted by $f(C)$, is empty or finite. (Our assumption holds generically since $C$ is empty in a Zariski open set in the space of the coefficients of $g_1,\dots,g_m$ with given degrees.) We provide a sequence of values, returned by semidefinite programs, finitely converges to the minimal value attained by $f$ over the basic semi-algebraic set $S$ defined by \begin{equation}
S:=\{x\in\mathbb R^n\,:\,g_j(x)\ge 0\,,\,j=1,\dots,m\}\,. \end{equation} Consequently, we can compute exactly the minimal value of any polynomial with real coefficients in $x$ over one of the following sets: the unit ball, the unit hypercube and the unit simplex. Under a slightly more general assumption, we extend this result to the minimization of any polynomial over a basic convex semi-algebraic set that has non-empty interior and is defined by the inequalities of concave polynomials.
△ Less
Submitted 21 January, 2023; v1 submitted 9 May, 2022;
originally announced May 2022.
-
A rigidity result of spectral gap on Finsler manifolds and its application
Authors:
Cong Hung Mai
Abstract:
We investigate the rigidity problem for the sharp spectral gap on Finsler manifolds of weighted Ricci curvature bound $\text{Ric}_{\infty} \geq K > 0$. Our main results show that if the equality holds, the manifold necessarily admits a diffeomorphic splitting (or isometric splitting in the particular class of Berwald spaces). This splitting phenomenon is comparable to the Cheeger-Gromoll type spli…
▽ More
We investigate the rigidity problem for the sharp spectral gap on Finsler manifolds of weighted Ricci curvature bound $\text{Ric}_{\infty} \geq K > 0$. Our main results show that if the equality holds, the manifold necessarily admits a diffeomorphic splitting (or isometric splitting in the particular class of Berwald spaces). This splitting phenomenon is comparable to the Cheeger-Gromoll type splitting theorem by Ohta. We also obtain the rigidity results of logarithmic Sobolev and Bakry-Ledoux isoperimetric inequalities via needle decomposition as corollaries.
△ Less
Submitted 23 July, 2022; v1 submitted 5 April, 2022;
originally announced April 2022.
-
Quantitative estimates for the Bakry-Ledoux isoperimetric inequality. II
Authors:
Cong Hung Mai,
Shin-ichi Ohta
Abstract:
Concerning quantitative isoperimetry for a weighted Riemannian manifold satisfying $\mathrm{Ric}_{\infty} \ge 1$, we give an $L^1$-estimate exhibiting that the push-forward of the reference measure by the guiding function (arising from the needle decomposition) is close to the Gaussian measure. We also show $L^p$- and $W_2$-estimates in the $1$-dimensional case.
Concerning quantitative isoperimetry for a weighted Riemannian manifold satisfying $\mathrm{Ric}_{\infty} \ge 1$, we give an $L^1$-estimate exhibiting that the push-forward of the reference measure by the guiding function (arising from the needle decomposition) is close to the Gaussian measure. We also show $L^p$- and $W_2$-estimates in the $1$-dimensional case.
△ Less
Submitted 26 July, 2022; v1 submitted 7 March, 2022;
originally announced March 2022.
-
Tractable semidefinite bounds of positive maximal singular values
Authors:
Victor Magron,
Ngoc Hoang Anh Mai,
Yoshio Ebihara,
Hayato Waki
Abstract:
We focus on computing certified upper bounds for the positive maximal singular value (PMSV) of a given matrix. The PMSV problem boils down to maximizing a quadratic polynomial on the intersection of the unit sphere and the nonnegative orthant. We provide a hierarchy of tractable semidefinite relaxations to approximate the value of the latter polynomial optimization problem as closely as desired. T…
▽ More
We focus on computing certified upper bounds for the positive maximal singular value (PMSV) of a given matrix. The PMSV problem boils down to maximizing a quadratic polynomial on the intersection of the unit sphere and the nonnegative orthant. We provide a hierarchy of tractable semidefinite relaxations to approximate the value of the latter polynomial optimization problem as closely as desired. This hierarchy is based on an extension of Pólya's representation theorem. Doing so, positive polynomials can be decomposed as weighted sums of squares of $s$-nomials, where $s$ can be a priori fixed ($s=1$ corresponds to monomials, $s=2$ corresponds to binomials, etc.). This in turn allows us to control the size of the resulting semidefinite relaxations.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
Stability Analysis of Recurrent Neural Networks by IQC with Copositive Mutipliers
Authors:
Yoshio Ebihara,
Hayato Waki,
Victor Magron,
Ngoc Hoang Anh Mai,
Dimitri Peaucelle,
Sophie Tarbouriech
Abstract:
This paper is concerned with the stability analysis of the recurrent neural networks (RNNs) by means of the integral quadratic constraint (IQC) framework. The rectified linear unit (ReLU) is typically employed as the activation function of the RNN, and the ReLU has specific nonnegativity properties regarding its input and output signals. Therefore, it is effective if we can derive IQC-based stabil…
▽ More
This paper is concerned with the stability analysis of the recurrent neural networks (RNNs) by means of the integral quadratic constraint (IQC) framework. The rectified linear unit (ReLU) is typically employed as the activation function of the RNN, and the ReLU has specific nonnegativity properties regarding its input and output signals. Therefore, it is effective if we can derive IQC-based stability conditions with multipliers taking care of such nonnegativity properties. However, such nonnegativity (linear) properties are hardly captured by the existing multipliers defined on the positive semidefinite cone. To get around this difficulty, we loosen the standard positive semidefinite cone to the copositive cone, and employ copositive multipliers to capture the nonnegativity properties. We show that, within the framework of the IQC, we can employ copositive multipliers (or their inner approximation) together with existing multipliers such as Zames-Falb multipliers and polytopic bounding multipliers, and this directly enables us to ensure that the introduction of the copositive multipliers leads to better (no more conservative) results. We finally illustrate the effectiveness of the IQC-based stability conditions with the copositive multipliers by numerical examples.
△ Less
Submitted 9 February, 2022;
originally announced February 2022.
-
On the complexity of Putinar-Vasilescu's Positivstellensatz
Authors:
Ngoc Hoang Anh Mai,
Victor Magron
Abstract:
We provide a new degree bound on the weighted sum-of-squares (SOS) polynomials for Putinar-Vasilescu's Positivstellensatz. This leads to another Positivstellensatz saying that if $f$ is a polynomial of degree at most $2 d_f$ nonnegative on a semialgebraic set having nonempty interior defined by finitely many polynomial inequalities $g_j(x)\ge 0$, $j=1,\dots,m$ with $g_1:=L-\|x\|_2^2$ for some…
▽ More
We provide a new degree bound on the weighted sum-of-squares (SOS) polynomials for Putinar-Vasilescu's Positivstellensatz. This leads to another Positivstellensatz saying that if $f$ is a polynomial of degree at most $2 d_f$ nonnegative on a semialgebraic set having nonempty interior defined by finitely many polynomial inequalities $g_j(x)\ge 0$, $j=1,\dots,m$ with $g_1:=L-\|x\|_2^2$ for some $L>0$, then there exist positive constants $\bar c$ and $c$ depending on $f,g_j$ such that for any $\varepsilon>0$, for all $k\ge \bar c \varepsilon^{-c}$, $f$ has the decomposition \begin{equation} \begin{array}{l} (1+\|x\|_2^2)^k(f+\varepsilon)=σ_0+\sum_{j=1}^m σ_jg_j \,, \end{array} \end{equation} for some SOS polynomials $σ_j$ being such that the degrees of $σ_0,σ_jg_j$ are at most $2(d_f+k)$. Here $\|\cdot\|_2$ denotes the $\ell_2$ vector norm. As a consequence, we obtain a converging hierarchy of semidefinite relaxations for lower bounds in polynomial optimization on basic compact semialgebraic sets. The complexity of this hierarchy is $\mathcal{O}(\varepsilon^{-c})$ for prescribed accuracy $\varepsilon>0$. In particular, if $m=L=1$ then $c=65$, yielding the complexity $\mathcal{O}(\varepsilon^{-65})$ for the minimization of a polynomial on the unit ball. Our result improves the complexity bound $\mathcal{O}(\exp(\varepsilon^{-c}))$ due to Nie and Schweighofer in [Journal of Complexity 23.1 (2007): 135-150].
△ Less
Submitted 27 May, 2021; v1 submitted 23 April, 2021;
originally announced April 2021.
-
The Constant Trace Property in Noncommutative Optimization
Authors:
Ngoc Hoang Anh Mai,
Abhishek Bhardwaj,
Victor Magron
Abstract:
In this article, we show that each semidefinite relaxation of a ball-constrained noncommutative polynomial optimization problem can be cast as a semidefinite program with a constant trace matrix variable. We then demonstrate how this constant trace property can be exploited via first order numerical methods to solve efficiently the semidefinite relaxations of the noncommutative problem.
In this article, we show that each semidefinite relaxation of a ball-constrained noncommutative polynomial optimization problem can be cast as a semidefinite program with a constant trace matrix variable. We then demonstrate how this constant trace property can be exploited via first order numerical methods to solve efficiently the semidefinite relaxations of the noncommutative problem.
△ Less
Submitted 3 February, 2021;
originally announced February 2021.
-
Comparing different subgradient methods for solving convex optimization problems with functional constraints
Authors:
Thi Lan Dinh,
Ngoc Hoang Anh Mai
Abstract:
We consider the problem of minimizing a convex, nonsmooth function subject to a closed convex constraint domain. The methods that we propose are reforms of subgradient methods based on Metel--Takeda's paper [Optimization Letters 15.4 (2021): 1491-1504] and Boyd's works [Lecture notes of EE364b, Stanford University, Spring 2013-14, pp. 1-39]. While the former has complexity…
▽ More
We consider the problem of minimizing a convex, nonsmooth function subject to a closed convex constraint domain. The methods that we propose are reforms of subgradient methods based on Metel--Takeda's paper [Optimization Letters 15.4 (2021): 1491-1504] and Boyd's works [Lecture notes of EE364b, Stanford University, Spring 2013-14, pp. 1-39]. While the former has complexity $\mathcal{O}(\varepsilon^{-2r})$ for all $r> 1$, the complexity of the latter is $\mathcal{O}(\varepsilon^{-2})$. We perform some comparisons between these two methods using several test examples.
△ Less
Submitted 21 January, 2023; v1 submitted 4 January, 2021;
originally announced January 2021.
-
Exploiting constant trace property in large-scale polynomial optimization
Authors:
Ngoc Hoang Anh Mai,
Jean-Bernard Lasserre,
Victor Magron,
Jie Wang
Abstract:
We prove that every semidefinite moment relaxation of a polynomial optimization problem (POP) with a ball constraint can be reformulated as a semidefinite program involving a matrix with constant trace property (CTP). As a result such moment relaxations can be solved efficiently by first-order methods that exploit CTP, e.g., the conditional gradient-based augmented Lagrangian method. We also exten…
▽ More
We prove that every semidefinite moment relaxation of a polynomial optimization problem (POP) with a ball constraint can be reformulated as a semidefinite program involving a matrix with constant trace property (CTP). As a result such moment relaxations can be solved efficiently by first-order methods that exploit CTP, e.g., the conditional gradient-based augmented Lagrangian method. We also extend this CTP-exploiting framework to large-scale POPs with different sparsity structures. The efficiency and scalability of our framework are illustrated on second-order moment relaxations for various randomly generated quadratically constrained quadratic programs.
△ Less
Submitted 16 December, 2020;
originally announced December 2020.
-
$l_2$ Induced Norm Analysis of Discrete-Time LTI Systems for Nonnegative Input Signals and Its Application to Stability Analysis of Recurrent Neural Networks
Authors:
Yoshio Ebihara,
Hayato Waki,
Victor Magron,
Ngoc Hoang Anh Mai,
Dimitri Peaucelle,
Sophie Tarbouriech
Abstract:
In this paper, we focus on the "positive" $l_2$ induced norm of discrete-time linear time-invariant systems where the input signals are restricted to be nonnegative. To cope with the nonnegativity of the input signals, we employ copositive programming as the mathematical tool for the analysis. Then, by applying an inner approximation to the copositive cone, we derive numerically tractable semidefi…
▽ More
In this paper, we focus on the "positive" $l_2$ induced norm of discrete-time linear time-invariant systems where the input signals are restricted to be nonnegative. To cope with the nonnegativity of the input signals, we employ copositive programming as the mathematical tool for the analysis. Then, by applying an inner approximation to the copositive cone, we derive numerically tractable semidefinite programming problems for the upper and lower bound computation of the "positive" $l_2$ induced norm. This norm is typically useful for the stability analysis of feedback systems constructed from an LTI system and nonlinearities where the nonlinear elements provide only nonnegative signals. As a concrete example, we illustrate the usefulness of the "positive" $l_2$ induced norm for the stability analysis of recurrent neural networks with activation functions being rectified linear units.
△ Less
Submitted 25 November, 2020;
originally announced November 2020.
-
Cancer image classification based on DenseNet model
Authors:
Ziliang Zhong,
Muhang Zheng,
Huafeng Mai,
Jianan Zhao,
Xinyi Liu
Abstract:
Computer-aided diagnosis establishes methods for robust assessment of medical image-based examination. Image processing introduced a promising strategy to facilitate disease classification and detection while diminishing unnecessary expenses. In this paper, we propose a novel metastatic cancer image classification model based on DenseNet Block, which can effectively identify metastatic cancer in s…
▽ More
Computer-aided diagnosis establishes methods for robust assessment of medical image-based examination. Image processing introduced a promising strategy to facilitate disease classification and detection while diminishing unnecessary expenses. In this paper, we propose a novel metastatic cancer image classification model based on DenseNet Block, which can effectively identify metastatic cancer in small image patches taken from larger digital pathology scans. We evaluate the proposed approach to the slightly modified version of the PatchCamelyon (PCam) benchmark dataset. The dataset is the slightly modified version of the PatchCamelyon (PCam) benchmark dataset provided by Kaggle competition, which packs the clinically-relevant task of metastasis detection into a straight-forward binary image classification task. The experiments indicated that our model outperformed other classical methods like Resnet34, Vgg19. Moreover, we also conducted data augmentation experiment and study the relationship between Batches processed and loss value during the training and validation process.
△ Less
Submitted 22 November, 2020;
originally announced November 2020.
-
Protein-based microsphere biolasers fabricated by dehydration
Authors:
Toan Van Nguyen,
Nhat Van Pham,
Hanh Hong Mai,
Dung Chi Duong,
Hai Hoang Le,
Riccardo Sapienza,
Van Duong Ta
Abstract:
Biolasers made of biological materials have attracted a great of research attention due to their biocompatibility and biodegradability, which have the potential for biosensors and biointegration. However, the current fabrication method of biolasers suffers several limitations such as complicated processes, time-consuming and environmental unfriendly. In this work, a novel approach with green proce…
▽ More
Biolasers made of biological materials have attracted a great of research attention due to their biocompatibility and biodegradability, which have the potential for biosensors and biointegration. However, the current fabrication method of biolasers suffers several limitations such as complicated processes, time-consuming and environmental unfriendly. In this work, a novel approach with green processes for fabricating solid-state microspheres biolasers is demonstrated. By using dehydration via a modified MicroglassificationTM technology, dye-doped bovine serum albumin (BSA) droplets can quickly (less than 10 minutes) and easily turn into solid microspheres with the diameter ranging from 10-150 μm. The size of the microspheres can be effectively controlled by changing either the concentration of BSA solution or the diameter of the initial droplets. Fabricated microspheres can act as efficient microlasers under optical pulse excitation. Lasing threshold of 7.8 μJ mm-2 and quality (Q) factor of about 1700 to 3100 are obtained. The size-dependence of lasing characteristics have been investigated and the results show a good agreement with whispering gallery mode (WGM) theory. Our finding contributes an effective technique for the fabrication of high Q factor microlasers that may be potential for applications in biological and chemical sensors.
△ Less
Submitted 15 September, 2020;
originally announced September 2020.
-
A hierarchy of spectral relaxations for polynomial optimization
Authors:
Ngoc Hoang Anh Mai,
Victor Magron,
Jean-Bernard Lasserre
Abstract:
We show that (i) any constrained polynomial optimization problem (POP) has an equivalent formulation on a variety contained in an Euclidean sphere and (ii) the resulting semidefinite relaxations in the moment-SOS hierarchy have the constant trace property (CTP) for the involved matrices. We then exploit the CTP to avoid solving the semidefinite relaxations via interior-point methods and rather use…
▽ More
We show that (i) any constrained polynomial optimization problem (POP) has an equivalent formulation on a variety contained in an Euclidean sphere and (ii) the resulting semidefinite relaxations in the moment-SOS hierarchy have the constant trace property (CTP) for the involved matrices. We then exploit the CTP to avoid solving the semidefinite relaxations via interior-point methods and rather use ad-hoc spectral methods that minimize the largest eigenvalue of a matrix pencil. Convergence to the optimal value of the semidefinite relaxation is guaranteed. As a result we obtain a hierarchy of nonsmooth "spectral relaxations" of the initial POP. Efficiency and robustness of this spectral hierarchy is tested against several equality constrained POPs on a sphere as well as on a sample of randomly generated quadratically constrained quadratic problems (QCQPs).
△ Less
Submitted 17 July, 2020;
originally announced July 2020.
-
CS-TSSOS: Correlative and term sparsity for large-scale polynomial optimization
Authors:
Jie Wang,
Victor Magron,
Jean B. Lasserre,
Ngoc Hoang Anh Mai
Abstract:
This work proposes a new moment-SOS hierarchy, called CS-TSSOS, for solving large-scale sparse polynomial optimization problems. Its novelty is to exploit simultaneously correlative sparsity and term sparsity by combining advantages of two existing frameworks for sparse polynomial optimization. The former is due to Waki et al. while the latter was initially proposed by Wang et al. and later exploi…
▽ More
This work proposes a new moment-SOS hierarchy, called CS-TSSOS, for solving large-scale sparse polynomial optimization problems. Its novelty is to exploit simultaneously correlative sparsity and term sparsity by combining advantages of two existing frameworks for sparse polynomial optimization. The former is due to Waki et al. while the latter was initially proposed by Wang et al. and later exploited in the TSSOS hierarchy. In doing so we obtain CS-TSSOS -- a two-level hierarchy of semidefinite programming relaxations with (i), the crucial property to involve blocks of SDP matrices and (ii), the guarantee of convergence to the global optimum under certain conditions. We demonstrate its efficiency and scalability on several large-scale instances of the celebrated Max-Cut problem and the important industrial optimal power flow problem, involving up to six thousand variables and tens of thousands of constraints.
△ Less
Submitted 8 June, 2021; v1 submitted 6 May, 2020;
originally announced May 2020.
-
A sparse version of Reznick's Positivstellensatz
Authors:
Ngoc Hoang Anh Mai,
Victor Magron,
Jean-Bernard Lasserre
Abstract:
If $f$ is a positive definite form, Reznick's Positivstellensatz [Mathematische Zeitschrift. 220 (1995), pp. 75--97] states that there exists $k\in\mathbf{N}$ such that ${\| x \|^{2k}_2}f$ is a sum of squares of polynomials. Assuming that $f$ can be written as a sum of forms $\sum_{l=1}^p f_l$, where each $f_l$ depends on a subset of the initial variables, and assuming that these subsets satisfy t…
▽ More
If $f$ is a positive definite form, Reznick's Positivstellensatz [Mathematische Zeitschrift. 220 (1995), pp. 75--97] states that there exists $k\in\mathbf{N}$ such that ${\| x \|^{2k}_2}f$ is a sum of squares of polynomials. Assuming that $f$ can be written as a sum of forms $\sum_{l=1}^p f_l$, where each $f_l$ depends on a subset of the initial variables, and assuming that these subsets satisfy the so-called running intersection property, we provide a sparse version of Reznick's Positivstellensatz. Namely, there exists $k \in \mathbf{N}$ such that $f=\sum_{l = 1}^p {{σ_l}/{H_l^{k}}}$, where $σ_l$ is a sum of squares of polynomials, $H_l$ is a uniform polynomial denominator, and both polynomials $σ_l,H_l$ involve the same variables as $f_l$, for each $l=1,\dots,p$. In other words, the sparsity pattern of $f$ is also reflected in this sparse version of Reznick's certificate of positivity. We next use this result to also obtain positivity certificates for (i) polynomials nonnegative on the whole space and (ii) polynomials nonnegative on a (possibly non-compact) basic semialgebraic set, assuming that the input data satisfy the running intersection property. Both are sparse versions of a positivity certificate due to Putinar and Vasilescu.
△ Less
Submitted 13 February, 2020; v1 submitted 12 February, 2020;
originally announced February 2020.
-
Positivity certificates and polynomial optimization on non-compact semialgebraic sets
Authors:
Ngoc Hoang Anh Mai,
Jean-Bernard Lasserre,
Victor Magron
Abstract:
In a first contribution, we revisit two certificates of positivity on (possibly non-compact) basic semialgebraic sets due to Putinar and Vasilescu [Comptes Rendus de l'Académie des Sciences-Series I-Mathematics, 328(6) (1999) pp. 495-499]. We use Jacobi's technique from [Mathematische Zeitschrift, 237(2) (2001) pp. 259-273] to provide an alternative proof with an effective degree bound on the sums…
▽ More
In a first contribution, we revisit two certificates of positivity on (possibly non-compact) basic semialgebraic sets due to Putinar and Vasilescu [Comptes Rendus de l'Académie des Sciences-Series I-Mathematics, 328(6) (1999) pp. 495-499]. We use Jacobi's technique from [Mathematische Zeitschrift, 237(2) (2001) pp. 259-273] to provide an alternative proof with an effective degree bound on the sums of squares multipliers in such certificates. As a consequence, it allows one to define a hierarchy of semidefinite relaxations for a general polynomial optimization problem. Convergence of this hierarchy to a neighborhood of the optimal value as well as strong duality and analysis are guaranteed. In a second contribution, we introduce a new numerical method for solving systems of polynomial inequalities and equalities with possibly uncountably many solutions. As a bonus, one may apply this method to obtain approximate global optimizers in polynomial optimization.
△ Less
Submitted 6 December, 2019; v1 submitted 26 November, 2019;
originally announced November 2019.
-
Overcharging of zinc ion in the structure of zinc finger protein is needed for DNA binding stability
Authors:
Ly Hai Nguyen,
Tuyen Thanh Tran,
Lien Ngoc Thi Truong,
Hanh Hong Mai,
Toan T. Nguyen
Abstract:
The zinc finger structure where a Zn2+ ion binds to 4 cysteine or histidine amino acids in a tetrahedral structure is very common motif of nucleic acid binding proteins. The corresponding interaction model is present in 3% of the genes of human genome. As a result, zinc finger has been shown to be extremely useful in various therapeutic and research capacities, as well as in biotechnology. In stab…
▽ More
The zinc finger structure where a Zn2+ ion binds to 4 cysteine or histidine amino acids in a tetrahedral structure is very common motif of nucleic acid binding proteins. The corresponding interaction model is present in 3% of the genes of human genome. As a result, zinc finger has been shown to be extremely useful in various therapeutic and research capacities, as well as in biotechnology. In stable configuration, the cysteine amino acids are deprotonated and become negatively charged. This means the Zn2+ ion is overscreened by 4 cysteine charges (overcharged). It is question of whether this overcharged configuration is also stable when such negatively charged zinc finger binds to negatively charged DNA molecule. Using all atom molecular dynamics simulation up to microsecond range of an androgen receptor protein dimer, we investigate how the deprotonated state of cysteine influences its structure, dynamics, and function in binding o DNA molecules. Our results show that the deprotonated state of cysteine residues are essential for mechanical stabilization of the functional, folded conformation. Not only this state stabilizes the protein structure, it also stabilizes the protein-DNA binding complex. The differences in structural and energetic properties of the two (sequence-identical) monomers are also investigated showing the strong influence of DNA on the structure of zinc fingers upon complexation. Our result has potential impact on better molecular understanding of one of the most common classes of zinc fingers
△ Less
Submitted 21 February, 2020; v1 submitted 23 November, 2019;
originally announced November 2019.
-
Quantitative estimates for the Bakry-Ledoux isoperimetric inequality
Authors:
Cong Hung Mai,
Shin-ichi Ohta
Abstract:
We establish a quantitative isoperimetric inequality for weighted Riemannian manifolds with $\mathrm{Ric}_{\infty} \ge 1$. Precisely, we give an upper bound of the volume of the symmetric difference between a Borel set and a sub-level (or super-level) set of the associated guiding function (arising from the needle decomposition), in terms of the deficit in Bakry-Ledoux's Gaussian isoperimetric ine…
▽ More
We establish a quantitative isoperimetric inequality for weighted Riemannian manifolds with $\mathrm{Ric}_{\infty} \ge 1$. Precisely, we give an upper bound of the volume of the symmetric difference between a Borel set and a sub-level (or super-level) set of the associated guiding function (arising from the needle decomposition), in terms of the deficit in Bakry-Ledoux's Gaussian isoperimetric inequality. This is the first quantitative isoperimetric inequality on noncompact spaces besides Euclidean and Gaussian spaces. Our argument makes use of Klartag's needle decomposition (also called localization), and is inspired by a recent work of Cavalletti, Maggi and Mondino on compact spaces. Besides the quantitative isoperimetry, a reverse Poincaré inequality for the guiding function that we have as a key step, as well as the way we use it, are of independent interest.
△ Less
Submitted 28 February, 2021; v1 submitted 30 October, 2019;
originally announced October 2019.
-
Hydrodynamic Effects on the Motility of Crawling Eukaryotic Cells
Authors:
Melissa H. Mai,
Brian A. Camley
Abstract:
Eukaryotic cell motility is crucial during development, wound healing, the immune response, and cancer metastasis. Some eukaryotic cells can swim, but cells more commonly adhere to and crawl along the extracellular matrix. We study the relationship between hydrodynamics and adhesion that describe whether a cell is swimming, crawling, or combining these motions. Our simple model of a cell, based on…
▽ More
Eukaryotic cell motility is crucial during development, wound healing, the immune response, and cancer metastasis. Some eukaryotic cells can swim, but cells more commonly adhere to and crawl along the extracellular matrix. We study the relationship between hydrodynamics and adhesion that describe whether a cell is swimming, crawling, or combining these motions. Our simple model of a cell, based on the three-sphere swimmer, is capable of both swimming and crawling. As cell-matrix adhesion strength increases, the influence of hydrodynamics on migration diminish. Cells with significant adhesion can crawl with speeds much larger than their nonadherent, swimming counterparts. We predict that, while most eukaryotic cells are in the strong-adhesion limit, increasing environment viscosity or decreasing cell-matrix adhesion could lead to significant hydrodynamic effects even in crawling cells. Signatures of hydrodynamic effects include dependence of cell speed on the medium viscosity or the presence of a nearby substrate and the presence of interactions between noncontacting cells. These signatures will be suppressed at large adhesion strengths, but even strongly adherent cells will generate relevant fluid flows that will advect nearby passive particles and swimmers.
△ Less
Submitted 29 August, 2019;
originally announced August 2019.
-
Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models and Their Human- and Machine-based Detection
Authors:
David Ifeoluwa Adelani,
Haotian Mai,
Fuming Fang,
Huy H. Nguyen,
Junichi Yamagishi,
Isao Echizen
Abstract:
Advanced neural language models (NLMs) are widely used in sequence generation tasks because they are able to produce fluent and meaningful sentences. They can also be used to generate fake reviews, which can then be used to attack online review systems and influence the buying decisions of online shoppers. To perform such attacks, it is necessary for experts to train a tailored LM for a specific t…
▽ More
Advanced neural language models (NLMs) are widely used in sequence generation tasks because they are able to produce fluent and meaningful sentences. They can also be used to generate fake reviews, which can then be used to attack online review systems and influence the buying decisions of online shoppers. To perform such attacks, it is necessary for experts to train a tailored LM for a specific topic. In this work, we show that a low-skilled threat model can be built just by combining publicly available LMs and show that the produced fake reviews can fool both humans and machines. In particular, we use the GPT-2 NLM to generate a large number of high-quality reviews based on a review with the desired sentiment and then using a BERT based text classifier (with accuracy of 96%) to filter out reviews with undesired sentiments. Because none of the words in the review are modified, fluent samples like the training data can be generated from the learned distribution. A subjective evaluation with 80 participants demonstrated that this simple method can produce reviews that are as fluent as those written by people. It also showed that the participants tended to distinguish fake reviews randomly. Three countermeasures, Grover, GLTR, and OpenAI GPT-2 detector, were found to be difficult to accurately detect fake review.
△ Less
Submitted 3 December, 2019; v1 submitted 22 July, 2019;
originally announced July 2019.
-
Neutral Higgs decays $H \rightarrow Z γ,γγ$ in 3-3-1 models
Authors:
H. T. Hung,
T. T. Hong,
H. H. Phuong,
H. L. T. Mai,
L. T. Hue
Abstract:
The significance of new physics appearing in the loop-induced decays of neutral Higgs bosons into pairs of dibosons $γγ$ and $Zγ$ will be discussed in the framework of the 3-3-1 models based on a recent work~\cite{Okada:2016whh}, where the Higgs sector becomes effectively the same as that in the two Higgs doublet models (2HDM) after the first symmetry breaking from $SU(3)_L$ scale into the electro…
▽ More
The significance of new physics appearing in the loop-induced decays of neutral Higgs bosons into pairs of dibosons $γγ$ and $Zγ$ will be discussed in the framework of the 3-3-1 models based on a recent work~\cite{Okada:2016whh}, where the Higgs sector becomes effectively the same as that in the two Higgs doublet models (2HDM) after the first symmetry breaking from $SU(3)_L$ scale into the electroweak scale. For large $SU(3)_L$ scale $v_3\simeq10$ TeV, dominant one-loop contributions to the two decay amplitudes arise from only the single charged Higgs boson predicted by the 2HDM, leading to that experimental constraint on the signal strength $μ^{331}_{γγ}$ of the Standard Model-like Higgs boson decay $h\rightarrow γγ$ will result in a strict upper bound on the signal strength $μ^{331}_{Zγ}$ of the decay $h\rightarrow\, Zγ$. For a particular model with lower $v_3$ around 3 TeV, contributions from heavy charged gauge and Higgs bosons may have the same order, therefore may give strong destructive or constructive correlations. As a by-product, a deviation from the SM prediction $|μ^{331}_{γγ}-1| \le 0.04$ still allows $|μ^{331}_{Zγ}-1|$ to reach values near 0.1. We also show that there exists an $CP$-even neutral Higgs boson $h^0_3$ predicted by the 3-3-1 models, but beyond the 2HDM, has an interesting property that the branching ratio Br$(h^0_3\rightarrow γγ)$ is very sensitive to the parameter $β$ used to distinguish different 3-3-1 models.
△ Less
Submitted 7 October, 2019; v1 submitted 15 July, 2019;
originally announced July 2019.
-
Yet again on iteration improvement for averaged expected cost control for 1D ergodic diffusions
Authors:
Svetlana Anulova,
Hilmar Mai,
Alexander Veretennikov
Abstract:
The paper is a full version of the short presentation in \cite{amv17}. Ergodic control for one-dimensional controlled diffusion is tackled; both drift and diffusion coefficients may depend on a strategy which is assumed markovian. Ergodic HJB equation is established and existence and uniqueness of its solution is proved, as well as the convergence of the reward improvement algorithm.
The paper is a full version of the short presentation in \cite{amv17}. Ergodic control for one-dimensional controlled diffusion is tackled; both drift and diffusion coefficients may depend on a strategy which is assumed markovian. Ergodic HJB equation is established and existence and uniqueness of its solution is proved, as well as the convergence of the reward improvement algorithm.
△ Less
Submitted 10 August, 2020; v1 submitted 27 December, 2018;
originally announced December 2018.
-
Persistent Homology and Euler Integral Transforms
Authors:
Robert Ghrist,
Rachel Levanger,
Huy Mai
Abstract:
The Euler calculus -- an integral calculus based on Euler characteristic as a valuation on constructible functions -- is shown to be an incisive tool for answering questions about injectivity and invertibility of recent transforms based on persistent homology for shape characterization.
The Euler calculus -- an integral calculus based on Euler characteristic as a valuation on constructible functions -- is shown to be an incisive tool for answering questions about injectivity and invertibility of recent transforms based on persistent homology for shape characterization.
△ Less
Submitted 14 June, 2018; v1 submitted 12 April, 2018;
originally announced April 2018.
-
PotentialNet for Molecular Property Prediction
Authors:
Evan N. Feinberg,
Debnil Sur,
Zhenqin Wu,
Brooke E. Husic,
Huanghao Mai,
Yang Li,
Saisai Sun,
Jianyi Yang,
Bharath Ramsundar,
Vijay S. Pande
Abstract:
The arc of drug discovery entails a multiparameter optimization problem spanning vast length scales. They key parameters range from solubility (angstroms) to protein-ligand binding (nanometers) to in vivo toxicity (meters). Through feature learning---instead of feature engineering---deep neural networks promise to outperform both traditional physics-based and knowledge-based machine learning model…
▽ More
The arc of drug discovery entails a multiparameter optimization problem spanning vast length scales. They key parameters range from solubility (angstroms) to protein-ligand binding (nanometers) to in vivo toxicity (meters). Through feature learning---instead of feature engineering---deep neural networks promise to outperform both traditional physics-based and knowledge-based machine learning models for predicting molecular properties pertinent to drug discovery. To this end, we present the PotentialNet family of graph convolutions. These models are specifically designed for and achieve state-of-the-art performance for protein-ligand binding affinity. We further validate these deep neural networks by setting new standards of performance in several ligand-based tasks. In parallel, we introduce a new metric, the Regression Enrichment Factor $EF_χ^{(R)}$, to measure the early enrichment of computational models for chemical data. Finally, we introduce a cross-validation strategy based on structural homology clustering that can more accurately measure model generalizability, which crucially distinguishes the aims of machine learning for drug discovery from standard machine learning tasks.
△ Less
Submitted 22 October, 2018; v1 submitted 12 March, 2018;
originally announced March 2018.
-
Convergent Actor-Critic Algorithms Under Off-Policy Training and Function Approximation
Authors:
Hamid Reza Maei
Abstract:
We present the first class of policy-gradient algorithms that work with both state-value and policy function-approximation, and are guaranteed to converge under off-policy training. Our solution targets problems in reinforcement learning where the action representation adds to the-curse-of-dimensionality; that is, with continuous or large action sets, thus making it infeasible to estimate state-ac…
▽ More
We present the first class of policy-gradient algorithms that work with both state-value and policy function-approximation, and are guaranteed to converge under off-policy training. Our solution targets problems in reinforcement learning where the action representation adds to the-curse-of-dimensionality; that is, with continuous or large action sets, thus making it infeasible to estimate state-action value functions (Q functions). Using state-value functions helps to lift the curse and as a result naturally turn our policy-gradient solution into classical Actor-Critic architecture whose Actor uses state-value function for the update. Our algorithms, Gradient Actor-Critic and Emphatic Actor-Critic, are derived based on the exact gradient of averaged state-value function objective and thus are guaranteed to converge to its optimal solution, while maintaining all the desirable properties of classical Actor-Critic methods with no additional hyper-parameters. To our knowledge, this is the first time that convergent off-policy learning methods have been extended to classical Actor-Critic methods with function approximation.
△ Less
Submitted 21 February, 2018;
originally announced February 2018.
-
Rigidity for the isoperimetric inequality of negative effective dimension on weighted Riemannian manifolds
Authors:
Cong Hung Mai
Abstract:
We study, on a weighted Riemannian manifold of Ric$_{N} \geq K > 0$ for $N < -1$, when equality holds in the isoperimetric inequality. Our main theorem asserts that such a manifold is necessarily isometric to the warped product $\mathbb{R} \times_{\cosh(\sqrt{K/(1-N)}t)} Σ^{n-1}$ of hyperbolic nature, where $Σ^{n-1}$ is an $(n-1)$-dimensional manifold with lower weighted Ricci curvature bound and…
▽ More
We study, on a weighted Riemannian manifold of Ric$_{N} \geq K > 0$ for $N < -1$, when equality holds in the isoperimetric inequality. Our main theorem asserts that such a manifold is necessarily isometric to the warped product $\mathbb{R} \times_{\cosh(\sqrt{K/(1-N)}t)} Σ^{n-1}$ of hyperbolic nature, where $Σ^{n-1}$ is an $(n-1)$-dimensional manifold with lower weighted Ricci curvature bound and $\mathbb{R}$ is equipped with a hyperbolic cosine measure. This is a similar phenomenon to the equality condition of Poincaré inequality. Moreover, every isoperimetric minimizer set is isometric to a half-space in an appropriate sense.
△ Less
Submitted 17 June, 2018; v1 submitted 19 December, 2017;
originally announced December 2017.
-
On Riemannian manifolds with positive weighted Ricci curvature of negative effective dimension
Authors:
Cong Hung Mai
Abstract:
In this paper, we investigate complete Riemannian manifolds satisfying the lower weighted Ricci curvature bound $\mathrm{Ric}_{N} \geq K$ with $K>0$ for the negative effective dimension $N<0$. We analyze two $1$-dimensional examples of constant curvature $\mathrm{Ric}_N \equiv K$ with finite and infinite total volumes. We also discuss when the first nonzero eigenvalue of the Laplacian takes its mi…
▽ More
In this paper, we investigate complete Riemannian manifolds satisfying the lower weighted Ricci curvature bound $\mathrm{Ric}_{N} \geq K$ with $K>0$ for the negative effective dimension $N<0$. We analyze two $1$-dimensional examples of constant curvature $\mathrm{Ric}_N \equiv K$ with finite and infinite total volumes. We also discuss when the first nonzero eigenvalue of the Laplacian takes its minimum under the same condition $\mathrm{Ric}_N \ge K>0$, as a counterpart to the classical Obata rigidity theorem. Our main theorem shows that, if $N<-1$ and the minimum is attained, then the manifold splits off the real line as a warped product of hyperbolic nature.
△ Less
Submitted 9 October, 2018; v1 submitted 20 April, 2017;
originally announced April 2017.
-
Deep Reinforcement Learning for Visual Object Tracking in Videos
Authors:
Da Zhang,
Hamid Maei,
Xin Wang,
Yuan-Fang Wang
Abstract:
In this paper we introduce a fully end-to-end approach for visual tracking in videos that learns to predict the bounding box locations of a target object at every frame. An important insight is that the tracking problem can be considered as a sequential decision-making process and historical semantics encode highly relevant information for future decisions. Based on this intuition, we formulate ou…
▽ More
In this paper we introduce a fully end-to-end approach for visual tracking in videos that learns to predict the bounding box locations of a target object at every frame. An important insight is that the tracking problem can be considered as a sequential decision-making process and historical semantics encode highly relevant information for future decisions. Based on this intuition, we formulate our model as a recurrent convolutional neural network agent that interacts with a video overtime, and our model can be trained with reinforcement learning (RL) algorithms to learn good tracking policies that pay attention to continuous, inter-frame correlation and maximize tracking performance in the long run. The proposed tracking algorithm achieves state-of-the-art performance in an existing tracking benchmark and operates at frame-rates faster than real-time. To the best of our knowledge, our tracker is the first neural-network tracker that combines convolutional and recurrent networks with RL algorithms.
△ Less
Submitted 10 April, 2017; v1 submitted 31 January, 2017;
originally announced January 2017.
-
A Batch, Off-Policy, Actor-Critic Algorithm for Optimizing the Average Reward
Authors:
S. A. Murphy,
Y. Deng,
E. B. Laber,
H. R. Maei,
R. S. Sutton,
K. Witkiewitz
Abstract:
We develop an off-policy actor-critic algorithm for learning an optimal policy from a training set composed of data from multiple individuals. This algorithm is developed with a view towards its use in mobile health.
We develop an off-policy actor-critic algorithm for learning an optimal policy from a training set composed of data from multiple individuals. This algorithm is developed with a view towards its use in mobile health.
△ Less
Submitted 18 July, 2016;
originally announced July 2016.
-
Jump filtering and efficient drift estimation for Lévy-driven SDE's
Authors:
Arnaud Gloter,
Dasha Loukianova,
Hilmar Mai
Abstract:
The problem of drift estimation for the solution $X$ of a stochastic differential equation with Lévy-type jumps is considered under discrete high-frequency observations with a growing observation window. An efficient and asymptotically normal estimator for the drift parameter is constructed under minimal conditions on the jump behavior and the sampling scheme. In the case of a bounded jump measure…
▽ More
The problem of drift estimation for the solution $X$ of a stochastic differential equation with Lévy-type jumps is considered under discrete high-frequency observations with a growing observation window. An efficient and asymptotically normal estimator for the drift parameter is constructed under minimal conditions on the jump behavior and the sampling scheme. In the case of a bounded jump measure density these conditions reduce to $n Δ_n^{3-ε}\to 0,$ where $n$ is the number of observations and $Δ_n$ is the maximal sampling step. This result relaxes the condition $nΔ_n^2 \to 0$ usually required for joint estimation of drift and diffusion coefficient for SDE's with jumps. The main challenge in this estimation problem stems from the appearance of the unobserved continuous part $X^c$ in the likelihood function. In order to construct the drift estimator we recover this continuous part from discrete observations. More precisely, we estimate, in a nonparametric way, stochastic integrals with respect to $X^c$. Convergence results of independent interest are proved for these nonparametric estimators. Finally, we illustrate the behavior of our drift estimator for a number of popular Lévy-driven models from finance.
△ Less
Submitted 16 March, 2016;
originally announced March 2016.
-
Generalized Post-Widder inversion formula with application to statistics
Authors:
Denis Belomestny,
Hilmar Mai,
John Schoenmakers
Abstract:
In this work we derive an inversion formula for the Laplace transform of a density observed on a curve in the complex domain, which generalizes the well known Post-Widder formula. We establish convergence of our inversion method and derive the corresponding convergence rates for the case of a Laplace transform of a smooth density. As an application we consider the problem of statistical inference…
▽ More
In this work we derive an inversion formula for the Laplace transform of a density observed on a curve in the complex domain, which generalizes the well known Post-Widder formula. We establish convergence of our inversion method and derive the corresponding convergence rates for the case of a Laplace transform of a smooth density. As an application we consider the problem of statistical inference for variance-mean mixture models. We construct a nonparametric estimator for the mixing density based on the generalized Post-Widder formula, derive bounds for its root mean square error and give a brief numerical example.
△ Less
Submitted 30 November, 2015;
originally announced November 2015.