-
Recursive variational Gaussian approximation with the Whittle likelihood for linear non-Gaussian state space models
Authors:
Bao Anh Vu,
David Gunawan,
Andrew Zammit-Mangion
Abstract:
Parameter inference for linear and non-Gaussian state space models is challenging because the likelihood function contains an intractable integral over the latent state variables. Exact inference using Markov chain Monte Carlo is computationally expensive, particularly for long time series data. Variational Bayes methods are useful when exact inference is infeasible. These methods approximate the…
▽ More
Parameter inference for linear and non-Gaussian state space models is challenging because the likelihood function contains an intractable integral over the latent state variables. Exact inference using Markov chain Monte Carlo is computationally expensive, particularly for long time series data. Variational Bayes methods are useful when exact inference is infeasible. These methods approximate the posterior density of the parameters by a simple and tractable distribution found through optimisation. In this paper, we propose a novel sequential variational Bayes approach that makes use of the Whittle likelihood for computationally efficient parameter inference in this class of state space models. Our algorithm, which we call Recursive Variational Gaussian Approximation with the Whittle Likelihood (R-VGA-Whittle), updates the variational parameters by processing data in the frequency domain. At each iteration, R-VGA-Whittle requires the gradient and Hessian of the Whittle log-likelihood, which are available in closed form for a wide class of models. Through several examples using a linear Gaussian state space model and a univariate/bivariate non-Gaussian stochastic volatility model, we show that R-VGA-Whittle provides good approximations to posterior distributions of the parameters and is very computationally efficient when compared to asymptotically exact methods such as Hamiltonian Monte Carlo.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Blurry-Consistency Segmentation Framework with Selective Stacking on Differential Interference Contrast 3D Breast Cancer Spheroid
Authors:
Thanh-Huy Nguyen,
Thi Kim Ngan Ngo,
Mai Anh Vu,
Ting-Yuan Tu
Abstract:
The ability of three-dimensional (3D) spheroid modeling to study the invasive behavior of breast cancer cells has drawn increased attention. The deep learning-based image processing framework is very effective at speeding up the cell morphological analysis process. Out-of-focus photos taken while capturing 3D cells under several z-slices, however, could negatively impact the deep learning model. I…
▽ More
The ability of three-dimensional (3D) spheroid modeling to study the invasive behavior of breast cancer cells has drawn increased attention. The deep learning-based image processing framework is very effective at speeding up the cell morphological analysis process. Out-of-focus photos taken while capturing 3D cells under several z-slices, however, could negatively impact the deep learning model. In this work, we created a new algorithm to handle blurry images while preserving the stacked image quality. Furthermore, we proposed a unique training architecture that leverages consistency training to help reduce the bias of the model when dense-slice stacking is applied. Additionally, the model's stability is increased under the sparse-slice stacking effect by utilizing the self-training approach. The new blurring stacking technique and training flow are combined with the suggested architecture and self-training mechanism to provide an innovative yet easy-to-use framework. Our methods produced noteworthy experimental outcomes in terms of both quantitative and qualitative aspects.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
What do Transformers Know about Government?
Authors:
Jue Hou,
Anisia Katinskaia,
Lari Kotilainen,
Sathianpong Trangcasanchai,
Anh-Duc Vu,
Roman Yangarber
Abstract:
This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government relation between constituents in a sentence. We use several probing classifiers, and data from two morphologically rich languages. Our experiments show t…
▽ More
This paper investigates what insights about linguistic features and what knowledge about the structure of natural language can be obtained from the encodings in transformer language models.In particular, we explore how BERT encodes the government relation between constituents in a sentence. We use several probing classifiers, and data from two morphologically rich languages. Our experiments show that information about government is encoded across all transformer layers, but predominantly in the early layers of the model. We find that, for both languages, a small number of attention heads encode enough information about the government relations to enable us to train a classifier capable of discovering new, previously unknown types of government, never seen in the training data. Currently, data is lacking for the research community working on grammatical constructions, and government in particular. We release the Government Bank -- a dataset defining the government relations for thousands of lemmas in the languages in our experiments.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
On the convergence analysis of one-shot inversion methods
Authors:
Marcella Bonazzoli,
Houssem Haddar,
Tuan Anh Vu
Abstract:
When an inverse problem is solved by a gradient-based optimization algorithm, the corresponding forward and adjoint problems, which are introduced to compute the gradient, can be also solved iteratively. The idea of iterating at the same time on the inverse problem unknown and on the forward and adjoint problem solutions yields the concept of one-shot inversion methods. We are especially intereste…
▽ More
When an inverse problem is solved by a gradient-based optimization algorithm, the corresponding forward and adjoint problems, which are introduced to compute the gradient, can be also solved iteratively. The idea of iterating at the same time on the inverse problem unknown and on the forward and adjoint problem solutions yields the concept of one-shot inversion methods. We are especially interested in the case where the inner iterations for the direct and adjoint problems are incomplete, that is, stopped before achieving a high accuracy on their solutions. Here, we focus on general linear inverse problems and generic fixed-point iterations for the associated forward problem. We analyze variants of the so-called multi-step one-shot methods, in particular semi-implicit schemes with a regularization parameter. We establish sufficient conditions on the descent step for convergence, by studying the eigenvalues of the block matrix of the coupled iterations. Several numerical experiments are provided to illustrate the convergence of these methods in comparison with the classical gradient descent, where the forward and adjoint problems are solved exactly by a direct solver instead. We observe that very few inner iterations are enough to guarantee good convergence of the inversion algorithm, even in the presence of noisy data.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
ToXCL: A Unified Framework for Toxic Speech Detection and Explanation
Authors:
Nhat M. Hoang,
Xuan Long Do,
Duc Anh Do,
Duc Anh Vu,
Luu Anh Tuan
Abstract:
The proliferation of online toxic speech is a pertinent problem posing threats to demographic groups. While explicit toxic speech contains offensive lexical signals, implicit one consists of coded or indirect language. Therefore, it is crucial for models not only to detect implicit toxic speech but also to explain its toxicity. This draws a unique need for unified frameworks that can effectively d…
▽ More
The proliferation of online toxic speech is a pertinent problem posing threats to demographic groups. While explicit toxic speech contains offensive lexical signals, implicit one consists of coded or indirect language. Therefore, it is crucial for models not only to detect implicit toxic speech but also to explain its toxicity. This draws a unique need for unified frameworks that can effectively detect and explain implicit toxic speech. Prior works mainly formulated the task of toxic speech detection and explanation as a text generation problem. Nonetheless, models trained using this strategy can be prone to suffer from the consequent error propagation problem. Moreover, our experiments reveal that the detection results of such models are much lower than those that focus only on the detection task. To bridge these gaps, we introduce ToXCL, a unified framework for the detection and explanation of implicit toxic speech. Our model consists of three modules: a (i) Target Group Generator to generate the targeted demographic group(s) of a given post; an (ii) Encoder-Decoder Model in which the encoder focuses on detecting implicit toxic speech and is boosted by a (iii) Teacher Classifier via knowledge distillation, and the decoder generates the necessary explanation. ToXCL achieves new state-of-the-art effectiveness, and outperforms baselines significantly.
△ Less
Submitted 20 May, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
An ergodic and isotropic zero-conductance model with arbitrarily strong local connectivity
Authors:
Martin Heida,
Benedikt Jahnel,
Anh Duc Vu
Abstract:
We exhibit a percolating ergodic and isotropic lattice model in all but at least two dimensions that has zero effective conductivity in all spatial directions and for all non-trivial choices of the connectivity parameter. The model is based on the so-called randomly stretched lattice where we additionally elongate layers containing few open edges.
We exhibit a percolating ergodic and isotropic lattice model in all but at least two dimensions that has zero effective conductivity in all spatial directions and for all non-trivial choices of the connectivity parameter. The model is based on the so-called randomly stretched lattice where we additionally elongate layers containing few open edges.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Integrating Preprocessing Methods and Convolutional Neural Networks for Effective Tumor Detection in Medical Imaging
Authors:
Ha Anh Vu
Abstract:
This research presents a machine-learning approach for tumor detection in medical images using convolutional neural networks (CNNs). The study focuses on preprocessing techniques to enhance image features relevant to tumor detection, followed by develo** and training a CNN model for accurate classification. Various image processing techniques, including Gaussian smoothing, bilateral filtering, a…
▽ More
This research presents a machine-learning approach for tumor detection in medical images using convolutional neural networks (CNNs). The study focuses on preprocessing techniques to enhance image features relevant to tumor detection, followed by develo** and training a CNN model for accurate classification. Various image processing techniques, including Gaussian smoothing, bilateral filtering, and K-means clustering, are employed to preprocess the input images and highlight tumor regions. The CNN model is trained and evaluated on a dataset of medical images, with augmentation and data generators utilized to enhance model generalization. Experimental results demonstrate the effectiveness of the proposed approach in accurately detecting tumors in medical images, paving the way for improved diagnostic tools in healthcare.
△ Less
Submitted 25 February, 2024;
originally announced February 2024.
-
Magnetosheath ion field-aligned anisotropy and implications for ion leakage to the foreshock
Authors:
Terry Zixu Liu,
Vassilis Angelopoulos,
Hui Zhang,
Andrew Vu,
Joachim Raeder
Abstract:
The ion foreshock is highly dynamic, disturbing the bow shock and the magnetosphere-ionosphere system. To forecast foreshock-driven space weather effects, it is necessary to model foreshock ions as a function of upstream shock parameters. Case studies in the accompanying paper show that magnetosheath ions sometimes exhibit strong field-aligned anisotropy towards the upstream direction, which may b…
▽ More
The ion foreshock is highly dynamic, disturbing the bow shock and the magnetosphere-ionosphere system. To forecast foreshock-driven space weather effects, it is necessary to model foreshock ions as a function of upstream shock parameters. Case studies in the accompanying paper show that magnetosheath ions sometimes exhibit strong field-aligned anisotropy towards the upstream direction, which may be responsible for enhancing magnetosheath leakage and therefore foreshock ion density. To understand the conditions leading to such an anisotropy and the potential for enhanced leakage, we perform case studies and a statistical study of magnetosheath and foreshock region data surrounding ~500 THEMIS bow shock crossings. We quantify the anisotropy using the heat flux along the field-aligned direction. We show that the strong field-aligned heat flux persists across the entire magnetosheath from the magnetopause to the bow shock. Ion distribution functions reveal that the strong heat flux is caused by a secondary thermal population. We find that stronger anisotropy events exhibit heat flux preferentially towards the upstream direction near the bow shock and occur under larger IMF strength and larger solar wind dynamic pressure and/or energy flux. Additionally, we show that near the bow shock, magnetosheath leakage is a significant contributor to foreshock ions, and through enhancing the leakage the magnetosheath ion anisotropy can modulate the foreshock ion velocity and density. Our results imply that likely due to field line dra** and compression against the magnetopause that leads to a directional mirror force, modeling the foreshock ions necessitates a more global accounting of downstream conditions.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Fast & Fair: Efficient Second-Order Robust Optimization for Fairness in Machine Learning
Authors:
Allen Minch,
Hung Anh Vu,
Anne Marie Warren
Abstract:
This project explores adversarial training techniques to develop fairer Deep Neural Networks (DNNs) to mitigate the inherent bias they are known to exhibit. DNNs are susceptible to inheriting bias with respect to sensitive attributes such as race and gender, which can lead to life-altering outcomes (e.g., demographic bias in facial recognition software used to arrest a suspect). We propose a robus…
▽ More
This project explores adversarial training techniques to develop fairer Deep Neural Networks (DNNs) to mitigate the inherent bias they are known to exhibit. DNNs are susceptible to inheriting bias with respect to sensitive attributes such as race and gender, which can lead to life-altering outcomes (e.g., demographic bias in facial recognition software used to arrest a suspect). We propose a robust optimization problem, which we demonstrate can improve fairness in several datasets, both synthetic and real-world, using an affine linear model. Leveraging second order information, we are able to find a solution to our optimization problem more efficiently than a purely first order method.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
Improving Multimodal Sentiment Analysis: Supervised Angular Margin-based Contrastive Learning for Enhanced Fusion Representation
Authors:
Cong-Duy Nguyen,
Thong Nguyen,
Duc Anh Vu,
Luu Anh Tuan
Abstract:
The effectiveness of a model is heavily reliant on the quality of the fusion representation of multiple modalities in multimodal sentiment analysis. Moreover, each modality is extracted from raw input and integrated with the rest to construct a multimodal representation. Although previous methods have proposed multimodal representations and achieved promising results, most of them focus on forming…
▽ More
The effectiveness of a model is heavily reliant on the quality of the fusion representation of multiple modalities in multimodal sentiment analysis. Moreover, each modality is extracted from raw input and integrated with the rest to construct a multimodal representation. Although previous methods have proposed multimodal representations and achieved promising results, most of them focus on forming positive and negative pairs, neglecting the variation in sentiment scores within the same class. Additionally, they fail to capture the significance of unimodal representations in the fusion vector. To address these limitations, we introduce a framework called Supervised Angular-based Contrastive Learning for Multimodal Sentiment Analysis. This framework aims to enhance discrimination and generalizability of the multimodal representation and overcome biases in the fusion vector's modality. Our experimental results, along with visualizations on two widely used datasets, demonstrate the effectiveness of our approach.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions
Authors:
Phuoc Pham Van Long,
Duc Anh Vu,
Nhat M. Hoang,
Xuan Long Do,
Anh Tuan Luu
Abstract:
Mathematical questioning is crucial for assessing students problem-solving skills. Since manually creating such questions requires substantial effort, automatic methods have been explored. Existing state-of-the-art models rely on fine-tuning strategies and struggle to generate questions that heavily involve multiple steps of logical and arithmetic reasoning. Meanwhile, large language models(LLMs)…
▽ More
Mathematical questioning is crucial for assessing students problem-solving skills. Since manually creating such questions requires substantial effort, automatic methods have been explored. Existing state-of-the-art models rely on fine-tuning strategies and struggle to generate questions that heavily involve multiple steps of logical and arithmetic reasoning. Meanwhile, large language models(LLMs) such as ChatGPT have excelled in many NLP tasks involving logical and arithmetic reasoning. Nonetheless, their applications in generating educational questions are underutilized, especially in the field of mathematics. To bridge this gap, we take the first step to conduct an in-depth analysis of ChatGPT in generating pre-university math questions. Our analysis is categorized into two main settings: context-aware and context-unaware. In the context-aware setting, we evaluate ChatGPT on existing math question-answering benchmarks covering elementary, secondary, and ternary classes. In the context-unaware setting, we evaluate ChatGPT in generating math questions for each lesson from pre-university math curriculums that we crawl. Our crawling results in TopicMath, a comprehensive and novel collection of pre-university math curriculums collected from 121 math topics and 428 lessons from elementary, secondary, and tertiary classes. Through this analysis, we aim to provide insight into the potential of ChatGPT as a math questioner.
△ Less
Submitted 27 February, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
A long-range contact process in a random environment
Authors:
Benedikt Jahnel,
Anh Duc Vu
Abstract:
We study survival and extinction of a long-range infection process on a diluted one-dimensional lattice in discrete time. The infection can spread to distant vertices according to a Pareto distribution, however spreading is also prohibited at random times. We prove a phase transition in the recovery parameter via block arguments. This contributes to a line of research on directed percolation with…
▽ More
We study survival and extinction of a long-range infection process on a diluted one-dimensional lattice in discrete time. The infection can spread to distant vertices according to a Pareto distribution, however spreading is also prohibited at random times. We prove a phase transition in the recovery parameter via block arguments. This contributes to a line of research on directed percolation with long-range correlations in nonstabilizing random environments.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Manta Ray Inspired Flap**-Wing Blimp
Authors:
Kentaro Nojima-Schmunk,
David Turzak,
Kevin Kim,
Andrew Vu,
James Yang,
Sreeauditya Motukuri,
Ningshi Yao,
Daigo Shishika
Abstract:
Lighter-than-air vehicles or blimps, are an evolving platform in robotics with several beneficial properties such as energy efficiency, collision resistance, and ability to work in close proximity to human users. While existing blimp designs have mainly used propeller-based propulsion, we focus our attention to an alternate locomotion method, flap** wings. Specifically, this paper introduces a f…
▽ More
Lighter-than-air vehicles or blimps, are an evolving platform in robotics with several beneficial properties such as energy efficiency, collision resistance, and ability to work in close proximity to human users. While existing blimp designs have mainly used propeller-based propulsion, we focus our attention to an alternate locomotion method, flap** wings. Specifically, this paper introduces a flap**-wing blimp inspired by manta rays, in contrast to existing research on flap**-wing vehicles that draw inspiration from insects or birds. We present the overall design and control scheme of the blimp as well as the analysis on how the wing performs. The effects of wing shape and flap** characteristics on the thrust generation are studied experimentally. We also demonstrate that the flap**-wing blimp has a significant range advantage over a propeller-based system.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Few-Shot Object Detection via Synthetic Features with Optimal Transport
Authors:
Anh-Khoa Nguyen Vu,
Thanh-Toan Do,
Vinh-Tiep Nguyen,
Tam Le,
Minh-Triet Tran,
Tam V. Nguyen
Abstract:
Few-shot object detection aims to simultaneously localize and classify the objects in an image with limited training samples. However, most existing few-shot object detection methods focus on extracting the features of a few samples of novel classes that lack diversity. Hence, they may not be sufficient to capture the data distribution. To address that limitation, in this paper, we propose a novel…
▽ More
Few-shot object detection aims to simultaneously localize and classify the objects in an image with limited training samples. However, most existing few-shot object detection methods focus on extracting the features of a few samples of novel classes that lack diversity. Hence, they may not be sufficient to capture the data distribution. To address that limitation, in this paper, we propose a novel approach in which we train a generator to generate synthetic data for novel classes. Still, directly training a generator on the novel class is not effective due to the lack of novel data. To overcome that issue, we leverage the large-scale dataset of base classes. Our overarching goal is to train a generator that captures the data variations of the base dataset. We then transform the captured variations into novel classes by generating synthetic data with the trained generator. To encourage the generator to capture data variations on base classes, we propose to train the generator with an optimal transport loss that minimizes the optimal transport distance between the distributions of real and synthetic data. Extensive experiments on two benchmark datasets demonstrate that the proposed method outperforms the state of the art. Source code will be available.
△ Less
Submitted 29 August, 2023; v1 submitted 28 August, 2023;
originally announced August 2023.
-
Surjective Span 6 Cellular Automata
Authors:
Hung Anh Vu,
Nate Schnitzer,
Ethan Ewing
Abstract:
Using FSA and the construction algorithm, we generated a list of surjective span 6 cellular automata as a modest sample for our FDense program. We wanted to experimentally quantify Mike Boyle's conjecture which states that the jointly periodic points of one-dimensional cellular automata are dense. Furthermore, we wanted to know if the cardinality of cellular automata on N symbols is greater than o…
▽ More
Using FSA and the construction algorithm, we generated a list of surjective span 6 cellular automata as a modest sample for our FDense program. We wanted to experimentally quantify Mike Boyle's conjecture which states that the jointly periodic points of one-dimensional cellular automata are dense. Furthermore, we wanted to know if the cardinality of cellular automata on N symbols is greater than or equal to the square root of N.
△ Less
Submitted 10 June, 2023;
originally announced June 2023.
-
R-VGAL: A Sequential Variational Bayes Algorithm for Generalised Linear Mixed Models
Authors:
Bao Anh Vu,
David Gunawan,
Andrew Zammit-Mangion
Abstract:
Models with random effects, such as generalised linear mixed models (GLMMs), are often used for analysing clustered data. Parameter inference with these models is difficult because of the presence of cluster-specific random effects, which must be integrated out when evaluating the likelihood function. Here, we propose a sequential variational Bayes algorithm, called Recursive Variational Gaussian…
▽ More
Models with random effects, such as generalised linear mixed models (GLMMs), are often used for analysing clustered data. Parameter inference with these models is difficult because of the presence of cluster-specific random effects, which must be integrated out when evaluating the likelihood function. Here, we propose a sequential variational Bayes algorithm, called Recursive Variational Gaussian Approximation for Latent variable models (R-VGAL), for estimating parameters in GLMMs. The R-VGAL algorithm operates on the data sequentially, requires only a single pass through the data, and can provide parameter updates as new data are collected without the need of re-processing the previous data. At each update, the R-VGAL algorithm requires the gradient and Hessian of a "partial" log-likelihood function evaluated at the new observation, which are generally not available in closed form for GLMMs. To circumvent this issue, we propose using an importance-sampling-based approach for estimating the gradient and Hessian via Fisher's and Louis' identities. We find that R-VGAL can be unstable when traversing the first few data points, but that this issue can be mitigated by using a variant of variational tempering in the initial steps of the algorithm. Through illustrations on both simulated and real datasets, we show that R-VGAL provides good approximations to the exact posterior distributions, that it can be made robust through tempering, and that it is computationally efficient.
△ Less
Submitted 18 April, 2024; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Validity Constraints for Data Analysis Workflows
Authors:
Florian Schintke,
Ninon De Mecquenem,
David Frantz,
Vanessa Emanuela Guarino,
Marcus Hilbrich,
Fabian Lehmann,
Rebecca Sattler,
Jan Arne Sparka,
Daniel Speckhard,
Hermann Stolte,
Anh Duc Vu,
Ulf Leser
Abstract:
Porting a scientific data analysis workflow (DAW) to a cluster infrastructure, a new software stack, or even only a new dataset with some notably different properties is often challenging. Despite the structured definition of the steps (tasks) and their interdependencies during a complex data analysis in the DAW specification, relevant assumptions may remain unspecified and implicit. Such hidden a…
▽ More
Porting a scientific data analysis workflow (DAW) to a cluster infrastructure, a new software stack, or even only a new dataset with some notably different properties is often challenging. Despite the structured definition of the steps (tasks) and their interdependencies during a complex data analysis in the DAW specification, relevant assumptions may remain unspecified and implicit. Such hidden assumptions often lead to crashing tasks without a reasonable error message, poor performance in general, non-terminating executions, or silent wrong results of the DAW, to name only a few possible consequences. Searching for the causes of such errors and drawbacks in a distributed compute cluster managed by a complex infrastructure stack, where DAWs for large datasets typically are executed, can be tedious and time-consuming.
We propose validity constraints (VCs) as a new concept for DAW languages to alleviate this situation. A VC is a constraint specifying some logical conditions that must be fulfilled at certain times for DAW executions to be valid. When defined together with a DAW, VCs help to improve the portability, adaptability, and reusability of DAWs by making implicit assumptions explicit. Once specified, VC can be controlled automatically by the DAW infrastructure, and violations can lead to meaningful error messages and graceful behaviour (e.g., termination or invocation of repair mechanisms). We provide a broad list of possible VCs, classify them along multiple dimensions, and compare them to similar concepts one can find in related fields. We also provide a first sketch for VCs' implementation into existing DAW infrastructures.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods
Authors:
Nhat-Hao Pham,
Khanh-Linh Vo,
Mai Anh Vu,
Thu Nguyen,
Michael A. Riegler,
Pål Halvorsen,
Binh T. Nguyen
Abstract:
Correlation matrix visualization is essential for understanding the relationships between variables in a dataset, but missing data can pose a significant challenge in estimating correlation coefficients. In this paper, we compare the effects of various missing data methods on the correlation plot, focusing on two common missing patterns: random and monotone. We aim to provide practical strategies…
▽ More
Correlation matrix visualization is essential for understanding the relationships between variables in a dataset, but missing data can pose a significant challenge in estimating correlation coefficients. In this paper, we compare the effects of various missing data methods on the correlation plot, focusing on two common missing patterns: random and monotone. We aim to provide practical strategies and recommendations for researchers and practitioners in creating and analyzing the correlation plot. Our experimental results suggest that while imputation is commonly used for missing data, using imputed data for plotting the correlation matrix may lead to a significantly misleading inference of the relation between the features. We recommend using DPER, a direct parameter estimation approach, for plotting the correlation matrix based on its performance in the experiments.
△ Less
Submitted 5 September, 2023; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Blockwise Principal Component Analysis for monotone missing data imputation and dimensionality reduction
Authors:
Tu T. Do,
Mai Anh Vu,
Tuan L. Vo,
Hoang Thien Ly,
Thu Nguyen,
Steven A. Hicks,
Michael A. Riegler,
Pål Halvorsen,
Binh T. Nguyen
Abstract:
Monotone missing data is a common problem in data analysis. However, imputation combined with dimensionality reduction can be computationally expensive, especially with the increasing size of datasets. To address this issue, we propose a Blockwise principal component analysis Imputation (BPI) framework for dimensionality reduction and imputation of monotone missing data. The framework conducts Pri…
▽ More
Monotone missing data is a common problem in data analysis. However, imputation combined with dimensionality reduction can be computationally expensive, especially with the increasing size of datasets. To address this issue, we propose a Blockwise principal component analysis Imputation (BPI) framework for dimensionality reduction and imputation of monotone missing data. The framework conducts Principal Component Analysis (PCA) on the observed part of each monotone block of the data and then imputes on merging the obtained principal components using a chosen imputation technique. BPI can work with various imputation techniques and can significantly reduce imputation time compared to conducting dimensionality reduction after imputation. This makes it a practical and efficient approach for large datasets with monotone missing data. Our experiments validate the improvement in speed. In addition, our experiments also show that while applying MICE imputation directly on missing data may not yield convergence, applying BPI with MICE for the data may lead to convergence.
△ Less
Submitted 10 January, 2024; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Effects of sub-word segmentation on performance of transformer language models
Authors:
Jue Hou,
Anisia Katinskaia,
Anh-Duc Vu,
Roman Yangarber
Abstract:
Language modeling is a fundamental task in natural language processing, which has been thoroughly explored with various architectures and hyperparameters. However, few studies focus on the effect of sub-word segmentation on the performance of language models (LMs). In this paper, we compare GPT and BERT models trained with the statistical segmentation algorithm BPE vs. two unsupervised algorithms…
▽ More
Language modeling is a fundamental task in natural language processing, which has been thoroughly explored with various architectures and hyperparameters. However, few studies focus on the effect of sub-word segmentation on the performance of language models (LMs). In this paper, we compare GPT and BERT models trained with the statistical segmentation algorithm BPE vs. two unsupervised algorithms for morphological segmentation -- Morfessor and StateMorph. We train the models for several languages -- including ones with very rich morphology -- and compare their performance with different segmentation algorithms, vocabulary sizes, and model sizes. The results show that training with morphological segmentation allows the LMs to: 1. achieve lower perplexity, 2. converge more efficiently in terms of training time, and 3. achieve equivalent or better evaluation scores on downstream tasks. Lastly, we show 4. that LMs of smaller size using morphological segmentation can perform comparably to models of larger size trained with BPE -- both in terms of (1) perplexity and (3) scores on downstream tasks. Points (2) and (4) impact on sustainability of LMs, since they reduce the model cost: size and computation time. While (2) reduces cost only in the training phase, (4) does so also in the inference phase.
△ Less
Submitted 26 October, 2023; v1 submitted 9 May, 2023;
originally announced May 2023.
-
Instance-level Few-shot Learning with Class Hierarchy Mining
Authors:
Anh-Khoa Nguyen Vu,
Thanh-Toan Do,
Nhat-Duy Nguyen,
Vinh-Tiep Nguyen,
Thanh Duc Ngo,
Tam V. Nguyen
Abstract:
Few-shot learning is proposed to tackle the problem of scarce training data in novel classes. However, prior works in instance-level few-shot learning have paid less attention to effectively utilizing the relationship between categories. In this paper, we exploit the hierarchical information to leverage discriminative and relevant features of base classes to effectively classify novel objects. The…
▽ More
Few-shot learning is proposed to tackle the problem of scarce training data in novel classes. However, prior works in instance-level few-shot learning have paid less attention to effectively utilizing the relationship between categories. In this paper, we exploit the hierarchical information to leverage discriminative and relevant features of base classes to effectively classify novel objects. These features are extracted from abundant data of base classes, which could be utilized to reasonably describe classes with scarce data. Specifically, we propose a novel superclass approach that automatically creates a hierarchy considering base and novel classes as fine-grained classes for few-shot instance segmentation (FSIS). Based on the hierarchical information, we design a novel framework called Soft Multiple Superclass (SMS) to extract relevant features or characteristics of classes in the same superclass. A new class assigned to the superclass is easier to classify by leveraging these relevant features. Besides, in order to effectively train the hierarchy-based-detector in FSIS, we apply the label refinement to further describe the associations between fine-grained classes. The extensive experiments demonstrate the effectiveness of our method on FSIS benchmarks. Code is available online.
△ Less
Submitted 14 April, 2023;
originally announced April 2023.
-
The Art of Camouflage: Few-shot Learning for Animal Detection and Segmentation
Authors:
Thanh-Danh Nguyen,
Anh-Khoa Nguyen Vu,
Nhat-Duy Nguyen,
Vinh-Tiep Nguyen,
Thanh Duc Ngo,
Thanh-Toan Do,
Minh-Triet Tran,
Tam V. Nguyen
Abstract:
Camouflaged object detection and segmentation is a new and challenging research topic in computer vision. There is a serious issue of lacking data of camouflaged objects such as camouflaged animals in natural scenes. In this paper, we address the problem of few-shot learning for camouflaged object detection and segmentation. To this end, we first collect a new dataset, CAMO-FS, for the benchmark.…
▽ More
Camouflaged object detection and segmentation is a new and challenging research topic in computer vision. There is a serious issue of lacking data of camouflaged objects such as camouflaged animals in natural scenes. In this paper, we address the problem of few-shot learning for camouflaged object detection and segmentation. To this end, we first collect a new dataset, CAMO-FS, for the benchmark. We then propose a novel method to efficiently detect and segment the camouflaged objects in the images. In particular, we introduce the instance triplet loss and the instance memory storage. The extensive experiments demonstrated that our proposed method achieves state-of-the-art performance on the newly collected dataset.
△ Less
Submitted 21 January, 2024; v1 submitted 14 April, 2023;
originally announced April 2023.
-
No Easy Way Out: the Effectiveness of Deplatforming an Extremist Forum to Suppress Hate and Harassment
Authors:
Anh V. Vu,
Alice Hutchings,
Ross Anderson
Abstract:
Legislators and policymakers worldwide are debating options for suppressing illegal, harmful and undesirable material online. Drawing on several quantitative data sources, we show that deplatforming an active community to suppress online hate and harassment, even with a substantial concerted effort involving several tech firms, can be hard. Our case study is the disruption of the largest and longe…
▽ More
Legislators and policymakers worldwide are debating options for suppressing illegal, harmful and undesirable material online. Drawing on several quantitative data sources, we show that deplatforming an active community to suppress online hate and harassment, even with a substantial concerted effort involving several tech firms, can be hard. Our case study is the disruption of the largest and longest-running harassment forum Kiwi Farms in late 2022, which is probably the most extensive industry effort to date. Despite the active participation of a number of tech companies over several consecutive months, this campaign failed to shut down the forum and remove its objectionable content. While briefly raising public awareness, it led to rapid platform displacement and traffic fragmentation. Part of the activity decamped to Telegram, while traffic shifted from the primary domain to previously abandoned alternatives. The forum experienced intermittent outages for several weeks, after which the community leading the campaign lost interest, traffic was directed back to the main domain, users quickly returned, and the forum was back online and became even more connected. The forum members themselves stopped discussing the incident shortly thereafter, and the net effect was that forum activity, active users, threads, posts and traffic were all cut by about half. Deplatforming a community without a court order raises philosophical issues about censorship versus free speech; ethical and legal issues about the role of industry in online content moderation; and practical issues on the efficacy of private-sector versus government action. Deplatforming a dispersed community using a series of court orders against individual service providers appears unlikely to be very effective if the censor cannot incapacitate the key maintainers, whether by arresting them, enjoining them or otherwise deterring them.
△ Less
Submitted 13 April, 2024; v1 submitted 14 April, 2023;
originally announced April 2023.
-
Long time behavior for collisional strongly magnetized plasma in three space dimensions
Authors:
Mihaï Bostan,
Anh-Tuan Vu
Abstract:
We consider the long time evolution of a population of charged particles, under strong magnetic fields and collision mechanisms. We derive a fluid model and justify the asymptotic behavior toward smooth solutions of this regime. In three space dimensions, a constraint ocurs along the parallel direction. For eliminating the corresponding Lagrange multiplier, we average along the magnetic lines.
We consider the long time evolution of a population of charged particles, under strong magnetic fields and collision mechanisms. We derive a fluid model and justify the asymptotic behavior toward smooth solutions of this regime. In three space dimensions, a constraint ocurs along the parallel direction. For eliminating the corresponding Lagrange multiplier, we average along the magnetic lines.
△ Less
Submitted 3 January, 2024; v1 submitted 29 March, 2023;
originally announced March 2023.
-
Asymptotic behavior of the two-dimensional Vlasov-Poisson-Fokker-Planck equation with a strong external magnetic field
Authors:
Mihaï Bostan,
Anh-Tuan Vu
Abstract:
The subject matter of the paper concerns the Vlasov-Poisson-Fokker-Planck (VPFP) equations in the context of magnetic confinement. We study the long-time behavior of the VPFP system with an intense external magnetic field, when neglecting the curvature of the magnetic lines. When the intensity of the magnetic field tends to infinity, the long-time behavior of the particle concentration is describe…
▽ More
The subject matter of the paper concerns the Vlasov-Poisson-Fokker-Planck (VPFP) equations in the context of magnetic confinement. We study the long-time behavior of the VPFP system with an intense external magnetic field, when neglecting the curvature of the magnetic lines. When the intensity of the magnetic field tends to infinity, the long-time behavior of the particle concentration is described by a first-order nonlinear hyperbolic equation of the Euler type for fluid mechanics. More exactly, when the magnetic field is uniform, we find the vorticity formulation of the incompressible Euler equations in two-dimensional space. Our proofs rely on the modulated energy method.
△ Less
Submitted 28 December, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
-
Bounds on Functionality and Symmetric Difference -- Two Intriguing Graph Parameters
Authors:
Pavel Dvořák,
Lukáš Folwarczný,
Michal Opler,
Pavel Pudlák,
Robert Šámal,
Tung Anh Vu
Abstract:
[Alecu et al.: Graph functionality, JCTB2021] define functionality, a graph parameter that generalizes graph degeneracy. They research the relation of functionality to many other graph parameters (tree-width, clique-width, VC-dimension, etc.). Extending their research, we prove logarithmic lower bound for functionality of random graph $G(n,p)$ for large range of $p$. Previously known graphs have f…
▽ More
[Alecu et al.: Graph functionality, JCTB2021] define functionality, a graph parameter that generalizes graph degeneracy. They research the relation of functionality to many other graph parameters (tree-width, clique-width, VC-dimension, etc.). Extending their research, we prove logarithmic lower bound for functionality of random graph $G(n,p)$ for large range of $p$. Previously known graphs have functionality logarithmic in number of vertices. We show that for every graph $G$ on $n$ vertices we have $\mathrm{fun}(G) \le O(\sqrt{ n \log n})$ and we give a nearly matching $Ω(\sqrt{n})$-lower bound provided by projective planes.
Further, we study a related graph parameter \emph{symmetric difference}, the minimum of $|N(u) ΔN(v)|$ over all pairs of vertices of the ``worst possible'' induced subgraph. It was observed by Alecu et al. that $\mathrm{fun}(G) \le \mathrm{sd}(G)+1$ for every graph $G$. We compare $\mathrm{fun}$ and $\mathrm{sd}$ for the class $\mathrm{INT}$ of interval graphs and $\mathrm{CA}$ of circular-arc graphs. We let $\mathrm{INT}_n$ denote the $n$-vertex interval graphs, similarly for $\mathrm{CA}_n$. Alecu et al. ask, whether $\mathrm{fun}(\mathrm{INT})$ is bounded. Dallard et al. answer this positively in a recent preprint. On the other hand, we show that $Ω(\sqrt[4]{n}) \leq \mathrm{sd}(\mathrm{INT}_n) \leq O(\sqrt[3]{n})$. For the related class $\mathrm{CA}$ we show that $\mathrm{sd}(\mathrm{CA}_n) = Θ(\sqrt{n})$. We propose a follow-up question: is $\mathrm{fun}(\mathrm{CA})$ bounded?
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification
Authors:
Ihsan Ullah,
Dustin Carrión-Ojeda,
Sergio Escalera,
Isabelle Guyon,
Mike Huisman,
Felix Mohr,
Jan N van Rijn,
Haozhe Sun,
Joaquin Vanschoren,
Phan Anh Vu
Abstract:
We introduce Meta-Album, an image classification meta-dataset designed to facilitate few-shot learning, transfer learning, meta-learning, among other tasks. It includes 40 open datasets, each having at least 20 classes with 40 examples per class, with verified licences. They stem from diverse domains, such as ecology (fauna and flora), manufacturing (textures, vehicles), human actions, and optical…
▽ More
We introduce Meta-Album, an image classification meta-dataset designed to facilitate few-shot learning, transfer learning, meta-learning, among other tasks. It includes 40 open datasets, each having at least 20 classes with 40 examples per class, with verified licences. They stem from diverse domains, such as ecology (fauna and flora), manufacturing (textures, vehicles), human actions, and optical character recognition, featuring various image scales (microscopic, human scales, remote sensing). All datasets are preprocessed, annotated, and formatted uniformly, and come in 3 versions (Micro $\subset$ Mini $\subset$ Extended) to match users' computational resources. We showcase the utility of the first 30 datasets on few-shot learning problems. The other 10 will be released shortly after. Meta-Album is already more diverse and larger (in number of datasets) than similar efforts, and we are committed to keep enlarging it via a series of competitions. As competitions terminate, their test data are released, thus creating a rolling benchmark, available through OpenML.org. Our website https://meta-album.github.io/ contains the source code of challenge winning methods, baseline methods, data loaders, and instructions for contributing either new datasets or algorithms to our expandable meta-dataset.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
Conditional expectation with regularization for missing data imputation
Authors:
Mai Anh Vu,
Thu Nguyen,
Tu T. Do,
Nhan Phan,
Nitesh V. Chawla,
Pål Halvorsen,
Michael A. Riegler,
Binh T. Nguyen
Abstract:
Missing data frequently occurs in datasets across various domains, such as medicine, sports, and finance. In many cases, to enable proper and reliable analyses of such data, the missing values are often imputed, and it is necessary that the method used has a low root mean square error (RMSE) between the imputed and the true values. In addition, for some critical applications, it is also often a re…
▽ More
Missing data frequently occurs in datasets across various domains, such as medicine, sports, and finance. In many cases, to enable proper and reliable analyses of such data, the missing values are often imputed, and it is necessary that the method used has a low root mean square error (RMSE) between the imputed and the true values. In addition, for some critical applications, it is also often a requirement that the imputation method is scalable and the logic behind the imputation is explainable, which is especially difficult for complex methods that are, for example, based on deep learning. Based on these considerations, we propose a new algorithm named "conditional Distribution-based Imputation of Missing Values with Regularization" (DIMV). DIMV operates by determining the conditional distribution of a feature that has missing entries, using the information from the fully observed features as a basis. As will be illustrated via experiments in the paper, DIMV (i) gives a low RMSE for the imputed values compared to state-of-the-art methods; (ii) fast and scalable; (iii) is explainable as coefficients in a regression model, allowing reliable and trustable analysis, makes it a suitable choice for critical domains where understanding is important such as in medical fields, finance, etc; (iv) can provide an approximated confidence region for the missing values in a given sample; (v) suitable for both small and large scale data; (vi) in many scenarios, does not require a huge number of parameters as deep learning approaches; (vii) handle multicollinearity in imputation effectively; and (viii) is robust to the normally distributed assumption that its theoretical grounds rely on.
△ Less
Submitted 11 September, 2023; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Linguistic Constructs as the Representation of the Domain Model in an Intelligent Language Tutoring System
Authors:
Anisia Katinskaia,
Jue Hou,
Anh-Duc Vu,
Roman Yangarber
Abstract:
This paper presents the development of an AI-based language learning platform Revita. It is a freely available intelligent online tutor, developed to support learners of multiple languages, from low-intermediate to advanced levels. It has been in pilot use by hundreds of students at several universities, whose feedback and needs are sha** the development. One of the main emerging features of Rev…
▽ More
This paper presents the development of an AI-based language learning platform Revita. It is a freely available intelligent online tutor, developed to support learners of multiple languages, from low-intermediate to advanced levels. It has been in pilot use by hundreds of students at several universities, whose feedback and needs are sha** the development. One of the main emerging features of Revita is the introduction of a system of linguistic constructs as the representation of domain knowledge. The system of constructs is developed in close collaboration with experts in language teaching. Constructs define the types of exercises, the content of the feedback, and enable the detailed modeling and evaluation of learning progress.
△ Less
Submitted 3 December, 2022;
originally announced December 2022.
-
Towards Advanced Monitoring for Scientific Workflows
Authors:
Jonathan Bader,
Joel Witzke,
Soeren Becker,
Ansgar Lößer,
Fabian Lehmann,
Leon Doehler,
Anh Duc Vu,
Odej Kao
Abstract:
Scientific workflows consist of thousands of highly parallelized tasks executed in a distributed environment involving many components. Automatic tracing and investigation of the components' and tasks' performance metrics, traces, and behavior are necessary to support the end user with a level of abstraction since the large amount of data cannot be analyzed manually. The execution and monitoring o…
▽ More
Scientific workflows consist of thousands of highly parallelized tasks executed in a distributed environment involving many components. Automatic tracing and investigation of the components' and tasks' performance metrics, traces, and behavior are necessary to support the end user with a level of abstraction since the large amount of data cannot be analyzed manually. The execution and monitoring of scientific workflows involves many components, the cluster infrastructure, its resource manager, the workflow, and the workflow tasks. All components in such an execution environment access different monitoring metrics and provide metrics on different abstraction levels. The combination and analysis of observed metrics from different components and their interdependencies are still widely unregarded.
We specify four different monitoring layers that can serve as an architectural blueprint for the monitoring responsibilities and the interactions of components in the scientific workflow execution context. We describe the different monitoring metrics subject to the four layers and how the layers interact. Finally, we examine five state-of-the-art scientific workflow management systems (SWMS) in order to assess which steps are needed to enable our four-layer-based approach.
△ Less
Submitted 18 July, 2023; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Generalized $k$-Center: Distinguishing Doubling and Highway Dimension
Authors:
Andreas Emil Feldmann,
Tung Anh Vu
Abstract:
We consider generalizations of the $k$-Center problem in graphs of low doubling and highway dimension. For the Capacitated $k$-Supplier with Outliers (CkSwO) problem, we show an efficient parameterized approximation scheme (EPAS) when the parameters are $k$, the number of outliers and the doubling dimension of the supplier set. On the other hand, we show that for the Capacitated $k$-Center problem…
▽ More
We consider generalizations of the $k$-Center problem in graphs of low doubling and highway dimension. For the Capacitated $k$-Supplier with Outliers (CkSwO) problem, we show an efficient parameterized approximation scheme (EPAS) when the parameters are $k$, the number of outliers and the doubling dimension of the supplier set. On the other hand, we show that for the Capacitated $k$-Center problem, which is a special case of CkSwO, obtaining a parameterized approximation scheme (PAS) is $\mathrm{W[1]}$-hard when the parameters are $k$, and the highway dimension. This is the first known example of a problem for which it is hard to obtain a PAS for highway dimension, while simultaneously admitting an EPAS for doubling dimension.
△ Less
Submitted 1 September, 2022;
originally announced September 2022.
-
Getting Bored of Cyberwar: Exploring the Role of Low-level Cybercrime Actors in the Russia-Ukraine Conflict
Authors:
Anh V. Vu,
Daniel R. Thomas,
Ben Collier,
Alice Hutchings,
Richard Clayton,
Ross Anderson
Abstract:
There has been substantial commentary on the role of cyberattacks carried out by low-level cybercrime actors in the Russia-Ukraine conflict. We analyse 358k website defacement attacks, 1.7M UDP amplification DDoS attacks, 1764 posts made by 372 users on Hack Forums mentioning the two countries, and 441 Telegram announcements (with 58k replies) of a volunteer hacking group for two months before and…
▽ More
There has been substantial commentary on the role of cyberattacks carried out by low-level cybercrime actors in the Russia-Ukraine conflict. We analyse 358k website defacement attacks, 1.7M UDP amplification DDoS attacks, 1764 posts made by 372 users on Hack Forums mentioning the two countries, and 441 Telegram announcements (with 58k replies) of a volunteer hacking group for two months before and four months after the invasion. We find the conflict briefly but notably caught the attention of low-level cybercrime actors, with significant increases in online discussion and both types of attacks targeting Russia and Ukraine. However, there was little evidence of high-profile actions; the role of these players in the ongoing hybrid warfare is minor, and they should be separated from persistent and motivated 'hacktivists' in state-sponsored operations. Their involvement in the conflict appears to have been short-lived and fleeting, with a clear loss of interest in discussing the situation and carrying out both website defacement and DDoS attacks against either Russia or Ukraine after just a few weeks.
△ Less
Submitted 13 April, 2024; v1 submitted 22 August, 2022;
originally announced August 2022.
-
Convergence analysis of multi-step one-shot methods for linear inverse problems
Authors:
Marcella Bonazzoli,
Houssem Haddar,
Tuan Anh Vu
Abstract:
In this work we are interested in general linear inverse problems where the corresponding forward problem is solved iteratively using fixed point methods. Then one-shot methods, which iterate at the same time on the forward problem solution and on the inverse problem unknown, can be applied. We analyze two variants of the so-called multi-step one-shot methods and establish sufficient conditions on…
▽ More
In this work we are interested in general linear inverse problems where the corresponding forward problem is solved iteratively using fixed point methods. Then one-shot methods, which iterate at the same time on the forward problem solution and on the inverse problem unknown, can be applied. We analyze two variants of the so-called multi-step one-shot methods and establish sufficient conditions on the descent step for their convergence, by studying the eigenvalues of the block matrix of the coupled iterations. Several numerical experiments are provided to illustrate the convergence of these methods in comparison with the classical usual and shifted gradient descent. In particular, we observe that very few inner iterations on the forward problem are enough to guarantee good convergence of the inversion algorithm.
△ Less
Submitted 26 July, 2022; v1 submitted 21 July, 2022;
originally announced July 2022.
-
Continuum Percolation in a Nonstabilizing Environment
Authors:
Benedikt Jahnel,
Sanjoy Kumar Jhawar,
Anh Duc Vu
Abstract:
We prove phase transitions for continuum percolation in a Boolean model based on a Cox point process with nonstabilizing directing measure. The directing measure, which can be seen as a stationary random environment for the classical Poisson--Boolean model, is given by a planar rectangular Poisson line process. This Manhattan grid type construction features long-range dependencies in the environme…
▽ More
We prove phase transitions for continuum percolation in a Boolean model based on a Cox point process with nonstabilizing directing measure. The directing measure, which can be seen as a stationary random environment for the classical Poisson--Boolean model, is given by a planar rectangular Poisson line process. This Manhattan grid type construction features long-range dependencies in the environment, leading to absence of a sharp phase transition for the associated Cox--Boolean model. The phase transitions are established under individually as well as jointly varying parameters. Our proofs rest on discretization arguments and a comparison to percolation on randomly stretched lattices established in Hoffman 2005.
△ Less
Submitted 9 May, 2023; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Improved long time accuracy for projection methods for Navier-Stokes equations using EMAC formulation
Authors:
Sean Ingimarson,
Monika Neda,
Leo Rebholz,
Jorge Reyes,
An Vu
Abstract:
We consider a pressure correction temporal discretization for the incompressible Navier-Stokes equations in EMAC form. We prove stability and error estimates for the case of mixed finite element spatial discretization, and in particular that the Gronwall constant's exponential dependence on the Reynolds number is removed (for sufficiently smooth true solutions) or at least significantly reduced co…
▽ More
We consider a pressure correction temporal discretization for the incompressible Navier-Stokes equations in EMAC form. We prove stability and error estimates for the case of mixed finite element spatial discretization, and in particular that the Gronwall constant's exponential dependence on the Reynolds number is removed (for sufficiently smooth true solutions) or at least significantly reduced compared to the commonly used skew-symmetric formulation. We also show the method preserves momentum and angular momentum, and while it does not preserve energy it does admit an energy inequality. Several numerical tests show the advantages EMAC can have over other commonly used formulations of the nonlinearity. Additionally, we discuss extensions of the results to the usual Crank-Nicolson temporal discretization.
△ Less
Submitted 6 September, 2022; v1 submitted 10 May, 2022;
originally announced May 2022.
-
Auto-balanced common shock claim models
Authors:
Greg Taylor,
Phuong Anh Vu
Abstract:
The paper is concerned with common shock models of claim triangles. These are usually constructed as a linear combinations of shock components and idiosyncratic components. Previous literature has discussed the unbalanced property of such models, whereby the shocks may over- or under-contribute to some observations. The literature has also introduced corrections for this. The present paper discuss…
▽ More
The paper is concerned with common shock models of claim triangles. These are usually constructed as a linear combinations of shock components and idiosyncratic components. Previous literature has discussed the unbalanced property of such models, whereby the shocks may over- or under-contribute to some observations. The literature has also introduced corrections for this. The present paper discusses 'auto-balanced' models, in which all shock and idiosyncratic components contribute to observations such that their proportionate contributions are constant from one observation to another. The conditions for auto-balance are found to be simple and applicable to a wide range of model structures. Numerical illustrations are given.
△ Less
Submitted 23 December, 2021;
originally announced December 2021.
-
ExtremeBB: A Database for Large-Scale Research into Online Hate, Harassment, the Manosphere and Extremism
Authors:
Anh V. Vu,
Lydia Wilson,
Yi Ting Chua,
Ilia Shumailov,
Ross Anderson
Abstract:
We introduce ExtremeBB, a textual database of over 53.5M posts made by 38.5k users on 12 extremist bulletin board forums promoting online hate, harassment, the manosphere and other forms of extremism. It enables large-scale analyses of qualitative and quantitative historical trends going back two decades: measuring hate speech and toxicity; tracing the evolution of different strands of extremist i…
▽ More
We introduce ExtremeBB, a textual database of over 53.5M posts made by 38.5k users on 12 extremist bulletin board forums promoting online hate, harassment, the manosphere and other forms of extremism. It enables large-scale analyses of qualitative and quantitative historical trends going back two decades: measuring hate speech and toxicity; tracing the evolution of different strands of extremist ideology; tracking the relationships between online subcultures, extremist behaviours, and real-world violence; and monitoring extremist communities in near real time. This can shed light not only on the spread of problematic ideologies but also the effectiveness of interventions. ExtremeBB comes with a robust ethical data-sharing regime that allows us to share data with academics worldwide. Since 2020, access has been granted to 49 licensees in 16 research groups from 12 institutions.
△ Less
Submitted 20 August, 2023; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Stochastic Homogenization on Irregularly Perforated Domains
Authors:
Martin Heida,
Benedikt Jahnel,
Anh Duc Vu
Abstract:
We study stochastic homogenization of a quasilinear parabolic PDE with nonlinear microscopic Robin conditions on a perforated domain. The focus of our work lies on the underlying geometry that does not allow standard homogenization techniques to be applied directly. Instead we prove homogenization on a regularized geometry and demonstrate afterwards that the form of the homogenized equation is ind…
▽ More
We study stochastic homogenization of a quasilinear parabolic PDE with nonlinear microscopic Robin conditions on a perforated domain. The focus of our work lies on the underlying geometry that does not allow standard homogenization techniques to be applied directly. Instead we prove homogenization on a regularized geometry and demonstrate afterwards that the form of the homogenized equation is independent from the regularization. Then we pass to the regularization limit to obtain the anticipated limit equation. Furthermore, we show that Boolean models of Poisson point processes are covered by our approach.
△ Less
Submitted 7 October, 2021;
originally announced October 2021.
-
Nonlocal Phonon Heat Transport Seen in 1-d Pulses
Authors:
Philip B. Allen,
Nhat Ahn Nghiem Vu
Abstract:
Phonons are the main heat carriers in semiconductor devices. In small devices, heat is not driven by a local temperature gradient, but by local points of heat input and removal. This complicates theoretical modeling. Study of the propagation of vibrational energy from an initial localized pulse provides insight into nonlocal phonon heat transport. We report simulations of pulse propagation in one…
▽ More
Phonons are the main heat carriers in semiconductor devices. In small devices, heat is not driven by a local temperature gradient, but by local points of heat input and removal. This complicates theoretical modeling. Study of the propagation of vibrational energy from an initial localized pulse provides insight into nonlocal phonon heat transport. We report simulations of pulse propagation in one dimension. The 1d case has tricky anomalies, but provides the simplest pictures of the evolution from initially ballistic toward longer time diffusive propagation. Our results show surprising details, such as diverse results from different definitions of atomistic local energy, and failure to exhibit pure diffusion at long times. Boltzmann phonon gas theory, including external energy insertion, is applied to this inherentlytime-dependent and nonlocal problem. The solution, using relaxation time approximation for impurity scattering, does not closely agree with the simulated results.
△ Less
Submitted 26 April, 2022; v1 submitted 1 June, 2021;
originally announced June 2021.
-
Optimal fire allocation in a combat model of mixed NCW type
Authors:
My A. Vu,
Nam H. Nguyen,
Hanh Le T. Nguyen,
Anh N. Ta,
Mong H. Nguyen
Abstract:
In this work, we introduce a nonlinear Lanchester model of NCW-type and study a problem of finding the optimal fire allocation for this model. A Blue party $B$ will fight against a Red party consisting of $A$ and $R$, where $A$ is an independent force and $R$ fights with supports from a supply unit $N$. A battle may consist of several stages but we consider the problem of finding optimal fire allo…
▽ More
In this work, we introduce a nonlinear Lanchester model of NCW-type and study a problem of finding the optimal fire allocation for this model. A Blue party $B$ will fight against a Red party consisting of $A$ and $R$, where $A$ is an independent force and $R$ fights with supports from a supply unit $N$. A battle may consist of several stages but we consider the problem of finding optimal fire allocation for $B$ in the first stage only. Optimal fire allocation is a set of three non-negative numbers whose sum equals to one, such that the remaining force of $B$ is maximal at any instants. In order to tackle this problem, we introduce the notion of \textit{threatening rates} which are computed for $A, R, N$ at the beginning of the battle. Numerical illustrations are presented to justify the theoretical findings.
△ Less
Submitted 7 April, 2021;
originally announced April 2021.
-
MAGNeto: An Efficient Deep Learning Method for the Extractive Tags Summarization Problem
Authors:
Hieu Trong Phung,
Anh Tuan Vu,
Tung Dinh Nguyen,
Lam Thanh Do,
Giang Nam Ngo,
Trung Thanh Tran,
Ngoc C. Lê
Abstract:
In this work, we study a new image annotation task named Extractive Tags Summarization (ETS). The goal is to extract important tags from the context lying in an image and its corresponding tags. We adjust some state-of-the-art deep learning models to utilize both visual and textual information. Our proposed solution consists of different widely used blocks like convolutional and self-attention lay…
▽ More
In this work, we study a new image annotation task named Extractive Tags Summarization (ETS). The goal is to extract important tags from the context lying in an image and its corresponding tags. We adjust some state-of-the-art deep learning models to utilize both visual and textual information. Our proposed solution consists of different widely used blocks like convolutional and self-attention layers, together with a novel idea of combining auxiliary loss functions and the gating mechanism to glue and elevate these fundamental components and form a unified architecture. Besides, we introduce a loss function that aims to reduce the imbalance of the training data and a simple but effective data augmentation technique dedicated to alleviates the effect of outliers on the final results. Last but not least, we explore an unsupervised pre-training strategy to further boost the performance of the model by making use of the abundant amount of available unlabeled data. Our model shows the good results as 90% $F_\text{1}$ score on the public NUS-WIDE benchmark, and 50% $F_\text{1}$ score on a noisy large-scale real-world private dataset. Source code for reproducing the experiments is publicly available at: https://github.com/pixta-dev/labteam
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
Heterogeneous Multi-Agent Reinforcement Learning for Unknown Environment Map**
Authors:
Ceyer Wakilpoor,
Patrick J. Martin,
Carrie Rebhuhn,
Amanda Vu
Abstract:
Reinforcement learning in heterogeneous multi-agent scenarios is important for real-world applications but presents challenges beyond those seen in homogeneous settings and simple benchmarks. In this work, we present an actor-critic algorithm that allows a team of heterogeneous agents to learn decentralized control policies for covering an unknown environment. This task is of interest to national…
▽ More
Reinforcement learning in heterogeneous multi-agent scenarios is important for real-world applications but presents challenges beyond those seen in homogeneous settings and simple benchmarks. In this work, we present an actor-critic algorithm that allows a team of heterogeneous agents to learn decentralized control policies for covering an unknown environment. This task is of interest to national security and emergency response organizations that would like to enhance situational awareness in hazardous areas by deploying teams of unmanned aerial vehicles. To solve this multi-agent coverage path planning problem in unknown environments, we augment a multi-agent actor-critic architecture with a new state encoding structure and triplet learning loss to support heterogeneous agent learning. We developed a simulation environment that includes real-world environmental factors such as turbulence, delayed communication, and agent loss, to train teams of agents as well as probe their robustness and flexibility to such disturbances.
△ Less
Submitted 6 October, 2020;
originally announced October 2020.
-
Optimizing fire allocation in a NCW-type model
Authors:
Nam Hong Nguyen,
My Anh Vu,
Dinh Van Bui,
Anh Ngoc Ta,
Manh Duc Hy
Abstract:
In this paper, we introduce a non-linear Lanchester model of NCW-type and investigate an optimization problem for this model, where only the Red force is supplied by several supply agents. Optimal fire allocation of the Blue force is sought in the form of a piece-wise constant function of time. A threatening rate is computed for the Red force and each of its supply agents at the beginning of each…
▽ More
In this paper, we introduce a non-linear Lanchester model of NCW-type and investigate an optimization problem for this model, where only the Red force is supplied by several supply agents. Optimal fire allocation of the Blue force is sought in the form of a piece-wise constant function of time. A threatening rate is computed for the Red force and each of its supply agents at the beginning of each stage of the combat. These rates can be used to derive the optimal decision for the Blue force to focus its firepower to the Red force itself or one of its supply agents. This optimal fire allocation is derived and proved by considering an optimization problem of number of Blue force troops. Numerical experiments are included to demonstrate the theoretical results.
△ Less
Submitted 12 August, 2020;
originally announced August 2020.
-
On unbalanced data and common shock models in stochastic loss reserving
Authors:
Benjamin Avanzi,
Gregory Clive Taylor,
Phuong Anh Vu,
Bernard Wong
Abstract:
Introducing common shocks is a popular dependence modelling approach, with some recent applications in loss reserving. The main advantage of this approach is the ability to capture structural dependence coming from known relationships. In addition, it helps with the parsimonious construction of correlation matrices of large dimensions. However, complications arise in the presence of "unbalanced da…
▽ More
Introducing common shocks is a popular dependence modelling approach, with some recent applications in loss reserving. The main advantage of this approach is the ability to capture structural dependence coming from known relationships. In addition, it helps with the parsimonious construction of correlation matrices of large dimensions. However, complications arise in the presence of "unbalanced data", that is, when (expected) magnitude of observations over a single triangle, or between triangles, can vary substantially. Specifically, if a single common shock is applied to all of these cells, it can contribute insignificantly to the larger values and/or swamp the smaller ones, unless careful adjustments are made. This problem is further complicated in applications involving negative claim amounts. In this paper, we address this problem in the loss reserving context using a common shock Tweedie approach for unbalanced data. We show that the solution not only provides a much better balance of the common shock proportions relative to the unbalanced data, but it is also parsimonious. Finally, the common shock Tweedie model also provides distributional tractability.
△ Less
Submitted 17 May, 2020; v1 submitted 7 May, 2020;
originally announced May 2020.
-
Some Applications of Lie Groups in Theory of Technical Progress
Authors:
Le Anh Vu,
Duong Quang Hoa,
Nguyen Minh Tri,
Ha Van Hieu
Abstract:
In recent decades, we have known some interesting applications of Lie theory in the theory of technological progress. Firstly, we will discuss some results of R. Saito in \cite{rS1980} and \cite{rS1981} about the application modeling of Lie groups in the theory of technical progress. Next, we will describe the result on Romanian economy of G. Zaman and Z. Goschin in \cite{ZG2010}. Finally, by usin…
▽ More
In recent decades, we have known some interesting applications of Lie theory in the theory of technological progress. Firstly, we will discuss some results of R. Saito in \cite{rS1980} and \cite{rS1981} about the application modeling of Lie groups in the theory of technical progress. Next, we will describe the result on Romanian economy of G. Zaman and Z. Goschin in \cite{ZG2010}. Finally, by using Sato's results and applying the method of G. Zaman and Z. Goschin, we give an estimation of the GDP function of Viet Nam for the 1995-2018 period and give several important observations about the impact of technical progress on economic growth of Viet Nam.
△ Less
Submitted 14 March, 2020;
originally announced April 2020.
-
A multivariate evolutionary generalised linear model framework with adaptive estimation for claims reserving
Authors:
Benjamin Avanzi,
Gregory Clive Taylor,
Phuong Anh Vu,
Bernard Wong
Abstract:
In this paper, we develop a multivariate evolutionary generalised linear model (GLM) framework for claims reserving, which allows for dynamic features of claims activity in conjunction with dependency across business lines to accurately assess claims reserves. We extend the traditional GLM reserving framework on two fronts: GLM fixed factors are allowed to evolve in a recursive manner, and depende…
▽ More
In this paper, we develop a multivariate evolutionary generalised linear model (GLM) framework for claims reserving, which allows for dynamic features of claims activity in conjunction with dependency across business lines to accurately assess claims reserves. We extend the traditional GLM reserving framework on two fronts: GLM fixed factors are allowed to evolve in a recursive manner, and dependence is incorporated in the specification of these factors using a common shock approach.
We consider factors that evolve across accident years in conjunction with factors that evolve across calendar years. This two-dimensional evolution of factors is unconventional as a traditional evolutionary model typically considers the evolution in one single time dimension. This creates challenges for the estimation process, which we tackle in this paper. We develop the formulation of a particle filtering algorithm with parameter learning procedure. This is an adaptive estimation approach which updates evolving factors of the framework recursively over time.
We implement and illustrate our model with a simulated data set, as well as a set of real data from a Canadian insurer.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
Eisenstein series whose Fourier coefficients are zeta functions of binary Hermitian forms
Authors:
Jorge Flórez,
Cihan Karabulut,
An Hoa Vu
Abstract:
In this paper we investigate a result of Ueno on the modularity of generating series associated to the zeta functions of binary Hermitian forms previously studied by Elstrodt et al. We improve his result by showing that the generating series are Eisenstein series. As a consequence we obtain an explicit formula for the special values of zeta functions associated with binary Hermitian forms.
In this paper we investigate a result of Ueno on the modularity of generating series associated to the zeta functions of binary Hermitian forms previously studied by Elstrodt et al. We improve his result by showing that the generating series are Eisenstein series. As a consequence we obtain an explicit formula for the special values of zeta functions associated with binary Hermitian forms.
△ Less
Submitted 22 February, 2020;
originally announced February 2020.
-
Early Findings from Field Trials of Heavy-Duty Truck Connected Eco-Driving System
Authors:
Ziran Wang,
Yuan-Pu Hsu,
Alexander Vu,
Francisco Caballero,
Peng Hao,
Guoyuan Wu,
Kanok Boriboonsomsin,
Matthew J. Barth,
Aravind Kailas,
Pascal Amar,
Eddie Garmon,
Sandeep Tanugula
Abstract:
In recent years, the development of connected and automated vehicle (CAV) technology has inspired numerous advanced applications targeted at improving existing transportation systems. As one of the widely studied applications of CAV technology, connected eco-driving takes advantage of Signal Phase and Timing (SPaT) information from traffic signals to enable CAVs to approach and depart from signali…
▽ More
In recent years, the development of connected and automated vehicle (CAV) technology has inspired numerous advanced applications targeted at improving existing transportation systems. As one of the widely studied applications of CAV technology, connected eco-driving takes advantage of Signal Phase and Timing (SPaT) information from traffic signals to enable CAVs to approach and depart from signalized intersections in an energy-efficient manner. However the majority of the connected eco-driving studies have been numerical or microscopic traffic simulations. Only few studies have implemented the application on real vehicles, and even fewer have been focused on heavy-duty trucks. In this study, we developed a connected eco-driving system and equipped it on a heavy-duty diesel truck using cellular-based wireless communications. Field trials were conducted in the City ofCarson, California, along two corridors with six connected signalized intersections capable of communicating their SPaT information. Early results showed the benefits of the system in smoothing the speed profiles of the equipped truck when approaching the connected signalized intersections.
△ Less
Submitted 31 July, 2019;
originally announced August 2019.
-
Cohomology of some families of Lie algebras and quadratic Lie algebras
Authors:
Cao Tran Tu Hai,
Duong Minh Thanh,
Le Anh Vu
Abstract:
The paper studies the cohomology of Lie algebras and quadratic Lie algebras. Firstly, we propose to describe the cohomology of $MD(n,1)$-class which was introduced in \cite{LHNCN16}. This class contains Heisenberg Lie algebras. In 1983, L. J. Santharoubane \cite{San83} computed the cohomology of Heisenberg Lie algebras. In this paper, we will completely describe the cohomology of the other ones of…
▽ More
The paper studies the cohomology of Lie algebras and quadratic Lie algebras. Firstly, we propose to describe the cohomology of $MD(n,1)$-class which was introduced in \cite{LHNCN16}. This class contains Heisenberg Lie algebras. In 1983, L. J. Santharoubane \cite{San83} computed the cohomology of Heisenberg Lie algebras. In this paper, we will completely describe the cohomology of the other ones of $MD(n, 1)$-class. Finally, we will be concerned about the cohomology of quadratic Lie algebras. In 1985, A. Medina and P. Revoy \cite{MR85} computed the second Betti number of the generalized real diamond Lie algebras. We will compute in this paper the second Betti number of the generalized complex diamond Lie algebras by using the super-Poisson bracket.
△ Less
Submitted 27 March, 2019;
originally announced March 2019.
-
Traffic Density Estimation using a Convolutional Neural Network
Authors:
Julian Nubert,
Nicholas Giai Truong,
Abel Lim,
Herbert Ilhan Tanujaya,
Leah Lim,
Mai Anh Vu
Abstract:
The goal of this project is to introduce and present a machine learning application that aims to improve the quality of life of people in Singapore. In particular, we investigate the use of machine learning solutions to tackle the problem of traffic congestion in Singapore. In layman's terms, we seek to make Singapore (or any other city) a smoother place. To accomplish this aim, we present an end-…
▽ More
The goal of this project is to introduce and present a machine learning application that aims to improve the quality of life of people in Singapore. In particular, we investigate the use of machine learning solutions to tackle the problem of traffic congestion in Singapore. In layman's terms, we seek to make Singapore (or any other city) a smoother place. To accomplish this aim, we present an end-to-end system comprising of 1. A traffic density estimation algorithm at traffic lights/junctions and 2. a suitable traffic signal control algorithms that make use of the density information for better traffic control. Traffic density estimation can be obtained from traffic junction images using various machine learning techniques (combined with CV tools). After research into various advanced machine learning methods, we decided on convolutional neural networks (CNNs). We conducted experiments on our algorithms, using the publicly available traffic camera dataset published by the Land Transport Authority (LTA) to demonstrate the feasibility of this approach. With these traffic density estimates, different traffic algorithms can be applied to minimize congestion at traffic junctions in general.
△ Less
Submitted 5 September, 2018;
originally announced September 2018.