-
LoRaLay: A Multilingual and Multimodal Dataset for Long Range and Layout-Aware Summarization
Authors:
Laura Nguyen,
Thomas Scialom,
Benjamin Piwowarski,
Jacopo Staiano
Abstract:
Text Summarization is a popular task and an active area of research for the Natural Language Processing community. By definition, it requires to account for long input texts, a characteristic which poses computational challenges for neural models. Moreover, real-world documents come in a variety of complex, visually-rich, layouts. This information is of great relevance, whether to highlight salien…
▽ More
Text Summarization is a popular task and an active area of research for the Natural Language Processing community. By definition, it requires to account for long input texts, a characteristic which poses computational challenges for neural models. Moreover, real-world documents come in a variety of complex, visually-rich, layouts. This information is of great relevance, whether to highlight salient content or to encode long-range interactions between textual passages. Yet, all publicly available summarization datasets only provide plain text content. To facilitate research on how to exploit visual/layout information to better capture long-range dependencies in summarization models, we present LoRaLay, a collection of datasets for long-range summarization with accompanying visual/layout information. We extend existing and popular English datasets (arXiv and PubMed) with layout information and propose four novel datasets -- consistently built from scholar resources -- covering French, Spanish, Portuguese, and Korean languages. Further, we propose new baselines merging layout-aware and long-range models -- two orthogonal approaches -- and obtain state-of-the-art results, showing the importance of combining both lines of research.
△ Less
Submitted 26 January, 2023;
originally announced January 2023.
-
The Seiberg-Witten equations for multiple-spinors on $4-$manifolds with definite intersection forms
Authors:
Minh Lam Nguyen
Abstract:
In this note, we present a proof of Donaldson's Diagonalization Theorem via an abelian gauge-theoretic variant of the Seiberg-Witten equations for multiple spinors. Like the other proof of Donaldson's theorem using the standard Seiberg-Witten theory, Elkies' theorem also plays a key role in our argument.
In this note, we present a proof of Donaldson's Diagonalization Theorem via an abelian gauge-theoretic variant of the Seiberg-Witten equations for multiple spinors. Like the other proof of Donaldson's theorem using the standard Seiberg-Witten theory, Elkies' theorem also plays a key role in our argument.
△ Less
Submitted 26 January, 2023; v1 submitted 23 January, 2023;
originally announced January 2023.
-
ViHOS: Hate Speech Spans Detection for Vietnamese
Authors:
Phu Gia Hoang,
Canh Duc Luu,
Khanh Quoc Tran,
Kiet Van Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
The rise in hateful and offensive language directed at other users is one of the adverse side effects of the increased use of social networking platforms. This could make it difficult for human moderators to review tagged comments filtered by classification systems. To help address this issue, we present the ViHOS (Vietnamese Hate and Offensive Spans) dataset, the first human-annotated corpus cont…
▽ More
The rise in hateful and offensive language directed at other users is one of the adverse side effects of the increased use of social networking platforms. This could make it difficult for human moderators to review tagged comments filtered by classification systems. To help address this issue, we present the ViHOS (Vietnamese Hate and Offensive Spans) dataset, the first human-annotated corpus containing 26k spans on 11k comments. We also provide definitions of hateful and offensive spans in Vietnamese comments as well as detailed annotation guidelines. Besides, we conduct experiments with various state-of-the-art models. Specifically, XLM-R$_{Large}$ achieved the best F1-scores in Single span detection and All spans detection, while PhoBERT$_{Large}$ obtained the highest in Multiple spans detection. Finally, our error analysis demonstrates the difficulties in detecting specific types of spans in our data for future research.
Disclaimer: This paper contains real comments that could be considered profane, offensive, or abusive.
△ Less
Submitted 26 January, 2023; v1 submitted 24 January, 2023;
originally announced January 2023.
-
An abelian gauge-theoretic variant of the Seiberg-Witten equations for multiple-spinors
Authors:
Minh Lam Nguyen
Abstract:
We consider a variant of the Seiberg-Witten equations for multiple-spinors. The moduli space of solutions to our generalized Seiberg-Witten equations in the setting of Kähler surfaces has a direct relation with ASD connections of holomorphic vector bundle. Also in Kähler setting, we construct a numerical invariant from the equations that detects a notion of $φ-$stability of $SU(n)-$holomorphic vec…
▽ More
We consider a variant of the Seiberg-Witten equations for multiple-spinors. The moduli space of solutions to our generalized Seiberg-Witten equations in the setting of Kähler surfaces has a direct relation with ASD connections of holomorphic vector bundle. Also in Kähler setting, we construct a numerical invariant from the equations that detects a notion of $φ-$stability of $SU(n)-$holomorphic vector bundles where $φ$ is some prescribed non-trivial holomorphic section.
△ Less
Submitted 26 January, 2023; v1 submitted 23 January, 2023;
originally announced January 2023.
-
Refutations of pebble minimization via output languages
Authors:
Sandra Kiefer,
Lê Thành Dũng Nguyên,
Cécilia Pradic
Abstract:
Polyregular functions are the class of string-to-string functions definable by pebble transducers, an extension of finite-state automata with outputs and multiple two-way reading heads (pebbles) with a stack discipline. If a polyregular function can be computed with $k$ pebbles, then its output length is bounded by a polynomial of degree $k$ in the input length. But Bojańczyk has shown that the co…
▽ More
Polyregular functions are the class of string-to-string functions definable by pebble transducers, an extension of finite-state automata with outputs and multiple two-way reading heads (pebbles) with a stack discipline. If a polyregular function can be computed with $k$ pebbles, then its output length is bounded by a polynomial of degree $k$ in the input length. But Bojańczyk has shown that the converse fails.
In this paper, we provide two alternative easier proofs. The first establishes by elementary means that some quadratic polyregular function requires 3 pebbles. The second proof - just as short, albeit less elementary - shows a stronger statement: for every $k$, there exists some polyregular function with quadratic growth whose output language differs from that of any $k$-fold composition of macro tree transducers (and which therefore cannot be computed by a $k$-pebble transducer). Along the way, we also refute a conjectured logical characterization of polyblind functions.
△ Less
Submitted 20 June, 2023; v1 submitted 22 January, 2023;
originally announced January 2023.
-
Quenched lattice fluctuations in optically driven SrTiO3
Authors:
M. Fechner,
M. Först,
G. Orenstein,
V. Krapivin,
A. S. Disa,
M. Buzzi,
A. von Hoegen,
G. de la Pena,
Q. L Nguyen,
R. Mankowsky,
M. Sander,
H. Lemke,
Y. Deng,
M. Trigo,
A. Cavalleri
Abstract:
Many functionally relevant ferroic phenomena in quantum materials can be manipulated by driving the lattice coherently with optical and terahertz pulses. New physical phenomena and non-equilibrium phases that have no equilibrium counterpart have been discovered following these protocols. The underlying structural dynamics has been mostly studied by recording the average atomic position along dynam…
▽ More
Many functionally relevant ferroic phenomena in quantum materials can be manipulated by driving the lattice coherently with optical and terahertz pulses. New physical phenomena and non-equilibrium phases that have no equilibrium counterpart have been discovered following these protocols. The underlying structural dynamics has been mostly studied by recording the average atomic position along dynamical structural coordinates with elastic scattering methods. However, crystal lattice fluctuations, which are known to influence phase transitions in equilibrium, are also expected to determine these dynamics but have rarely been explored. Here, we study the driven dynamics of the quantum paraelectric SrTiO3, in which mid-infrared drives have been shown to induce a metastable ferroelectric state. Crucial in these physics is the competition between the polar instability and antiferrodistortive rotations, which in equilibrium frustrate the formation of long-range ferroelectricity. We make use of high intensity mid-infrared optical pulses to resonantly drive a Ti-O stretching mode at 17 THz, and we measure the resulting change in lattice fluctuations using time-resolved x-ray diffuse scattering at a free electron laser. After a prompt increase, we observe a long-lived quench in R-point antiferrodistortive lattice fluctuations. The enhancement and reduction in lattice fluctuations are explained theoretically by considering fourth-order nonlinear phononic interactions and third-order coupling to the driven optical phonon and to lattice strain, respectively. These observations provide a number of new and testable hypotheses for the physics of light-induced ferroelectricity.
△ Less
Submitted 20 January, 2023;
originally announced January 2023.
-
A phonon laser in the quantum regime
Authors:
T. Behrle,
T. L. Nguyen,
F. Reiter,
D. Baur,
B. de Neeve,
M. Stadler,
M. Marinelli,
F. Lancellotti,
S. F. Yelin,
J. P. Home
Abstract:
We demonstrate a trapped-ion system with two competing dissipation channels, implemented independently on two ion species co-trapped in a Paul trap. By controlling coherent spin-oscillator couplings and optical pum** rates we explore the phase diagram of this system, which exhibits a regime analogous to that of a (phonon) laser but operates close to the quantum ground state with an average phono…
▽ More
We demonstrate a trapped-ion system with two competing dissipation channels, implemented independently on two ion species co-trapped in a Paul trap. By controlling coherent spin-oscillator couplings and optical pum** rates we explore the phase diagram of this system, which exhibits a regime analogous to that of a (phonon) laser but operates close to the quantum ground state with an average phonon number of $\bar{n}<10$. We demonstrate phase locking of the oscillator to an additional resonant drive, and also observe the phase diffusion of the resulting state under dissipation by reconstructing the quantum state from a measurement of the characteristic function.
△ Less
Submitted 19 January, 2023;
originally announced January 2023.
-
2 inch Molecular Organic Glass Scintillator for Neutron-Gamma Discrimination
Authors:
Martyna Grodzicka-Kobylka,
Tomasz Szczesniak,
Marek Moszyński,
Lukasz Swiderski,
Kamil Brylew,
Patrick L. Feng,
Lucas Q. Nguyen,
Joey S. Carlson,
Jose J. Valiente-Dobón,
Jan Trzuskowski,
Agnieszka Misiarz,
Łukasz Talarek,
Paweł Zając
Abstract:
In this manuscript we report on the scintillation properties and pulse shape discrimination (PSD) performance of new organic glass scintillator. Two cylindrical samples with dimensions of 2x2 inches were tested. Additionally, this two samples were used in stack configuration in order to measure the PSD characteristics of a sample with a size of 2x4 inches. The study covers the measurements of neut…
▽ More
In this manuscript we report on the scintillation properties and pulse shape discrimination (PSD) performance of new organic glass scintillator. Two cylindrical samples with dimensions of 2x2 inches were tested. Additionally, this two samples were used in stack configuration in order to measure the PSD characteristics of a sample with a size of 2x4 inches. The study covers the measurements of neutron/gamma discrimination capability, emission spectra, photoelectron yield and analysis of the light pulse shapes originating from events related to gamma-rays and fast neutrons. The results were compared to data recorded previously using an EJ-276 plastic scintillator, an EJ-309 liquid scintillator and a stilbene single crystal.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
Variational Bayes Inference for Data Detection in Cell-Free Massive MIMO
Authors:
Ly V. Nguyen,
Hien Quoc Ngo,
Le-Nam Tran,
A. Lee Swindlehurst,
Duy H. N. Nguyen
Abstract:
Cell-free massive MIMO is a promising technology for beyond-5G networks. Through the deployment of many cooperating access points (AP), the technology can significantly enhance user coverage and spectral efficiency compared to traditional cellular systems. Since the APs are distributed over a large area, the level of favorable propagation in cell-free massive MIMO is less than the one in colocated…
▽ More
Cell-free massive MIMO is a promising technology for beyond-5G networks. Through the deployment of many cooperating access points (AP), the technology can significantly enhance user coverage and spectral efficiency compared to traditional cellular systems. Since the APs are distributed over a large area, the level of favorable propagation in cell-free massive MIMO is less than the one in colocated massive MIMO. As a result, the current linear processing schemes are not close to the optimal ones when the number of AP antennas is not very large. The aim of this paper is to develop nonlinear variational Bayes (VB) methods for data detection in cell-free massive MIMO systems. Contrary to existing work in the literature, which only attained point estimates of the transmit data symbols, the proposed methods aim to obtain the posterior distribution and the Bayes estimate of the data symbols. We develop the VB methods accordingly to the levels of cooperation among the APs. Simulation results show significant performance advantages of the developed VB methods over the linear processing techniques.
△ Less
Submitted 10 January, 2023;
originally announced January 2023.
-
Integrating Semantic Information into Sketchy Reading Module of Retro-Reader for Vietnamese Machine Reading Comprehension
Authors:
Hang Thi-Thu Le,
Viet-Duc Ho,
Duc-Vu Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
Machine Reading Comprehension has become one of the most advanced and popular research topics in the fields of Natural Language Processing in recent years. The classification of answerability questions is a relatively significant sub-task in machine reading comprehension; however, there haven't been many studies. Retro-Reader is one of the studies that has solved this problem effectively. However,…
▽ More
Machine Reading Comprehension has become one of the most advanced and popular research topics in the fields of Natural Language Processing in recent years. The classification of answerability questions is a relatively significant sub-task in machine reading comprehension; however, there haven't been many studies. Retro-Reader is one of the studies that has solved this problem effectively. However, the encoders of most traditional machine reading comprehension models in general and Retro-Reader, in particular, have not been able to exploit the contextual semantic information of the context completely. Inspired by SemBERT, we use semantic role labels from the SRL task to add semantics to pre-trained language models such as mBERT, XLM-R, PhoBERT. This experiment was conducted to compare the influence of semantics on the classification of answerability for the Vietnamese machine reading comprehension. Additionally, we hope this experiment will enhance the encoder for the Retro-Reader model's Sketchy Reading Module. The improved Retro-Reader model's encoder with semantics was first applied to the Vietnamese Machine Reading Comprehension task and obtained positive results.
△ Less
Submitted 1 January, 2023;
originally announced January 2023.
-
Leveraging Semantic Representations Combined with Contextual Word Representations for Recognizing Textual Entailment in Vietnamese
Authors:
Quoc-Loc Duong,
Duc-Vu Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
RTE is a significant problem and is a reasonably active research community. The proposed research works on the approach to this problem are pretty diverse with many different directions. For Vietnamese, the RTE problem is moderately new, but this problem plays a vital role in natural language understanding systems. Currently, methods to solve this problem based on contextual word representation le…
▽ More
RTE is a significant problem and is a reasonably active research community. The proposed research works on the approach to this problem are pretty diverse with many different directions. For Vietnamese, the RTE problem is moderately new, but this problem plays a vital role in natural language understanding systems. Currently, methods to solve this problem based on contextual word representation learning models have given outstanding results. However, Vietnamese is a semantically rich language. Therefore, in this paper, we want to present an experiment combining semantic word representation through the SRL task with context representation of BERT relative models for the RTE problem. The experimental results give conclusions about the influence and role of semantic representation on Vietnamese in understanding natural language. The experimental results show that the semantic-aware contextual representation model has about 1% higher performance than the model that does not incorporate semantic representation. In addition, the effects on the data domain in Vietnamese are also higher than those in English. This result also shows the positive influence of SRL on RTE problem in Vietnamese.
△ Less
Submitted 1 January, 2023;
originally announced January 2023.
-
Is word segmentation necessary for Vietnamese sentiment classification?
Authors:
Duc-Vu Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
To the best of our knowledge, this paper made the first attempt to answer whether word segmentation is necessary for Vietnamese sentiment classification. To do this, we presented five pre-trained monolingual S4- based language models for Vietnamese, including one model without word segmentation, and four models using RDRsegmenter, uitnlp, pyvi, or underthesea toolkits in the pre-processing data ph…
▽ More
To the best of our knowledge, this paper made the first attempt to answer whether word segmentation is necessary for Vietnamese sentiment classification. To do this, we presented five pre-trained monolingual S4- based language models for Vietnamese, including one model without word segmentation, and four models using RDRsegmenter, uitnlp, pyvi, or underthesea toolkits in the pre-processing data phase. According to comprehensive experimental results on two corpora, including the VLSP2016-SA corpus of technical article reviews from the news and social media and the UIT-VSFC corpus of the educational survey, we have two suggestions. Firstly, using traditional classifiers like Naive Bayes or Support Vector Machines, word segmentation maybe not be necessary for the Vietnamese sentiment classification corpus, which comes from the social domain. Secondly, word segmentation is necessary for Vietnamese sentiment classification when word segmentation is used before using the BPE method and feeding into the deep learning model. In this way, the RDRsegmenter is the stable toolkit for word segmentation among the uitnlp, pyvi, and underthesea toolkits.
△ Less
Submitted 1 January, 2023;
originally announced January 2023.
-
Attentive Deep Neural Networks for Legal Document Retrieval
Authors:
Ha-Thanh Nguyen,
Manh-Kien Phi,
Xuan-Bach Ngo,
Vu Tran,
Le-Minh Nguyen,
Minh-Phuong Tu
Abstract:
Legal text retrieval serves as a key component in a wide range of legal text processing tasks such as legal question answering, legal case entailment, and statute law retrieval. The performance of legal text retrieval depends, to a large extent, on the representation of text, both query and legal documents. Based on good representations, a legal text retrieval model can effectively match the query…
▽ More
Legal text retrieval serves as a key component in a wide range of legal text processing tasks such as legal question answering, legal case entailment, and statute law retrieval. The performance of legal text retrieval depends, to a large extent, on the representation of text, both query and legal documents. Based on good representations, a legal text retrieval model can effectively match the query to its relevant documents. Because legal documents often contain long articles and only some parts are relevant to queries, it is quite a challenge for existing models to represent such documents. In this paper, we study the use of attentive neural network-based text representation for statute law document retrieval. We propose a general approach using deep neural networks with attention mechanisms. Based on it, we develop two hierarchical architectures with sparse attention to represent long sentences and articles, and we name them Attentive CNN and Paraformer. The methods are evaluated on datasets of different sizes and characteristics in English, Japanese, and Vietnamese. Experimental results show that: i) Attentive neural methods substantially outperform non-neural methods in terms of retrieval performance across datasets and languages; ii) Pretrained transformer-based models achieve better accuracy on small datasets at the cost of high computational complexity while lighter weight Attentive CNN achieves better accuracy on large datasets; and iii) Our proposed Paraformer outperforms state-of-the-art methods on COLIEE dataset, achieving the highest recall and F2 scores in the top-N retrieval task.
△ Less
Submitted 12 December, 2022;
originally announced December 2022.
-
The average connectivity matrix of a graph
Authors:
Linh Nguyen,
Suil O
Abstract:
For a graph $G$ and for two distinct vertices $u$ and $v$, let $κ(u,v)$ be the maximum number of vertex-disjoint paths joining $u$ and $v$ in $G$. The average connectivity matrix of an $n$-vertex connected graph $G$, written $A_{\barκ}(G)$, is an $n\times n$ matrix whose $(u,v)$-entry is $κ(u,v)/{n \choose 2}$ and let $ρ(A_{\barκ}(G))$ be the spectral radius of $A_{\barκ}(G)$. In this paper, we in…
▽ More
For a graph $G$ and for two distinct vertices $u$ and $v$, let $κ(u,v)$ be the maximum number of vertex-disjoint paths joining $u$ and $v$ in $G$. The average connectivity matrix of an $n$-vertex connected graph $G$, written $A_{\barκ}(G)$, is an $n\times n$ matrix whose $(u,v)$-entry is $κ(u,v)/{n \choose 2}$ and let $ρ(A_{\barκ}(G))$ be the spectral radius of $A_{\barκ}(G)$. In this paper, we investigate some spectral properties of the matrix. In particular, we prove that for any $n$-vertex connected graph $G$, we have $ρ(A_{\barκ}(G)) \le \frac{4α'(G)}n$, which implies a result of Kim and O \cite{KO} stating that for any connected graph $G$, we have $\barκ(G) \le 2 α'(G)$, where $\barκ(G)=\sum_{u,v \in V(G)}\frac{κ(u,v)}{n\choose 2}$ and $α'(G)$ is the maximum size of a matching in $G$; equality holds only when $G$ is a complete graph with an odd number of vertices. Also, for bipartite graphs, we improve the bound, namely $ρ(A_{\barκ}(G)) \le \frac{(n-α'(G))(4α'(G) - 2)}{n(n-1)}$, and equality in the bound holds only when $G$ is a complete balanced bipartite graph.
△ Less
Submitted 28 December, 2022;
originally announced December 2022.
-
Convexification Numerical Method for a Coefficient Inverse Problem for the Riemannian Radiative Transfer Equation
Authors:
Michael V. Klibanov,
**gzhi Li,
Loc H. Nguyen,
Vladimir G. Romanov,
Zhipeng Yang
Abstract:
The first globally convergent numerical method for a Coefficient Inverse Problem (CIP) for the Riemannian Radiative Transfer Equation (RRTE) is constructed. This is a version of the so-called \textquotedblleft convexification" method, which has been pursued by this research group for a number of years for some other CIPs for PDEs. Those PDEs are significantly different from the RRTE. The presence…
▽ More
The first globally convergent numerical method for a Coefficient Inverse Problem (CIP) for the Riemannian Radiative Transfer Equation (RRTE) is constructed. This is a version of the so-called \textquotedblleft convexification" method, which has been pursued by this research group for a number of years for some other CIPs for PDEs. Those PDEs are significantly different from the RRTE. The presence of the Carleman Weight Function (CWF) in the numerical scheme is the key element which insures the global convergence. Convergence analysis is presented along with the results of numerical experiments, which confirm the theory. RRTE governs the propagation of photons in the diffuse medium in the case when they propagate along geodesic lines between their collisions. Geodesic lines are generated by the spatially variable dielectric constant of the medium.
△ Less
Submitted 12 April, 2023; v1 submitted 23 December, 2022;
originally announced December 2022.
-
BDSP: A Fair Blockchain-enabled Framework for Privacy-Enhanced Enterprise Data Sharing
Authors:
Lam Duc Nguyen,
James Hoang,
Qin Wang,
Qinghua Lu,
Sherry Xu,
Shi** Chen
Abstract:
Across industries, there is an ever-increasing rate of data sharing for collaboration and innovation between organizations and their customers, partners, suppliers, and internal teams. However, many enterprises are restricted from freely sharing data due to regulatory restrictions across different regions, performance issues in moving large volume data, or requirements to maintain autonomy. In suc…
▽ More
Across industries, there is an ever-increasing rate of data sharing for collaboration and innovation between organizations and their customers, partners, suppliers, and internal teams. However, many enterprises are restricted from freely sharing data due to regulatory restrictions across different regions, performance issues in moving large volume data, or requirements to maintain autonomy. In such situations, the enterprise can benefit from the concept of federated learning, in which machine learning models are constructed at various geographic sites. In this paper, we introduce a general framework, namely BDSP, to share data among enterprises based on Blockchain and federated learning techniques. Specifically, we propose a transparency contribution accounting mechanism to estimate the valuation of data and implement a proof-of-concept for further evaluation. The extensive experimental results show that the proposed BDSP has a competitive performance with higher training accuracy, an increase of over 5%, and lower communication overhead, reducing 3 times, compared to baseline approaches.
△ Less
Submitted 16 December, 2022;
originally announced December 2022.
-
Law to Binary Tree -- An Formal Interpretation of Legal Natural Language
Authors:
Ha-Thanh Nguyen,
Vu Tran,
Ngoc-Cam Le,
Thi-Thuy Le,
Quang-Huy Nguyen,
Le-Minh Nguyen,
Ken Satoh
Abstract:
Knowledge representation and reasoning in law are essential to facilitate the automation of legal analysis and decision-making tasks. In this paper, we propose a new approach based on legal science, specifically legal taxonomy, for representing and reasoning with legal documents. Our approach interprets the regulations in legal documents as binary trees, which facilitates legal reasoning systems t…
▽ More
Knowledge representation and reasoning in law are essential to facilitate the automation of legal analysis and decision-making tasks. In this paper, we propose a new approach based on legal science, specifically legal taxonomy, for representing and reasoning with legal documents. Our approach interprets the regulations in legal documents as binary trees, which facilitates legal reasoning systems to make decisions and resolve logical contradictions. The advantages of this approach are twofold. First, legal reasoning can be performed on the basis of the binary tree representation of the regulations. Second, the binary tree representation of the regulations is more understandable than the existing sentence-based representations. We provide an example of how our approach can be used to interpret the regulations in a legal document.
△ Less
Submitted 16 December, 2022;
originally announced December 2022.
-
Improving Depression estimation from facial videos with face alignment, training optimization and scheduling
Authors:
Manuel Lage Cañellas,
Constantino Álvarez Casado,
Le Nguyen,
Miguel Bordallo López
Abstract:
Deep learning models have shown promising results in recognizing depressive states using video-based facial expressions. While successful models typically leverage using 3D-CNNs or video distillation techniques, the different use of pretraining, data augmentation, preprocessing, and optimization techniques across experiments makes it difficult to make fair architectural comparisons. We propose ins…
▽ More
Deep learning models have shown promising results in recognizing depressive states using video-based facial expressions. While successful models typically leverage using 3D-CNNs or video distillation techniques, the different use of pretraining, data augmentation, preprocessing, and optimization techniques across experiments makes it difficult to make fair architectural comparisons. We propose instead to enhance two simple models based on ResNet-50 that use only static spatial information by using two specific face alignment methods and improved data augmentation, optimization, and scheduling techniques. Our extensive experiments on benchmark datasets obtain similar results to sophisticated spatio-temporal models for single streams, while the score-level fusion of two different streams outperforms state-of-the-art methods. Our findings suggest that specific modifications in the preprocessing and training process result in noticeable differences in the performance of the models and could hide the actual originally attributed to the use of different neural network architectures.
△ Less
Submitted 13 December, 2022;
originally announced December 2022.
-
Generalizing DP-SGD with Shuffling and Batch Clip**
Authors:
Marten van Dijk,
Phuong Ha Nguyen,
Toan N. Nguyen,
Lam M. Nguyen
Abstract:
Classical differential private DP-SGD implements individual clip** with random subsampling, which forces a mini-batch SGD approach. We provide a general differential private algorithmic framework that goes beyond DP-SGD and allows any possible first order optimizers (e.g., classical SGD and momentum based SGD approaches) in combination with batch clip**, which clips an aggregate of computed gr…
▽ More
Classical differential private DP-SGD implements individual clip** with random subsampling, which forces a mini-batch SGD approach. We provide a general differential private algorithmic framework that goes beyond DP-SGD and allows any possible first order optimizers (e.g., classical SGD and momentum based SGD approaches) in combination with batch clip**, which clips an aggregate of computed gradients rather than summing clipped gradients (as is done in individual clip**). The framework also admits sampling techniques beyond random subsampling such as shuffling. Our DP analysis follows the $f$-DP approach and introduces a new proof technique which allows us to derive simple closed form expressions and to also analyse group privacy. In particular, for $E$ epochs work and groups of size $g$, we show a $\sqrt{g E}$ DP dependency for batch clip** with shuffling.
△ Less
Submitted 25 July, 2023; v1 submitted 12 December, 2022;
originally announced December 2022.
-
Variational Bayes for Joint Channel Estimation and Data Detection in Few-Bit Massive MIMO Systems
Authors:
Ly V. Nguyen,
A. Lee Swindlehurst,
Duy H. N. Nguyen
Abstract:
Massive multiple-input multiple-output (MIMO) communications using low-resolution analog-to-digital converters (ADCs) is a promising technology for providing high spectral and energy efficiency with affordable hardware cost and power consumption. However, the use of low-resolution ADCs requires special signal processing methods for channel estimation and data detection since the resulting system i…
▽ More
Massive multiple-input multiple-output (MIMO) communications using low-resolution analog-to-digital converters (ADCs) is a promising technology for providing high spectral and energy efficiency with affordable hardware cost and power consumption. However, the use of low-resolution ADCs requires special signal processing methods for channel estimation and data detection since the resulting system is severely non-linear. This paper proposes joint channel estimation and data detection methods for massive MIMO systems with low-resolution ADCs based on the variational Bayes (VB) inference framework. We first derive matched-filter quantized VB (MF-QVB) and linear minimum mean-squared error quantized VB (LMMSE-QVB) detection methods assuming the channel state information (CSI) is available. Then we extend these methods to the joint channel estimation and data detection (JED) problem and propose two methods we refer to as MF-QVB-JED and LMMSE-QVB-JED. Unlike conventional VB-based detection methods that assume knowledge of the second-order statistics of the additive noise, we propose to float the noise variance/covariance matrix as an unknown random variable that is used to account for both the noise and the residual inter-user interference. We also present practical aspects of the QVB framework to improve its implementation stability. Finally, we show via numerical results that the proposed VB-based methods provide robust performance and also significantly outperform existing methods.
△ Less
Submitted 3 December, 2022;
originally announced December 2022.
-
An Empirical Study on Snapshot DAOs
Authors:
Qin Wang,
Guangsheng Yu,
Yilin Sai,
Caijun Sun,
Lam Duc Nguyen,
Sherry Xu,
Shi** Chen
Abstract:
Decentralized Autonomous Organization (DAO) is an organization constructed by automatically executed rules such as via smart contracts, holding features of the permissionless committee, transparent proposals, and fair contribution by stakeholders. As of Nov 2022, DAO has impacted over \$11.2B market caps. However, there are no substantial studies focused on this emerging field. To fill the gap, we…
▽ More
Decentralized Autonomous Organization (DAO) is an organization constructed by automatically executed rules such as via smart contracts, holding features of the permissionless committee, transparent proposals, and fair contribution by stakeholders. As of Nov 2022, DAO has impacted over \$11.2B market caps. However, there are no substantial studies focused on this emerging field. To fill the gap, we start from the ground truth by empirically studying the breadth and depth of the DAO markets in mainstream public chain ecosystems in this paper. We dive into the most widely adoptable DAO launchpad, \textit{Snapshot}, which covers 95\% in the wild DAO projects for data collection and analysis. By integrating extensive enrolled DAOs and corresponding data measurements, we explore statistical data from Snapshot and try to demystify its undiscovered truths by delivering a series of summarised insights. We also present DAO status, patterns, distribution, and trends. To our knowledge, this is the first empirical study putting concentration on DAO spaces.
△ Less
Submitted 19 May, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource Constrained Devices using Divide and Collaborative Training
Authors:
Quan Nguyen,
Hieu H. Pham,
Kok-Seng Wong,
Phi Le Nguyen,
Truong Thao Nguyen,
Minh N. Do
Abstract:
We introduce FedDCT, a novel distributed learning paradigm that enables the usage of large, high-performance CNNs on resource-limited edge devices. As opposed to traditional FL approaches, which require each client to train the full-size neural network independently during each training round, the proposed FedDCT allows a cluster of several clients to collaboratively train a large deep learning mo…
▽ More
We introduce FedDCT, a novel distributed learning paradigm that enables the usage of large, high-performance CNNs on resource-limited edge devices. As opposed to traditional FL approaches, which require each client to train the full-size neural network independently during each training round, the proposed FedDCT allows a cluster of several clients to collaboratively train a large deep learning model by dividing it into an ensemble of several small sub-models and train them on multiple devices in parallel while maintaining privacy. In this collaborative training process, clients from the same cluster can also learn from each other, further improving their ensemble performance. In the aggregation stage, the server takes a weighted average of all the ensemble models trained by all the clusters. FedDCT reduces the memory requirements and allows low-end devices to participate in FL. We empirically conduct extensive experiments on standardized datasets, including CIFAR-10, CIFAR-100, and two real-world medical datasets HAM10000 and VAIPE. Experimental results show that FedDCT outperforms a set of current SOTA FL methods with interesting convergence behaviors. Furthermore, compared to other existing approaches, FedDCT achieves higher accuracy and substantially reduces the number of communication rounds (with $4-8$ times fewer memory requirements) to achieve the desired accuracy on the testing dataset without incurring any extra training cost on the server side.
△ Less
Submitted 18 September, 2023; v1 submitted 20 November, 2022;
originally announced November 2022.
-
Programmable Heisenberg interactions between Floquet qubits
Authors:
Long B. Nguyen,
Yosep Kim,
Akel Hashim,
Noah Goss,
Brian Marinelli,
Bibek Bhandari,
Debmalya Das,
Ravi K. Naik,
John Mark Kreikebaum,
Andrew N. Jordan,
David I. Santiago,
Irfan Siddiqi
Abstract:
The fundamental trade-off between robustness and tunability is a central challenge in the pursuit of quantum simulation and fault-tolerant quantum computation. In particular, many emerging quantum architectures are designed to achieve high coherence at the expense of having fixed spectra and consequently limited types of controllable interactions. Here, by adiabatically transforming fixed-frequenc…
▽ More
The fundamental trade-off between robustness and tunability is a central challenge in the pursuit of quantum simulation and fault-tolerant quantum computation. In particular, many emerging quantum architectures are designed to achieve high coherence at the expense of having fixed spectra and consequently limited types of controllable interactions. Here, by adiabatically transforming fixed-frequency superconducting circuits into modifiable Floquet qubits, we demonstrate an XXZ Heisenberg interaction with fully adjustable anisotropy. This interaction model is on one hand the basis for many-body quantum simulation of spin systems, and on the other hand the primitive for an expressive quantum gate set. To illustrate the robustness and versatility of our Floquet protocol, we tailor the Heisenberg Hamiltonian and implement two-qubit iSWAP, CZ, and SWAP gates with estimated fidelities of 99.32(3)%, 99.72(2)%, and 98.93(5)%, respectively. In addition, we implement a Heisenberg interaction between higher energy levels and employ it to construct a three-qubit CCZ gate with a fidelity of 96.18(5)%. Importantly, the protocol is applicable to various fixed-frequency high-coherence platforms, thereby unlocking a suite of essential interactions for high-performance quantum information processing. From a broader perspective, our work provides compelling avenues for future exploration of quantum electrodynamics and optimal control using the Floquet framework.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
Evidence for neutrino emission from the nearby active galaxy NGC 1068
Authors:
IceCube Collaboration,
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
J. M. Alameddine,
C. Alispach,
A. A. Alves Jr.,
N. M. Amin,
K. Andeen,
T. Anderson,
G. Anton,
C. Argüelles,
Y. Ashida,
S. Axani,
X. Bai,
A. Balagopal V.,
A. Barbano,
S. W. Barwick,
B. Bastian,
V. Basu,
S. Baur,
R. Bay
, et al. (361 additional authors not shown)
Abstract:
We report three searches for high energy neutrino emission from astrophysical objects using data recorded with IceCube between 2011 and 2020. Improvements over previous work include new neutrino reconstruction and data calibration methods. In one search, the positions of 110 a priori selected gamma-ray sources were analyzed individually for a possible surplus of neutrinos over atmospheric and cosm…
▽ More
We report three searches for high energy neutrino emission from astrophysical objects using data recorded with IceCube between 2011 and 2020. Improvements over previous work include new neutrino reconstruction and data calibration methods. In one search, the positions of 110 a priori selected gamma-ray sources were analyzed individually for a possible surplus of neutrinos over atmospheric and cosmic background expectations. We found an excess of $79_{-20}^{+22}$ neutrinos associated with the nearby active galaxy NGC 1068 at a significance of 4.2$\,σ$. The excess, which is spatially consistent with the direction of the strongest clustering of neutrinos in the Northern Sky, is interpreted as direct evidence of TeV neutrino emission from a nearby active galaxy. The inferred flux exceeds the potential TeV gamma-ray flux by at least one order of magnitude.
△ Less
Submitted 8 February, 2024; v1 submitted 17 November, 2022;
originally announced November 2022.
-
A Comparative Study of Question Answering over Knowledge Bases
Authors:
Khiem Vinh Tran,
Hao Phu Phan,
Khang Nguyen Duc Quach,
Ngan Luu-Thuy Nguyen,
Jun Jo,
Thanh Tam Nguyen
Abstract:
Question answering over knowledge bases (KBQA) has become a popular approach to help users extract information from knowledge bases. Although several systems exist, choosing one suitable for a particular application scenario is difficult. In this article, we provide a comparative study of six representative KBQA systems on eight benchmark datasets. In that, we study various question types, propert…
▽ More
Question answering over knowledge bases (KBQA) has become a popular approach to help users extract information from knowledge bases. Although several systems exist, choosing one suitable for a particular application scenario is difficult. In this article, we provide a comparative study of six representative KBQA systems on eight benchmark datasets. In that, we study various question types, properties, languages, and domains to provide insights on where existing systems struggle. On top of that, we propose an advanced map** algorithm to aid existing models in achieving superior results. Moreover, we also develop a multilingual corpus COVID-KGQA, which encourages COVID-19 research and multilingualism for the diversity of future AI. Finally, we discuss the key findings and their implications as well as performance guidelines and some future improvements. Our source code is available at \url{https://github.com/tamlhp/kbqa}.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Efficient Integration of Multi-Order Dynamics and Internal Dynamics in Stock Movement Prediction
Authors:
Thanh Trung Huynh,
Minh Hieu Nguyen,
Thanh Tam Nguyen,
Phi Le Nguyen,
Matthias Weidlich,
Quoc Viet Hung Nguyen,
Karl Aberer
Abstract:
Advances in deep neural network (DNN) architectures have enabled new prediction techniques for stock market data. Unlike other multivariate time-series data, stock markets show two unique characteristics: (i) \emph{multi-order dynamics}, as stock prices are affected by strong non-pairwise correlations (e.g., within the same industry); and (ii) \emph{internal dynamics}, as each individual stock sho…
▽ More
Advances in deep neural network (DNN) architectures have enabled new prediction techniques for stock market data. Unlike other multivariate time-series data, stock markets show two unique characteristics: (i) \emph{multi-order dynamics}, as stock prices are affected by strong non-pairwise correlations (e.g., within the same industry); and (ii) \emph{internal dynamics}, as each individual stock shows some particular behaviour. Recent DNN-based methods capture multi-order dynamics using hypergraphs, but rely on the Fourier basis in the convolution, which is both inefficient and ineffective. In addition, they largely ignore internal dynamics by adopting the same model for each stock, which implies a severe information loss.
In this paper, we propose a framework for stock movement prediction to overcome the above issues. Specifically, the framework includes temporal generative filters that implement a memory-based mechanism onto an LSTM network in an attempt to learn individual patterns per stock. Moreover, we employ hypergraph attentions to capture the non-pairwise correlations. Here, using the wavelet basis instead of the Fourier basis, enables us to simplify the message passing and focus on the localized convolution. Experiments with US market data over six years show that our framework outperforms state-of-the-art methods in terms of profit and stability. Our source code and data are available at \url{https://github.com/thanhtrunghuynh93/estimate}.
△ Less
Submitted 24 November, 2022; v1 submitted 10 November, 2022;
originally announced November 2022.
-
The Impact of Inter-grain Phases on the Ionic Conductivity of LAGP Solid Electrolyte Prepared by Spark Plasma Sintering
Authors:
Sorina Cretu,
David G. Bradley,
Omer Ulas Kudu,
Li Patrick Wen Feng,
Linh Lan Nguyen,
Tuan Tu Nguyen,
Arash Jamali,
Jean-Noel Chotard,
Vincent Seznec,
John V. Hanna,
Arnaud Demortière,
Martial Duchamp
Abstract:
Li1.5Al0.5Ge1.5(PO4)3 (LAGP) is a promising oxide solid electrolyte for all-solid-state batteries due to its excellent air stability, wide electrochemical stability window and cost-effective precursor materials. However, further improvement in their ionic conductivity performance is hindered by the presence of inter-grain phases leading to a major obstacle to the advanced design of oxide based sol…
▽ More
Li1.5Al0.5Ge1.5(PO4)3 (LAGP) is a promising oxide solid electrolyte for all-solid-state batteries due to its excellent air stability, wide electrochemical stability window and cost-effective precursor materials. However, further improvement in their ionic conductivity performance is hindered by the presence of inter-grain phases leading to a major obstacle to the advanced design of oxide based solid-state electrolytes. This study establishes and quantifies the influence of inter-grain phases, their 3D morphology, and formed compositions on the overall ion conductivity properties of LAGP pellets fabricated under different Spark plasma sintering conditions. Based on complementary techniques, such as PEIS, XRD, 3D FIB-SEM tomography and solid-state MAS NMR coupled with DFT modelling, a deep insight into the inter-grain phase microstructures is obtained revealing that the inter-grain region is comprised of Li4P2O7 and a disordered Li9Al3(P2O7)3(PO4)2 phase. We demonstrate that optimal ionic conductivity for the LAGP system is achieved for the 680 °C SPS preparation when the disordered Li9Al3(P2O7)3(PO4)2 phase dominates the inter-grain region composition with reduced contributions from the highly ordered Li4P2O7 phases.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
A Solution for a Fundamental Problem of 3D Inference based on 2D Representations
Authors:
Thien An L. Nguyen
Abstract:
3D inference from monocular vision using neural networks is an important research area of computer vision. Applications of the research area are various with many proposed solutions and have shown remarkable performance. Although many efforts have been invested, there are still unanswered questions, some of which are fundamental. In this paper, I discuss a problem that I hope will come to be known…
▽ More
3D inference from monocular vision using neural networks is an important research area of computer vision. Applications of the research area are various with many proposed solutions and have shown remarkable performance. Although many efforts have been invested, there are still unanswered questions, some of which are fundamental. In this paper, I discuss a problem that I hope will come to be known as a generalization of the Blind Perspective-n-Point (Blind PnP) problem for object-driven 3D inference based on 2D representations. The vital difference between the fundamental problem and the Blind PnP problem is that 3D inference parameters in the fundamental problem are attached directly to 3D points and the camera concept will be represented through the sharing of the parameters of these points. By providing an explainable and robust gradient-decent solution based on 2D representations for an important special case of the problem, the paper opens up a new approach for using available information-based learning methods to solve problems related to 3D object pose estimation from 2D images.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Depth-based Sampling and Steering Constraints for Memoryless Local Planners
Authors:
Thai Binh Nguyen,
Linh Nguyen,
Tanveer Choudhury,
Kathleen Keogh,
Manzur Murshed
Abstract:
By utilizing only depth information, the paper introduces a novel but efficient local planning approach that enhances not only computational efficiency but also planning performances for memoryless local planners. The sampling is first proposed to be based on the depth data which can identify and eliminate a specific type of in-collision trajectories in the sampled motion primitive library. More s…
▽ More
By utilizing only depth information, the paper introduces a novel but efficient local planning approach that enhances not only computational efficiency but also planning performances for memoryless local planners. The sampling is first proposed to be based on the depth data which can identify and eliminate a specific type of in-collision trajectories in the sampled motion primitive library. More specifically, all the obscured primitives' endpoints are found through querying the depth values and excluded from the sampled set, which can significantly reduce the computational workload required in collision checking. On the other hand, we furthermore propose a steering mechanism also based on the depth information to effectively prevent an autonomous vehicle from getting stuck when facing a large convex obstacle, providing a higher level of autonomy for a planning system. Our steering technique is theoretically proved to be complete in scenarios of convex obstacles. To evaluate effectiveness of the proposed DEpth based both Sampling and Steering (DESS) methods, we implemented them in the synthetic environments where a quadrotor was simulated flying through a cluttered region with multiple size-different obstacles. The obtained results demonstrate that the proposed approach can considerably decrease computing time in local planners, where more trajectories can be evaluated while the best path with much lower cost can be found. More importantly, the success rates calculated by the fact that the robot successfully navigated to the destinations in different testing scenarios are always higher than 99.6% on average.
△ Less
Submitted 5 November, 2022;
originally announced November 2022.
-
1-D Convolutional Graph Convolutional Networks for Fault Detection in Distributed Energy Systems
Authors:
Bang L. H. Nguyen,
Tuyen Vu,
Thai-Thanh Nguyen,
Mayank Panwar,
Rob Hovsapian
Abstract:
This paper presents a 1-D convolutional graph neural network for fault detection in microgrids. The combination of 1-D convolutional neural networks (1D-CNN) and graph convolutional networks (GCN) helps extract both spatial-temporal correlations from the voltage measurements in microgrids. The fault detection scheme includes fault event detection, fault type and phase classification, and fault loc…
▽ More
This paper presents a 1-D convolutional graph neural network for fault detection in microgrids. The combination of 1-D convolutional neural networks (1D-CNN) and graph convolutional networks (GCN) helps extract both spatial-temporal correlations from the voltage measurements in microgrids. The fault detection scheme includes fault event detection, fault type and phase classification, and fault location. There are five neural network model training to handle these tasks. Transfer learning and fine-tuning are applied to reduce training efforts. The combined recurrent graph convolutional neural networks (1D-CGCN) is compared with the traditional ANN structure on the Potsdam 13-bus microgrid dataset. The achievable accuracy of 99.27%, 98.1%, 98.75%, and 95.6% for fault detection, fault type classification, fault phase identification, and fault location respectively.
△ Less
Submitted 5 November, 2022;
originally announced November 2022.
-
Hierarchical Control of Grid-Connected Hydrogen Electrolyzer Providing Grid Services
Authors:
Bang L. H. Nguyen,
Mayank Panwar,
Rob Hovsapian,
Yashodhan Agalgaokar,
Tuyen Vu
Abstract:
This paper presents the operation modes and control architecture of the grid-connected hydrogen electrolyzer systems for the provision of frequency and voltage supports. The analysis is focused on the primary and secondary loops in the hierarchical control scheme. At the power converter inner control loop, the voltage- and current-control modes are analyzed. At the primary level, the droop and opp…
▽ More
This paper presents the operation modes and control architecture of the grid-connected hydrogen electrolyzer systems for the provision of frequency and voltage supports. The analysis is focused on the primary and secondary loops in the hierarchical control scheme. At the power converter inner control loop, the voltage- and current-control modes are analyzed. At the primary level, the droop and opposite droop control strategies to provide voltage and frequency support are described. Coordination between primary control and secondary, tertiary reserves is discussed. The case studies and real-time simulation results are provided using Typhoon HIL to back the theoretical investigation.
△ Less
Submitted 5 November, 2022;
originally announced November 2022.
-
A Large-Scale Study of a Sleep Tracking and Improving Device with Closed-loop and Personalized Real-time Acoustic Stimulation
Authors:
Anh Nguyen,
Galen Pogoncheff,
Ban Xuan Dong,
Nam Bui,
Hoang Truong,
Nhat Pham,
Linh Nguyen,
Hoang Huu Nguyen,
Sy Duong-Quy,
Sangtae Ha,
Tam Vu
Abstract:
Various intervention therapies ranging from pharmaceutical to hi-tech tailored solutions have been available to treat difficulty in falling asleep commonly caused by insomnia in modern life. However, current techniques largely remain ill-suited, ineffective, and unreliable due to their lack of precise real-time sleep tracking, in-time feedback on the therapies, an ability to keep people asleep dur…
▽ More
Various intervention therapies ranging from pharmaceutical to hi-tech tailored solutions have been available to treat difficulty in falling asleep commonly caused by insomnia in modern life. However, current techniques largely remain ill-suited, ineffective, and unreliable due to their lack of precise real-time sleep tracking, in-time feedback on the therapies, an ability to keep people asleep during the night, and a large-scale effectiveness evaluation. Here, we introduce a novel sleep aid system, called Earable, that can continuously sense multiple head-based physiological signals and simultaneously enable closed-loop auditory stimulation to entrain brain activities in time for effective sleep promotion. We develop the system in a lightweight, comfortable, and user-friendly headband with a comprehensive set of algorithms and dedicated own-designed audio stimuli. We conducted multiple protocols from 883 sleep studies on 377 subjects (241 women, 119 men) wearing either a gold-standard device (PSG), Earable, or both concurrently. We demonstrate that our system achieves (1) a strong correlation (0.89 +/- 0.03) between the physiological signals acquired by Earable and those from the gold-standard PSG, (2) an 87.8 +/- 5.3% agreement on sleep scoring using our automatic real-time sleep staging algorithm with the consensus scored by three sleep technicians, and (3) a successful non-pharmacological stimulation alternative to effectively shorten the duration of sleep falling by 24.1 +/- 0.1 minutes. These results show that the efficacy of Earable exceeds existing techniques in intentions to promote fast falling asleep, track sleep state accurately, and achieve high social acceptance for real-time closed-loop personalized neuromodulation-based home sleep care.
△ Less
Submitted 4 November, 2022;
originally announced November 2022.
-
Miko Team: Deep Learning Approach for Legal Question Answering in ALQAC 2022
Authors:
Hieu Nguyen Van,
Dat Nguyen,
Phuong Minh Nguyen,
Minh Le Nguyen
Abstract:
We introduce efficient deep learning-based methods for legal document processing including Legal Document Retrieval and Legal Question Answering tasks in the Automated Legal Question Answering Competition (ALQAC 2022). In this competition, we achieve 1\textsuperscript{st} place in the first task and 3\textsuperscript{rd} place in the second task. Our method is based on the XLM-RoBERTa model that i…
▽ More
We introduce efficient deep learning-based methods for legal document processing including Legal Document Retrieval and Legal Question Answering tasks in the Automated Legal Question Answering Competition (ALQAC 2022). In this competition, we achieve 1\textsuperscript{st} place in the first task and 3\textsuperscript{rd} place in the second task. Our method is based on the XLM-RoBERTa model that is pre-trained from a large amount of unlabeled corpus before fine-tuning to the specific tasks. The experimental results showed that our method works well in legal retrieval information tasks with limited labeled data. Besides, this method can be applied to other information retrieval tasks in low-resource languages.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Ultrafast x-ray scattering reveals composite amplitude collective mode in the Weyl charge density wave material (TaSe$_4$)$_2$I
Authors:
Quynh L. Nguyen,
Ryan A. Duncan,
Gal Orenstein,
Yi**g Huang,
Viktor Krapivin,
Gilberto de la Pena,
Chance Ornelas-Skarin,
David A. Reis,
Peter Abbamonte,
Simon Bettler,
Matthieu Chollet,
Matthias C. Hoffmann,
Matthew Hurley,
Soyeun Kim,
Patrick S. Kirchmann,
Yuya Kubota,
Fahad Mahmood,
Alexander Miller,
Taito Osaka,
Kejian Qu,
Takahiro Sato,
Daniel P. Shoemaker,
Nicholas Sirica,
Sanghoon Song,
Jade Stanton
, et al. (5 additional authors not shown)
Abstract:
We report ultrafast x-ray scattering experiments of the quasi-1D charge density wave (CDW) material (TaSe$_4$)$_2$I following photoexcitation with femtosecond infrared laser pulses. From the time-dependent diffraction signal at the CDW sidebands we identify an amplitude mode derived primarily from the transverse acoustic component of the CDW static distortion. The dynamics of this acoustic amplitu…
▽ More
We report ultrafast x-ray scattering experiments of the quasi-1D charge density wave (CDW) material (TaSe$_4$)$_2$I following photoexcitation with femtosecond infrared laser pulses. From the time-dependent diffraction signal at the CDW sidebands we identify an amplitude mode derived primarily from the transverse acoustic component of the CDW static distortion. The dynamics of this acoustic amplitude mode are described well by a model of a displacive excitation, which we interpret as mediated through a coupling to the optical phonon component associated with the tetramerization of the Ta chains.
△ Less
Submitted 23 December, 2022; v1 submitted 31 October, 2022;
originally announced October 2022.
-
Improved Learning-augmented Algorithms for k-means and k-medians Clustering
Authors:
Thy Nguyen,
Anamay Chaturvedi,
Huy Lê Nguyen
Abstract:
We consider the problem of clustering in the learning-augmented setting, where we are given a data set in $d$-dimensional Euclidean space, and a label for each data point given by an oracle indicating what subsets of points should be clustered together. This setting captures situations where we have access to some auxiliary information about the data set relevant for our clustering objective, for…
▽ More
We consider the problem of clustering in the learning-augmented setting, where we are given a data set in $d$-dimensional Euclidean space, and a label for each data point given by an oracle indicating what subsets of points should be clustered together. This setting captures situations where we have access to some auxiliary information about the data set relevant for our clustering objective, for instance the labels output by a neural network. Following prior work, we assume that there are at most an $α\in (0,c)$ for some $c<1$ fraction of false positives and false negatives in each predicted cluster, in the absence of which the labels would attain the optimal clustering cost $\mathrm{OPT}$.
For a dataset of size $m$, we propose a deterministic $k$-means algorithm that produces centers with improved bound on clustering cost compared to the previous randomized algorithm while preserving the $O( d m \log m)$ runtime. Furthermore, our algorithm works even when the predictions are not very accurate, i.e. our bound holds for $α$ up to $1/2$, an improvement over $α$ being at most $1/7$ in the previous work. For the $k$-medians problem we improve upon prior work by achieving a biquadratic improvement in the dependence of the approximation factor on the accuracy parameter $α$ to get a cost of $(1+O(α))\mathrm{OPT}$, while requiring essentially just $O(md \log^3 m/α)$ runtime.
△ Less
Submitted 1 March, 2023; v1 submitted 30 October, 2022;
originally announced October 2022.
-
Streaming Submodular Maximization with Differential Privacy
Authors:
Anamay Chaturvedi,
Huy Lê Nguyen,
Thy Nguyen
Abstract:
In this work, we study the problem of privately maximizing a submodular function in the streaming setting. Extensive work has been done on privately maximizing submodular functions in the general case when the function depends upon the private data of individuals. However, when the size of the data stream drawn from the domain of the objective function is large or arrives very fast, one must priva…
▽ More
In this work, we study the problem of privately maximizing a submodular function in the streaming setting. Extensive work has been done on privately maximizing submodular functions in the general case when the function depends upon the private data of individuals. However, when the size of the data stream drawn from the domain of the objective function is large or arrives very fast, one must privately optimize the objective within the constraints of the streaming setting. We establish fundamental differentially private baselines for this problem and then derive better trade-offs between privacy and utility for the special case of decomposable submodular functions. A submodular function is decomposable when it can be written as a sum of submodular functions; this structure arises naturally when each summand function models the utility of an individual and the goal is to study the total utility of the whole population as in the well-known Combinatorial Public Projects Problem. Finally, we complement our theoretical analysis with experimental corroboration.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Observation of the massive Lee-Fukuyama phason in a charge density wave insulator
Authors:
Soyeun Kim,
Yinchuan Lv,
Xiao-Qi Sun,
Chengxi Zhao,
Nina Bielinski,
Azel Murzabekova,
Kejian Qu,
Ryan A. Duncan,
Quynh L. D. Nguyen,
Mariano Trigo,
Daniel P. Shoemaker,
Barry Bradlyn,
Fahad Mahmood
Abstract:
The lowest-lying fundamental excitation of an incommensurate charge density wave (CDW) material is widely believed to be a massless phason -- a collective modulation of the phase of the CDW order parameter. However, as first pointed out by Lee and Fukuyama, long-range Coulomb interactions should push the phason energy up to the plasma energy of the CDW condensate, resulting in a massive phason and…
▽ More
The lowest-lying fundamental excitation of an incommensurate charge density wave (CDW) material is widely believed to be a massless phason -- a collective modulation of the phase of the CDW order parameter. However, as first pointed out by Lee and Fukuyama, long-range Coulomb interactions should push the phason energy up to the plasma energy of the CDW condensate, resulting in a massive phason and a fully gapped spectrum. Whether such behavior occurs in a CDW system has been unresolved for more than four decades. Using time-domain THz emission spectroscopy, we investigate this issue in the material (TaSe$_4$)$_2$I, a classical example of a quasi-one-dimensional CDW insulator. Upon transient photoexcitation at low temperatures, we find the material strikingly emits coherent, narrow-band THz radiation. The frequency, polarization and temperature-dependence of the emitted radiation imply the existence of a phason that acquires mass by coupling to long-range Coulomb interaction. Our observations constitute the first direct evidence of the massive "Lee-Fukuyama" phason and highlight the potential applicability of fundamental collective modes of correlated materials as compact and robust sources of THz radiation.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Reconstructing a space-dependent source term via the quasi-reversibility method
Authors:
Loc H. Nguyen,
Huong T. Vu
Abstract:
The aim of this paper is to solve an important inverse source problem which arises from the well-known inverse scattering problem. We propose to truncate the Fourier series of the solution to the governing equation with respect to a special basis of L2. By this, we obtain a system of linear elliptic equations. Solutions to this system are the Fourier coefficients of the solution to the governing e…
▽ More
The aim of this paper is to solve an important inverse source problem which arises from the well-known inverse scattering problem. We propose to truncate the Fourier series of the solution to the governing equation with respect to a special basis of L2. By this, we obtain a system of linear elliptic equations. Solutions to this system are the Fourier coefficients of the solution to the governing equation. After computing these Fourier coefficients, we can directly find the desired source function. Numerical examples are presented.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Classification of cow diet based on milk mid infrared spectra: a data analysis competition at the "International workshop of spectroscopy and chemometrics 2022"
Authors:
Maria Frizzarin,
Giulio Visentin,
Alessandro Ferragina,
Elena Hayes,
Antonio Bevilacqua,
Bhaskar Dhariyal,
Katarina Domijan,
Hussain Khan,
Georgiana Ifrim,
Thach Le Nguyen,
Joe Meagher,
Laura Menchetti,
Ashish Singh,
Suzy Whoriskey,
Robert Williamson,
Martina Zappaterra,
Alessandro Casa
Abstract:
In April 2022, the Vistamilk SFI Research Centre organized the second edition of the "International Workshop on Spectroscopy and Chemometrics - Applications in Food and Agriculture". Within this event, a data challenge was organized among participants of the workshop. Such data competition aimed at develo** a prediction model to discriminate dairy cows' diet based on milk spectral information co…
▽ More
In April 2022, the Vistamilk SFI Research Centre organized the second edition of the "International Workshop on Spectroscopy and Chemometrics - Applications in Food and Agriculture". Within this event, a data challenge was organized among participants of the workshop. Such data competition aimed at develo** a prediction model to discriminate dairy cows' diet based on milk spectral information collected in the mid-infrared region. In fact, the development of an accurate and reliable discriminant model for dairy cows' diet can provide important authentication tools for dairy processors to guarantee product origin for dairy food manufacturers from grass-fed animals. Different statistical and machine learning modelling approaches have been employed during the workshop, with different pre-processing steps involved and different degree of complexity. The present paper aims to describe the statistical methods adopted by participants to develop such classification model.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
A Stochastic Differential Equation Model for Predator-Avoidance Fish Schooling
Authors:
Aditya Dewanto Hartono,
Linh Thi Hoai Nguyen,
Ton Viet Ta
Abstract:
This paper presents a system of stochastic differential equations (SDEs) as mathematical model to describe the spatial-temporal dynamics of predator-prey system in an artificial aquatic environment with schooling behavior imposed upon the associated prey. The proposed model follows the particle-like approach where interactions among the associated units are manifested through combination of attrac…
▽ More
This paper presents a system of stochastic differential equations (SDEs) as mathematical model to describe the spatial-temporal dynamics of predator-prey system in an artificial aquatic environment with schooling behavior imposed upon the associated prey. The proposed model follows the particle-like approach where interactions among the associated units are manifested through combination of attractive and repulsive forces analogous to the ones occurred in molecular physics. Two hunting tactics of the predator are proposed and integrated into the general model, namely the center-attacking and the nearest-attacking strategy. Emphasis is placed upon demonstrating the capacity of the proposed model in: (i) discovering the predator-avoidance patterns of the schooling prey, and (ii) showing the benefit of constituting large prey school in better esca** the predator's attack. Based on numerical simulations upon the proposed model, four predator-avoidance patterns of the schooling prey are discovered, namely Split and Reunion, Split and Separate into Two Groups, Scattered, and Maintain Formation and Distance. The proposed model also successfully demonstrates the benefit of constituting large group of schooling prey in mitigating predation risk. Such findings are in agreement with real-life observations of the natural aquatic ecosystem, hence confirming the validity and exactitude of the proposed model.
△ Less
Submitted 8 October, 2022;
originally announced October 2022.
-
Multi-stream Fusion for Class Incremental Learning in Pill Image Classification
Authors:
Trong-Tung Nguyen,
Hieu H. Pham,
Phi Le Nguyen,
Thanh Hung Nguyen,
Minh Do
Abstract:
Classifying pill categories from real-world images is crucial for various smart healthcare applications. Although existing approaches in image classification might achieve a good performance on fixed pill categories, they fail to handle novel instances of pill categories that are frequently presented to the learning algorithm. To this end, a trivial solution is to train the model with novel classe…
▽ More
Classifying pill categories from real-world images is crucial for various smart healthcare applications. Although existing approaches in image classification might achieve a good performance on fixed pill categories, they fail to handle novel instances of pill categories that are frequently presented to the learning algorithm. To this end, a trivial solution is to train the model with novel classes. However, this may result in a phenomenon known as catastrophic forgetting, in which the system forgets what it learned in previous classes. In this paper, we address this challenge by introducing the class incremental learning (CIL) ability to traditional pill image classification systems. Specifically, we propose a novel incremental multi-stream intermediate fusion framework enabling incorporation of an additional guidance information stream that best matches the domain of the problem into various state-of-the-art CIL methods. From this framework, we consider color-specific information of pill images as a guidance stream and devise an approach, namely "Color Guidance with Multi-stream intermediate fusion"(CG-IMIF) for solving CIL pill image classification task. We conduct comprehensive experiments on real-world incremental pill image classification dataset, namely VAIPE-PCIL, and find that the CG-IMIF consistently outperforms several state-of-the-art methods by a large margin in different task settings. Our code, data, and trained model are available at https://github.com/vinuni-vishc/CG-IMIF.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
High Probability Convergence for Accelerated Stochastic Mirror Descent
Authors:
Alina Ene,
Huy L. Nguyen
Abstract:
In this work, we describe a generic approach to show convergence with high probability for stochastic convex optimization. In previous works, either the convergence is only in expectation or the bound depends on the diameter of the domain. Instead, we show high probability convergence with bounds depending on the initial distance to the optimal solution as opposed to the domain diameter. The algor…
▽ More
In this work, we describe a generic approach to show convergence with high probability for stochastic convex optimization. In previous works, either the convergence is only in expectation or the bound depends on the diameter of the domain. Instead, we show high probability convergence with bounds depending on the initial distance to the optimal solution as opposed to the domain diameter. The algorithms use step sizes analogous to the standard settings and are universal to Lipschitz functions, smooth functions, and their linear combinations.
△ Less
Submitted 2 October, 2022;
originally announced October 2022.
-
Fast and Robust Video-Based Exercise Classification via Body Pose Tracking and Scalable Multivariate Time Series Classifiers
Authors:
Ashish Singh,
Antonio Bevilacqua,
Thach Le Nguyen,
Feiyan Hu,
Kevin McGuinness,
Martin OReilly,
Darragh Whelan,
Brian Caulfield,
Georgiana Ifrim
Abstract:
Technological advancements have spurred the usage of machine learning based applications in sports science. Physiotherapists, sports coaches and athletes actively look to incorporate the latest technologies in order to further improve performance and avoid injuries. While wearable sensors are very popular, their use is hindered by constraints on battery power and sensor calibration, especially for…
▽ More
Technological advancements have spurred the usage of machine learning based applications in sports science. Physiotherapists, sports coaches and athletes actively look to incorporate the latest technologies in order to further improve performance and avoid injuries. While wearable sensors are very popular, their use is hindered by constraints on battery power and sensor calibration, especially for use cases which require multiple sensors to be placed on the body. Hence, there is renewed interest in video-based data capture and analysis for sports science. In this paper, we present the application of classifying S\&C exercises using video. We focus on the popular Military Press exercise, where the execution is captured with a video-camera using a mobile device, such as a mobile phone, and the goal is to classify the execution into different types. Since video recordings need a lot of storage and computation, this use case requires data reduction, while preserving the classification accuracy and enabling fast prediction. To this end, we propose an approach named BodyMTS to turn video into time series by employing body pose tracking, followed by training and prediction using multivariate time series classifiers. We analyze the accuracy and robustness of BodyMTS and show that it is robust to different types of noise caused by either video quality or pose estimation factors. We compare BodyMTS to state-of-the-art deep learning methods which classify human activity directly from videos and show that BodyMTS achieves similar accuracy, but with reduced running time and model engineering effort. Finally, we discuss some of the practical aspects of employing BodyMTS in this application in terms of accuracy and robustness under reduced data quality and size. We show that BodyMTS achieves an average accuracy of 87\%, which is significantly higher than the accuracy of human domain experts.
△ Less
Submitted 2 October, 2022;
originally announced October 2022.
-
META-STORM: Generalized Fully-Adaptive Variance Reduced SGD for Unbounded Functions
Authors:
Zijian Liu,
Ta Duy Nguyen,
Thien Hang Nguyen,
Alina Ene,
Huy L. Nguyen
Abstract:
We study the application of variance reduction (VR) techniques to general non-convex stochastic optimization problems. In this setting, the recent work STORM [Cutkosky-Orabona '19] overcomes the drawback of having to compute gradients of "mega-batches" that earlier VR methods rely on. There, STORM utilizes recursive momentum to achieve the VR effect and is then later made fully adaptive in STORM+…
▽ More
We study the application of variance reduction (VR) techniques to general non-convex stochastic optimization problems. In this setting, the recent work STORM [Cutkosky-Orabona '19] overcomes the drawback of having to compute gradients of "mega-batches" that earlier VR methods rely on. There, STORM utilizes recursive momentum to achieve the VR effect and is then later made fully adaptive in STORM+ [Levy et al., '21], where full-adaptivity removes the requirement for obtaining certain problem-specific parameters such as the smoothness of the objective and bounds on the variance and norm of the stochastic gradients in order to set the step size. However, STORM+ crucially relies on the assumption that the function values are bounded, excluding a large class of useful functions. In this work, we propose META-STORM, a generalized framework of STORM+ that removes this bounded function values assumption while still attaining the optimal convergence rate for non-convex optimization. META-STORM not only maintains full-adaptivity, removing the need to obtain problem specific parameters, but also improves the convergence rate's dependency on the problem parameters. Furthermore, META-STORM can utilize a large range of parameter settings that subsumes previous methods allowing for more flexibility in a wider range of settings. Finally, we demonstrate the effectiveness of META-STORM through experiments across common deep learning tasks. Our algorithm improves upon the previous work STORM+ and is competitive with widely used algorithms after the addition of per-coordinate update and exponential moving average heuristics.
△ Less
Submitted 29 September, 2022;
originally announced September 2022.
-
On the Convergence of AdaGrad(Norm) on $\R^{d}$: Beyond Convexity, Non-Asymptotic Rate and Acceleration
Authors:
Zijian Liu,
Ta Duy Nguyen,
Alina Ene,
Huy L. Nguyen
Abstract:
Existing analysis of AdaGrad and other adaptive methods for smooth convex optimization is typically for functions with bounded domain diameter. In unconstrained problems, previous works guarantee an asymptotic convergence rate without an explicit constant factor that holds true for the entire function class. Furthermore, in the stochastic setting, only a modified version of AdaGrad, different from…
▽ More
Existing analysis of AdaGrad and other adaptive methods for smooth convex optimization is typically for functions with bounded domain diameter. In unconstrained problems, previous works guarantee an asymptotic convergence rate without an explicit constant factor that holds true for the entire function class. Furthermore, in the stochastic setting, only a modified version of AdaGrad, different from the one commonly used in practice, in which the latest gradient is not used to update the stepsize, has been analyzed. Our paper aims at bridging these gaps and develo** a deeper understanding of AdaGrad and its variants in the standard setting of smooth convex functions as well as the more general setting of quasar convex functions. First, we demonstrate new techniques to explicitly bound the convergence rate of the vanilla AdaGrad for unconstrained problems in both deterministic and stochastic settings. Second, we propose a variant of AdaGrad for which we can show the convergence of the last iterate, instead of the average iterate. Finally, we give new accelerated adaptive algorithms and their convergence guarantee in the deterministic setting with explicit dependency on the problem parameters, improving upon the asymptotic rate shown in previous works.
△ Less
Submitted 4 October, 2023; v1 submitted 29 September, 2022;
originally announced September 2022.
-
An Efficient Algorithm for Fair Multi-Agent Multi-Armed Bandit with Low Regret
Authors:
Matthew Jones,
Huy Lê Nguyen,
Thy Nguyen
Abstract:
Recently a multi-agent variant of the classical multi-armed bandit was proposed to tackle fairness issues in online learning. Inspired by a long line of work in social choice and economics, the goal is to optimize the Nash social welfare instead of the total utility. Unfortunately previous algorithms either are not efficient or achieve sub-optimal regret in terms of the number of rounds $T$. We pr…
▽ More
Recently a multi-agent variant of the classical multi-armed bandit was proposed to tackle fairness issues in online learning. Inspired by a long line of work in social choice and economics, the goal is to optimize the Nash social welfare instead of the total utility. Unfortunately previous algorithms either are not efficient or achieve sub-optimal regret in terms of the number of rounds $T$. We propose a new efficient algorithm with lower regret than even previous inefficient ones. For $N$ agents, $K$ arms, and $T$ rounds, our approach has a regret bound of $\tilde{O}(\sqrt{NKT} + NK)$. This is an improvement to the previous approach, which has regret bound of $\tilde{O}( \min(NK, \sqrt{N} K^{3/2})\sqrt{T})$. We also complement our efficient algorithm with an inefficient approach with $\tilde{O}(\sqrt{KT} + N^2K)$ regret. The experimental findings confirm the effectiveness of our efficient algorithm compared to the previous approaches.
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
Orbital-selective time-domain signature of nematicity dynamics in the charge-density-wave phase of La$_{1.65}$Eu$_{0.2}$Sr$_{0.15}$CuO$_4$
Authors:
Martin Bluschke,
Naman K. Gupta,
Hoyoung Jang,
Ali A. Husain,
Byungjune Lee,
MengXing Na,
Brandon Dos Remedios,
Steef Smit,
Peter Moen,
Sang-Youn Park,
Minseok Kim,
Dogeun Jang,
Hyeongi Choi,
Ronny Sutarto,
Alexander H. Reid,
Georgi L. Dakovski,
Giacomo Coslovich,
Quynh L. Nguyen,
Nicolas G. Burdet,
Ming-Fu Lin,
Alexandre Revcolevschi,
Jae-Hoon Park,
Jochen Geck,
Joshua J. Turner,
Andrea Damascelli
, et al. (1 additional authors not shown)
Abstract:
Understanding the interplay between charge, nematic, and structural ordering tendencies in cuprate superconductors is critical to unraveling their complex phase diagram. Using pump-probe time-resolved resonant x-ray scattering on the (0 0 1) Bragg peak at the Cu $L_3$ and O $K$ resonances, we investigate non-equilibrium dynamics of $Q_a = Q_b = 0$ nematic order and its association with both charge…
▽ More
Understanding the interplay between charge, nematic, and structural ordering tendencies in cuprate superconductors is critical to unraveling their complex phase diagram. Using pump-probe time-resolved resonant x-ray scattering on the (0 0 1) Bragg peak at the Cu $L_3$ and O $K$ resonances, we investigate non-equilibrium dynamics of $Q_a = Q_b = 0$ nematic order and its association with both charge density wave (CDW) order and lattice dynamics in La$_{1.65}$Eu$_{0.2}$Sr$_{0.15}$CuO$_4$. The orbital selectivity of the resonant x-ray scattering cross-section allows nematicity dynamics associated with the planar O 2$p$ and Cu 3$d$ states to be distinguished from the response of anisotropic lattice distortions. A direct time-domain comparison of CDW translational-symmetry breaking and nematic rotational-symmetry breaking reveals that these broken symmetries remain closely linked in the photoexcited state, consistent with the stability of CDW topological defects in the investigated pump fluence regime.
△ Less
Submitted 9 September, 2023; v1 submitted 23 September, 2022;
originally announced September 2022.
-
The complexity of unsupervised learning of lexicographic preferences
Authors:
Hélène Fargier,
Pierre-François Gimenez,
Jérôme Mengin,
Bao Ngoc Le Nguyen
Abstract:
This paper considers the task of learning users' preferences on a combinatorial set of alternatives, as generally used by online configurators, for example. In many settings, only a set of selected alternatives during past interactions is available to the learner. Fargier et al. [2018] propose an approach to learn, in such a setting, a model of the users' preferences that ranks previously chosen a…
▽ More
This paper considers the task of learning users' preferences on a combinatorial set of alternatives, as generally used by online configurators, for example. In many settings, only a set of selected alternatives during past interactions is available to the learner. Fargier et al. [2018] propose an approach to learn, in such a setting, a model of the users' preferences that ranks previously chosen alternatives as high as possible; and an algorithm to learn, in this setting, a particular model of preferences: lexicographic preferences trees (LP-trees). In this paper, we study complexity-theoretical problems related to this approach. We give an upper bound on the sample complexity of learning an LP-tree, which is logarithmic in the number of attributes. We also prove that computing the LP tree that minimises the empirical risk can be done in polynomial time when restricted to the class of linear LP-trees.
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
SMTCE: A Social Media Text Classification Evaluation Benchmark and BERTology Models for Vietnamese
Authors:
Luan Thanh Nguyen,
Kiet Van Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
Text classification is a typical natural language processing or computational linguistics task with various interesting applications. As the number of users on social media platforms increases, data acceleration promotes emerging studies on Social Media Text Classification (SMTC) or social media text mining on these valuable resources. In contrast to English, Vietnamese, one of the low-resource la…
▽ More
Text classification is a typical natural language processing or computational linguistics task with various interesting applications. As the number of users on social media platforms increases, data acceleration promotes emerging studies on Social Media Text Classification (SMTC) or social media text mining on these valuable resources. In contrast to English, Vietnamese, one of the low-resource languages, is still not concentrated on and exploited thoroughly. Inspired by the success of the GLUE, we introduce the Social Media Text Classification Evaluation (SMTCE) benchmark, as a collection of datasets and models across a diverse set of SMTC tasks. With the proposed benchmark, we implement and analyze the effectiveness of a variety of multilingual BERT-based models (mBERT, XLM-R, and DistilmBERT) and monolingual BERT-based models (PhoBERT, viBERT, vELECTRA, and viBERT4news) for tasks in the SMTCE benchmark. Monolingual models outperform multilingual models and achieve state-of-the-art results on all text classification tasks. It provides an objective assessment of multilingual and monolingual BERT-based models on the benchmark, which will benefit future studies about BERTology in the Vietnamese language.
△ Less
Submitted 21 September, 2022;
originally announced September 2022.
-
FedToken: Tokenized Incentives for Data Contribution in Federated Learning
Authors:
Shashi Raj Pandey,
Lam Duc Nguyen,
Petar Popovski
Abstract:
Incentives that compensate for the involved costs in the decentralized training of a Federated Learning (FL) model act as a key stimulus for clients' long-term participation. However, it is challenging to convince clients for quality participation in FL due to the absence of: (i) full information on the client's data quality and properties; (ii) the value of client's data contributions; and (iii)…
▽ More
Incentives that compensate for the involved costs in the decentralized training of a Federated Learning (FL) model act as a key stimulus for clients' long-term participation. However, it is challenging to convince clients for quality participation in FL due to the absence of: (i) full information on the client's data quality and properties; (ii) the value of client's data contributions; and (iii) the trusted mechanism for monetary incentive offers. This often leads to poor efficiency in training and communication. While several works focus on strategic incentive designs and client selection to overcome this problem, there is a major knowledge gap in terms of an overall design tailored to the foreseen digital economy, including Web 3.0, while simultaneously meeting the learning objectives. To address this gap, we propose a contribution-based tokenized incentive scheme, namely \texttt{FedToken}, backed by blockchain technology that ensures fair allocation of tokens amongst the clients that corresponds to the valuation of their data during model training. Leveraging the engineered Shapley-based scheme, we first approximate the contribution of local models during model aggregation, then strategically schedule clients lowering the communication rounds for convergence and anchor ways to allocate \emph{affordable} tokens under a constrained monetary budget. Extensive simulations demonstrate the efficacy of our proposed method.
△ Less
Submitted 3 November, 2022; v1 submitted 20 September, 2022;
originally announced September 2022.