-
Maximum Length RLL Sequences in de Bruijn Graph
Authors:
Yeow Meng Chee,
Tuvi Etzion,
Tien Long Nguyen,
Duy Hoang Ta,
Vinh Duc Tran,
Van Khu Vu
Abstract:
A timing and synchronization system based on a de Bruijn sequence has been proposed and studied recently for a channel associated with quantum communication that requires reliable synchronization. To avoid a long period of no-pulse in such a system on-off pulses are used to simulate a zero and on-on pulses are used to simulate a one. However, these sequences have high redundancy. To reduce the red…
▽ More
A timing and synchronization system based on a de Bruijn sequence has been proposed and studied recently for a channel associated with quantum communication that requires reliable synchronization. To avoid a long period of no-pulse in such a system on-off pulses are used to simulate a zero and on-on pulses are used to simulate a one. However, these sequences have high redundancy. To reduce the redundancy, run-length limited sequences in the de Bruijn graph are proposed for the same purpose. The maximum length of such sequences in the de Bruijn graph is studied and an efficient algorithm to construct a large set of these sequences is presented. A maximum length sequence for which the position of each window can be computed efficiently is constructed. Finally, an enumeration of the number of such sequences is given and some generalizations are discussed.
△ Less
Submitted 3 March, 2024;
originally announced March 2024.
-
Real-time hybrid controls of energy storage and load shedding for integrated power and energy systems of ships
Authors:
Linh Vu,
Thai-Thanh Nguyen,
Bang Le-Huy Nguyen,
Md Isfakul Anam,
Tuyen Vu
Abstract:
This paper presents an original energy management methodology to enhance the resilience of ship power systems. The integration of various energy storage systems (ESS), including battery energy storage systems (BESS) and super-capacitor energy storage systems (SCESS), in modern ship power systems poses challenges in designing an efficient energy management system (EMS). The EMS proposed in this pap…
▽ More
This paper presents an original energy management methodology to enhance the resilience of ship power systems. The integration of various energy storage systems (ESS), including battery energy storage systems (BESS) and super-capacitor energy storage systems (SCESS), in modern ship power systems poses challenges in designing an efficient energy management system (EMS). The EMS proposed in this paper aims to achieve multiple objectives. The primary objective is to minimize shed loads, while the secondary objective is to effectively manage different types of ESS. Considering the diverse ramp-rate characteristics of generators, SCESS, and BESS, the proposed EMS exploits these differences to determine an optimal long-term schedule for minimizing shed loads. Furthermore, the proposed EMS balances the state-of-charge (SoC) of ESS and prioritizes the SCESS's SoC levels to ensure the efficient operation of BESS and SCESS. For better computational efficiency, we introduce the receding horizon optimization method, enabling real-time EMS implementation. A comparison with the fixed horizon optimization (FHO) validates its effectiveness. Simulation studies and results demonstrate that the proposed EMS efficiently manages generators, BESS, and SCESS, ensuring system resilience under generation shortages. Additionally, the proposed methodology significantly reduces the computational burden compared to the FHO technique while maintaining acceptable resilience performance.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Closed-loop Equilibria for Mean-Field Games in Randomly Switching Environments with General Discounting Costs
Authors:
Hongwei Mei,
Son Luu Nguyen,
George Yin
Abstract:
This work is devoted to finding the closed-loop equilibria for a class of mean-field games (MFGs) with infinitely many symmetric players in a common switching environment when the cost functional is under general discount in time. There are two key challenges in the application of the well-known Hamilton-Jacobi-Bellman and Fokker-Planck (HJB-FP) approach to our problems: the path-dependence due to…
▽ More
This work is devoted to finding the closed-loop equilibria for a class of mean-field games (MFGs) with infinitely many symmetric players in a common switching environment when the cost functional is under general discount in time. There are two key challenges in the application of the well-known Hamilton-Jacobi-Bellman and Fokker-Planck (HJB-FP) approach to our problems: the path-dependence due to the conditional mean-field interaction and the time-inconsistency due to the general discounting cost. To overcome the difficulties, a theory for a class of systems of path-dependent equilibrium Hamilton-Jacobi-Bellman equations (HJBs) is developed. Then closed-loop equilibrium strategies can be identified through a two-step verification procedure. It should be noted that the closed-loop equilibrium strategies obtained satisfy a new form of local optimality in the Nash sense. The theory obtained extends the HJB-FP approach for classical MFGs to more general conditional MFGs with general discounting costs.
△ Less
Submitted 29 February, 2024;
originally announced March 2024.
-
Stochastic ISTA/FISTA Adaptive Step Search Algorithms for Convex Composite Optimization
Authors:
Lam M. Nguyen,
Katya Scheinberg,
Trang H. Tran
Abstract:
We develop and analyze stochastic variants of ISTA and a full backtracking FISTA algorithms [Beck and Teboulle, 2009, Scheinberg et al., 2014] for composite optimization without the assumption that stochastic gradient is an unbiased estimator. This work extends analysis of inexact fixed step ISTA/FISTA in [Schmidt et al., 2011] to the case of stochastic gradient estimates and adaptive step-size pa…
▽ More
We develop and analyze stochastic variants of ISTA and a full backtracking FISTA algorithms [Beck and Teboulle, 2009, Scheinberg et al., 2014] for composite optimization without the assumption that stochastic gradient is an unbiased estimator. This work extends analysis of inexact fixed step ISTA/FISTA in [Schmidt et al., 2011] to the case of stochastic gradient estimates and adaptive step-size parameter chosen by backtracking. It also extends the framework for analyzing stochastic line-search method in [Cartis and Scheinberg, 2018] to the proximal gradient framework as well as to the accelerated first order methods.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
MSTAR: Multi-Scale Backbone Architecture Search for Timeseries Classification
Authors:
Tue M. Cao,
Nhat H. Tran,
Hieu H. Pham,
Hung T. Nguyen,
Le P. Nguyen
Abstract:
Most of the previous approaches to Time Series Classification (TSC) highlight the significance of receptive fields and frequencies while overlooking the time resolution. Hence, unavoidably suffered from scalability issues as they integrated an extensive range of receptive fields into classification models. Other methods, while having a better adaptation for large datasets, require manual design an…
▽ More
Most of the previous approaches to Time Series Classification (TSC) highlight the significance of receptive fields and frequencies while overlooking the time resolution. Hence, unavoidably suffered from scalability issues as they integrated an extensive range of receptive fields into classification models. Other methods, while having a better adaptation for large datasets, require manual design and yet not being able to reach the optimal architecture due to the uniqueness of each dataset. We overcome these challenges by proposing a novel multi-scale search space and a framework for Neural architecture search (NAS), which addresses both the problem of frequency and time resolution, discovering the suitable scale for a specific dataset. We further show that our model can serve as a backbone to employ a powerful Transformer module with both untrained and pre-trained weights. Our search space reaches the state-of-the-art performance on four datasets on four different domains while introducing more than ten highly fine-tuned models for each data.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
LangXAI: Integrating Large Vision Models for Generating Textual Explanations to Enhance Explainability in Visual Perception Tasks
Authors:
Truong Thanh Hung Nguyen,
Tobias Clement,
Phuc Truong Loc Nguyen,
Nils Kemmerzell,
Van Binh Truong,
Vo Thanh Khang Nguyen,
Mohamed Abdelaal,
Hung Cao
Abstract:
LangXAI is a framework that integrates Explainable Artificial Intelligence (XAI) with advanced vision models to generate textual explanations for visual recognition tasks. Despite XAI advancements, an understanding gap persists for end-users with limited domain knowledge in artificial intelligence and computer vision. LangXAI addresses this by furnishing text-based explanations for classification,…
▽ More
LangXAI is a framework that integrates Explainable Artificial Intelligence (XAI) with advanced vision models to generate textual explanations for visual recognition tasks. Despite XAI advancements, an understanding gap persists for end-users with limited domain knowledge in artificial intelligence and computer vision. LangXAI addresses this by furnishing text-based explanations for classification, object detection, and semantic segmentation model outputs to end-users. Preliminary results demonstrate LangXAI's enhanced plausibility, with high BERTScore across tasks, fostering a more transparent and reliable AI framework on vision tasks for end-users.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Non-equilibrium pathways to emergent polar supertextures
Authors:
Vladimir A. Stoica,
Tiannan Yang,
Sujit Das,
Yue Cao,
Huaiyu Wang,
Yuya Kubota,
Cheng Dai,
Hari Padmanabhan,
Yusuke Sato,
Anudeep Mangu,
Quynh L. Nguyen,
Zhan Zhang,
Disha Talreja,
Marc E. Zajac,
Donald A. Walko,
Anthony D. DiChiara,
Shigeki Owada,
Kohei Miyanishi,
Kenji Tamasaku,
Takahiro Sato,
James M. Glownia,
Vincent Esposito,
Silke Nelson,
Matthias C. Hoffmann,
Richard D. Schaller
, et al. (9 additional authors not shown)
Abstract:
Ultrafast stimuli can stabilize metastable states of matter inaccessible by equilibrium means. Establishing the spatiotemporal link between ultrafast excitation and metastability is crucial to understanding these phenomena. Here, we use single-shot optical-pump, X-ray-probe measurements to provide snapshots of the emergence of a persistent polar vortex supercrystal in a heterostructure that hosts…
▽ More
Ultrafast stimuli can stabilize metastable states of matter inaccessible by equilibrium means. Establishing the spatiotemporal link between ultrafast excitation and metastability is crucial to understanding these phenomena. Here, we use single-shot optical-pump, X-ray-probe measurements to provide snapshots of the emergence of a persistent polar vortex supercrystal in a heterostructure that hosts a fine balance between built-in electrostatic and elastic frustrations by design. By perturbing this balance with photoinduced charges, a starting heterogenous mixture of polar phases disorders within a few picoseconds, resulting in a soup state composed of disordered ferroelectric and suppressed vortex orders. On the pico-to-nanosecond timescales, transient labyrinthine fluctuations form in this soup along with a recovering vortex order. On longer timescales, these fluctuations are progressively quenched by dynamical strain modulations, which drive the collective emergence of a single supercrystal phase. Our results, corroborated by dynamical phase-field modeling, reveal how ultrafast excitation of designer systems generates pathways for persistent metastability.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Computationally Predicted Electronic Properties and Energetics of Native Defects in Cubic Boron Nitride
Authors:
Ngoc Linh Nguyen,
Hung The Dang,
Tien Lam Pham,
Thi Minh Hoa Nghiem
Abstract:
In this study, we employ a first-principles approach to conduct a comprehensive investigation of the properties of nine common native point defects in cubic boron nitride. This analysis combines standard semi-local and dielectric hybrid density-exchange-correlation functional calculations, encompassing vacancies, interstitials, antisites, and their complexes. Our findings elucidate the influence o…
▽ More
In this study, we employ a first-principles approach to conduct a comprehensive investigation of the properties of nine common native point defects in cubic boron nitride. This analysis combines standard semi-local and dielectric hybrid density-exchange-correlation functional calculations, encompassing vacancies, interstitials, antisites, and their complexes. Our findings elucidate the influence of these defects on the structural and electronic characteristics of cubic boron nitride, such as local structures, formation energy, magnetism, and the energies of defect states within the band gap. Notably, we accurately simulate the photoluminescent spectra of cubic boron nitride induced by these defects, demonstrating excellent agreement with experimental observations. This outcome indicates that the prominent peaks in the photoluminescent spectrum at 2.5 and 2.8 eV can be attributed to the nitrogen to boron antisite (N$_{\rm B}$) and boron interstitial (B$_{\rm i}$) defects, respectively. Additionally, we investigate the energetic stability of defects under various charge states, providing valuable references for benchmarking purposes.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Maximizing NFT Incentives: References Make You Rich
Authors:
Guangsheng Yu,
Qin Wang,
Caijun Sun,
Lam Duc Nguyen,
H. M. N. Dilum Bandara,
Shi** Chen
Abstract:
In this paper, we study how to optimize existing Non-Fungible Token (NFT) incentives. Upon exploring a large number of NFT-related standards and real-world projects, we come across an unexpected finding. That is, the current NFT incentive mechanisms, often organized in an isolated and one-time-use fashion, tend to overlook their potential for scalable organizational structures.
We propose, analy…
▽ More
In this paper, we study how to optimize existing Non-Fungible Token (NFT) incentives. Upon exploring a large number of NFT-related standards and real-world projects, we come across an unexpected finding. That is, the current NFT incentive mechanisms, often organized in an isolated and one-time-use fashion, tend to overlook their potential for scalable organizational structures.
We propose, analyze, and implement a novel reference incentive model, which is inherently structured as a Directed Acyclic Graph (DAG)-based NFT network. This model aims to maximize connections (or references) between NFTs, enabling each isolated NFT to expand its network and accumulate rewards derived from subsequent or subscribed ones. We conduct both theoretical and practical analyses of the model, demonstrating its optimal utility.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
(Almost) Affine Higher-Order Tree Transducers
Authors:
Lê Thành Dũng Tito Nguyên,
Gabriele Vanoni
Abstract:
We investigate the tree-to-tree functions computed by \enquote{affine$λ$-transducers}: tree automata whose memory consists of an affine $λ$-term instead of a finite state. They can be seen as variations on Gallot, Lemay and Salvati's Linear High-Order Deterministic Tree Transducers. When the memory is almost purely affine (\textit{à la} Kanazawa), we show that these machines can be translated to t…
▽ More
We investigate the tree-to-tree functions computed by \enquote{affine$λ$-transducers}: tree automata whose memory consists of an affine $λ$-term instead of a finite state. They can be seen as variations on Gallot, Lemay and Salvati's Linear High-Order Deterministic Tree Transducers. When the memory is almost purely affine (\textit{à la} Kanazawa), we show that these machines can be translated to tree-walking transducers (and with a purely affine memory, we get a reversible tree-walking transducer). This leads to a proof of an inexpressivity conjecture of \titocecilia on \enquote{implicit automata} in an affine $λ$-calculus. The key technical tool in our proofs is the Interaction Abstract Machine (IAM), an operational avatar of the \enquote{geometry of interaction} semantics of linear logic. We work with ad-hoc specializations to (almost) affine $λ$-terms of a tree-generating version of the IAM.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Optimizing Visibility-based Search in Polygonal Domains
Authors:
Kien C. Huynh,
Joseph S. B. Mitchell,
Linh Nguyen,
Valentin Polishchuk
Abstract:
Given a geometric domain $P$, visibility-based search problems seek routes for one or more mobile agents ("watchmen") to move within $P$ in order to be able to see a portion (or all) of $P$, while optimizing objectives, such as the length(s) of the route(s), the size (e.g., area or volume) of the portion seen, the probability of detecting a target distributed within $P$ according to a prior distri…
▽ More
Given a geometric domain $P$, visibility-based search problems seek routes for one or more mobile agents ("watchmen") to move within $P$ in order to be able to see a portion (or all) of $P$, while optimizing objectives, such as the length(s) of the route(s), the size (e.g., area or volume) of the portion seen, the probability of detecting a target distributed within $P$ according to a prior distribution, etc. The classic watchman route problem seeks a shortest route for an observer, with omnidirectional vision, to see all of $P$. In this paper we study bicriteria optimization problems for a single mobile agent within a polygonal domain $P$ in the plane, with the criteria of route length and area seen. Specifically, we address the problem of computing a minimum length route that sees at least a specified area of $P$ (minimum length, for a given area quota). We also study the problem of computing a length-constrained route that sees as much area as possible. We provide hardness results and approximation algorithms. In particular, for a simple polygon $P$ we provide the first fully polynomial-time approximation scheme for the problem of computing a shortest route seeing an area quota, as well as a (slightly more efficient) polynomial dual approximation. We also consider polygonal domains $P$ (with holes) and the special case of a planar domain consisting of a union of lines. Our results yield the first approximation algorithms for computing a time-optimal search route in $P$ to guarantee some specified probability of detection of a static target within $P$, randomly distributed in $P$ according to a given prior distribution.
△ Less
Submitted 18 April, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Hidden domain boundary dynamics towards crystalline perfection
Authors:
A. Mangu,
V. A. Stoica,
H. Zheng,
T. Yang,
M. Zhang,
H. Wang,
Q. L. Nguyen,
S. Song,
S. Das,
P. Meisenheimer,
E. Donoway,
M. Chollet,
Y. Sun,
J. J. Turner,
J. W. Freeland,
H. Wen,
L. W. Martin,
L. -Q. Chen,
V. Gopalan,
D. Zhu,
Y. Cao,
A. M. Lindenberg
Abstract:
A central paradigm of non-equilibrium physics concerns the dynamics of heterogeneity and disorder, impacting processes ranging from the behavior of glasses to the emergent functionality of active matter. Understanding these complex mesoscopic systems requires probing the microscopic trajectories associated with irreversible processes, the role of fluctuations and entropy growth, and the timescales…
▽ More
A central paradigm of non-equilibrium physics concerns the dynamics of heterogeneity and disorder, impacting processes ranging from the behavior of glasses to the emergent functionality of active matter. Understanding these complex mesoscopic systems requires probing the microscopic trajectories associated with irreversible processes, the role of fluctuations and entropy growth, and the timescales on which non-equilibrium responses are ultimately maintained. Approaches that illuminate these processes in model systems may enable a more general understanding of other heterogeneous non-equilibrium phenomena, and potentially define ultimate speed and energy cost limits for information processing technologies. Here, we apply ultrafast single shot x-ray photon correlation spectroscopy to resolve the non-equilibrium, heterogeneous, and irreversible mesoscale dynamics during a light-induced phase transition. This approach defines a new way of capturing the nucleation of the induced phase, the formation of transient mesoscale defects at the boundaries of the nuclei, and the eventual annihilation of these defects, even in systems with complex polarization topologies. A non-equilibrium response spanning >10 orders of magnitude in timescales is observed, with multistep behavior similar to the plateaus observed in supercooled liquids and glasses. We show how the observed time-dependent long-time correlations can be understood in terms of the stochastic dynamics of domain walls, encoded in effective waiting-time distributions with power-law tails. This work defines new possibilities for probing the non-equilibrium and correlated dynamics of disordered and heterogeneous media.
△ Less
Submitted 21 March, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
VlogQA: Task, Dataset, and Baseline Models for Vietnamese Spoken-Based Machine Reading Comprehension
Authors:
Thinh Phuoc Ngo,
Khoa Tran Anh Dang,
Son T. Luu,
Kiet Van Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks. The existing MRC corpora in Vietnamese mainly focus on formal written documents such as Wikipedia articles, online newspapers, or te…
▽ More
This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks. The existing MRC corpora in Vietnamese mainly focus on formal written documents such as Wikipedia articles, online newspapers, or textbooks. In contrast, the VlogQA consists of 10,076 question-answer pairs based on 1,230 transcript documents sourced from YouTube -- an extensive source of user-uploaded content, covering the topics of food and travel. By capturing the spoken language of native Vietnamese speakers in natural settings, an obscure corner overlooked in Vietnamese research, the corpus provides a valuable resource for future research in reading comprehension tasks for the Vietnamese language. Regarding performance evaluation, our deep-learning models achieved the highest F1 score of 75.34% on the test set, indicating significant progress in machine reading comprehension for Vietnamese spoken language data. In terms of EM, the highest score we accomplished is 53.97%, which reflects the challenge in processing spoken-based content and highlights the need for further improvement.
△ Less
Submitted 6 April, 2024; v1 submitted 4 February, 2024;
originally announced February 2024.
-
Physical Layer Location Privacy in SIMO Communication Using Fake Paths Injection
Authors:
Trong Duy Tran,
Maxime Ferreira Da Costa,
Linh Trung Nguyen
Abstract:
Fake path injection is an emerging paradigm for inducing privacy over wireless networks. In this paper, fake paths are injected by the transmitter into a SIMO multipath communication channel to preserve her physical location from an eavesdropper. A novel statistical privacy metric is defined as the ratio between the largest (resp. smallest) eigenvalues of Bob's (resp. Eve's) Cramér-Rao lower bound…
▽ More
Fake path injection is an emerging paradigm for inducing privacy over wireless networks. In this paper, fake paths are injected by the transmitter into a SIMO multipath communication channel to preserve her physical location from an eavesdropper. A novel statistical privacy metric is defined as the ratio between the largest (resp. smallest) eigenvalues of Bob's (resp. Eve's) Cramér-Rao lower bound on the SIMO multipath channel parameters to assess the privacy enhancements. Leveraging the spectral properties of generalized Vandermonde matrices, bounds on the privacy margin of the proposed scheme are derived. Specifically, it is shown that the privacy margin increases quadratically in the inverse of the separation between the true and the fake paths under Eve's perspective. Numerical simulations further showcase the approach's benefit.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Employing Label Models on ChatGPT Answers Improves Legal Text Entailment Performance
Authors:
Chau Nguyen,
Le-Minh Nguyen
Abstract:
The objective of legal text entailment is to ascertain whether the assertions in a legal query logically follow from the information provided in one or multiple legal articles. ChatGPT, a large language model, is robust in many natural language processing tasks, including legal text entailment: when we set the temperature = 0 (the ChatGPT answers are deterministic) and prompt the model, it achieve…
▽ More
The objective of legal text entailment is to ascertain whether the assertions in a legal query logically follow from the information provided in one or multiple legal articles. ChatGPT, a large language model, is robust in many natural language processing tasks, including legal text entailment: when we set the temperature = 0 (the ChatGPT answers are deterministic) and prompt the model, it achieves 70.64% accuracy on COLIEE 2022 dataset, which outperforms the previous SOTA of 67.89%. On the other hand, if the temperature is larger than zero, ChatGPT answers are not deterministic, leading to inconsistent answers and fluctuating results. We propose to leverage label models (a fundamental component of weak supervision techniques) to integrate the provisional answers by ChatGPT into consolidated labels. By that way, we treat ChatGPT provisional answers as noisy predictions which can be consolidated by label models. The experimental results demonstrate that this approach can attain an accuracy of 76.15%, marking a significant improvement of 8.26% over the prior state-of-the-art benchmark. Additionally, we perform an analysis of the instances where ChatGPT produces incorrect answers, then we classify the errors, offering insights that could guide potential enhancements for future research endeavors.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Channel Characterization of UAV-RIS-aided Systems with Adaptive Phase-shift Configuration
Authors:
Thanh Luan Nguyen,
Georges Kaddoum,
Tri Nhu Do,
Zygmunt J. Haas
Abstract:
This letter considers a UAV aiding communication between a ground transmitter and a ground receiver in the presence of co-channel interference. A discrete-time Markov process is adopted to model the complex nature of the Air-to-Ground (A2G) channel, including the occurrence of Line-of-Sight, Non-Line-of-Sight, and blockage events. Moreover, an adaptive phase-shift-enabled Reconfigurable Intelligen…
▽ More
This letter considers a UAV aiding communication between a ground transmitter and a ground receiver in the presence of co-channel interference. A discrete-time Markov process is adopted to model the complex nature of the Air-to-Ground (A2G) channel, including the occurrence of Line-of-Sight, Non-Line-of-Sight, and blockage events. Moreover, an adaptive phase-shift-enabled Reconfigurable Intelligent Surface (RIS) is deployed to combat A2G blockage events. Novel frameworks based on the shadowed Rician distribution are proposed to derive closed-form expressions for Ground-to-Air/A2G SINR' distributions. Numerical results show that RISs with large numbers of elements, e.g., 256 RIS elements, improve end-to-end Outage Probability (OP) and reduce blockages.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
User Association Optimization for IRS-aided Terahertz Networks: A Matching Theory Approach
Authors:
Muddasir Rahim,
Thanh Luan Nguyen,
Georges Kaddoum,
Tri Nhu Do
Abstract:
Terahertz (THz) communication is a promising technology for future wireless communications, offering data rates of up to several terabits-per-second (Tbps). However, the range of THz band communications is often limited by high pathloss and molecular absorption. To overcome these challenges, this paper proposes intelligent reconfigurable surfaces (IRSs) to enhance THz communication systems. Specif…
▽ More
Terahertz (THz) communication is a promising technology for future wireless communications, offering data rates of up to several terabits-per-second (Tbps). However, the range of THz band communications is often limited by high pathloss and molecular absorption. To overcome these challenges, this paper proposes intelligent reconfigurable surfaces (IRSs) to enhance THz communication systems. Specifically, we introduce an angle-based trigonometric channel model to evaluate the effectiveness of IRS-aided THz networks. Additionally, to maximize the sum rate, we formulate the source-IRS-destination matching problem, which is a mixed-integer nonlinear programming (MINLP) problem. To solve this non-deterministic polynomial-time hard (NP-hard) problem, the paper proposes a Gale-Shapley-based solution that obtains stable matches between sources and IRSs, as well as between destinations and IRSs in the first and second sub-problems, respectively.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
Statistical Characterization of RIS-assisted UAV Communications in Terrestrial and Non-Terrestrial Networks Under Channel Aging
Authors:
Thanh Luan Nguyen,
Georges Kaddoum,
Tri Nhu Do,
Zygmunt J. Haas
Abstract:
This paper studies the statistical characterization of ground-to-air (G2A) and reconfigurable intelligent surface (RIS)-assisted air-to-ground (A2G) communications with unmanned aerial vehicles (UAVs) in terrestrial and non-terrestrial networks under the impact of channel aging.
We first model the G2A and A2G signal-to-noise ratios (SNRs) as non-central complex Gaussian quadratic random variable…
▽ More
This paper studies the statistical characterization of ground-to-air (G2A) and reconfigurable intelligent surface (RIS)-assisted air-to-ground (A2G) communications with unmanned aerial vehicles (UAVs) in terrestrial and non-terrestrial networks under the impact of channel aging.
We first model the G2A and A2G signal-to-noise ratios (SNRs) as non-central complex Gaussian quadratic random variables (RVs) and derive their exact probability density functions, offering a unique characterization for the A2G SNR as the product of two scaled non-central chi-square RVs. Moreover, we also find that, for a large number of RIS elements, the RIS-assisted A2G channel can be characterized as a single Rician fading channel.
Our results reveal the presence of channel hardening in A2G communication under low UAV speeds, where we derive the maximum target spectral efficiency (SE) for a system to maintain a consistent required outage level. Meanwhile, high UAV speeds, exceeding 50 m/s, lead to a significant performance degradation, which cannot be mitigated by increasing the number of RIS elements.
△ Less
Submitted 30 January, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
Landscape of nuclear deformation softness with spherical quasi-particle random phase approximation
Authors:
Le-Anh Nguyen,
Minh-Loc Bui,
Panagiota Papakonstantinou,
Naftali Auerbach
Abstract:
We investigate the stability and softness of nuclei against quadrupole, octupole, and hexadecapole deformation. By applying the spherical Skyrme-force Hartree-Fock Bardeen-Cooper-Schrieffer quasi-particle random phase approximation, we diagnose ground-state deformation when imaginary solutions are obtained, i.e., the spherical ground state {\em collapses}. We also calculate the multipole polarizab…
▽ More
We investigate the stability and softness of nuclei against quadrupole, octupole, and hexadecapole deformation. By applying the spherical Skyrme-force Hartree-Fock Bardeen-Cooper-Schrieffer quasi-particle random phase approximation, we diagnose ground-state deformation when imaginary solutions are obtained, i.e., the spherical ground state {\em collapses}. We also calculate the multipole polarizability in spherical nuclei with no collapse, as a measure of softness. This numerically light and theoretically sound method is found able to capture deformation patterns across the nuclide chart. The connection between the intrinsic shape of nuclei and the dynamics of their low-lying collective states is established and the role of shell structure is discussed.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
Multi-objective Feature Selection in Remote Health Monitoring Applications
Authors:
Le Ngu Nguyen,
Constantino Álvarez Casado,
Manuel Lage Cañellas,
Anirban Mukherjee,
Nhi Nguyen,
Dinesh Babu Jayagopi,
Miguel Bordallo López
Abstract:
Radio frequency (RF) signals have facilitated the development of non-contact human monitoring tasks, such as vital signs measurement, activity recognition, and user identification. In some specific scenarios, an RF signal analysis framework may prioritize the performance of one task over that of others. In response to this requirement, we employ a multi-objective optimization approach inspired by…
▽ More
Radio frequency (RF) signals have facilitated the development of non-contact human monitoring tasks, such as vital signs measurement, activity recognition, and user identification. In some specific scenarios, an RF signal analysis framework may prioritize the performance of one task over that of others. In response to this requirement, we employ a multi-objective optimization approach inspired by biological principles to select discriminative features that enhance the accuracy of breathing patterns recognition while simultaneously impeding the identification of individual users. This approach is validated using a novel vital signs dataset consisting of 50 subjects engaged in four distinct breathing patterns. Our findings indicate a remarkable result: a substantial divergence in accuracy between breathing recognition and user identification. As a complementary viewpoint, we present a contrariwise result to maximize user identification accuracy and minimize the system's capacity for breathing activity recognition.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information Retrieval and Entailment Tasks
Authors:
Chau Nguyen,
Phuong Nguyen,
Thanh Tran,
Dat Nguyen,
An Trieu,
Tin Pham,
Anh Dang,
Le-Minh Nguyen
Abstract:
The Competition on Legal Information Extraction/Entailment (COLIEE) is held annually to encourage advancements in the automatic processing of legal texts. Processing legal documents is challenging due to the intricate structure and meaning of legal language. In this paper, we outline our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition. Our approach involved utiliz…
▽ More
The Competition on Legal Information Extraction/Entailment (COLIEE) is held annually to encourage advancements in the automatic processing of legal texts. Processing legal documents is challenging due to the intricate structure and meaning of legal language. In this paper, we outline our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition. Our approach involved utilizing appropriate state-of-the-art deep learning methods, designing methods based on domain characteristics observation, and applying meticulous engineering practices and methodologies to the competition. As a result, our performance in these tasks has been outstanding, with first places in Task 2 and Task 3, and promising results in Task 4. Our source code is available at https://github.com/Nguyen2015/CAPTAIN-COLIEE2023/tree/coliee2023.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
Empowering high-dimensional quantum computing by traversing the dual bosonic ladder
Authors:
Long B. Nguyen,
Noah Goss,
Karthik Siva,
Yosep Kim,
Ed Younis,
Bingcheng Qing,
Akel Hashim,
David I. Santiago,
Irfan Siddiqi
Abstract:
High-dimensional quantum information processing has emerged as a promising avenue to transcend hardware limitations and advance the frontiers of quantum technologies. Harnessing the untapped potential of the so-called qudits necessitates the development of quantum protocols beyond the established qubit methodologies. Here, we present a robust, hardware-efficient, and extensible approach for operat…
▽ More
High-dimensional quantum information processing has emerged as a promising avenue to transcend hardware limitations and advance the frontiers of quantum technologies. Harnessing the untapped potential of the so-called qudits necessitates the development of quantum protocols beyond the established qubit methodologies. Here, we present a robust, hardware-efficient, and extensible approach for operating multidimensional solid-state systems using Raman-assisted two-photon interactions. To demonstrate its efficacy, we construct a set of multi-qubit operations, realize highly entangled multidimensional states including atomic squeezed states and Schrödinger cat states, and implement programmable entanglement distribution along a qudit array. Our work illuminates the quantum electrodynamics of strongly driven multi-qudit systems and provides the experimental foundation for the future development of high-dimensional quantum applications.
△ Less
Submitted 29 December, 2023;
originally announced December 2023.
-
Physics-informed Graphical Neural Network for Power System State Estimation
Authors:
Quang-Ha Ngo,
Bang L. H. Nguyen,
Tuyen V. Vu,
Jianhua Zhang,
Tuan Ngo
Abstract:
State estimation is highly critical for accurately observing the dynamic behavior of the power grids and minimizing risks from cyber threats. However, existing state estimation methods encounter challenges in accurately capturing power system dynamics, primarily because of limitations in encoding the grid topology and sparse measurements. This paper proposes a physics-informed graphical learning s…
▽ More
State estimation is highly critical for accurately observing the dynamic behavior of the power grids and minimizing risks from cyber threats. However, existing state estimation methods encounter challenges in accurately capturing power system dynamics, primarily because of limitations in encoding the grid topology and sparse measurements. This paper proposes a physics-informed graphical learning state estimation method to address these limitations by leveraging both domain physical knowledge and a graph neural network (GNN). We employ a GNN architecture that can handle the graph-structured data of power systems more effectively than traditional data-driven methods. The physics-based knowledge is constructed from the branch current formulation, making the approach adaptable to both transmission and distribution systems. The validation results of three IEEE test systems show that the proposed method can achieve lower mean square error more than 20% than the conventional methods.
△ Less
Submitted 29 December, 2023;
originally announced December 2023.
-
Count What You Want: Exemplar Identification and Few-shot Counting of Human Actions in the Wild
Authors:
Yifeng Huang,
Duc Duy Nguyen,
Lam Nguyen,
Cuong Pham,
Minh Hoai
Abstract:
This paper addresses the task of counting human actions of interest using sensor data from wearable devices. We propose a novel exemplar-based framework, allowing users to provide exemplars of the actions they want to count by vocalizing predefined sounds ''one'', ''two'', and ''three''. Our method first localizes temporal positions of these utterances from the audio sequence. These positions serv…
▽ More
This paper addresses the task of counting human actions of interest using sensor data from wearable devices. We propose a novel exemplar-based framework, allowing users to provide exemplars of the actions they want to count by vocalizing predefined sounds ''one'', ''two'', and ''three''. Our method first localizes temporal positions of these utterances from the audio sequence. These positions serve as the basis for identifying exemplars representing the action class of interest. A similarity map is then computed between the exemplars and the entire sensor data sequence, which is further fed into a density estimation module to generate a sequence of estimated density values. Summing these density values provides the final count. To develop and evaluate our approach, we introduce a diverse and realistic dataset consisting of real-world data from 37 subjects and 50 action categories, encompassing both sensor and audio data. The experiments on this dataset demonstrate the viability of the proposed method in counting instances of actions from new classes and subjects that were not part of the training data. On average, the discrepancy between the predicted count and the ground truth value is 7.47, significantly lower than the errors of the frequency-based and transformer-based methods. Our project, code and dataset can be found at https://github.com/cvlab-stonybrook/ExRAC.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
On Partial Optimal Transport: Revising the Infeasibility of Sinkhorn and Efficient Gradient Methods
Authors:
Anh Duc Nguyen,
Tuan Dung Nguyen,
Quang Minh Nguyen,
Hoang H. Nguyen,
Lam M. Nguyen,
Kim-Chuan Toh
Abstract:
This paper studies the Partial Optimal Transport (POT) problem between two unbalanced measures with at most $n$ supports and its applications in various AI tasks such as color transfer or domain adaptation. There is hence the need for fast approximations of POT with increasingly large problem sizes in arising applications. We first theoretically and experimentally investigate the infeasibility of…
▽ More
This paper studies the Partial Optimal Transport (POT) problem between two unbalanced measures with at most $n$ supports and its applications in various AI tasks such as color transfer or domain adaptation. There is hence the need for fast approximations of POT with increasingly large problem sizes in arising applications. We first theoretically and experimentally investigate the infeasibility of the state-of-the-art Sinkhorn algorithm for POT due to its incompatible rounding procedure, which consequently degrades its qualitative performance in real world applications like point-cloud registration. To this end, we propose a novel rounding algorithm for POT, and then provide a feasible Sinkhorn procedure with a revised computation complexity of $\mathcal{\widetilde O}(n^2/\varepsilon^4)$. Our rounding algorithm also permits the development of two first-order methods to approximate the POT problem. The first algorithm, Adaptive Primal-Dual Accelerated Gradient Descent (APDAGD), finds an $\varepsilon$-approximate solution to the POT problem in $\mathcal{\widetilde O}(n^{2.5}/\varepsilon)$, which is better in $\varepsilon$ than revised Sinkhorn. The second method, Dual Extrapolation, achieves the computation complexity of $\mathcal{\widetilde O}(n^2/\varepsilon)$, thereby being the best in the literature. We further demonstrate the flexibility of POT compared to standard OT as well as the practicality of our algorithms on real applications where two marginal distributions are unbalanced.
△ Less
Submitted 22 December, 2023; v1 submitted 21 December, 2023;
originally announced December 2023.
-
One step closer to unbiased aleatoric uncertainty estimation
Authors:
Wang Zhang,
Ziwen Ma,
Subhro Das,
Tsui-Wei Weng,
Alexandre Megretski,
Luca Daniel,
Lam M. Nguyen
Abstract:
Neural networks are powerful tools in various applications, and quantifying their uncertainty is crucial for reliable decision-making. In the deep learning field, the uncertainties are usually categorized into aleatoric (data) and epistemic (model) uncertainty. In this paper, we point out that the existing popular variance attenuation method highly overestimates aleatoric uncertainty. To address t…
▽ More
Neural networks are powerful tools in various applications, and quantifying their uncertainty is crucial for reliable decision-making. In the deep learning field, the uncertainties are usually categorized into aleatoric (data) and epistemic (model) uncertainty. In this paper, we point out that the existing popular variance attenuation method highly overestimates aleatoric uncertainty. To address this issue, we propose a new estimation method by actively de-noising the observed data. By conducting a broad range of experiments, we demonstrate that our proposed approach provides a much closer approximation to the actual data uncertainty than the standard method.
△ Less
Submitted 20 December, 2023; v1 submitted 16 December, 2023;
originally announced December 2023.
-
IncepSE: Leveraging InceptionTime's performance with Squeeze and Excitation mechanism in ECG analysis
Authors:
Tue Minh Cao,
Nhat Hong Tran,
Le Phi Nguyen,
Hieu Huy Pham,
Hung Thanh Nguyen
Abstract:
Our study focuses on the potential for modifications of Inception-like architecture within the electrocardiogram (ECG) domain. To this end, we introduce IncepSE, a novel network characterized by strategic architectural incorporation that leverages the strengths of both InceptionTime and channel attention mechanisms. Furthermore, we propose a training setup that employs stabilization techniques tha…
▽ More
Our study focuses on the potential for modifications of Inception-like architecture within the electrocardiogram (ECG) domain. To this end, we introduce IncepSE, a novel network characterized by strategic architectural incorporation that leverages the strengths of both InceptionTime and channel attention mechanisms. Furthermore, we propose a training setup that employs stabilization techniques that are aimed at tackling the formidable challenges of severe imbalance dataset PTB-XL and gradient corruption. By this means, we manage to set a new height for deep learning model in a supervised learning manner across the majority of tasks. Our model consistently surpasses InceptionTime by substantial margins compared to other state-of-the-arts in this domain, noticeably 0.013 AUROC score improvement in the "all" task, while also mitigating the inherent dataset fluctuations during training.
△ Less
Submitted 16 November, 2023;
originally announced December 2023.
-
ComOM at VLSP 2023: A Dual-Stage Framework with BERTology and Unified Multi-Task Instruction Tuning Model for Vietnamese Comparative Opinion Mining
Authors:
Dang Van Thin,
Duong Ngoc Hao,
Ngan Luu-Thuy Nguyen
Abstract:
The ComOM shared task aims to extract comparative opinions from product reviews in Vietnamese language. There are two sub-tasks, including (1) Comparative Sentence Identification (CSI) and (2) Comparative Element Extraction (CEE). The first task is to identify whether the input is a comparative review, and the purpose of the second task is to extract the quintuplets mentioned in the comparative re…
▽ More
The ComOM shared task aims to extract comparative opinions from product reviews in Vietnamese language. There are two sub-tasks, including (1) Comparative Sentence Identification (CSI) and (2) Comparative Element Extraction (CEE). The first task is to identify whether the input is a comparative review, and the purpose of the second task is to extract the quintuplets mentioned in the comparative review. To address this task, our team proposes a two-stage system based on fine-tuning a BERTology model for the CSI task and unified multi-task instruction tuning for the CEE task. Besides, we apply the simple data augmentation technique to increase the size of the dataset for training our model in the second stage. Experimental results show that our approach outperforms the other competitors and has achieved the top score on the official private test.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Aerial STAR-RIS Empowered MEC: A DRL Approach for Energy Minimization
Authors:
Pyae Sone Aung,
Loc X. Nguyen,
Yan Kyaw Tun,
Zhu Han,
Choong Seon Hong
Abstract:
Multi-access Edge Computing (MEC) addresses computational and battery limitations in devices by allowing them to offload computation tasks. To overcome the difficulties in establishing line-of-sight connections, integrating unmanned aerial vehicles (UAVs) has proven beneficial, offering enhanced data exchange, rapid deployment, and mobility. The utilization of reconfigurable intelligent surfaces (…
▽ More
Multi-access Edge Computing (MEC) addresses computational and battery limitations in devices by allowing them to offload computation tasks. To overcome the difficulties in establishing line-of-sight connections, integrating unmanned aerial vehicles (UAVs) has proven beneficial, offering enhanced data exchange, rapid deployment, and mobility. The utilization of reconfigurable intelligent surfaces (RIS), specifically simultaneously transmitting and reflecting RIS (STAR-RIS) technology, further extends coverage capabilities and introduces flexibility in MEC. This study explores the integration of UAV and STAR-RIS to facilitate communication between IoT devices and an MEC server. The formulated problem aims to minimize energy consumption for IoT devices and aerial STAR-RIS by jointly optimizing task offloading, aerial STAR-RIS trajectory, amplitude and phase shift coefficients, and transmit power. Given the non-convexity of the problem and the dynamic environment, solving it directly within a polynomial time frame is challenging. Therefore, deep reinforcement learning (DRL), particularly proximal policy optimization (PPO), is introduced for its sample efficiency and stability. Simulation results illustrate the effectiveness of the proposed system compared to benchmark schemes in the literature.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Abusive Span Detection for Vietnamese Narrative Texts
Authors:
Nhu-Thanh Nguyen,
Khoa Thi-Kim Phan,
Duc-Vu Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
Abuse in its various forms, including physical, psychological, verbal, sexual, financial, and cultural, has a negative impact on mental health. However, there are limited studies on applying natural language processing (NLP) in this field in Vietnam. Therefore, we aim to contribute by building a human-annotated Vietnamese dataset for detecting abusive content in Vietnamese narrative texts. We sour…
▽ More
Abuse in its various forms, including physical, psychological, verbal, sexual, financial, and cultural, has a negative impact on mental health. However, there are limited studies on applying natural language processing (NLP) in this field in Vietnam. Therefore, we aim to contribute by building a human-annotated Vietnamese dataset for detecting abusive content in Vietnamese narrative texts. We sourced these texts from VnExpress, Vietnam's popular online newspaper, where readers often share stories containing abusive content. Identifying and categorizing abusive spans in these texts posed significant challenges during dataset creation, but it also motivated our research. We experimented with lightweight baseline models by freezing PhoBERT and XLM-RoBERTa and using their hidden states in a BiLSTM to assess the complexity of the dataset. According to our experimental results, PhoBERT outperforms other models in both labeled and unlabeled abusive span detection tasks. These results indicate that it has the potential for future improvements.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
A Deep Learning-Based System for Automatic Case Summarization
Authors:
Minh Duong,
Long Nguyen,
Yen Vuong,
Trong Le,
Ha-Thanh Nguyen
Abstract:
This paper presents a deep learning-based system for efficient automatic case summarization. Leveraging state-of-the-art natural language processing techniques, the system offers both supervised and unsupervised methods to generate concise and relevant summaries of lengthy legal case documents. The user-friendly interface allows users to browse the system's database of legal case documents, select…
▽ More
This paper presents a deep learning-based system for efficient automatic case summarization. Leveraging state-of-the-art natural language processing techniques, the system offers both supervised and unsupervised methods to generate concise and relevant summaries of lengthy legal case documents. The user-friendly interface allows users to browse the system's database of legal case documents, select their desired case, and choose their preferred summarization method. The system generates comprehensive summaries for each subsection of the legal text as well as an overall summary. This demo streamlines legal case document analysis, potentially benefiting legal professionals by reducing workload and increasing efficiency. Future work will focus on refining summarization techniques and exploring the application of our methods to other types of legal texts.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Non-contact Multimodal Indoor Human Monitoring Systems: A Survey
Authors:
Le Ngu Nguyen,
Praneeth Susarla,
Anirban Mukherjee,
Manuel Lage Cañellas,
Constantino Álvarez Casado,
Xiaoting Wu,
Olli~Silvén,
Dinesh Babu Jayagopi,
Miguel Bordallo López
Abstract:
Indoor human monitoring systems leverage a wide range of sensors, including cameras, radio devices, and inertial measurement units, to collect extensive data from users and the environment. These sensors contribute diverse data modalities, such as video feeds from cameras, received signal strength indicators and channel state information from WiFi devices, and three-axis acceleration data from ine…
▽ More
Indoor human monitoring systems leverage a wide range of sensors, including cameras, radio devices, and inertial measurement units, to collect extensive data from users and the environment. These sensors contribute diverse data modalities, such as video feeds from cameras, received signal strength indicators and channel state information from WiFi devices, and three-axis acceleration data from inertial measurement units. In this context, we present a comprehensive survey of multimodal approaches for indoor human monitoring systems, with a specific focus on their relevance in elderly care. Our survey primarily highlights non-contact technologies, particularly cameras and radio devices, as key components in the development of indoor human monitoring systems. Throughout this article, we explore well-established techniques for extracting features from multimodal data sources. Our exploration extends to methodologies for fusing these features and harnessing multiple modalities to improve the accuracy and robustness of machine learning models. Furthermore, we conduct comparative analysis across different data modalities in diverse human monitoring tasks and undertake a comprehensive examination of existing multimodal datasets. This extensive survey not only highlights the significance of indoor human monitoring systems but also affirms their versatile applications. In particular, we emphasize their critical role in enhancing the quality of elderly care, offering valuable insights into the development of non-contact monitoring solutions applicable to the needs of aging populations.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Improved Frequency Estimation Algorithms with and without Predictions
Authors:
Anders Aamand,
Justin Y. Chen,
Huy Lê Nguyen,
Sandeep Silwal,
Ali Vakilian
Abstract:
Estimating frequencies of elements appearing in a data stream is a key task in large-scale data analysis. Popular sketching approaches to this problem (e.g., CountMin and CountSketch) come with worst-case guarantees that probabilistically bound the error of the estimated frequencies for any possible input. The work of Hsu et al. (2019) introduced the idea of using machine learning to tailor sketch…
▽ More
Estimating frequencies of elements appearing in a data stream is a key task in large-scale data analysis. Popular sketching approaches to this problem (e.g., CountMin and CountSketch) come with worst-case guarantees that probabilistically bound the error of the estimated frequencies for any possible input. The work of Hsu et al. (2019) introduced the idea of using machine learning to tailor sketching algorithms to the specific data distribution they are being run on. In particular, their learning-augmented frequency estimation algorithm uses a learned heavy-hitter oracle which predicts which elements will appear many times in the stream. We give a novel algorithm, which in some parameter regimes, already theoretically outperforms the learning based algorithm of Hsu et al. without the use of any predictions. Augmenting our algorithm with heavy-hitter predictions further reduces the error and improves upon the state of the art. Empirically, our algorithms achieve superior performance in all experiments compared to prior approaches.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
RIS-Aided Interference Cancellation for Joint Device-to-Device and Cellular Communications
Authors:
Ly V. Nguyen,
A. Lee Swindlehurst
Abstract:
Joint device-to-device (D2D) and cellular communication is a promising technology for enhancing the spectral efficiency of future wireless networks. However, the interference management problem is challenging since the operating devices and the cellular users share the same spectrum. The emerging reconfigurable intelligent surfaces (RIS) technology is a potentially ideal solution for this interfer…
▽ More
Joint device-to-device (D2D) and cellular communication is a promising technology for enhancing the spectral efficiency of future wireless networks. However, the interference management problem is challenging since the operating devices and the cellular users share the same spectrum. The emerging reconfigurable intelligent surfaces (RIS) technology is a potentially ideal solution for this interference problem since RISs can shape the wireless channel in desired ways. This paper considers an RIS-aided joint D2D and cellular communication system where the RIS is exploited to cancel interference to the D2D links and maximize the minimum signal-to-interference plus noise (SINR) of the device pairs and cellular users. First, we adopt a popular alternating optimization (AO) approach to solve the minimum SINR maximization problem. Then, we propose an interference cancellation (IC)-based approach whose complexity is much lower than that of the AO algorithm. We derive a representation for the RIS phase shift vector which cancels the interference to the D2D links. Based on this representation, the RIS phase shift optimization problem is transformed into an effective D2D channel optimization. We show that the AO approach can converge faster and can even give better performance when it is initialized by the proposed IC solution. We also show that for the case of a single D2D pair, the proposed IC approach can be implemented with limited feedback from the single receive device.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
On the variants of SVM methods applied to GPR data to classify tack coat characteristics in French pavements: two experimental case studies
Authors:
Grégory Andreoli,
Amine Ihamouten,
Mai Lan Nguyen,
Yannick Fargier,
Cyrille Fauchard,
Jean-Michel Simonin,
Viktoriia Buliuk,
David Souriou,
Xavier Dérobert
Abstract:
Among the commonly used non-destructive techniques, the Ground Penetrating Radar (GPR) is one of the most widely adopted today for assessing pavement conditions in France. However, conventional radar systems and their forward processing methods have shown their limitations for the physical and geometrical characterization of very thin layers such as tack coats. However, the use of Machine Learning…
▽ More
Among the commonly used non-destructive techniques, the Ground Penetrating Radar (GPR) is one of the most widely adopted today for assessing pavement conditions in France. However, conventional radar systems and their forward processing methods have shown their limitations for the physical and geometrical characterization of very thin layers such as tack coats. However, the use of Machine Learning methods applied to GPR with an inverse approach showed that it was numerically possible to identify the tack coat characteristics despite masking effects due to low timefrequency resolution noted in the raw B-scans. Thus, we propose in this paper to apply the inverse approach based on Machine Learning, already validated in previous works on numerical data, on two experimental cases with different pavement structures. The first case corresponds to a validation on known pavement structures on the Gustave Eiffel University (Nantes, France) with its pavement fatigue carousel and the second case focuses on a new real road in Vend{é}e department (France). In both case studies, the performances of SVM/SVR methods showed the efficiency of supervised learning methods to classify and estimate the emulsion proportioning in the tack coats.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
LLMs Accelerate Annotation for Medical Information Extraction
Authors:
Akshay Goel,
Almog Gueta,
Omry Gilon,
Chang Liu,
Sofia Erell,
Lan Huong Nguyen,
Xiaohong Hao,
Bolous Jaber,
Shashir Reddy,
Rupesh Kartha,
Jean Steiner,
Itay Laish,
Amir Feder
Abstract:
The unstructured nature of clinical notes within electronic health records often conceals vital patient-related information, making it challenging to access or interpret. To uncover this hidden information, specialized Natural Language Processing (NLP) models are required. However, training these models necessitates large amounts of labeled data, a process that is both time-consuming and costly wh…
▽ More
The unstructured nature of clinical notes within electronic health records often conceals vital patient-related information, making it challenging to access or interpret. To uncover this hidden information, specialized Natural Language Processing (NLP) models are required. However, training these models necessitates large amounts of labeled data, a process that is both time-consuming and costly when relying solely on human experts for annotation. In this paper, we propose an approach that combines Large Language Models (LLMs) with human expertise to create an efficient method for generating ground truth labels for medical text annotation. By utilizing LLMs in conjunction with human annotators, we significantly reduce the human annotation burden, enabling the rapid creation of labeled datasets. We rigorously evaluate our method on a medical information extraction task, demonstrating that our approach not only substantially cuts down on human intervention but also maintains high accuracy. The results highlight the potential of using LLMs to improve the utilization of unstructured clinical data, allowing for the swift deployment of tailored NLP solutions in healthcare.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Determining initial conditions for nonlinear hyperbolic equations with time dimensional reduction and the Carleman contraction
Authors:
Trong D. Dang,
Loc H. Nguyen,
Huong T. T. Vu
Abstract:
This paper aims to determine the initial conditions for quasi-linear hyperbolic equations that include nonlocal elements. We suggest a method where we approximate the solution of the hyperbolic equation by truncating its Fourier series in the time domain with a polynomial-exponential basis. This truncation effectively removes the time variable, transforming the problem into a system of quasi-linea…
▽ More
This paper aims to determine the initial conditions for quasi-linear hyperbolic equations that include nonlocal elements. We suggest a method where we approximate the solution of the hyperbolic equation by truncating its Fourier series in the time domain with a polynomial-exponential basis. This truncation effectively removes the time variable, transforming the problem into a system of quasi-linear elliptic equations. We refer to this technique as the "time dimensional reduction method." To numerically solve this system comprehensively without the need for an accurate initial estimate, we used the newly developed Carleman contraction principle. We show the efficiency of our method through various numerical examples. The time dimensional reduction method stands out not only for its precise solutions but also for its remarkable speed in computation.
△ Less
Submitted 11 June, 2024; v1 submitted 2 December, 2023;
originally announced December 2023.
-
Constellation Sha** under Phase Noise Impairment for Sub-THz Communications
Authors:
Dileepa Marasinghe,
Le Hang Nguyen,
Jafar Mohammadi,
Yejian Chen,
Thorsten Wild,
Nandana Rajatheva
Abstract:
The large untapped spectrum in the sub-THz allows for ultra-high throughput communication to realize many seemingly impossible applications in 6G. One of the challenges in radio communications in sub-THz is the hardware impairments. Specifically, phase noise is one key hardware impairment, which is accentuated as we increase the frequency and bandwidth. Furthermore, the moderate output power of th…
▽ More
The large untapped spectrum in the sub-THz allows for ultra-high throughput communication to realize many seemingly impossible applications in 6G. One of the challenges in radio communications in sub-THz is the hardware impairments. Specifically, phase noise is one key hardware impairment, which is accentuated as we increase the frequency and bandwidth. Furthermore, the moderate output power of the sub-THz power amplifier demands limits on peak to average power ratio (PAPR) signal design. Single carrier frequency domain equalization (SC-FDE) has been identified as a suitable candidate for sub-THz, although some challenges such as phase noise and PAPR still remain to be tackled. In this work, we design a phase noise robust, modest PAPR SC waveform by geometrically sha** the constellation under practical conditions. We formulate the waveform optimization problem in its augmented Lagrangian form and use a back-propagation-inspired technique to obtain a constellation design that is numerically robust to phase noise, while maintaining a relatively low PAPR compared to the conventional waveforms.
△ Less
Submitted 21 March, 2024; v1 submitted 21 November, 2023;
originally announced November 2023.
-
A Supervised Contrastive Learning Pretrain-Finetune Approach for Time Series
Authors:
Trang H. Tran,
Lam M. Nguyen,
Kyongmin Yeo,
Nam Nguyen,
Roman Vaculin
Abstract:
Foundation models have recently gained attention within the field of machine learning thanks to its efficiency in broad data processing. While researchers had attempted to extend this success to time series models, the main challenge is effectively extracting representations and transferring knowledge from pretraining datasets to the target finetuning dataset. To tackle this issue, we introduce a…
▽ More
Foundation models have recently gained attention within the field of machine learning thanks to its efficiency in broad data processing. While researchers had attempted to extend this success to time series models, the main challenge is effectively extracting representations and transferring knowledge from pretraining datasets to the target finetuning dataset. To tackle this issue, we introduce a novel pretraining procedure that leverages supervised contrastive learning to distinguish features within each pretraining dataset. This pretraining phase enables a probabilistic similarity metric, which assesses the likelihood of a univariate sample being closely related to one of the pretraining datasets. Subsequently, using this similarity metric as a guide, we propose a fine-tuning procedure designed to enhance the accurate prediction of the target data by aligning it more closely with the learned dynamics of the pretraining datasets. Our experiments have shown promising results which demonstrate the efficacy of our approach.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
Correlated Attention in Transformers for Multivariate Time Series
Authors:
Quang Minh Nguyen,
Lam M. Nguyen,
Subhro Das
Abstract:
Multivariate time series (MTS) analysis prevails in real-world applications such as finance, climate science and healthcare. The various self-attention mechanisms, the backbone of the state-of-the-art Transformer-based models, efficiently discover the temporal dependencies, yet cannot well capture the intricate cross-correlation between different features of MTS data, which inherently stems from c…
▽ More
Multivariate time series (MTS) analysis prevails in real-world applications such as finance, climate science and healthcare. The various self-attention mechanisms, the backbone of the state-of-the-art Transformer-based models, efficiently discover the temporal dependencies, yet cannot well capture the intricate cross-correlation between different features of MTS data, which inherently stems from complex dynamical systems in practice. To this end, we propose a novel correlated attention mechanism, which not only efficiently captures feature-wise dependencies, but can also be seamlessly integrated within the encoder blocks of existing well-known Transformers to gain efficiency improvement. In particular, correlated attention operates across feature channels to compute cross-covariance matrices between queries and keys with different lag values, and selectively aggregate representations at the sub-series level. This architecture facilitates automated discovery and representation learning of not only instantaneous but also lagged cross-correlations, while inherently capturing time series auto-correlation. When combined with prevalent Transformer baselines, correlated attention mechanism constitutes a better alternative for encoder-only architectures, which are suitable for a wide range of tasks including imputation, anomaly detection and classification. Extensive experiments on the aforementioned tasks consistently underscore the advantages of correlated attention mechanism in enhancing base Transformer models, and demonstrate our state-of-the-art results in imputation, anomaly detection and classification.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
The three-dimensional Seiberg-Witten equations for 3/2-spinors: a compactness theorem
Authors:
Ahmad Reza Haj Saeedi Sadegh,
Minh Lam Nguyen
Abstract:
The Rarita-Schwinger-Seiberg-Witten (RS-SW) equations are defined similarly to the classical Seiberg-Witten equations, where a geometric non-Dirac-type operator replaces the Dirac operator called the Rarita-Schwinger operator. In dimension four, the RS-SW equation was first considered by the second-named author. The variational approach will also give us a three-dimensional version of the equation…
▽ More
The Rarita-Schwinger-Seiberg-Witten (RS-SW) equations are defined similarly to the classical Seiberg-Witten equations, where a geometric non-Dirac-type operator replaces the Dirac operator called the Rarita-Schwinger operator. In dimension four, the RS-SW equation was first considered by the second-named author. The variational approach will also give us a three-dimensional version of the equations. The RS-SW equations share some features with the multiple-spinor Seiberg-Witten equations, where the moduli space of solutions could be non-compact. In this note, we prove a compactness theorem regarding the moduli space of solutions of the RS-SW equations defined on 3-manifolds.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
Gendec: A Machine Learning-based Framework for Gender Detection from Japanese Names
Authors:
Duong Tien Pham,
Luan Thanh Nguyen
Abstract:
Every human has their own name, a fundamental aspect of their identity and cultural heritage. The name often conveys a wealth of information, including details about an individual's background, ethnicity, and, especially, their gender. By detecting gender through the analysis of names, researchers can unlock valuable insights into linguistic patterns and cultural norms, which can be applied to pra…
▽ More
Every human has their own name, a fundamental aspect of their identity and cultural heritage. The name often conveys a wealth of information, including details about an individual's background, ethnicity, and, especially, their gender. By detecting gender through the analysis of names, researchers can unlock valuable insights into linguistic patterns and cultural norms, which can be applied to practical applications. Hence, this work presents a novel dataset for Japanese name gender detection comprising 64,139 full names in romaji, hiragana, and kanji forms, along with their biological genders. Moreover, we propose Gendec, a framework for gender detection from Japanese names that leverages diverse approaches, including traditional machine learning techniques or cutting-edge transfer learning models, to predict the gender associated with Japanese names accurately. Through a thorough investigation, the proposed framework is expected to be effective and serve potential applications in various domains.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
Existence and uniqueness for the non-compact Yamabe problem of negative curvature type
Authors:
Joseph Hogg,
Luc Nguyen
Abstract:
We study existence and uniqueness results for the Yamabe problem on non-compact manifolds of negative curvature type. Our first existence and uniqueness result concerns those such manifolds which are asymptotically locally hyperbolic. In this context, our result requires only a partial $C^2$ decay of the metric, namely the full decay of the metric in $C^1$ and the decay of the scalar curvature. In…
▽ More
We study existence and uniqueness results for the Yamabe problem on non-compact manifolds of negative curvature type. Our first existence and uniqueness result concerns those such manifolds which are asymptotically locally hyperbolic. In this context, our result requires only a partial $C^2$ decay of the metric, namely the full decay of the metric in $C^1$ and the decay of the scalar curvature. In particular, no decay of the Ricci curvature is assumed. In our second result we establish that a local volume ratio condition, when combined with negativity of the scalar curvature at infinity, is sufficient for existence of a solution. Our volume ratio condition appears tight. This paper is based on the DPhil thesis of the first author.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
PhoGPT: Generative Pre-training for Vietnamese
Authors:
Dat Quoc Nguyen,
Linh The Nguyen,
Chi Tran,
Dung Ngoc Nguyen,
Dinh Phung,
Hung Bui
Abstract:
We open-source a state-of-the-art 4B-parameter generative model series for Vietnamese, which includes the base pre-trained monolingual model PhoGPT-4B and its chat variant, PhoGPT-4B-Chat. The base model, PhoGPT-4B, with exactly 3.7B parameters, is pre-trained from scratch on a Vietnamese corpus of 102B tokens, with an 8192 context length, employing a vocabulary of 20480 token types. The chat vari…
▽ More
We open-source a state-of-the-art 4B-parameter generative model series for Vietnamese, which includes the base pre-trained monolingual model PhoGPT-4B and its chat variant, PhoGPT-4B-Chat. The base model, PhoGPT-4B, with exactly 3.7B parameters, is pre-trained from scratch on a Vietnamese corpus of 102B tokens, with an 8192 context length, employing a vocabulary of 20480 token types. The chat variant, PhoGPT-4B-Chat, is the modeling output obtained by fine-tuning PhoGPT-4B on a dataset of 70K instructional prompts and their responses, along with an additional 290K conversations. In addition, we also demonstrate its superior performance compared to previous open-source models. Our PhoGPT models are available at: https://github.com/VinAIResearch/PhoGPT
△ Less
Submitted 22 March, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
ViCLEVR: A Visual Reasoning Dataset and Hybrid Multimodal Fusion Model for Visual Question Answering in Vietnamese
Authors:
Khiem Vinh Tran,
Hao Phu Phan,
Kiet Van Nguyen,
Ngan Luu Thuy Nguyen
Abstract:
In recent years, Visual Question Answering (VQA) has gained significant attention for its diverse applications, including intelligent car assistance, aiding visually impaired individuals, and document image information retrieval using natural language queries. VQA requires effective integration of information from questions and images to generate accurate answers. Neural models for VQA have made r…
▽ More
In recent years, Visual Question Answering (VQA) has gained significant attention for its diverse applications, including intelligent car assistance, aiding visually impaired individuals, and document image information retrieval using natural language queries. VQA requires effective integration of information from questions and images to generate accurate answers. Neural models for VQA have made remarkable progress on large-scale datasets, with a primary focus on resource-rich languages like English. To address this, we introduce the ViCLEVR dataset, a pioneering collection for evaluating various visual reasoning capabilities in Vietnamese while mitigating biases. The dataset comprises over 26,000 images and 30,000 question-answer pairs (QAs), each question annotated to specify the type of reasoning involved. Leveraging this dataset, we conduct a comprehensive analysis of contemporary visual reasoning systems, offering valuable insights into their strengths and limitations. Furthermore, we present PhoVIT, a comprehensive multimodal fusion that identifies objects in images based on questions. The architecture effectively employs transformers to enable simultaneous reasoning over textual and visual data, merging both modalities at an early model stage. The experimental findings demonstrate that our proposed model achieves state-of-the-art performance across four evaluation metrics. The accompanying code and dataset have been made publicly accessible at \url{https://github.com/kvt0012/ViCLEVR}. This provision seeks to stimulate advancements within the research community, fostering the development of more multimodal fusion algorithms, specifically tailored to address the nuances of low-resource languages, exemplified by Vietnamese.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
A Method for Network Intrusion Detection Using Flow Sequence and BERT Framework
Authors:
Loc Gia Nguyen,
Kohei Watabe
Abstract:
A Network Intrusion Detection System (NIDS) is a tool that identifies potential threats to a network. Recently, different flow-based NIDS designs utilizing Machine Learning (ML) algorithms have been proposed as solutions to detect intrusions efficiently. However, conventional ML-based classifiers have not seen widespread adoption in the real world due to their poor domain adaptation capability. In…
▽ More
A Network Intrusion Detection System (NIDS) is a tool that identifies potential threats to a network. Recently, different flow-based NIDS designs utilizing Machine Learning (ML) algorithms have been proposed as solutions to detect intrusions efficiently. However, conventional ML-based classifiers have not seen widespread adoption in the real world due to their poor domain adaptation capability. In this research, our goal is to explore the possibility of using sequences of flows to improve the domain adaptation capability of network intrusion detection systems. Our proposal employs natural language processing techniques and Bidirectional Encoder Representations from Transformers framework, which is an effective technique for modeling data with respect to its context. Early empirical results show that our approach has improved domain adaptation capability compared to previous approaches. The proposed approach provides a new research method for building a robust intrusion detection system.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Broadband CPW-based impedance-transformed Josephson parametric amplifier
Authors:
Bingcheng Qing,
Long B. Nguyen,
Xinyu Liu,
Hengjiang Ren,
William P. Livingston,
Noah Goss,
Ahmed Hajr,
Trevor Chistolini,
Zahra Pedramrazi,
David I. Santiago,
Jie Luo,
Irfan Siddiqi
Abstract:
Quantum-limited Josephson parametric amplifiers play a pivotal role in advancing the field of circuit quantum electrodynamics by enabling the fast and high-fidelity measurement of weak microwave signals. Therefore, it is necessary to develop robust parametric amplifiers with low noise, broad bandwidth, and reduced design complexity for microwave detection. However, current broadband parametric amp…
▽ More
Quantum-limited Josephson parametric amplifiers play a pivotal role in advancing the field of circuit quantum electrodynamics by enabling the fast and high-fidelity measurement of weak microwave signals. Therefore, it is necessary to develop robust parametric amplifiers with low noise, broad bandwidth, and reduced design complexity for microwave detection. However, current broadband parametric amplifiers either have degraded noise performance or rely on complex designs. Here, we present a device based on the broadband impedance-transformed Josephson parametric amplifier (IMPA) that integrates a horn-like coplanar waveguide (CPW) transmission line, which significantly decreases the design and fabrication complexity, while kee** comparable performance. The device shows an instantaneous bandwidth of 700(200) MHz for 15(20) dB gain with an average saturation power of -110 dBm and near quantum-limited added noise. The operating frequency can be tuned over 1.4 GHz using an external flux bias. We further demonstrate the negligible back-action from our device on a transmon qubit. The amplification performance and simplicity of our device promise its wide adaptation in quantum metrology, quantum communication, and quantum information processing.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Controlling spin-orbit coupling to tailor type-II Dirac bands
Authors:
Nguyen Huu Lam,
Phuong Lien Nguyen,
Byoung Ki Choi,
Trinh Thi Ly,
Ganbat Duvjir,
Tae Gyu Rhee,
Yong ** Jo,
Tae Heon Kim,
Chris Jozwiak,
Aaron Bostwick,
Eli Rotenberg,
Younghun Hwang,
Young Jun Chang,
Jaekwang Lee,
Jungdae Kim
Abstract:
NiTe2, a type-II Dirac semimetal with strongly tilted Dirac band, has been explored extensively to understand its intriguing topological properties. Here, using density-functional theory (DFT) calculations, we report that the strength of spin-orbit coupling (SOC) in NiTe2 can be tuned by Se substitution. This results in negative shifts of the bulk Dirac point (BDP) while preserving the type-II Dir…
▽ More
NiTe2, a type-II Dirac semimetal with strongly tilted Dirac band, has been explored extensively to understand its intriguing topological properties. Here, using density-functional theory (DFT) calculations, we report that the strength of spin-orbit coupling (SOC) in NiTe2 can be tuned by Se substitution. This results in negative shifts of the bulk Dirac point (BDP) while preserving the type-II Dirac band. Indeed, combined studies using scanning tunneling spectroscopy (STS) and angle-resolved photoemission spectroscopy (ARPES) confirm that the BDP in the NiTe2-xSex alloy moves from +0.1 eV (NiTe2) to -0.3 eV (NiTeSe) depending on the Se concentrations, indicating the effective tunability of type-II Dirac fermions. Our results demonstrate an approach to tailor the type-II Dirac band in NiTe2 by controlling the SOC strength via chalcogen substitution. This approach can be applicable to different types of topological materials.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
An Efficient Federated Learning Framework for Training Semantic Communication System
Authors:
Loc X. Nguyen,
Huy Q. Le,
Ye Lin Tun,
Pyae Sone Aung,
Yan Kyaw Tun,
Zhu Han,
Choong Seon Hong
Abstract:
Semantic communication has emerged as a pillar for the next generation of communication systems due to its capabilities in alleviating data redundancy. Most semantic communication systems are built upon advanced deep learning models whose training performance heavily relies on data availability. Existing studies often make unrealistic assumptions of a readily accessible data source, where in pract…
▽ More
Semantic communication has emerged as a pillar for the next generation of communication systems due to its capabilities in alleviating data redundancy. Most semantic communication systems are built upon advanced deep learning models whose training performance heavily relies on data availability. Existing studies often make unrealistic assumptions of a readily accessible data source, where in practice, data is mainly created on the client side. Due to privacy and security concerns, the transmission of data is restricted, which is necessary for conventional centralized training schemes. To address this challenge, we explore semantic communication in a federated learning (FL) setting that utilizes client data without leaking privacy. Additionally, we design our system to tackle the communication overhead by reducing the quantity of information delivered in each global round. In this way, we can save significant bandwidth for resource-limited devices and reduce overall network traffic. Finally, we introduce a mechanism to aggregate the global model from clients, called FedLol. Extensive simulation results demonstrate the effectiveness of our proposed technique compared to baseline methods.
△ Less
Submitted 9 November, 2023; v1 submitted 19 October, 2023;
originally announced October 2023.
-
DA-TransUNet: Integrating Spatial and Channel Dual Attention with Transformer U-Net for Medical Image Segmentation
Authors:
Guanqun Sun,
Yizhi Pan,
Weikun Kong,
Zichang Xu,
Jianhua Ma,
Teeradaj Racharak,
Le-Minh Nguyen,
Junyi Xin
Abstract:
Accurate medical image segmentation is critical for disease quantification and treatment evaluation. While traditional Unet architectures and their transformer-integrated variants excel in automated segmentation tasks. However, they lack the ability to harness the intrinsic position and channel features of image. Existing models also struggle with parameter efficiency and computational complexity,…
▽ More
Accurate medical image segmentation is critical for disease quantification and treatment evaluation. While traditional Unet architectures and their transformer-integrated variants excel in automated segmentation tasks. However, they lack the ability to harness the intrinsic position and channel features of image. Existing models also struggle with parameter efficiency and computational complexity, often due to the extensive use of Transformers. To address these issues, this study proposes a novel deep medical image segmentation framework, called DA-TransUNet, aiming to integrate the Transformer and dual attention block(DA-Block) into the traditional U-shaped architecture. Unlike earlier transformer-based U-net models, DA-TransUNet utilizes Transformers and DA-Block to integrate not only global and local features, but also image-specific positional and channel features, improving the performance of medical image segmentation. By incorporating a DA-Block at the embedding layer and within each skip connection layer, we substantially enhance feature extraction capabilities and improve the efficiency of the encoder-decoder structure. DA-TransUNet demonstrates superior performance in medical image segmentation tasks, consistently outperforming state-of-the-art techniques across multiple datasets. In summary, DA-TransUNet offers a significant advancement in medical image segmentation, providing an effective and powerful alternative to existing techniques. Our architecture stands out for its ability to improve segmentation accuracy, thereby advancing the field of automated medical image diagnostics. The codes and parameters of our model will be publicly available at https://github.com/SUN-1024/DA-TransUnet.
△ Less
Submitted 14 November, 2023; v1 submitted 19 October, 2023;
originally announced October 2023.