-
Photoemission of spin-polarized electrons from aligned grains and chiral symmetry breaking
Authors:
Thiem Hoang
Abstract:
The unique biosignature of life on Earth is the homochirality of organic compounds such as amino acids, proteins, and sugars. The origin of this homochirality has remained a mystery for over a century. While high-energy spin-polarized (spin-up or spin-down) electrons (SPEs) from the $β$ decay of radioactive nuclei discovered by Lee and Yang (1956) and Wu et al. (1957) have been proposed as a poten…
▽ More
The unique biosignature of life on Earth is the homochirality of organic compounds such as amino acids, proteins, and sugars. The origin of this homochirality has remained a mystery for over a century. While high-energy spin-polarized (spin-up or spin-down) electrons (SPEs) from the $β$ decay of radioactive nuclei discovered by Lee and Yang (1956) and Wu et al. (1957) have been proposed as a potential source of symmetry breaking, their exact role on homochirality is much debated. Here we suggest magnetically aligned dust grains as a new source of SPEs due to photoemission of electrons having aligned spins by the Barnett effect. For the interstellar UV radiation field of strength $G_{\rm UV}$, we found that the SPE emission rate is $Γ_{\rm pe}^{\rm SPE}\sim 10^{-14}G_{\rm UV}$ electrons per second per H, the fraction of spin-polarized to total photoelectrons is $\sim 10\%$, and the SPE yield (photoelectron number per UV photon) can reach $\sim 1\%$, using the modern theory of grain alignment. Low-energy SPEs from aligned grains would cause chiral symmetry breaking of interstellar chiral molecules due to spin-selective (dipole-dipole) interactions. Finally, we suggest magnetically aligned grains as chiral agents that facilitate and enrich the chiral asymmetry of chiral molecules. Our proposed mechanism might explain the detection of chiral asymmetry in the ISM, comets, and meteorites due to the ubiquitous UV radiation and magnetically aligned grains, paving the way for understanding the origin and distribution of life in the universe. This mechanism based on magnetic grain alignment implies the role of magnetic fields on chirality symmetry breaking.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD
Authors:
C. Leloup,
G. Patanchon,
J. Errard,
C. Franceschet,
J. E. Gudmundsson,
S. Henrot-Versillé,
H. Imada,
H. Ishino,
T. Matsumura,
G. Puglisi,
W. Wang,
A. Adler,
J. Aumont,
R. Aurlien,
C. Baccigalupi,
M. Ballardini,
A. J. Banday,
R. B. Barreiro,
N. Bartolo,
A. Basyrov,
M. Bersanelli,
D. Blinov,
M. Bortolami,
T. Brinckmann,
P. Campeti
, et al. (86 additional authors not shown)
Abstract:
We present a study of the impact of an uncertainty in the beam far side-lobe knowledge on the measurement of the Cosmic Microwave Background $B$-mode signal at large scale. It is expected to be one of the main source of systematic effects in future CMB observations. Because it is crucial for all-sky survey missions to take into account the interplays between beam systematic effects and all the dat…
▽ More
We present a study of the impact of an uncertainty in the beam far side-lobe knowledge on the measurement of the Cosmic Microwave Background $B$-mode signal at large scale. It is expected to be one of the main source of systematic effects in future CMB observations. Because it is crucial for all-sky survey missions to take into account the interplays between beam systematic effects and all the data analysis steps, the primary goal of this paper is to provide the methodology to carry out the end-to-end study of their effect for a space-borne CMB polarization experiment, up to the cosmological results in the form of a bias $δr$ on the tensor-to-scalar ratio $r$. LiteBIRD is dedicated to target the measurement of CMB primordial $B$ modes by reaching a sensitivity of $σ\left( r \right) \leq 10^{-3}$ assuming $r=0$. As a demonstration of our framework, we derive the relationship between the knowledge of the beam far side-lobes and the tentatively allocated error budget under given assumptions on design, simulation and component separation method. We assume no mitigation of the far side-lobes effect at any stage of the analysis pipeline. We show that $δr$ is mostly due to the integrated fractional power difference between the estimated beams and the true beams in the far side-lobes region, with little dependence on the actual shape of the beams, for low enough $δr$. Under our set of assumptions, in particular considering the specific foreground cleaning method we used, we find that the integrated fractional power in the far side-lobes should be known at a level as tight as $\sim 10^{-4}$, to achieve the required limit on the bias $δr < 1.9 \times 10^{-5}$. The framework and tools developed for this study can be easily adapted to provide requirements under different design, data analysis frameworks and for other future space-borne experiments beyond LiteBIRD.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Enabling End-to-End Secure Federated Learning in Biomedical Research on Heterogeneous Computing Environments with APPFLx
Authors:
Trung-Hieu Hoang,
Jordan Fuhrman,
Ravi Madduri,
Miao Li,
Pranshu Chaturvedi,
Zilinghan Li,
Kibaek Kim,
Minseok Ryu,
Ryan Chard,
E. A. Huerta,
Maryellen Giger
Abstract:
Facilitating large-scale, cross-institutional collaboration in biomedical machine learning projects requires a trustworthy and resilient federated learning (FL) environment to ensure that sensitive information such as protected health information is kept confidential. In this work, we introduce APPFLx, a low-code FL framework that enables the easy setup, configuration, and running of FL experiment…
▽ More
Facilitating large-scale, cross-institutional collaboration in biomedical machine learning projects requires a trustworthy and resilient federated learning (FL) environment to ensure that sensitive information such as protected health information is kept confidential. In this work, we introduce APPFLx, a low-code FL framework that enables the easy setup, configuration, and running of FL experiments across organizational and administrative boundaries while providing secure end-to-end communication, privacy-preserving functionality, and identity management. APPFLx is completely agnostic to the underlying computational infrastructure of participating clients. We demonstrate the capability of APPFLx as an easy-to-use framework for accelerating biomedical studies across institutions and healthcare systems while maintaining the protection of private medical data in two case studies: (1) predicting participant age from electrocardiogram (ECG) waveforms, and (2) detecting COVID-19 disease from chest radiographs. These experiments were performed securely across heterogeneous compute resources, including a mixture of on-premise high-performance computing and cloud computing, and highlight the role of federated learning in improving model generalizability and performance when aggregating data from multiple healthcare systems. Finally, we demonstrate that APPFLx serves as a convenient and easy-to-use framework for accelerating biomedical studies across institutions and healthcare system while maintaining the protection of private medical data.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Amino acid characteristics in protein native state structures
Authors:
Tatjana Škrbić,
Achille Giacometti,
Trinh X. Hoang,
Amos Maritan,
Jayanth R. Banavar
Abstract:
We present a geometrical analysis of the protrusion statistics of side chains in more than 4,000 high-resolution protein structures. We employ a coarse-grained representation of the protein backbone viewed as a linear chain of Cα atoms and consider just the heavy atoms of the side chains. We study the large variety of behaviors of the amino acids based on both rudimentary structural chemistry as w…
▽ More
We present a geometrical analysis of the protrusion statistics of side chains in more than 4,000 high-resolution protein structures. We employ a coarse-grained representation of the protein backbone viewed as a linear chain of Cα atoms and consider just the heavy atoms of the side chains. We study the large variety of behaviors of the amino acids based on both rudimentary structural chemistry as well as geometry. Our geometrical analysis uses a backbone Frenet coordinate system for the common study of all amino acids. Our analysis underscores the richness of the repertoire of amino acids that is available to nature to design protein sequences that fit within the putative native state folds.
△ Less
Submitted 26 January, 2024; v1 submitted 12 December, 2023;
originally announced December 2023.
-
III. Geometrical framework for thinking about globular proteins: turns in proteins
Authors:
Tatjana Škrbić,
Achille Giacometti,
Trinh X. Hoang,
Amos Maritan,
Jayanth R. Banavar
Abstract:
We have shown recently that the notion of poking pairwise interactions along a chain provides a unifying framework for understanding the formation of both secondary and the tertiary protein structure based on symmetry and geometry. $α$-helices and $β$-sheets are found to be special geometries that have systematic poking contacts in a repetitive manner with the contacts being local along the $α$-he…
▽ More
We have shown recently that the notion of poking pairwise interactions along a chain provides a unifying framework for understanding the formation of both secondary and the tertiary protein structure based on symmetry and geometry. $α$-helices and $β$-sheets are found to be special geometries that have systematic poking contacts in a repetitive manner with the contacts being local along the $α$-helix and non-local along a pair of adjacent strands within a $β$-sheet. Pairwise poking interactions also govern tertiary structure formation, but they are weaker and there are no special geometrical constraints as in secondary structure formation. Here we demonstrate that protein turns, the most prevalent non-repetitive structural element in proteins, are instances of local (as in $α$-helices) and isolated (non-repetitive) poking pairwise contacts for which the geometrical constraints are partially relaxed. This simple and purely geometrical definition of protein turns (also sometimes known as reverse turns, $β$-turns, $β$-bends, hairpin bends, $3_{10}$ bends, kinks, widgets, ...) provides a simple framework for unifying them. We present the results of a systematic analysis and identify their structural classes as well as their respective amino acid preferences.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Securing MIMO Wiretap Channel with Learning-Based Friendly Jamming under Imperfect CSI
Authors:
Bui Minh Tuan,
Diep N. Nguyen,
Nguyen Linh Trung,
Van-Dinh Nguyen,
Nguyen Van Huynh,
Dinh Thai Hoang,
Marwan Krunz,
Eryk Dutkiewicz
Abstract:
Wireless communications are particularly vulnerable to eavesdrop** attacks due to their broadcast nature. To effectively deal with eavesdroppers, existing security techniques usually require accurate channel state information (CSI), e.g., for friendly jamming (FJ), and/or additional computing resources at transceivers, e.g., cryptography-based solutions, which unfortunately may not be feasible i…
▽ More
Wireless communications are particularly vulnerable to eavesdrop** attacks due to their broadcast nature. To effectively deal with eavesdroppers, existing security techniques usually require accurate channel state information (CSI), e.g., for friendly jamming (FJ), and/or additional computing resources at transceivers, e.g., cryptography-based solutions, which unfortunately may not be feasible in practice. This challenge is even more acute in low-end IoT devices. We thus introduce a novel deep learning-based FJ framework that can effectively defeat eavesdrop** attacks with imperfect CSI and even without CSI of legitimate channels. In particular, we first develop an autoencoder-based communication architecture with FJ, namely AEFJ, to jointly maximize the secrecy rate and minimize the block error rate at the receiver without requiring perfect CSI of the legitimate channels. In addition, to deal with the case without CSI, we leverage the mutual information neural estimation (MINE) concept and design a MINE-based FJ scheme that can achieve comparable security performance to the conventional FJ methods that require perfect CSI. Extensive simulations in a multiple-input multiple-output (MIMO) system demonstrate that our proposed solution can effectively deal with eavesdrop** attacks in various settings. Moreover, the proposed framework can seamlessly integrate MIMO security and detection tasks into a unified end-to-end learning process. This integrated approach can significantly maximize the throughput and minimize the block error rate, offering a good solution for enhancing communication security in wireless communication systems.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Improving the Robustness of 3D Human Pose Estimation: A Benchmark and Learning from Noisy Input
Authors:
Trung-Hieu Hoang,
Mona Zehni,
Huy Phan,
Duc Minh Vo,
Minh N. Do
Abstract:
Despite the promising performance of current 3D human pose estimation techniques, understanding and enhancing their generalization on challenging in-the-wild videos remain an open problem. In this work, we focus on the robustness of 2D-to-3D pose lifters. To this end, we develop two benchmark datasets, namely Human3.6M-C and HumanEva-I-C, to examine the robustness of video-based 3D pose lifters to…
▽ More
Despite the promising performance of current 3D human pose estimation techniques, understanding and enhancing their generalization on challenging in-the-wild videos remain an open problem. In this work, we focus on the robustness of 2D-to-3D pose lifters. To this end, we develop two benchmark datasets, namely Human3.6M-C and HumanEva-I-C, to examine the robustness of video-based 3D pose lifters to a wide range of common video corruptions including temporary occlusion, motion blur, and pixel-level noise. We observe the poor generalization of state-of-the-art 3D pose lifters in the presence of corruption and establish two techniques to tackle this issue. First, we introduce Temporal Additive Gaussian Noise (TAGN) as a simple yet effective 2D input pose data augmentation. Additionally, to incorporate the confidence scores output by the 2D pose detectors, we design a confidence-aware convolution (CA-Conv) block. Extensively tested on corrupted videos, the proposed strategies consistently boost the robustness of 3D pose lifters and serve as new baselines for future research.
△ Less
Submitted 15 April, 2024; v1 submitted 11 December, 2023;
originally announced December 2023.
-
Generative AI for Physical Layer Communications: A Survey
Authors:
Nguyen Van Huynh,
Jiacheng Wang,
Hongyang Du,
Dinh Thai Hoang,
Dusit Niyato,
Diep N. Nguyen,
Dong In Kim,
Khaled B. Letaief
Abstract:
The recent evolution of generative artificial intelligence (GAI) leads to the emergence of groundbreaking applications such as ChatGPT, which not only enhances the efficiency of digital content production, such as text, audio, video, or even network traffic data, but also enriches its diversity. Beyond digital content creation, GAI's capability in analyzing complex data distributions offers great…
▽ More
The recent evolution of generative artificial intelligence (GAI) leads to the emergence of groundbreaking applications such as ChatGPT, which not only enhances the efficiency of digital content production, such as text, audio, video, or even network traffic data, but also enriches its diversity. Beyond digital content creation, GAI's capability in analyzing complex data distributions offers great potential for wireless communications, particularly amidst a rapid expansion of new physical layer communication technologies. For example, the diffusion model can learn input signal distributions and use them to improve the channel estimation accuracy, while the variational autoencoder can model channel distribution and infer latent variables for blind channel equalization. Therefore, this paper presents a comprehensive investigation of GAI's applications for communications at the physical layer, ranging from traditional issues, including signal classification, channel estimation, and equalization, to emerging topics, such as intelligent reflecting surfaces and joint source channel coding. We also compare GAI-enabled physical layer communications with those supported by traditional AI, highlighting GAI's inherent capabilities and unique contributions in these areas. Finally, the paper discusses open issues and proposes several future research directions, laying a foundation for further exploration and advancement of GAI in physical layer communications.
△ Less
Submitted 9 December, 2023;
originally announced December 2023.
-
Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning Interference with Gradient Projection
Authors:
Tuan Hoang,
Santu Rana,
Sunil Gupta,
Svetha Venkatesh
Abstract:
Recent data-privacy laws have sparked interest in machine unlearning, which involves removing the effect of specific training samples from a learnt model as if they were never present in the original training dataset. The challenge of machine unlearning is to discard information about the ``forget'' data in the learnt model without altering the knowledge about the remaining dataset and to do so mo…
▽ More
Recent data-privacy laws have sparked interest in machine unlearning, which involves removing the effect of specific training samples from a learnt model as if they were never present in the original training dataset. The challenge of machine unlearning is to discard information about the ``forget'' data in the learnt model without altering the knowledge about the remaining dataset and to do so more efficiently than the naive retraining approach. To achieve this, we adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU), in which the model takes steps in the orthogonal direction to the gradient subspaces deemed unimportant for the retaining dataset, so as to its knowledge is preserved. By utilizing Stochastic Gradient Descent (SGD) to update the model weights, our method can efficiently scale to any model and dataset size. We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible. Our code is available at https://github.com/hnanhtuan/projected_gradient_unlearning.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Constrained Twin Variational Auto-Encoder for Intrusion Detection in IoT Systems
Authors:
Phai Vu Dinh,
Quang Uy Nguyen,
Dinh Thai Hoang,
Diep N. Nguyen,
Son Pham Bao,
Eryk Dutkiewicz
Abstract:
Intrusion detection systems (IDSs) play a critical role in protecting billions of IoT devices from malicious attacks. However, the IDSs for IoT devices face inherent challenges of IoT systems, including the heterogeneity of IoT data/devices, the high dimensionality of training data, and the imbalanced data. Moreover, the deployment of IDSs on IoT systems is challenging, and sometimes impossible, d…
▽ More
Intrusion detection systems (IDSs) play a critical role in protecting billions of IoT devices from malicious attacks. However, the IDSs for IoT devices face inherent challenges of IoT systems, including the heterogeneity of IoT data/devices, the high dimensionality of training data, and the imbalanced data. Moreover, the deployment of IDSs on IoT systems is challenging, and sometimes impossible, due to the limited resources such as memory/storage and computing capability of typical IoT devices. To tackle these challenges, this article proposes a novel deep neural network/architecture called Constrained Twin Variational Auto-Encoder (CTVAE) that can feed classifiers of IDSs with more separable/distinguishable and lower-dimensional representation data. Additionally, in comparison to the state-of-the-art neural networks used in IDSs, CTVAE requires less memory/storage and computing power, hence making it more suitable for IoT IDS systems. Extensive experiments with the 11 most popular IoT botnet datasets show that CTVAE can boost around 1% in terms of accuracy and Fscore in detection attack compared to the state-of-the-art machine learning and representation learning methods, whilst the running time for attack detection is lower than 2E-6 seconds and the model size is lower than 1 MB. We also further investigate various characteristics of CTVAE in the latent space and in the reconstruction representation to demonstrate its efficacy compared with current well-known methods.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Nonstandard finite difference methods preserving general quadratic Lyapunov functions
Authors:
Manh Tuan Hoang
Abstract:
In this work, we consider a class of dynamical systems described by ordinary differential equations under the assumption that the global asymptotic stability (GAS) of equilibrium points is established based on the Lyapunov stability theory with the help of quadratic Lyapunov functions. We employ the Micken's methodology to construct a family of explicit nonstandard finite difference (NSFD) methods…
▽ More
In this work, we consider a class of dynamical systems described by ordinary differential equations under the assumption that the global asymptotic stability (GAS) of equilibrium points is established based on the Lyapunov stability theory with the help of quadratic Lyapunov functions. We employ the Micken's methodology to construct a family of explicit nonstandard finite difference (NSFD) methods preserving any given quadratic Lyapunov function $V$, i.e. they admit $V$ as a discrete Lyapunov function. Here, the proposed NSFD methods are derived from a novel non-local approximation for the zero vector function.
Through rigorous mathematical analysis, we show that the constructed NSFD methods have the ability to preserve any given quadratic Lyapunov functions regardless of the values of the step size. As an important consequence, they are dynamically consistent with respect to the GAS of continuous-time dynamical systems. On the other hand, the positivity of the proposed NSFD methods is investigated. It is proved that they can also preserve the positivity of solutions of continuous-time dynamical systems.
Finally, the theoretical findings are supported by a series of illustrative numerical experiments, in which advantages of the NSFD methods are demonstrated.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
Navigating Privacy and Copyright Challenges Across the Data Lifecycle of Generative AI
Authors:
Dawen Zhang,
Boming Xia,
Yue Liu,
Xiwei Xu,
Thong Hoang,
Zhenchang Xing,
Mark Staples,
Qinghua Lu,
Liming Zhu
Abstract:
The advent of Generative AI has marked a significant milestone in artificial intelligence, demonstrating remarkable capabilities in generating realistic images, texts, and data patterns. However, these advancements come with heightened concerns over data privacy and copyright infringement, primarily due to the reliance on vast datasets for model training. Traditional approaches like differential p…
▽ More
The advent of Generative AI has marked a significant milestone in artificial intelligence, demonstrating remarkable capabilities in generating realistic images, texts, and data patterns. However, these advancements come with heightened concerns over data privacy and copyright infringement, primarily due to the reliance on vast datasets for model training. Traditional approaches like differential privacy, machine unlearning, and data poisoning only offer fragmented solutions to these complex issues. Our paper delves into the multifaceted challenges of privacy and copyright protection within the data lifecycle. We advocate for integrated approaches that combines technical innovation with ethical foresight, holistically addressing these concerns by investigating and devising solutions that are informed by the lifecycle perspective. This work aims to catalyze a broader discussion and inspire concerted efforts towards data privacy and copyright integrity in Generative AI.
△ Less
Submitted 10 January, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Persistent Test-time Adaptation in Episodic Testing Scenarios
Authors:
Trung-Hieu Hoang,
Duc Minh Vo,
Minh N. Do
Abstract:
Current test-time adaptation (TTA) approaches aim to adapt to environments that change continuously. Yet, when the environments not only change but also recur in a correlated manner over time, such as in the case of day-night surveillance cameras, it is unclear whether the adaptability of these methods is sustained after a long run. This study aims to examine the error accumulation of TTA models w…
▽ More
Current test-time adaptation (TTA) approaches aim to adapt to environments that change continuously. Yet, when the environments not only change but also recur in a correlated manner over time, such as in the case of day-night surveillance cameras, it is unclear whether the adaptability of these methods is sustained after a long run. This study aims to examine the error accumulation of TTA models when they are repeatedly exposed to previous testing environments, proposing a novel testing setting called episodic TTA. To study this phenomenon, we design a simulation of TTA process on a simple yet representative $ε$-perturbed Gaussian Mixture Model Classifier and derive the theoretical findings revealing the dataset- and algorithm-dependent factors that contribute to the gradual degeneration of TTA methods through time. Our investigation has led us to propose a method, named persistent TTA (PeTTA). PeTTA senses the model divergence towards a collapsing and adjusts the adaptation strategy of TTA, striking a balance between two primary objectives: adaptation and preventing model collapse. The stability of PeTTA in the face of episodic TTA scenarios has been demonstrated through a set of comprehensive experiments on various benchmarks.
△ Less
Submitted 16 January, 2024; v1 submitted 29 November, 2023;
originally announced November 2023.
-
A generalized nonstandard finite difference method for a class of autonomous dynamical systems and its applications
Authors:
Manh Tuan Hoang
Abstract:
In this work, a class of continuous-time autonomous dynamical systems describing many important phenomena and processes arising in real-world applications is considered. We apply the nonstandard finite difference (NSFD) methodology proposed by Mickens to design a generalized NSFD method for the dynamical system models under consideration. This method is constructed based on a novel non-local appro…
▽ More
In this work, a class of continuous-time autonomous dynamical systems describing many important phenomena and processes arising in real-world applications is considered. We apply the nonstandard finite difference (NSFD) methodology proposed by Mickens to design a generalized NSFD method for the dynamical system models under consideration. This method is constructed based on a novel non-local approximation for the right-side functions of the dynamical systems. It is proved by rigorous mathematical analyses that the NSFD method is dynamically consistent with respect to positivity, asymptotic stability and three classes of conservation laws, including direct conservation, generalized conservation and sub-conservation laws. Furthermore, the NSFD method is easy to be implemented and can be applied to solve a broad range of mathematical models arising in real-life. Finally, a set of numerical experiments is performed to illustrate the theoretical findings and to show advantages of the proposed NSFD method.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Breaking Boundaries: Balancing Performance and Robustness in Deep Wireless Traffic Forecasting
Authors:
Romain Ilbert,
Thai V. Hoang,
Zonghua Zhang,
Themis Palpanas
Abstract:
Balancing the trade-off between accuracy and robustness is a long-standing challenge in time series forecasting. While most of existing robust algorithms have achieved certain suboptimal performance on clean data, sustaining the same performance level in the presence of data perturbations remains extremely hard. In this paper, we study a wide array of perturbation scenarios and propose novel defen…
▽ More
Balancing the trade-off between accuracy and robustness is a long-standing challenge in time series forecasting. While most of existing robust algorithms have achieved certain suboptimal performance on clean data, sustaining the same performance level in the presence of data perturbations remains extremely hard. In this paper, we study a wide array of perturbation scenarios and propose novel defense mechanisms against adversarial attacks using real-world telecom data. We compare our strategy against two existing adversarial training algorithms under a range of maximal allowed perturbations, defined using $\ell_{\infty}$-norm, $\in [0.1,0.4]$. Our findings reveal that our hybrid strategy, which is composed of a classifier to detect adversarial examples, a denoiser to eliminate noise from the perturbed data samples, and a standard forecaster, achieves the best performance on both clean and perturbed data. Our optimal model can retain up to $92.02\%$ the performance of the original forecasting model in terms of Mean Squared Error (MSE) on clean data, while being more robust than the standard adversarially trained models on perturbed data. Its MSE is 2.71$\times$ and 2.51$\times$ lower than those of comparing methods on normal and perturbed data, respectively. In addition, the components of our models can be trained in parallel, resulting in better computational efficiency. Our results indicate that we can optimally balance the trade-off between the performance and robustness of forecasting models by improving the classifier and denoiser, even in the presence of sophisticated and destructive poisoning attacks.
△ Less
Submitted 28 November, 2023; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Depth and regularity of tableau ideals
Authors:
Do Trong Hoang,
Thanh Vu
Abstract:
We compute the depth and regularity of ideals associated with arbitrary fillings of positive integers to a Young diagram, called the tableau ideals.
We compute the depth and regularity of ideals associated with arbitrary fillings of positive integers to a Young diagram, called the tableau ideals.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Utilizing polydispersity in composite fibrous based sound absorbing materials
Authors:
Quang Vu Tran,
Camille Perrot,
Raymond Panneton,
Minh Tan Hoang,
Ludovic Dejaeger,
Valerie Marcel,
Mathieu Jouve
Abstract:
The distribution of fiber diameters plays a crucial role in the transport and sound absorbing properties of a three-dimensional random fibrous (3D-RF) composites. Conventionally, volume-weighted averaging of fiber diameters has been utilized as an appropriate microstructural descriptor to predict the static viscous permeability of 3D-RF composites. However, the long wavelength acoustical propertie…
▽ More
The distribution of fiber diameters plays a crucial role in the transport and sound absorbing properties of a three-dimensional random fibrous (3D-RF) composites. Conventionally, volume-weighted averaging of fiber diameters has been utilized as an appropriate microstructural descriptor to predict the static viscous permeability of 3D-RF composites. However, the long wavelength acoustical properties of a 3D-RF composites are also sensitive to the smallest fibers, this is particularly true in the high-frequency regime. In our recent research, we demonstrated that an inverse volume-weighted averaging of fiber diameters can effectively serve as a complementary microstructural descriptor to capture the high-frequency behavior of polydisperse fibrous media. In the present work, we review the identification of two representative volume elements (RVEs) which relies on the reconstruction of 3D-RF composites having volume-weighted and inverse-volume weighted averaged fiber diameters, respectively in the low-frequency and high frequency regimes. We examine the implication of such a weighting procedure on the transport and sound absorbing properties of polydisperse fibrous media, highlighting their potential advantages. Furthermore, we discuss the challenges associated with this research field. Finally, we provide a brief perspective of the future directions and opportunities for advancing this area of study, aiming to overcome challenges and extend the benefits of employing polydispersity as a new lever for the optimization of 3D-RF composites in sound-absorbing materials.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
Latent Task-Specific Graph Network Simulators
Authors:
Philipp Dahlinger,
Niklas Freymuth,
Michael Volpp,
Tai Hoang,
Gerhard Neumann
Abstract:
Simulating dynamic physical interactions is a critical challenge across multiple scientific domains, with applications ranging from robotics to material science. For mesh-based simulations, Graph Network Simulators (GNSs) pose an efficient alternative to traditional physics-based simulators. Their inherent differentiability and speed make them particularly well-suited for inverse design problems.…
▽ More
Simulating dynamic physical interactions is a critical challenge across multiple scientific domains, with applications ranging from robotics to material science. For mesh-based simulations, Graph Network Simulators (GNSs) pose an efficient alternative to traditional physics-based simulators. Their inherent differentiability and speed make them particularly well-suited for inverse design problems. Yet, adapting to new tasks from limited available data is an important aspect for real-world applications that current methods struggle with. We frame mesh-based simulation as a meta-learning problem and use a recent Bayesian meta-learning method to improve GNSs adaptability to new scenarios by leveraging context data and handling uncertainties. Our approach, latent task-specific graph network simulator, uses non-amortized task posterior approximations to sample latent descriptions of unknown system properties. Additionally, we leverage movement primitives for efficient full trajectory prediction, effectively addressing the issue of accumulating errors encountered by previous auto-regressive methods. We validate the effectiveness of our approach through various experiments, performing on par with or better than established baseline methods. Movement primitives further allow us to accommodate various types of context data, as demonstrated through the utilization of point clouds during inference. By combining GNSs with meta-learning, we bring them closer to real-world applicability, particularly in scenarios with smaller datasets.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Reconstructing Human Pose from Inertial Measurements: A Generative Model-based Compressive Sensing Approach
Authors:
Nguyen Quang Hieu,
Dinh Thai Hoang,
Diep N. Nguyen,
Mohammad Abu Alsheikh
Abstract:
The ability to sense, localize, and estimate the 3D position and orientation of the human body is critical in virtual reality (VR) and extended reality (XR) applications. This becomes more important and challenging with the deployment of VR/XR applications over the next generation of wireless systems such as 5G and beyond. In this paper, we propose a novel framework that can reconstruct the 3D hum…
▽ More
The ability to sense, localize, and estimate the 3D position and orientation of the human body is critical in virtual reality (VR) and extended reality (XR) applications. This becomes more important and challenging with the deployment of VR/XR applications over the next generation of wireless systems such as 5G and beyond. In this paper, we propose a novel framework that can reconstruct the 3D human body pose of the user given sparse measurements from Inertial Measurement Unit (IMU) sensors over a noisy wireless environment. Specifically, our framework enables reliable transmission of compressed IMU signals through noisy wireless channels and effective recovery of such signals at the receiver, e.g., an edge server. This task is very challenging due to the constraints of transmit power, recovery accuracy, and recovery latency. To address these challenges, we first develop a deep generative model at the receiver to recover the data from linear measurements of IMU signals. The linear measurements of the IMU signals are obtained by a linear projection with a measurement matrix based on the compressive sensing theory. The key to the success of our framework lies in the novel design of the measurement matrix at the transmitter, which can not only satisfy power constraints for the IMU devices but also obtain a highly accurate recovery for the IMU signals at the receiver. This can be achieved by extending the set-restricted eigenvalue condition of the measurement matrix and combining it with an upper bound for the power transmission constraint. Our framework can achieve robust performance for recovering 3D human poses from noisy compressed IMU signals. Additionally, our pre-trained deep generative model achieves signal reconstruction accuracy comparable to an optimization-based approach, i.e., Lasso, but is an order of magnitude faster.
△ Less
Submitted 12 May, 2024; v1 submitted 31 October, 2023;
originally announced October 2023.
-
Pre-trained Recommender Systems: A Causal Debiasing Perspective
Authors:
Ziqian Lin,
Hao Ding,
Nghia Trong Hoang,
Branislav Kveton,
Anoop Deoras,
Hao Wang
Abstract:
Recent studies on pre-trained vision/language models have demonstrated the practical benefit of a new, promising solution-building paradigm in AI where models can be pre-trained on broad data describing a generic task space and then adapted successfully to solve a wide range of downstream tasks, even when training data is severely limited (e.g., in zero- or few-shot learning scenarios). Inspired b…
▽ More
Recent studies on pre-trained vision/language models have demonstrated the practical benefit of a new, promising solution-building paradigm in AI where models can be pre-trained on broad data describing a generic task space and then adapted successfully to solve a wide range of downstream tasks, even when training data is severely limited (e.g., in zero- or few-shot learning scenarios). Inspired by such progress, we investigate in this paper the possibilities and challenges of adapting such a paradigm to the context of recommender systems, which is less investigated from the perspective of pre-trained model. In particular, we propose to develop a generic recommender that captures universal interaction patterns by training on generic user-item interaction data extracted from different domains, which can then be fast adapted to improve few-shot learning performance in unseen new domains (with limited data).
However, unlike vision/language data which share strong conformity in the semantic space, universal patterns underlying recommendation data collected across different domains (e.g., different countries or different E-commerce platforms) are often occluded by both in-domain and cross-domain biases implicitly imposed by the cultural differences in their user and item bases, as well as their uses of different e-commerce platforms. As shown in our experiments, such heterogeneous biases in the data tend to hinder the effectiveness of the pre-trained model. To address this challenge, we further introduce and formalize a causal debiasing perspective, which is substantiated via a hierarchical Bayesian deep learning model, named PreRec. Our empirical studies on real-world data show that the proposed model could significantly improve the recommendation performance in zero- and few-shot learning settings under both cross-market and cross-platform scenarios.
△ Less
Submitted 8 January, 2024; v1 submitted 29 October, 2023;
originally announced October 2023.
-
IDENAS: Internal Dependency Exploration for Neural Architecture Search
Authors:
Anh T. Hoang,
Zsolt J. Viharos
Abstract:
Machine learning is a powerful tool for extracting valuable information and making various predictions from diverse datasets. Traditional algorithms rely on well-defined input and output variables however, there are scenarios where the distinction between the input and output variables and the underlying, associated (input and output) layers of the model, are unknown. Neural Architecture Search (N…
▽ More
Machine learning is a powerful tool for extracting valuable information and making various predictions from diverse datasets. Traditional algorithms rely on well-defined input and output variables however, there are scenarios where the distinction between the input and output variables and the underlying, associated (input and output) layers of the model, are unknown. Neural Architecture Search (NAS) and Feature Selection have emerged as promising solutions in such scenarios. This research proposes IDENAS, an Internal Dependency-based Exploration for Neural Architecture Search, integrating NAS with feature selection. The methodology explores internal dependencies in the complete parameter space for classification involving 1D sensor and 2D image data as well. IDENAS employs a modified encoder-decoder model and the Sequential Forward Search (SFS) algorithm, combining input-output configuration search with embedded feature selection. Experimental results demonstrate IDENASs superior performance in comparison to other algorithms, showcasing its effectiveness in model development pipelines and automated machine learning. On average, IDENAS achieved significant modelling improvements, underscoring its significant contribution to advancing the state-of-the-art in neural architecture search and feature selection integration.
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
Probing 3D magnetic fields using thermal dust polarization and grain alignment theory
Authors:
Thiem Hoang,
Bao Truong
Abstract:
Magnetic fields are ubiquitous in the universe and are thought to play an important role in various astrophysical processes. Polarization of thermal dust emission from dust grains aligned with the magnetic field is widely used to measure the two-dimensional magnetic field projected onto the plane of the sky (POS), but the component along the line of sight (LOS) is not yet reliably constrained with…
▽ More
Magnetic fields are ubiquitous in the universe and are thought to play an important role in various astrophysical processes. Polarization of thermal dust emission from dust grains aligned with the magnetic field is widely used to measure the two-dimensional magnetic field projected onto the plane of the sky (POS), but the component along the line of sight (LOS) is not yet reliably constrained with dust polarization. Here, we introduce a new method to infer three-dimensional (3D) magnetic fields using thermal dust polarization and grain alignment physics. We first develop a physical model of thermal dust polarization using the modern grain alignment theory based on the magnetically enhanced radiative torque (MRAT) alignment theory. We then test this model with synthetic observations of magnetohydrodynamic (MHD) simulations of a filamentary cloud with our updated POLARIS code. Combining the tested physical polarization model with synthetic polarization, we show that the B-field inclination angle can be accurately constrained by the polarization degree from synthetic observations. Compared to the true 3D magnetic fields, our method with grain alignment is more accurate than the previous methods that assume uniform grain alignment. This new technique paves the way for tracing 3D B-fields using thermal dust polarization and grain alignment theory and for constraining dust properties and grain alignment physics.
△ Less
Submitted 19 February, 2024; v1 submitted 25 October, 2023;
originally announced October 2023.
-
JWST MIRI/MRS Observations and Spectral Models of the Under-luminous Type Ia Supernova 2022xkq
Authors:
J. M. DerKacy,
C. Ashall,
P. Hoeflich,
E. Baron,
M. Shahbandeh,
B. J. Shappee,
J. Andrews,
D. Baade,
E. F Balangan,
K. A. Bostroem,
P. J. Brown,
C. R. Burns,
A. Burrow,
A. Cikota,
T. de Jaeger,
A. Do,
Y. Dong,
I. Dominguez,
O. Fox,
L. Galbany,
E. T. Hoang,
E. Y. Hsiao,
D. Janzen,
J. E. Jencson,
K. Krisciunas
, et al. (22 additional authors not shown)
Abstract:
We present a JWST mid-infrared spectrum of the under-luminous Type Ia Supernova (SN Ia) 2022xkq, obtained with the medium-resolution spectrometer on the Mid-Infrared Instrument (MIRI) $\sim130$ days post-explosion. We identify the first MIR lines beyond 14 $μ$m in SN Ia observations. We find features unique to under-luminous SNe Ia, including: isolated emission of stable Ni, strong blends of [Ti I…
▽ More
We present a JWST mid-infrared spectrum of the under-luminous Type Ia Supernova (SN Ia) 2022xkq, obtained with the medium-resolution spectrometer on the Mid-Infrared Instrument (MIRI) $\sim130$ days post-explosion. We identify the first MIR lines beyond 14 $μ$m in SN Ia observations. We find features unique to under-luminous SNe Ia, including: isolated emission of stable Ni, strong blends of [Ti II], and large ratios of singly ionized to doubly ionized species in both [Ar] and [Co]. Comparisons to normal-luminosity SNe Ia spectra at similar phases show a tentative trend between the width of the [Co III] 11.888 $μ$m feature and the SN light curve shape. Using non-LTE-multi-dimensional radiation hydro simulations and the observed electron capture elements we constrain the mass of the exploding white dwarf. The best-fitting model shows that SN 2022xkq is consistent with an off-center delayed-detonation explosion of a near-Chandrasekhar mass WD (M$_{\rm ej}$ $\approx 1.37$ M$_{\odot}$) of high-central density ($ρ_c \geq 2.0\times10^{9}$ g cm$^{-3}$) seen equator on, which produced M($^{56}$Ni) $= 0.324$ M$_{\odot}$ and M($^{58}$Ni) $\geq 0.06$ M$_{\odot}$. The observed line widths are consistent with the overall abundance distribution; and the narrow stable Ni lines indicate little to no mixing in the central regions, favoring central ignition of sub-sonic carbon burning followed by an off-center DDT beginning at a single point. Additional observations may further constrain the physics revealing the presence of additional species including Cr and Mn. Our work demonstrates the power of using the full coverage of MIRI in combination with detailed modeling to elucidate the physics of SNe Ia at a level not previously possible.
△ Less
Submitted 7 November, 2023; v1 submitted 13 October, 2023;
originally announced October 2023.
-
Tag Your Fish in the Broken Net: A Responsible Web Framework for Protecting Online Privacy and Copyright
Authors:
Dawen Zhang,
Boming Xia,
Yue Liu,
Xiwei Xu,
Thong Hoang,
Zhenchang Xing,
Mark Staples,
Qinghua Lu,
Liming Zhu
Abstract:
The World Wide Web, a ubiquitous source of information, serves as a primary resource for countless individuals, amassing a vast amount of data from global internet users. However, this online data, when scraped, indexed, and utilized for activities like web crawling, search engine indexing, and, notably, AI model training, often diverges from the original intent of its contributors. The ascent of…
▽ More
The World Wide Web, a ubiquitous source of information, serves as a primary resource for countless individuals, amassing a vast amount of data from global internet users. However, this online data, when scraped, indexed, and utilized for activities like web crawling, search engine indexing, and, notably, AI model training, often diverges from the original intent of its contributors. The ascent of Generative AI has accentuated concerns surrounding data privacy and copyright infringement. Regrettably, the web's current framework falls short in facilitating pivotal actions like consent withdrawal or data copyright claims. While some companies offer voluntary measures, such as crawler access restrictions, these often remain inaccessible to individual users. To empower online users to exercise their rights and enable companies to adhere to regulations, this paper introduces a user-controlled consent tagging framework for online data. It leverages the extensibility of HTTP and HTML in conjunction with the decentralized nature of distributed ledger technology. With this framework, users have the ability to tag their online data at the time of transmission, and subsequently, they can track and request the withdrawal of consent for their data from the data holders. A proof-of-concept system is implemented, demonstrating the feasibility of the framework. This work holds significant potential for contributing to the reinforcement of user consent, privacy, and copyright on the modern internet and lays the groundwork for future insights into creating a more responsible and user-centric web ecosystem.
△ Less
Submitted 5 November, 2023; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Promoting Robustness of Randomized Smoothing: Two Cost-Effective Approaches
Authors:
Linbo Liu,
Trong Nghia Hoang,
Lam M. Nguyen,
Tsui-Wei Weng
Abstract:
Randomized smoothing has recently attracted attentions in the field of adversarial robustness to provide provable robustness guarantees on smoothed neural network classifiers. However, existing works show that vanilla randomized smoothing usually does not provide good robustness performance and often requires (re)training techniques on the base classifier in order to boost the robustness of the re…
▽ More
Randomized smoothing has recently attracted attentions in the field of adversarial robustness to provide provable robustness guarantees on smoothed neural network classifiers. However, existing works show that vanilla randomized smoothing usually does not provide good robustness performance and often requires (re)training techniques on the base classifier in order to boost the robustness of the resulting smoothed classifier. In this work, we propose two cost-effective approaches to boost the robustness of randomized smoothing while preserving its clean performance. The first approach introduces a new robust training method AdvMacerwhich combines adversarial training and robustness certification maximization for randomized smoothing. We show that AdvMacer can improve the robustness performance of randomized smoothing classifiers compared to SOTA baselines, while being 3x faster to train than MACER baseline. The second approach introduces a post-processing method EsbRS which greatly improves the robustness certificate based on building model ensembles. We explore different aspects of model ensembles that has not been studied by prior works and propose a novel design methodology to further improve robustness of the ensemble based on our theoretical analysis.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Sample-Driven Federated Learning for Energy-Efficient and Real-Time IoT Sensing
Authors:
Minh Ngoc Luu,
Minh-Duong Nguyen,
Ebrahim Bedeer,
Van Duc Nguyen,
Dinh Thai Hoang,
Diep N. Nguyen,
Quoc-Viet Pham
Abstract:
In the domain of Federated Learning (FL) systems, recent cutting-edge methods heavily rely on ideal conditions convergence analysis. Specifically, these approaches assume that the training datasets on IoT devices possess similar attributes to the global data distribution. However, this approach fails to capture the full spectrum of data characteristics in real-time sensing FL systems. In order to…
▽ More
In the domain of Federated Learning (FL) systems, recent cutting-edge methods heavily rely on ideal conditions convergence analysis. Specifically, these approaches assume that the training datasets on IoT devices possess similar attributes to the global data distribution. However, this approach fails to capture the full spectrum of data characteristics in real-time sensing FL systems. In order to overcome this limitation, we suggest a new approach system specifically designed for IoT networks with real-time sensing capabilities. Our approach takes into account the generalization gap due to the user's data sampling process. By effectively controlling this sampling process, we can mitigate the overfitting issue and improve overall accuracy. In particular, We first formulate an optimization problem that harnesses the sampling process to concurrently reduce overfitting while maximizing accuracy. In pursuit of this objective, our surrogate optimization problem is adept at handling energy efficiency while optimizing the accuracy with high generalization. To solve the optimization problem with high complexity, we introduce an online reinforcement learning algorithm, named Sample-driven Control for Federated Learning (SCFL) built on the Soft Actor-Critic (A2C) framework. This enables the agent to dynamically adapt and find the global optima even in changing environments. By leveraging the capabilities of SCFL, our system offers a promising solution for resource allocation in FL systems with real-time sensing capabilities.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Adversarial Machine Learning for Social Good: Reframing the Adversary as an Ally
Authors:
Shawqi Al-Maliki,
Adnan Qayyum,
Hassan Ali,
Mohamed Abdallah,
Junaid Qadir,
Dinh Thai Hoang,
Dusit Niyato,
Ala Al-Fuqaha
Abstract:
Deep Neural Networks (DNNs) have been the driving force behind many of the recent advances in machine learning. However, research has shown that DNNs are vulnerable to adversarial examples -- input samples that have been perturbed to force DNN-based models to make errors. As a result, Adversarial Machine Learning (AdvML) has gained a lot of attention, and researchers have investigated these vulner…
▽ More
Deep Neural Networks (DNNs) have been the driving force behind many of the recent advances in machine learning. However, research has shown that DNNs are vulnerable to adversarial examples -- input samples that have been perturbed to force DNN-based models to make errors. As a result, Adversarial Machine Learning (AdvML) has gained a lot of attention, and researchers have investigated these vulnerabilities in various settings and modalities. In addition, DNNs have also been found to incorporate embedded bias and often produce unexplainable predictions, which can result in anti-social AI applications. The emergence of new AI technologies that leverage Large Language Models (LLMs), such as ChatGPT and GPT-4, increases the risk of producing anti-social applications at scale. AdvML for Social Good (AdvML4G) is an emerging field that repurposes the AdvML bug to invent pro-social applications. Regulators, practitioners, and researchers should collaborate to encourage the development of pro-social applications and hinder the development of anti-social ones. In this work, we provide the first comprehensive review of the emerging field of AdvML4G. This paper encompasses a taxonomy that highlights the emergence of AdvML4G, a discussion of the differences and similarities between AdvML4G and AdvML, a taxonomy covering social good-related concepts and aspects, an exploration of the motivations behind the emergence of AdvML4G at the intersection of ML4G and AdvML, and an extensive summary of the works that utilize AdvML4G as an auxiliary tool for innovating pro-social applications. Finally, we elaborate upon various challenges and open research issues that require significant attention from the research community.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Effect of polydispersity on the transport and sound absorbing properties of three-dimensional random fibrous structures
Authors:
Q. V. Tran,
C. Perrot,
R. Panneton,
M. T. Hoang,
L. Dejaeger,
V. Marcel,
M. Jouve
Abstract:
A technique is proposed that uses a multi-scale approach to calculate transport properties of compressed felts using only image analysis and numerical calculations. From the image analysis fiber diameter distribution and fiber orientation are determined. From a known porosity and the latter two characteristics, two representative elementary volumes (REV) are constructed: one based on the volume-we…
▽ More
A technique is proposed that uses a multi-scale approach to calculate transport properties of compressed felts using only image analysis and numerical calculations. From the image analysis fiber diameter distribution and fiber orientation are determined. From a known porosity and the latter two characteristics, two representative elementary volumes (REV) are constructed: one based on the volume-weighted average diameter and one on an inverse volume-weighted average diameter. Numerical calculations on the former showed that it correctly estimates viscous and thermal permeabilities, while the latter correctly estimates tortuosity and viscous and thermal characteristic lengths. From these calculations, micro-macro analytical expressions are developed to estimate the transport properties of polydisperse composite felts based solely on open porosity, fiber diameter polydiversity, and fiber orientation. Good agreements are obtained between analytical predictions and measurements of transport properties. The predicted transport properties are also used in the Johnson-Champoux-Allard-Lafarge (JCAL) equivalent fluid model to predict the sound absorption coefficient of the felts. Excellent agreements are obtained with impedance tube measurements.
△ Less
Submitted 10 June, 2024; v1 submitted 17 September, 2023;
originally announced September 2023.
-
VulnSense: Efficient Vulnerability Detection in Ethereum Smart Contracts by Multimodal Learning with Graph Neural Network and Language Model
Authors:
Phan The Duy,
Nghi Hoang Khoa,
Nguyen Huu Quyen,
Le Cong Trinh,
Vu Trung Kien,
Trinh Minh Hoang,
Van-Hau Pham
Abstract:
This paper presents VulnSense framework, a comprehensive approach to efficiently detect vulnerabilities in Ethereum smart contracts using a multimodal learning approach on graph-based and natural language processing (NLP) models. Our proposed framework combines three types of features from smart contracts comprising source code, opcode sequences, and control flow graph (CFG) extracted from bytecod…
▽ More
This paper presents VulnSense framework, a comprehensive approach to efficiently detect vulnerabilities in Ethereum smart contracts using a multimodal learning approach on graph-based and natural language processing (NLP) models. Our proposed framework combines three types of features from smart contracts comprising source code, opcode sequences, and control flow graph (CFG) extracted from bytecode. We employ Bidirectional Encoder Representations from Transformers (BERT), Bidirectional Long Short-Term Memory (BiLSTM) and Graph Neural Network (GNN) models to extract and analyze these features. The final layer of our multimodal approach consists of a fully connected layer used to predict vulnerabilities in Ethereum smart contracts. Addressing limitations of existing vulnerability detection methods relying on single-feature or single-model deep learning techniques, our method surpasses accuracy and effectiveness constraints. We assess VulnSense using a collection of 1.769 smart contracts derived from the combination of three datasets: Curated, SolidiFI-Benchmark, and Smartbugs Wild. We then make a comparison with various unimodal and multimodal learning techniques contributed by GNN, BiLSTM and BERT architectures. The experimental outcomes demonstrate the superior performance of our proposed approach, achieving an average accuracy of 77.96\% across all three categories of vulnerable smart contracts.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Retail store customer behavior analysis system: Design and Implementation
Authors:
Tuan Dinh Nguyen,
Keisuke Hihara,
Tung Cao Hoang,
Yumeka Utada,
Akihiko Torii,
Naoki Izumi,
Nguyen Thanh Thuy,
Long Quoc Tran
Abstract:
Understanding customer behavior in retail stores plays a crucial role in improving customer satisfaction by adding personalized value to services. Behavior analysis reveals both general and detailed patterns in the interaction of customers with a store items and other people, providing store managers with insight into customer preferences. Several solutions aim to utilize this data by recognizing…
▽ More
Understanding customer behavior in retail stores plays a crucial role in improving customer satisfaction by adding personalized value to services. Behavior analysis reveals both general and detailed patterns in the interaction of customers with a store items and other people, providing store managers with insight into customer preferences. Several solutions aim to utilize this data by recognizing specific behaviors through statistical visualization. However, current approaches are limited to the analysis of small customer behavior sets, utilizing conventional methods to detect behaviors. They do not use deep learning techniques such as deep neural networks, which are powerful methods in the field of computer vision. Furthermore, these methods provide limited figures when visualizing the behavioral data acquired by the system. In this study, we propose a framework that includes three primary parts: mathematical modeling of customer behaviors, behavior analysis using an efficient deep learning based system, and individual and group behavior visualization. Each module and the entire system were validated using data from actual situations in a retail store.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Companion Animal Disease Diagnostics based on Literal-aware Medical Knowledge Graph Representation Learning
Authors:
Van Thuy Hoang,
Sang Thanh Nguyen,
Sangmyeong Lee,
Jooho Lee,
Luong Vuong Nguyen,
O-Joun Lee
Abstract:
Knowledge graph (KG) embedding has been used to benefit the diagnosis of animal diseases by analyzing electronic medical records (EMRs), such as notes and veterinary records. However, learning representations to capture entities and relations with literal information in KGs is challenging as the KGs show heterogeneous properties and various types of literal information. Meanwhile, the existing met…
▽ More
Knowledge graph (KG) embedding has been used to benefit the diagnosis of animal diseases by analyzing electronic medical records (EMRs), such as notes and veterinary records. However, learning representations to capture entities and relations with literal information in KGs is challenging as the KGs show heterogeneous properties and various types of literal information. Meanwhile, the existing methods mostly aim to preserve graph structures surrounding target nodes without considering different types of literals, which could also carry significant information. In this paper, we propose a knowledge graph embedding model for the efficient diagnosis of animal diseases, which could learn various types of literal information and graph structure and fuse them into unified representations, namely LiteralKG. Specifically, we construct a knowledge graph that is built from EMRs along with literal information collected from various animal hospitals. We then fuse different types of entities and node feature information into unified vector representations through gate networks. Finally, we propose a self-supervised learning task to learn graph structure in pretext tasks and then towards various downstream tasks. Experimental results on link prediction tasks demonstrate that our model outperforms the baselines that consist of state-of-the-art models. The source code is available at https://github.com/NSLab-CUK/LiteralKG.
△ Less
Submitted 31 August, 2023;
originally announced September 2023.
-
A Human-Centric Metaverse Enabled by Brain-Computer Interface: A Survey
Authors:
Howe Yuan Zhu,
Nguyen Quang Hieu,
Dinh Thai Hoang,
Diep N. Nguyen,
Chin-Teng Lin
Abstract:
The growing interest in the Metaverse has generated momentum for members of academia and industry to innovate toward realizing the Metaverse world. The Metaverse is a unique, continuous, and shared virtual world where humans embody a digital form within an online platform. Through a digital avatar, Metaverse users should have a perceptual presence within the environment and can interact and contro…
▽ More
The growing interest in the Metaverse has generated momentum for members of academia and industry to innovate toward realizing the Metaverse world. The Metaverse is a unique, continuous, and shared virtual world where humans embody a digital form within an online platform. Through a digital avatar, Metaverse users should have a perceptual presence within the environment and can interact and control the virtual world around them. Thus, a human-centric design is a crucial element of the Metaverse. The human users are not only the central entity but also the source of multi-sensory data that can be used to enrich the Metaverse ecosystem. In this survey, we study the potential applications of Brain-Computer Interface (BCI) technologies that can enhance the experience of Metaverse users. By directly communicating with the human brain, the most complex organ in the human body, BCI technologies hold the potential for the most intuitive human-machine system operating at the speed of thought. BCI technologies can enable various innovative applications for the Metaverse through this neural pathway, such as user cognitive state monitoring, digital avatar control, virtual interactions, and imagined speech communications. This survey first outlines the fundamental background of the Metaverse and BCI technologies. We then discuss the current challenges of the Metaverse that can potentially be addressed by BCI, such as motion sickness when users experience virtual environments or the negative emotional states of users in immersive virtual applications. After that, we propose and discuss a new research direction called Human Digital Twin, in which digital twins can create an intelligent and interactable avatar from the user's brain signals. We also present the challenges and potential solutions in synchronizing and communicating between virtual and physical entities in the Metaverse.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Securing Blockchain Systems: A Novel Collaborative Learning Framework to Detect Attacks in Transactions and Smart Contracts
Authors:
Tran Viet Khoa,
Do Hai Son,
Chi-Hieu Nguyen,
Dinh Thai Hoang,
Diep N. Nguyen,
Nguyen Linh Trung,
Tran Thi Thuy Quynh,
Trong-Minh Hoang,
Nguyen Viet Ha,
Eryk Dutkiewicz,
Mohammad Abu Alsheikh
Abstract:
With the escalating prevalence of malicious activities exploiting vulnerabilities in blockchain systems, there is an urgent requirement for robust attack detection mechanisms. To address this challenge, this paper presents a novel collaborative learning framework designed to detect attacks in blockchain transactions and smart contracts by analyzing transaction features. Our framework exhibits the…
▽ More
With the escalating prevalence of malicious activities exploiting vulnerabilities in blockchain systems, there is an urgent requirement for robust attack detection mechanisms. To address this challenge, this paper presents a novel collaborative learning framework designed to detect attacks in blockchain transactions and smart contracts by analyzing transaction features. Our framework exhibits the capability to classify various types of blockchain attacks, including intricate attacks at the machine code level (e.g., injecting malicious codes to withdraw coins from users unlawfully), which typically necessitate significant time and security expertise to detect. To achieve that, the proposed framework incorporates a unique tool that transforms transaction features into visual representations, facilitating efficient analysis and classification of low-level machine codes. Furthermore, we propose a customized collaborative learning model to enable real-time detection of diverse attack types at distributed mining nodes. In order to create a comprehensive dataset, we deploy a pilot system based on a private Ethereum network and conduct multiple attack scenarios. To the best of our knowledge, our dataset is the most comprehensive and diverse collection of transactions and smart contracts synthesized in a laboratory for cyberattack detection in blockchain systems. Our framework achieves a detection accuracy of approximately 94\% through extensive simulations and real-time experiments with a throughput of over 2,150 transactions per second. These compelling results validate the efficacy of our framework and showcase its adaptability in addressing real-world cyberattack scenarios.
△ Less
Submitted 26 March, 2024; v1 submitted 30 August, 2023;
originally announced August 2023.
-
Universal Graph Continual Learning
Authors:
Thanh Duc Hoang,
Do Viet Tung,
Duy-Hung Nguyen,
Bao-Sinh Nguyen,
Huy Hoang Nguyen,
Hung Le
Abstract:
We address catastrophic forgetting issues in graph learning as incoming data transits from one to another graph distribution. Whereas prior studies primarily tackle one setting of graph continual learning such as incremental node classification, we focus on a universal approach wherein each data point in a task can be a node or a graph, and the task varies from node to graph classification. We pro…
▽ More
We address catastrophic forgetting issues in graph learning as incoming data transits from one to another graph distribution. Whereas prior studies primarily tackle one setting of graph continual learning such as incremental node classification, we focus on a universal approach wherein each data point in a task can be a node or a graph, and the task varies from node to graph classification. We propose a novel method that enables graph neural networks to excel in this universal setting. Our approach perseveres knowledge about past tasks through a rehearsal mechanism that maintains local and global structure consistency across the graphs. We benchmark our method against various continual learning baselines in real-world graph datasets and achieve significant improvement in average performance and forgetting across tasks.
△ Less
Submitted 26 August, 2023;
originally announced August 2023.
-
Time-to-Pattern: Information-Theoretic Unsupervised Learning for Scalable Time Series Summarization
Authors:
Alireza Ghods,
Trong Nghia Hoang,
Diane Cook
Abstract:
Data summarization is the process of generating interpretable and representative subsets from a dataset. Existing time series summarization approaches often search for recurring subsequences using a set of manually devised similarity functions to summarize the data. However, such approaches are fraught with limitations stemming from an exhaustive search coupled with a heuristic definition of serie…
▽ More
Data summarization is the process of generating interpretable and representative subsets from a dataset. Existing time series summarization approaches often search for recurring subsequences using a set of manually devised similarity functions to summarize the data. However, such approaches are fraught with limitations stemming from an exhaustive search coupled with a heuristic definition of series similarity. Such approaches affect the diversity and comprehensiveness of the generated data summaries. To mitigate these limitations, we introduce an approach to time series summarization, called Time-to-Pattern (T2P), which aims to find a set of diverse patterns that together encode the most salient information, following the notion of minimum description length. T2P is implemented as a deep generative model that learns informative embeddings of the discrete time series on a latent space specifically designed to be interpretable. Our synthetic and real-world experiments reveal that T2P discovers informative patterns, even in noisy and complex settings. Furthermore, our results also showcase the improved performance of T2P over previous work in pattern diversity and processing scalability, which conclusively demonstrate the algorithm's effectiveness for time series summarization.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Transitivity-Preserving Graph Representation Learning for Bridging Local Connectivity and Role-based Similarity
Authors:
Van Thuy Hoang,
O-Joun Lee
Abstract:
Graph representation learning (GRL) methods, such as graph neural networks and graph transformer models, have been successfully used to analyze graph-structured data, mainly focusing on node classification and link prediction tasks. However, the existing studies mostly only consider local connectivity while ignoring long-range connectivity and the roles of nodes. In this paper, we propose Unified…
▽ More
Graph representation learning (GRL) methods, such as graph neural networks and graph transformer models, have been successfully used to analyze graph-structured data, mainly focusing on node classification and link prediction tasks. However, the existing studies mostly only consider local connectivity while ignoring long-range connectivity and the roles of nodes. In this paper, we propose Unified Graph Transformer Networks (UGT) that effectively integrate local and global structural information into fixed-length vector representations. First, UGT learns local structure by identifying the local substructures and aggregating features of the $k$-hop neighborhoods of each node. Second, we construct virtual edges, bridging distant nodes with structural similarity to capture the long-range dependencies. Third, UGT learns unified representations through self-attention, encoding structural distance and $p$-step transition probability between node pairs. Furthermore, we propose a self-supervised learning task that effectively learns transition probability to fuse local and global structural features, which could then be transferred to other downstream tasks. Experimental results on real-world benchmark datasets over various downstream tasks showed that UGT significantly outperformed baselines that consist of state-of-the-art models. In addition, UGT reaches the expressive power of the third-order Weisfeiler-Lehman isomorphism test (3d-WL) in distinguishing non-isomorphic graph pairs. The source code is available at https://github.com/NSLab-CUK/Unified-Graph-Transformer.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
APPFLx: Providing Privacy-Preserving Cross-Silo Federated Learning as a Service
Authors:
Zilinghan Li,
Shilan He,
Pranshu Chaturvedi,
Trung-Hieu Hoang,
Minseok Ryu,
E. A. Huerta,
Volodymyr Kindratenko,
Jordan Fuhrman,
Maryellen Giger,
Ryan Chard,
Kibaek Kim,
Ravi Madduri
Abstract:
Cross-silo privacy-preserving federated learning (PPFL) is a powerful tool to collaboratively train robust and generalized machine learning (ML) models without sharing sensitive (e.g., healthcare of financial) local data. To ease and accelerate the adoption of PPFL, we introduce APPFLx, a ready-to-use platform that provides privacy-preserving cross-silo federated learning as a service. APPFLx empl…
▽ More
Cross-silo privacy-preserving federated learning (PPFL) is a powerful tool to collaboratively train robust and generalized machine learning (ML) models without sharing sensitive (e.g., healthcare of financial) local data. To ease and accelerate the adoption of PPFL, we introduce APPFLx, a ready-to-use platform that provides privacy-preserving cross-silo federated learning as a service. APPFLx employs Globus authentication to allow users to easily and securely invite trustworthy collaborators for PPFL, implements several synchronous and asynchronous FL algorithms, streamlines the FL experiment launch process, and enables tracking and visualizing the life cycle of FL experiments, allowing domain experts and ML practitioners to easily orchestrate and evaluate cross-silo FL under one platform. APPFLx is available online at https://appflx.link
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Wirelessly Powered Federated Learning Networks: Joint Power Transfer, Data Sensing, Model Training, and Resource Allocation
Authors:
Mai Le,
Dinh Thai Hoang,
Diep N. Nguyen,
Won-Joo Hwang,
Quoc-Viet Pham
Abstract:
Federated learning (FL) has found many successes in wireless networks; however, the implementation of FL has been hindered by the energy limitation of mobile devices (MDs) and the availability of training data at MDs. How to integrate wireless power transfer and mobile crowdsensing towards sustainable FL solutions is a research topic entirely missing from the open literature. This work for the fir…
▽ More
Federated learning (FL) has found many successes in wireless networks; however, the implementation of FL has been hindered by the energy limitation of mobile devices (MDs) and the availability of training data at MDs. How to integrate wireless power transfer and mobile crowdsensing towards sustainable FL solutions is a research topic entirely missing from the open literature. This work for the first time investigates a resource allocation problem in collaborative sensing-assisted sustainable FL (S2FL) networks with the goal of minimizing the total completion time. We investigate a practical harvesting-sensing-training-transmitting protocol in which energy-limited MDs first harvest energy from RF signals, use it to gain a reward for user participation, sense the training data from the environment, train the local models at MDs, and transmit the model updates to the server. The total completion time minimization problem of jointly optimizing power transfer, transmit power allocation, data sensing, bandwidth allocation, local model training, and data transmission is complicated due to the non-convex objective function, highly non-convex constraints, and strongly coupled variables. We propose a computationally-efficient path-following algorithm to obtain the optimal solution via the decomposition technique. In particular, inner convex approximations are developed for the resource allocation subproblem, and the subproblems are performed alternatively in an iterative fashion. Simulation results are provided to evaluate the effectiveness of the proposed S2FL algorithm in reducing the completion time up to 21.45% in comparison with other benchmark schemes. Further, we investigate an extension of our work from frequency division multiple access (FDMA) to non-orthogonal multiple access (NOMA) and show that NOMA can speed up the total completion time 8.36% on average of the considered FL system.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
A Benchmarking Study of Matching Algorithms for Knowledge Graph Entity Alignment
Authors:
Nhat-Minh Dao,
Thai V. Hoang,
Zonghua Zhang
Abstract:
How to identify those equivalent entities between knowledge graphs (KGs), which is called Entity Alignment (EA), is a long-standing challenge. So far, many methods have been proposed, with recent focus on leveraging Deep Learning to solve this problem. However, we observe that most of the efforts has been paid to having better representation of entities, rather than improving entity matching from…
▽ More
How to identify those equivalent entities between knowledge graphs (KGs), which is called Entity Alignment (EA), is a long-standing challenge. So far, many methods have been proposed, with recent focus on leveraging Deep Learning to solve this problem. However, we observe that most of the efforts has been paid to having better representation of entities, rather than improving entity matching from the learned representations. In fact, how to efficiently infer the entity pairs from this similarity matrix, which is essentially a matching problem, has been largely ignored by the community. Motivated by this observation, we conduct an in-depth analysis on existing algorithms that are particularly designed for solving this matching problem, and propose a novel matching method, named Bidirectional Matching (BMat). Our extensive experimental results on public datasets indicate that there is currently no single silver bullet solution for EA. In other words, different classes of entity similarity estimation may require different matching algorithms to reach the best EA results for each class. We finally conclude that using PARIS, the state-of-the-art EA approach, with BMat gives the best combination in terms of EA performance and the algorithm's time and space complexity.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Variational quantum algorithm for ergotropy estimation in quantum many-body batteries
Authors:
Duc Tuan Hoang,
Friederike Metz,
Andreas Thomasen,
Tran Duong Anh-Tai,
Thomas Busch,
Thomás Fogarty
Abstract:
Quantum batteries are predicted to have the potential to outperform their classical counterparts and are therefore an important element in the development of quantum technologies. Of particular interest is the role of correlations in many-body quantum batteries and how these can affect the maximal work extraction, quantified by the ergotropy. In this work we simulate the charging process and work…
▽ More
Quantum batteries are predicted to have the potential to outperform their classical counterparts and are therefore an important element in the development of quantum technologies. Of particular interest is the role of correlations in many-body quantum batteries and how these can affect the maximal work extraction, quantified by the ergotropy. In this work we simulate the charging process and work extraction of many-body quantum batteries on noisy-intermediate scale quantum (NISQ) devices, and devise the Variational Quantum Ergotropy (VQErgo) algorithm which finds the optimal unitary operation that maximises work extraction from the battery. We test VQErgo by calculating the ergotropy of a many-body quantum battery undergoing transverse field Ising dynamics following a sudden quench. We investigate the battery for different system sizes and charging times, and analyze the minimum number of ansatz circuit repetitions needed for the variational optimization using both ideal and noisy simulators. We also discuss how the growth of long-range correlations can hamper the accuracy of VQErgo in larger systems, requiring increased repetitions of the ansatz circuit to reduce error. Finally, we optimize part of the VQErgo algorithm and calculate the ergotropy on one of IBM's quantum devices.
△ Less
Submitted 1 February, 2024; v1 submitted 7 August, 2023;
originally announced August 2023.
-
Countering Eavesdroppers with Meta-learning-based Cooperative Ambient Backscatter Communications
Authors:
Nam H. Chu,
Nguyen Van Huynh,
Diep N. Nguyen,
Dinh Thai Hoang,
Shimin Gong,
Tao Shu,
Eryk Dutkiewicz,
Khoa T. Phan
Abstract:
This article introduces a novel lightweight framework using ambient backscattering communications to counter eavesdroppers. In particular, our framework divides an original message into two parts: (i) the active-transmit message transmitted by the transmitter using conventional RF signals and (ii) the backscatter message transmitted by an ambient backscatter tag that backscatters upon the active s…
▽ More
This article introduces a novel lightweight framework using ambient backscattering communications to counter eavesdroppers. In particular, our framework divides an original message into two parts: (i) the active-transmit message transmitted by the transmitter using conventional RF signals and (ii) the backscatter message transmitted by an ambient backscatter tag that backscatters upon the active signals emitted by the transmitter. Notably, the backscatter tag does not generate its own signal, making it difficult for an eavesdropper to detect the backscattered signals unless they have prior knowledge of the system. Here, we assume that without decoding/knowing the backscatter message, the eavesdropper is unable to decode the original message. Even in scenarios where the eavesdropper can capture both messages, reconstructing the original message is a complex task without understanding the intricacies of the message-splitting mechanism. A challenge in our proposed framework is to effectively decode the backscattered signals at the receiver, often accomplished using the maximum likelihood (MLK) approach. However, such a method may require a complex mathematical model together with perfect channel state information (CSI). To address this issue, we develop a novel deep meta-learning-based signal detector that can not only effectively decode the weak backscattered signals without requiring perfect CSI but also quickly adapt to a new wireless environment with very little knowledge. Simulation results show that our proposed learning approach, without requiring perfect CSI and complex mathematical model, can achieve a bit error ratio close to that of the MLK-based approach. They also clearly show the efficiency of the proposed approach in dealing with eavesdrop** attacks and the lack of training data for deep learning models in practical scenarios.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Fox-Neuwirth cells, quantum shuffle algebras, and character sums of the resultant
Authors:
Anh Trong Nam Hoang
Abstract:
We give an upper bound on character sums of the resultant over pairs of monic square-free polynomials of given degrees, answering a question of Ellenberg and Shusterman in the quadratic case. Our approach is topological: we compute the homology of braid groups on multi-punctured planes and prove a vanishing range for the homology of mixed braid groups with rank-1 local coefficients associated to c…
▽ More
We give an upper bound on character sums of the resultant over pairs of monic square-free polynomials of given degrees, answering a question of Ellenberg and Shusterman in the quadratic case. Our approach is topological: we compute the homology of braid groups on multi-punctured planes and prove a vanishing range for the homology of mixed braid groups with rank-1 local coefficients associated to characters of finite fields. Our method involves constructing a cellular stratification for configuration spaces of multi-punctured planes and relating their twisted homology with more general exponential coefficients to the cohomology of certain bimodules over quantum shuffle algebras.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Numerical modeling of thermal dust polarization from aligned grains in the envelope of evolved stars with updated POLARIS
Authors:
Bao Truong,
Thiem Hoang,
Nguyen Chau Giang,
Pham Ngoc Diep,
Dieu D. Nguyen,
Nguyen Bich Ngoc
Abstract:
Magnetic fields are thought to influence the formation and evolution of evolved star envelopes. Thermal dust polarization from magnetically aligned grains is potentially a powerful tool for probing magnetic fields and dust properties in these circumstellar environments. In this paper, we present numerical modeling of thermal dust polarization from the envelope of IK Tau using the magnetically enha…
▽ More
Magnetic fields are thought to influence the formation and evolution of evolved star envelopes. Thermal dust polarization from magnetically aligned grains is potentially a powerful tool for probing magnetic fields and dust properties in these circumstellar environments. In this paper, we present numerical modeling of thermal dust polarization from the envelope of IK Tau using the magnetically enhanced radiative torque (MRAT) alignment theory implemented in our updated POLARIS code. Due to the strong stellar radiation field, the minimum size required for RAT alignment of silicate grains is $\sim 0.005 - 0.05\,\rmμm$. Additionally, ordinary paramagnetic grains can achieve perfect alignment by MRAT in the inner regions of $r < 500\,\rm au$ due to stronger magnetic fields of $B\sim 10$ mG - 1G, producing thermal dust polarization degree of $\sim 10\,\%$. The polarization degree can be enhanced to $\sim 20-40\%$ for grains with embedded iron inclusions. We also find that the magnetic field geometry affects the alignment size and the resulting polarization degree due to the projection effect in the plane-of-sky. We also study the spectrum of polarized thermal dust emission and find the increased polarization degree toward $λ> 50\,\rmμm$ due to the alignment of small grains by MRAT. Furthermore, we investigate the impact of rotational disruption by RATs (RAT-D) and find the RAT-D effect cause a decrease in the dust polarization fraction. Finally, we compare our numerical results with available polarization data observed by SOFIA/HAWC+ for constraining dust properties, suggesting grains are unlikely to have embedded iron clusters and might have slightly elongated shapes. Our modeling results suggest further observational studies at far-infrared/sub-millimeter wavelengths to understand the properties of magnetic fields and dust in AGB envelopes.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Effects of Grain Magnetic Properties and Grain Growth on Synthetic Dust Polarization of MHD Simulations in Protostellar Environments
Authors:
Nguyen Chau Giang,
Thiem Hoang
Abstract:
Thermal dust polarization is a powerful tool to probe magnetic fields ($\textbf{B}$) and grain properties. However, a systematic study of the dependence of dust polarization on grain properties in protostellar environments is not yet available. In this paper, we post-process a non-ideal MHD simulation of a collapsing protostellar core with our updated POLARIS code to study in detail the effects of…
▽ More
Thermal dust polarization is a powerful tool to probe magnetic fields ($\textbf{B}$) and grain properties. However, a systematic study of the dependence of dust polarization on grain properties in protostellar environments is not yet available. In this paper, we post-process a non-ideal MHD simulation of a collapsing protostellar core with our updated POLARIS code to study in detail the effects of iron inclusions and grain growth on thermal dust polarization. We found that superparamagnetic (SPM) grains can produce high polarization degree of $p \sim 10-40\%$ beyond $\sim 500$ au from the protostar because of their efficient alignment by magnetically enhanced Radiative Torque mechanism. The magnetic field tangling by turbulence in the envelope causes the decrease in $p$ with increasing emission intensity $I$ as $p\propto I^α$ with the slope $α\sim -0.3$. But within 500 au, SPM grains tend to have inefficient internal alignment (IA) and be aligned with $\textbf{B}$ by RATs only, producing lower $p \sim 1\%$ and a steeper slope of $α\sim -0.6$. For paramagnetic (PM) grains, the alignment loss of grains above $1μm$ in the inner $\sim 200$ au produces $p << 1\%$ and the polarization hole with $α\sim -0.9$. Grain growth can increase $p$ in the envelope for SPM grains, but cause stronger depolarization for SPM grains in the inner $\sim 500$ au and for PM grains in the entire protostellar core. Finally, we found the increase of polarization angle dispersion function $S$ with iron inclusions and grain growth, implying the dependence of B-field strength measured using the DCF technique on grain alignment and grain properties.
△ Less
Submitted 13 March, 2024; v1 submitted 31 July, 2023;
originally announced July 2023.
-
Elastic Entangled Pair and Qubit Resource Management in Quantum Cloud Computing
Authors:
Rakpong Kaewpuang,
Minrui Xu,
Dinh Thai Hoang,
Dusit Niyato,
Han Yu,
Ruidong Li,
Zehui Xiong,
Jiawen Kang
Abstract:
Quantum cloud computing (QCC) offers a promising approach to efficiently provide quantum computing resources, such as quantum computers, to perform resource-intensive tasks. Like traditional cloud computing platforms, QCC providers can offer both reservation and on-demand plans for quantum resource provisioning to satisfy users' requirements. However, the fluctuations in user demand and quantum ci…
▽ More
Quantum cloud computing (QCC) offers a promising approach to efficiently provide quantum computing resources, such as quantum computers, to perform resource-intensive tasks. Like traditional cloud computing platforms, QCC providers can offer both reservation and on-demand plans for quantum resource provisioning to satisfy users' requirements. However, the fluctuations in user demand and quantum circuit requirements are challenging for efficient resource provisioning. Furthermore, in distributed QCC, entanglement routing is a critical component of quantum networks that enables remote entanglement communication between users and QCC providers. Further, maintaining entanglement fidelity in quantum networks is challenging due to the requirement for high-quality entanglement routing, especially when accessing the providers over long distances. To address these challenges, we propose a resource allocation model to provision quantum computing and networking resources. In particular, entangled pairs, entanglement routing, qubit resources, and circuits' waiting time are jointly optimized to achieve minimum total costs. We formulate the proposed model based on the two-stage stochastic programming, which takes into account the uncertainties of fidelity and qubit requirements, and quantum circuits' waiting time. Furthermore, we apply the Benders decomposition algorithm to divide the proposed model into sub-models to be solved simultaneously. Experimental results demonstrate that our model can achieve the optimal total costs and reduce total costs at most 49.43\% in comparison to the baseline model.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
Quantum Software Analytics: Opportunities and Challenges
Authors:
Thong Hoang,
Hoa Khanh Dam,
Tingting Bi,
Qinghua Lu,
Zhenchang Xing,
Liming Zhu,
Lam Duc Nguyen,
Shi** Chen
Abstract:
Quantum computing systems depend on the principles of quantum mechanics to perform multiple challenging tasks more efficiently than their classical counterparts. In classical software engineering, the software life cycle is used to document and structure the processes of design, implementation, and maintenance of software applications. It helps stakeholders understand how to build an application.…
▽ More
Quantum computing systems depend on the principles of quantum mechanics to perform multiple challenging tasks more efficiently than their classical counterparts. In classical software engineering, the software life cycle is used to document and structure the processes of design, implementation, and maintenance of software applications. It helps stakeholders understand how to build an application. In this paper, we summarize a set of software analytics topics and techniques in the development life cycle that can be leveraged and integrated into quantum software application development. The results of this work can assist researchers and practitioners in better understanding the quantum-specific emerging development activities, challenges, and opportunities in the next generation of quantum software.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Test-takers have a say: understanding the implications of the use of AI in language tests
Authors:
Dawen Zhang,
Thong Hoang,
Shidong Pan,
Yongquan Hu,
Zhenchang Xing,
Mark Staples,
Xiwei Xu,
Qinghua Lu,
Aaron Quigley
Abstract:
Language tests measure a person's ability to use a language in terms of listening, speaking, reading, or writing. Such tests play an integral role in academic, professional, and immigration domains, with entities such as educational institutions, professional accreditation bodies, and governments using them to assess candidate language proficiency. Recent advances in Artificial Intelligence (AI) a…
▽ More
Language tests measure a person's ability to use a language in terms of listening, speaking, reading, or writing. Such tests play an integral role in academic, professional, and immigration domains, with entities such as educational institutions, professional accreditation bodies, and governments using them to assess candidate language proficiency. Recent advances in Artificial Intelligence (AI) and the discipline of Natural Language Processing have prompted language test providers to explore AI's potential applicability within language testing, leading to transformative activity patterns surrounding language instruction and learning. However, with concerns over AI's trustworthiness, it is imperative to understand the implications of integrating AI into language testing. This knowledge will enable stakeholders to make well-informed decisions, thus safeguarding community well-being and testing integrity. To understand the concerns and effects of AI usage in language tests, we conducted interviews and surveys with English test-takers. To the best of our knowledge, this is the first empirical study aimed at identifying the implications of AI adoption in language tests from a test-taker perspective. Our study reveals test-taker perceptions and behavioral patterns. Specifically, we identify that AI integration may enhance perceptions of fairness, consistency, and availability. Conversely, it might incite mistrust regarding reliability and interactivity aspects, subsequently influencing the behaviors and well-being of test-takers. These insights provide a better understanding of potential societal implications and assist stakeholders in making informed decisions concerning AI usage in language testing.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Are We Ready to Embrace Generative AI for Software Q&A?
Authors:
Bowen Xu,
Thanh-Dat Nguyen,
Thanh Le-Cong,
Thong Hoang,
Jiakun Liu,
Kisub Kim,
Chen Gong,
Changan Niu,
Chenyu Wang,
Bach Le,
David Lo
Abstract:
Stack Overflow, the world's largest software Q&A (SQA) website, is facing a significant traffic drop due to the emergence of generative AI techniques. ChatGPT is banned by Stack Overflow after only 6 days from its release. The main reason provided by the official Stack Overflow is that the answers generated by ChatGPT are of low quality. To verify this, we conduct a comparative evaluation of human…
▽ More
Stack Overflow, the world's largest software Q&A (SQA) website, is facing a significant traffic drop due to the emergence of generative AI techniques. ChatGPT is banned by Stack Overflow after only 6 days from its release. The main reason provided by the official Stack Overflow is that the answers generated by ChatGPT are of low quality. To verify this, we conduct a comparative evaluation of human-written and ChatGPT-generated answers. Our methodology employs both automatic comparison and a manual study. Our results suggest that human-written and ChatGPT-generated answers are semantically similar, however, human-written answers outperform ChatGPT-generated ones consistently across multiple aspects, specifically by 10% on the overall score. We release the data, analysis scripts, and detailed results at https://anonymous.4open.science/r/GAI4SQA-FD5C.
△ Less
Submitted 12 August, 2023; v1 submitted 19 July, 2023;
originally announced July 2023.
-
Right to be Forgotten in the Era of Large Language Models: Implications, Challenges, and Solutions
Authors:
Dawen Zhang,
Pamela Finckenberg-Broman,
Thong Hoang,
Shidong Pan,
Zhenchang Xing,
Mark Staples,
Xiwei Xu
Abstract:
The Right to be Forgotten (RTBF) was first established as the result of the ruling of Google Spain SL, Google Inc. v AEPD, Mario Costeja González, and was later included as the Right to Erasure under the General Data Protection Regulation (GDPR) of European Union to allow individuals the right to request personal data be deleted by organizations. Specifically for search engines, individuals can se…
▽ More
The Right to be Forgotten (RTBF) was first established as the result of the ruling of Google Spain SL, Google Inc. v AEPD, Mario Costeja González, and was later included as the Right to Erasure under the General Data Protection Regulation (GDPR) of European Union to allow individuals the right to request personal data be deleted by organizations. Specifically for search engines, individuals can send requests to organizations to exclude their information from the query results. It was a significant emergent right as the result of the evolution of technology. With the recent development of Large Language Models (LLMs) and their use in chatbots, LLM-enabled software systems have become popular. But they are not excluded from the RTBF. Compared with the indexing approach used by search engines, LLMs store, and process information in a completely different way. This poses new challenges for compliance with the RTBF. In this paper, we explore these challenges and provide our insights on how to implement technical solutions for the RTBF, including the use of differential privacy, machine unlearning, model editing, and guardrails. With the rapid advancement of AI and the increasing need of regulating this powerful technology, learning from the case of RTBF can provide valuable lessons for technical practitioners, legal experts, organizations, and authorities.
△ Less
Submitted 4 June, 2024; v1 submitted 8 July, 2023;
originally announced July 2023.
-
SeePrivacy: Automated Contextual Privacy Policy Generation for Mobile Applications
Authors:
Shidong Pan,
Zhen Tao,
Thong Hoang,
Dawen Zhang,
Zhenchang Xing,
Xiwei Xu,
Mark Staples,
David Lo
Abstract:
Privacy policies have become the most critical approach to safeguarding individuals' privacy and digital security. To enhance their presentation and readability, researchers propose the concept of contextual privacy policies (CPPs), aiming to fragment policies into shorter snippets and display them only in corresponding contexts. In this paper, we propose a novel multi-modal framework, namely SeeP…
▽ More
Privacy policies have become the most critical approach to safeguarding individuals' privacy and digital security. To enhance their presentation and readability, researchers propose the concept of contextual privacy policies (CPPs), aiming to fragment policies into shorter snippets and display them only in corresponding contexts. In this paper, we propose a novel multi-modal framework, namely SeePrivacy, designed to automatically generate contextual privacy policies for mobile apps. Our method synergistically combines mobile GUI understanding and privacy policy document analysis, yielding an impressive overall 83.6% coverage rate for privacy-related context detection and an accuracy of 0.92 in extracting corresponding policy segments. Remarkably, 96% of the retrieved policy segments can be correctly matched with their contexts. The user study shows SeePrivacy demonstrates excellent functionality and usability (4.5/5). Specifically, participants exhibit a greater willingness to read CPPs (4.1/5) compared to original privacy policies (2/5). Our solution effectively assists users in comprehending privacy notices, and this research establishes a solid foundation for further advancements and exploration.
△ Less
Submitted 9 July, 2023; v1 submitted 4 July, 2023;
originally announced July 2023.