-
ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding
Authors:
Quang P. M. Pham,
Khoi T. N. Nguyen,
Lan C. Ngo,
Truong Do,
Truong Son Hy
Abstract:
Scene graphs have been proven to be useful for various scene understanding tasks due to their compact and explicit nature. However, existing approaches often neglect the importance of maintaining the symmetry-preserving property when generating scene graphs from 3D point clouds. This oversight can diminish the accuracy and robustness of the resulting scene graphs, especially when handling noisy, m…
▽ More
Scene graphs have been proven to be useful for various scene understanding tasks due to their compact and explicit nature. However, existing approaches often neglect the importance of maintaining the symmetry-preserving property when generating scene graphs from 3D point clouds. This oversight can diminish the accuracy and robustness of the resulting scene graphs, especially when handling noisy, multi-view 3D data. This work, to the best of our knowledge, is the first to implement an Equivariant Graph Neural Network in semantic scene graph generation from 3D point clouds for scene understanding. Our proposed method, ESGNN, outperforms existing state-of-the-art approaches, demonstrating a significant improvement in scene estimation with faster convergence. ESGNN demands low computational resources and is easy to implement from available frameworks, paving the way for real-time applications such as robotics and computer vision.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Ollabench: Evaluating LLMs' Reasoning for Human-centric Interdependent Cybersecurity
Authors:
Tam n. Nguyen
Abstract:
Large Language Models (LLMs) have the potential to enhance Agent-Based Modeling by better representing complex interdependent cybersecurity systems, improving cybersecurity threat modeling and risk management. However, evaluating LLMs in this context is crucial for legal compliance and effective application development. Existing LLM evaluation frameworks often overlook the human factor and cogniti…
▽ More
Large Language Models (LLMs) have the potential to enhance Agent-Based Modeling by better representing complex interdependent cybersecurity systems, improving cybersecurity threat modeling and risk management. However, evaluating LLMs in this context is crucial for legal compliance and effective application development. Existing LLM evaluation frameworks often overlook the human factor and cognitive computing capabilities essential for interdependent cybersecurity. To address this gap, I propose OllaBench, a novel evaluation framework that assesses LLMs' accuracy, wastefulness, and consistency in answering scenario-based information security compliance and non-compliance questions. OllaBench is built on a foundation of 24 cognitive behavioral theories and empirical evidence from 38 peer-reviewed papers. OllaBench was used to evaluate 21 LLMs, including both open-weight and commercial models from OpenAI, Anthropic, Google, Microsoft, Meta and so on. The results reveal that while commercial LLMs have the highest overall accuracy scores, there is significant room for improvement. Smaller low-resolution open-weight LLMs are not far behind in performance, and there are significant differences in token efficiency and consistency among the evaluated models. OllaBench provides a user-friendly interface and supports a wide range of LLM platforms, making it a valuable tool for researchers and solution developers in the field of human-centric interdependent cybersecurity and beyond.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Geometry-aware framework for deep energy method: an application to structural mechanics with hyperelastic materials
Authors:
Thi Nguyen Khoa Nguyen,
Thibault Dairay,
Raphaël Meunier,
Christophe Millet,
Mathilde Mougeot
Abstract:
Physics-Informed Neural Networks (PINNs) have gained considerable interest in diverse engineering domains thanks to their capacity to integrate physical laws into deep learning models. Recently, geometry-aware PINN-based approaches that employ the strong form of underlying physical system equations have been developed with the aim of integrating geometric information into PINNs. Despite ongoing re…
▽ More
Physics-Informed Neural Networks (PINNs) have gained considerable interest in diverse engineering domains thanks to their capacity to integrate physical laws into deep learning models. Recently, geometry-aware PINN-based approaches that employ the strong form of underlying physical system equations have been developed with the aim of integrating geometric information into PINNs. Despite ongoing research, the assessment of PINNs in problems with various geometries remains an active area of investigation. In this work, we introduce a novel physics-informed framework named the Geometry-Aware Deep Energy Method (GADEM) for solving structural mechanics problems on different geometries. As the weak form of the physical system equation (or the energy-based approach) has demonstrated clear advantages compared to the strong form for solving solid mechanics problems, GADEM employs the weak form and aims to infer the solution on multiple shapes of geometries. Integrating a geometry-aware framework into an energy-based method results in an effective physics-informed deep learning model in terms of accuracy and computational cost. Different ways to represent the geometric information and to encode the geometric latent vectors are investigated in this work. We introduce a loss function of GADEM which is minimized based on the potential energy of all considered geometries. An adaptive learning method is also employed for the sampling of collocation points to enhance the performance of GADEM. We present some applications of GADEM to solve solid mechanics problems, including a loading simulation of a toy tire involving contact mechanics and large deformation hyperelasticity. The numerical results of this work demonstrate the remarkable capability of GADEM to infer the solution on various and new shapes of geometries using only one trained model.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Emerging Technologies for 6G Non-Terrestrial-Networks: From Academia to Industrial Applications
Authors:
Cong T. Nguyen,
Yuris Mulya Saputra,
Nguyen Van Huynh,
Tan N. Nguyen,
Dinh Thai Hoang,
Diep N Nguyen,
Van-Quan Pham,
Miroslav Voznak,
Symeon Chatzinotas,
Dinh-Hieu Tran
Abstract:
Terrestrial networks form the fundamental infrastructure of modern communication systems, serving more than 4 billion users globally. However, terrestrial networks are facing a wide range of challenges, from coverage and reliability to interference and congestion. As the demands of the 6G era are expected to be much higher, it is crucial to address these challenges to ensure a robust and efficient…
▽ More
Terrestrial networks form the fundamental infrastructure of modern communication systems, serving more than 4 billion users globally. However, terrestrial networks are facing a wide range of challenges, from coverage and reliability to interference and congestion. As the demands of the 6G era are expected to be much higher, it is crucial to address these challenges to ensure a robust and efficient communication infrastructure for the future. To address these problems, Non-terrestrial Network (NTN) has emerged to be a promising solution. NTNs are communication networks that leverage airborne (e.g., unmanned aerial vehicles) and spaceborne vehicles (e.g., satellites) to facilitate ultra-reliable communications and connectivity with high data rates and low latency over expansive regions. This article aims to provide a comprehensive survey on the utilization of network slicing, Artificial Intelligence/Machine Learning (AI/ML), and Open Radio Access Network (ORAN) to address diverse challenges of NTNs from the perspectives of both academia and industry. Particularly, we first provide an in-depth tutorial on NTN and the key enabling technologies including network slicing, AI/ML, and ORAN. Then, we provide a comprehensive survey on how network slicing and AI/ML have been leveraged to overcome the challenges that NTNs are facing. Moreover, we present how ORAN can be utilized for NTNs. Finally, we highlight important challenges, open issues, and future research directions of NTN in the 6G era.
△ Less
Submitted 3 July, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
RepoHyper: Better Context Retrieval Is All You Need for Repository-Level Code Completion
Authors:
Huy N. Phan,
Hoang N. Phan,
Tien N. Nguyen,
Nghi D. Q. Bui
Abstract:
Code Large Language Models (CodeLLMs) have demonstrated impressive proficiency in code completion tasks. However, they often fall short of fully understanding the extensive context of a project repository, such as the intricacies of relevant files and class hierarchies, which can result in less precise completions. To overcome these limitations, we present \tool, a multifaceted framework designed…
▽ More
Code Large Language Models (CodeLLMs) have demonstrated impressive proficiency in code completion tasks. However, they often fall short of fully understanding the extensive context of a project repository, such as the intricacies of relevant files and class hierarchies, which can result in less precise completions. To overcome these limitations, we present \tool, a multifaceted framework designed to address the complex challenges associated with repository-level code completion. Central to \tool is the {\em Repo-level Semantic Graph} (RSG), a novel semantic graph structure that encapsulates the vast context of code repositories. Furthermore, RepoHyper leverages \textit{Expand and Refine} retrieval method, including a graph expansion and a link prediction algorithm applied to the RSG, enabling the effective retrieval and prioritization of relevant code snippets. Our evaluations show that \tool markedly outperforms existing techniques in repository-level code completion, showcasing enhanced accuracy across various datasets when compared to several strong baselines. Our implementation of RepoHyper can be found at~\url{https://github.com/FSoft-AI4Code/RepoHyper}.
△ Less
Submitted 16 March, 2024; v1 submitted 10 March, 2024;
originally announced March 2024.
-
Inferring solar differential rotation and viscosity via passive imaging with inertial waves
Authors:
Tram Thi Ngoc Nguyen,
Thorsten Hohage,
Damien Fournier,
Laurent Gizon
Abstract:
The recent discovery of inertial waves on the surface of the Sun offers new possibilities to learn about the solar interior. These waves are long-lived with a period on the order of the Sun rotation period ($\sim$27 days) and are sensitive to parameters deep inside the Sun. They are excited by turbulent convection, leading to a passive imaging problem. In this work, we present the forward and inve…
▽ More
The recent discovery of inertial waves on the surface of the Sun offers new possibilities to learn about the solar interior. These waves are long-lived with a period on the order of the Sun rotation period ($\sim$27 days) and are sensitive to parameters deep inside the Sun. They are excited by turbulent convection, leading to a passive imaging problem. In this work, we present the forward and inverse problem of reconstructing viscosity and differential rotation on the Sun from cross-covariance observations of these inertial waves.
△ Less
Submitted 22 March, 2024; v1 submitted 1 March, 2024;
originally announced March 2024.
-
Integrating LLMs for Explainable Fault Diagnosis in Complex Systems
Authors:
Akshay J. Dave,
Tat Nghia Nguyen,
Richard B. Vilim
Abstract:
This paper introduces an integrated system designed to enhance the explainability of fault diagnostics in complex systems, such as nuclear power plants, where operator understanding is critical for informed decision-making. By combining a physics-based diagnostic tool with a Large Language Model, we offer a novel solution that not only identifies faults but also provides clear, understandable expl…
▽ More
This paper introduces an integrated system designed to enhance the explainability of fault diagnostics in complex systems, such as nuclear power plants, where operator understanding is critical for informed decision-making. By combining a physics-based diagnostic tool with a Large Language Model, we offer a novel solution that not only identifies faults but also provides clear, understandable explanations of their causes and implications. The system's efficacy is demonstrated through application to a molten salt facility, showcasing its ability to elucidate the connections between diagnosed faults and sensor data, answer operator queries, and evaluate historical sensor anomalies. Our approach underscores the importance of merging model-based diagnostics with advanced AI to improve the reliability and transparency of autonomous systems.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Data-Driven Evidence-Based Syntactic Sugar Design
Authors:
David OBrien,
Robert Dyer,
Tien N. Nguyen,
Hridesh Rajan
Abstract:
Programming languages are essential tools for developers, and their evolution plays a crucial role in supporting the activities of developers. One instance of programming language evolution is the introduction of syntactic sugars, which are additional syntax elements that provide alternative, more readable code constructs. However, the process of designing and evolving a programming language has t…
▽ More
Programming languages are essential tools for developers, and their evolution plays a crucial role in supporting the activities of developers. One instance of programming language evolution is the introduction of syntactic sugars, which are additional syntax elements that provide alternative, more readable code constructs. However, the process of designing and evolving a programming language has traditionally been guided by anecdotal experiences and intuition. Recent advances in tools and methodologies for mining open-source repositories have enabled developers to make data-driven software engineering decisions. In light of this, this paper proposes an approach for motivating data-driven programming evolution by applying frequent subgraph mining techniques to a large dataset of 166,827,154 open-source Java methods. The dataset is mined by generalizing Java control-flow graphs to capture broad programming language usages and instances of duplication. Frequent subgraphs are then extracted to identify potentially impactful opportunities for new syntactic sugars. Our diverse results demonstrate the benefits of the proposed technique by identifying new syntactic sugars involving a variety of programming constructs that could be implemented in Java, thus simplifying frequent code idioms. This approach can potentially provide valuable insights for Java language designers, and serve as a proof-of-concept for data-driven programming language design and evolution.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Representation formulas for maximal monotone operators of type (D) in Banach spaces whose dual spaces are strictly convex
Authors:
Nguyen B. Tran,
Tran N. Nguyen,
Huynh M. Hien
Abstract:
This work deals with a maximal monotone operator $A$ of type (D) in a Banach space whose dual space is strictly convex. We establish some representations for the value $Ax$ at a given point $x$ via its values at nearby points of $x$. We show that the faces of $Ax$ are contained in the set of all weak$^*$ convergent limits of bounded nets of the operator at nearby points of $x$, then we obtain a re…
▽ More
This work deals with a maximal monotone operator $A$ of type (D) in a Banach space whose dual space is strictly convex. We establish some representations for the value $Ax$ at a given point $x$ via its values at nearby points of $x$. We show that the faces of $Ax$ are contained in the set of all weak$^*$ convergent limits of bounded nets of the operator at nearby points of $x$, then we obtain a representation for $Ax$ by use of this set. In addition, representations for the support function of $Ax$ based on the minimal-norm selection of the operator in certain Banach spaces are given.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
FPGA-based residual amplitude modulation suppression and control for compact atomic clocks
Authors:
Tin Nghia Nguyen,
Thomas R. Schibli
Abstract:
We designed an FPGA fabric to provide phase modulation techniques to lock lasers to optical frequency references. The method incorporates an active residual-amplitude-modulation (RAM) suppression scheme that relies on complex modulation. All the required servos to construct an optical atomic clock are incorporated onto the same low-cost, commercial FPGA chip. We demonstrate a reliable, long-term R…
▽ More
We designed an FPGA fabric to provide phase modulation techniques to lock lasers to optical frequency references. The method incorporates an active residual-amplitude-modulation (RAM) suppression scheme that relies on complex modulation. All the required servos to construct an optical atomic clock are incorporated onto the same low-cost, commercial FPGA chip. We demonstrate a reliable, long-term RAM suppression 60 dB with the remaining RAM level at -100 dBc and an improved stability of three decades when applied on a two-photon rubidium clock.
△ Less
Submitted 31 October, 2023;
originally announced November 2023.
-
Large Language Models for Scientific Synthesis, Inference and Explanation
Authors:
Yizhen Zheng,
Huan Yee Koh,
Jiaxin Ju,
Anh T. N. Nguyen,
Lauren T. May,
Geoffrey I. Webb,
Shirui Pan
Abstract:
Large language models are a form of artificial intelligence systems whose primary knowledge consists of the statistical patterns, semantic relationships, and syntactical structures of language1. Despite their limited forms of "knowledge", these systems are adept at numerous complex tasks including creative writing, storytelling, translation, question-answering, summarization, and computer code gen…
▽ More
Large language models are a form of artificial intelligence systems whose primary knowledge consists of the statistical patterns, semantic relationships, and syntactical structures of language1. Despite their limited forms of "knowledge", these systems are adept at numerous complex tasks including creative writing, storytelling, translation, question-answering, summarization, and computer code generation. However, they have yet to demonstrate advanced applications in natural science. Here we show how large language models can perform scientific synthesis, inference, and explanation. We present a method for using general-purpose large language models to make inferences from scientific datasets of the form usually associated with special-purpose machine learning algorithms. We show that the large language model can augment this "knowledge" by synthesizing from the scientific literature. When a conventional machine learning system is augmented with this synthesized and inferred knowledge it can outperform the current state of the art across a range of benchmark tasks for predicting molecular properties. This approach has the further advantage that the large language model can explain the machine learning system's predictions. We anticipate that our framework will open new avenues for AI to accelerate the pace of scientific discovery.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Bi-level iterative regularization for inverse problems in nonlinear PDEs
Authors:
Tram Thi Ngoc Nguyen
Abstract:
We investigate the ill-posed inverse problem of recovering unknown spatially dependent parameters in nonlinear evolution PDEs. We propose a bi-level Landweber scheme, where the upper-level parameter reconstruction embeds a lower-level state approximation. This can be seen as combining the classical reduced setting and the newer all-at-once setting, allowing us to, respectively, utilize well-posedn…
▽ More
We investigate the ill-posed inverse problem of recovering unknown spatially dependent parameters in nonlinear evolution PDEs. We propose a bi-level Landweber scheme, where the upper-level parameter reconstruction embeds a lower-level state approximation. This can be seen as combining the classical reduced setting and the newer all-at-once setting, allowing us to, respectively, utilize well-posedness of the parameter-to-state map, and to bypass having to solve nonlinear PDEs exactly. Using this, we derive stop** rules for lower- and upper-level iterations and convergence of the bi-level method. We discuss application to parameter identification for the Landau-Lifshitz-Gilbert equation in magnetic particle imaging.
△ Less
Submitted 5 February, 2024; v1 submitted 31 August, 2023;
originally announced August 2023.
-
Learning in Cooperative Multiagent Systems Using Cognitive and Machine Models
Authors:
Thuy Ngoc Nguyen,
Duy Nhat Phan,
Cleotilde Gonzalez
Abstract:
Develo** effective Multi-Agent Systems (MAS) is critical for many applications requiring collaboration and coordination with humans. Despite the rapid advance of Multi-Agent Deep Reinforcement Learning (MADRL) in cooperative MAS, one major challenge is the simultaneous learning and interaction of independent agents in dynamic environments in the presence of stochastic rewards. State-of-the-art M…
▽ More
Develo** effective Multi-Agent Systems (MAS) is critical for many applications requiring collaboration and coordination with humans. Despite the rapid advance of Multi-Agent Deep Reinforcement Learning (MADRL) in cooperative MAS, one major challenge is the simultaneous learning and interaction of independent agents in dynamic environments in the presence of stochastic rewards. State-of-the-art MADRL models struggle to perform well in Coordinated Multi-agent Object Transportation Problems (CMOTPs), wherein agents must coordinate with each other and learn from stochastic rewards. In contrast, humans often learn rapidly to adapt to nonstationary environments that require coordination among people. In this paper, motivated by the demonstrated ability of cognitive models based on Instance-Based Learning Theory (IBLT) to capture human decisions in many dynamic decision making tasks, we propose three variants of Multi-Agent IBL models (MAIBL). The idea of these MAIBL algorithms is to combine the cognitive mechanisms of IBLT and the techniques of MADRL models to deal with coordination MAS in stochastic environments from the perspective of independent learners. We demonstrate that the MAIBL models exhibit faster learning and achieve better coordination in a dynamic CMOTP task with various settings of stochastic rewards compared to current MADRL models. We discuss the benefits of integrating cognitive insights into MADRL models.
△ Less
Submitted 17 August, 2023;
originally announced August 2023.
-
Batch Clip** and Adaptive Layerwise Clip** for Differential Private Stochastic Gradient Descent
Authors:
Toan N. Nguyen,
Phuong Ha Nguyen,
Lam M. Nguyen,
Marten Van Dijk
Abstract:
Each round in Differential Private Stochastic Gradient Descent (DPSGD) transmits a sum of clipped gradients obfuscated with Gaussian noise to a central server which uses this to update a global model which often represents a deep neural network. Since the clipped gradients are computed separately, which we call Individual Clip** (IC), deep neural networks like resnet-18 cannot use Batch Normaliz…
▽ More
Each round in Differential Private Stochastic Gradient Descent (DPSGD) transmits a sum of clipped gradients obfuscated with Gaussian noise to a central server which uses this to update a global model which often represents a deep neural network. Since the clipped gradients are computed separately, which we call Individual Clip** (IC), deep neural networks like resnet-18 cannot use Batch Normalization Layers (BNL) which is a crucial component in deep neural networks for achieving a high accuracy. To utilize BNL, we introduce Batch Clip** (BC) where, instead of clip** single gradients as in the orginal DPSGD, we average and clip batches of gradients. Moreover, the model entries of different layers have different sensitivities to the added Gaussian noise. Therefore, Adaptive Layerwise Clip** methods (ALC), where each layer has its own adaptively finetuned clip** constant, have been introduced and studied, but so far without rigorous DP proofs. In this paper, we propose {\em a new ALC and provide rigorous DP proofs for both BC and ALC}. Experiments show that our modified DPSGD with BC and ALC for CIFAR-$10$ with resnet-$18$ converges while DPSGD with IC and ALC does not.
△ Less
Submitted 21 July, 2023;
originally announced July 2023.
-
Credit Assignment: Challenges and Opportunities in Develo** Human-like AI Agents
Authors:
Thuy Ngoc Nguyen,
Chase McDonald,
Cleotilde Gonzalez
Abstract:
Temporal credit assignment is crucial for learning and skill development in natural and artificial intelligence. While computational methods like the TD approach in reinforcement learning have been proposed, it's unclear if they accurately represent how humans handle feedback delays. Cognitive models intend to represent the mental steps by which humans solve problems and perform a number of tasks,…
▽ More
Temporal credit assignment is crucial for learning and skill development in natural and artificial intelligence. While computational methods like the TD approach in reinforcement learning have been proposed, it's unclear if they accurately represent how humans handle feedback delays. Cognitive models intend to represent the mental steps by which humans solve problems and perform a number of tasks, but limited research in cognitive science has addressed the credit assignment problem in humans and cognitive models. Our research uses a cognitive model based on a theory of decisions from experience, Instance-Based Learning Theory (IBLT), to test different credit assignment mechanisms in a goal-seeking navigation task with varying levels of decision complexity. Instance-Based Learning (IBL) models simulate the process of making sequential choices with different credit assignment mechanisms, including a new IBL-TD model that combines the IBL decision mechanism with the TD approach. We found that (1) An IBL model that gives equal credit assignment to all decisions is able to match human performance better than other models, including IBL-TD and Q-learning; (2) IBL-TD and Q-learning models underperform compared to humans initially, but eventually, they outperform humans; (3) humans are influenced by decision complexity, while models are not. Our study provides insights into the challenges of capturing human behavior and the potential opportunities to use these models in future AI systems to support human activities.
△ Less
Submitted 16 July, 2023;
originally announced July 2023.
-
Better Language Models of Code through Self-Improvement
Authors:
Hung Quoc To,
Nghi D. Q. Bui,
** Guo,
Tien N. Nguyen
Abstract:
Pre-trained language models for code (PLMCs) have gained attention in recent research. These models are pre-trained on large-scale datasets using multi-modal objectives. However, fine-tuning them requires extensive supervision and is limited by the size of the dataset provided. We aim to improve this issue by proposing a simple data augmentation framework. Our framework utilizes knowledge gained d…
▽ More
Pre-trained language models for code (PLMCs) have gained attention in recent research. These models are pre-trained on large-scale datasets using multi-modal objectives. However, fine-tuning them requires extensive supervision and is limited by the size of the dataset provided. We aim to improve this issue by proposing a simple data augmentation framework. Our framework utilizes knowledge gained during the pre-training and fine-tuning stage to generate pseudo data, which is then used as training data for the next step. We incorporate this framework into the state-of-the-art language models, such as CodeT5, CodeBERT, and UnixCoder. The results show that our framework significantly improves PLMCs' performance in code-related sequence generation tasks, such as code summarization and code generation in the CodeXGLUE benchmark.
△ Less
Submitted 9 May, 2023; v1 submitted 2 April, 2023;
originally announced April 2023.
-
Managing Cold-start in The Serverless Cloud with Temporal Convolutional Networks
Authors:
Tam N. Nguyen
Abstract:
Serverless cloud is an innovative cloud service model that frees customers from most cloud management duties. It also offers the same advantages as other cloud models but at much lower costs. As a result, the serverless cloud has been increasingly employed in high-impact areas such as system security, banking, and health care. A big threat to the serverless cloud's performance is cold-start, which…
▽ More
Serverless cloud is an innovative cloud service model that frees customers from most cloud management duties. It also offers the same advantages as other cloud models but at much lower costs. As a result, the serverless cloud has been increasingly employed in high-impact areas such as system security, banking, and health care. A big threat to the serverless cloud's performance is cold-start, which is when the time of provisioning the needed cloud resource to serve customers' requests incurs unacceptable costs to the service providers and/or the customers. This paper proposes a novel low-coupling, high-cohesion ensemble policy that addresses the cold-start problem at infrastructure- and function-levels of the serverless cloud stack, while the state of the art policies have a more narrowed focus. This ensemble policy anchors on the prediction of function instance arrivals, 10 to 15 minutes into the future. It is achievable by using the temporal convolutional network (TCN) deep-learning method. Bench-marking results on a real-world dataset from a large-scale serverless cloud provider show that TCN out-performs other popular machine learning algorithms for time series. Going beyond cold-start management, the proposed policy and publicly available codes can be adopted in solving other cloud problems such as optimizing the provisioning of virtual software-defined network assets.
△ Less
Submitted 1 April, 2023;
originally announced April 2023.
-
Positive semidefinite interval of matrix pencil and its applications for the generalized trust region subproblems
Authors:
Van-Bong Nguyen,
Thi Ngan Nguyen
Abstract:
We are concerned with finding the set $I_{\succeq}(A,B)$ of real values $μ$ such that the matrix pencil $A+μB$ is positive semidefinite. If $A, B$ are not simultaneously diagonalizable via congruence (SDC), $I_{\succeq}(A,B)$ either is empty or has only one value $μ.$ When $A, B$ are SDC, $I_{\succeq}(A,B),$ if not empty, can be a singleton or an interval. Especially, if $I_{\succeq}(A,B)$ is an i…
▽ More
We are concerned with finding the set $I_{\succeq}(A,B)$ of real values $μ$ such that the matrix pencil $A+μB$ is positive semidefinite. If $A, B$ are not simultaneously diagonalizable via congruence (SDC), $I_{\succeq}(A,B)$ either is empty or has only one value $μ.$ When $A, B$ are SDC, $I_{\succeq}(A,B),$ if not empty, can be a singleton or an interval. Especially, if $I_{\succeq}(A,B)$ is an interval and at least one of the matrices is nonsingular then its interior is the positive definite interval $I_{\succ}(A,B).$ If $A, B$ are both singular, then even $I_{\succeq}(A,B)$ is an interval, its interior may not be $I_{\succ}(A,B),$ but $A, B$ are then decomposed to block diagonals of submatrices $A_1, B_1$ with $B_1$ nonsingular such that $I_{\succeq}(A,B)=I_{\succeq}(A_1,B_1).$ Applying $I_{\succeq}(A,B),$ the hard-case of the generalized trust-region subproblem (GTRS) can be dealt with by only solving a system of linear equations or reduced to the easy-case of a GTRS of smaller size.
△ Less
Submitted 28 February, 2023;
originally announced February 2023.
-
Fixed-budget online adaptive learning for physics-informed neural networks. Towards parameterized problem inference
Authors:
Thi Nguyen Khoa Nguyen,
Thibault Dairay,
Raphaël Meunier,
Christophe Millet,
Mathilde Mougeot
Abstract:
Physics-Informed Neural Networks (PINNs) have gained much attention in various fields of engineering thanks to their capability of incorporating physical laws into the models. PINNs integrate the physical constraints by minimizing the partial differential equations (PDEs) residuals on a set of collocation points. The distribution of these collocation points appears to have a huge impact on the per…
▽ More
Physics-Informed Neural Networks (PINNs) have gained much attention in various fields of engineering thanks to their capability of incorporating physical laws into the models. PINNs integrate the physical constraints by minimizing the partial differential equations (PDEs) residuals on a set of collocation points. The distribution of these collocation points appears to have a huge impact on the performance of PINNs and the assessment of the sampling methods for these points is still an active topic. In this paper, we propose a Fixed-Budget Online Adaptive Learning (FBOAL) method, which decomposes the domain into sub-domains, for training collocation points based on local maxima and local minima of the PDEs residuals. The effectiveness of FBOAL is demonstrated for non-parameterized and parameterized problems. The comparison with other adaptive sampling methods is also illustrated. The numerical results demonstrate important gains in terms of the accuracy and computational cost of PINNs with FBOAL over the classical PINNs with non-adaptive collocation points. We also apply FBOAL in a complex industrial application involving coupling between mechanical and thermal fields. We show that FBOAL is able to identify the high-gradient locations and even give better predictions for some physical fields than the classical PINNs with collocation points sampled on a pre-adapted finite element mesh built thanks to numerical expert knowledge. From the present study, it is expected that the use of FBOAL will help to improve the conventional numerical solver in the construction of the mesh.
△ Less
Submitted 6 March, 2023; v1 submitted 22 December, 2022;
originally announced December 2022.
-
Generalizing DP-SGD with Shuffling and Batch Clip**
Authors:
Marten van Dijk,
Phuong Ha Nguyen,
Toan N. Nguyen,
Lam M. Nguyen
Abstract:
Classical differential private DP-SGD implements individual clip** with random subsampling, which forces a mini-batch SGD approach. We provide a general differential private algorithmic framework that goes beyond DP-SGD and allows any possible first order optimizers (e.g., classical SGD and momentum based SGD approaches) in combination with batch clip**, which clips an aggregate of computed gr…
▽ More
Classical differential private DP-SGD implements individual clip** with random subsampling, which forces a mini-batch SGD approach. We provide a general differential private algorithmic framework that goes beyond DP-SGD and allows any possible first order optimizers (e.g., classical SGD and momentum based SGD approaches) in combination with batch clip**, which clips an aggregate of computed gradients rather than summing clipped gradients (as is done in individual clip**). The framework also admits sampling techniques beyond random subsampling such as shuffling. Our DP analysis follows the $f$-DP approach and introduces a new proof technique which allows us to derive simple closed form expressions and to also analyse group privacy. In particular, for $E$ epochs work and groups of size $g$, we show a $\sqrt{g E}$ DP dependency for batch clip** with shuffling.
△ Less
Submitted 25 July, 2023; v1 submitted 12 December, 2022;
originally announced December 2022.
-
Optimal sizing of renewable energy storage: A comparative study of hydrogen and battery system considering degradation and seasonal storage
Authors:
Son Tay Le,
Tuan Ngoc Nguyen,
Dac-Khuong Bui,
Tuan Duc Ngo
Abstract:
Renewable energy storage (RES) is essential to address the intermittence issues of renewable energy systems, thereby enhancing the system stability and reliability. This study presents an optimisation study of sizing and operational strategy parameters of a grid-connected photovoltaic (PV)-hydrogen/battery systems using a Multi-Objective Modified Firefly Algorithm (MOMFA). An operational strategy…
▽ More
Renewable energy storage (RES) is essential to address the intermittence issues of renewable energy systems, thereby enhancing the system stability and reliability. This study presents an optimisation study of sizing and operational strategy parameters of a grid-connected photovoltaic (PV)-hydrogen/battery systems using a Multi-Objective Modified Firefly Algorithm (MOMFA). An operational strategy that utilises the ability of hydrogen to store energy over a long time was also investigated. The proposed method was applied to a real-world distributed energy project located in the tropical climate zone. To further demonstrate the robustness and versatility of the method, another synthetic test case was examined for a location in the subtropical weather zone, which has a high seasonal mismatch. The performance of the proposed MOMFA method is compared with the NSGA-II method, which has been widely used to design renewable energy storage systems in the literature. The result shows that MOMFA is more accurate and robust than NSGA-II owing to the complex and dynamic nature of energy storage system. The optimisation results show that battery storage systems, as a mature technology, yield better economic performance than current hydrogen storage systems. However, it is proven that hydrogen storage systems provide better techno-economic performance and can be a viable long-term storage solution when high penetration of renewable energy is required. The study also proves that the proposed long-term operational strategy can lower component degradation, enhance efficiency, and increase the total economic performance of hydrogen storage systems. The findings of this study can support the implementation of energy storage systems for renewable energy.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Security and Reliability Analysis of Satellite-Terrestrial Multi-Relay Networks with Imperfect CSI
Authors:
Tan N. Nguyen,
Dinh-Hieu Tran,
Trinh Van Chien,
Van-Duc Phan,
Miroslav Voznak,
Symeon Chatzinotas
Abstract:
This work investigates the security and reliability analysis for a novel satellite-terrestrial (SatTer) network. Specifically, a satellite attempts to transmit confidential information to a ground user (GU) via the support of multiple relay nodes in the presence of an eavesdropper that tries to overhear the information. A friendly jammer is deployed to improve the secure transmission between the s…
▽ More
This work investigates the security and reliability analysis for a novel satellite-terrestrial (SatTer) network. Specifically, a satellite attempts to transmit confidential information to a ground user (GU) via the support of multiple relay nodes in the presence of an eavesdropper that tries to overhear the information. A friendly jammer is deployed to improve the secure transmission between the satellite and the relays. Furthermore, satellite-to-relay generalized Rician fading channels and imperfect channel state information (CSI) are deployed to examine a general system model. In this context, the closed-formed expressions for the outage probability (OP) and intercept probability (IP) are derived corresponding to an amplify-and-forward (AF)-based relaying scheme, which is challenging and has not been studied before. Finally, the exactness of the mathematical analyses is validated through Monte Carlo simulations. Furthermore, the effects of various key parameters (e.g., channel estimation errors, satellite's transmit power, relay's transmit power, number of relays, and fading severity parameter) are examined.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
A Meta-Analysis of Solar Forecasting Based on Skill Score
Authors:
Thi Ngoc Nguyen,
Felix Müsgens
Abstract:
We conduct the first comprehensive meta-analysis of deterministic solar forecasting based on skill score, screening 1,447 papers from Google Scholar and reviewing the full texts of 320 papers for data extraction. A database of 4,687 points was built and analyzed with multivariate adaptive regression spline modelling, partial dependence plots, and linear regression. The marginal impacts on skill sc…
▽ More
We conduct the first comprehensive meta-analysis of deterministic solar forecasting based on skill score, screening 1,447 papers from Google Scholar and reviewing the full texts of 320 papers for data extraction. A database of 4,687 points was built and analyzed with multivariate adaptive regression spline modelling, partial dependence plots, and linear regression. The marginal impacts on skill score of ten factors were quantified. The analysis shows the non-linearity and complex interaction between variables in the database. Forecast horizon has a central impact and dominates other factors' impacts. Therefore, the analysis of solar forecasts should be done separately for each horizon. Climate zone variables have statistically significant correlation with skill score. Regarding inputs, historical data and spatial temporal information are highly helpful. For intra-day, sky and satellite images show the most importance. For day-ahead, numerical weather predictions and locally measured meteorological data are very efficient. All forecast models were compared. Ensemble-hybrid models achieve the most accurate forecasts for all horizons. Hybrid models show superiority for intra-hour while image-based methods are the most efficient for intra-day forecasts. More training data can enhance skill score. However, over-fitting is observed when there is too much training data (longer than 2000 days). There has been a substantial improvement in solar forecast accuracy, especially in recent years. More improvement is observed for intra-hour and intra-day than day-ahead forecasts. By controlling for the key differences between forecasts, including location variables, our findings can be applied globally.
△ Less
Submitted 12 April, 2023; v1 submitted 22 August, 2022;
originally announced August 2022.
-
Efficient Embedding VNFs in 5G Network Slicing: A Deep Reinforcement Learning Approach
Authors:
Linh Le,
Tu N. Nguyen,
Kun Suo,
**g He
Abstract:
5G radio access network (RAN) slicing aims to logically split an infrastructure into a set of self-contained programmable RAN slices, with each slice built on top of the underlying physical RAN (substrate) is a separate logical mobile network, which delivers a set of services with similar characteristics. Each RAN slice is constituted by various virtual network functions (VNFs) distributed geograp…
▽ More
5G radio access network (RAN) slicing aims to logically split an infrastructure into a set of self-contained programmable RAN slices, with each slice built on top of the underlying physical RAN (substrate) is a separate logical mobile network, which delivers a set of services with similar characteristics. Each RAN slice is constituted by various virtual network functions (VNFs) distributed geographically in numerous substrate nodes. A key challenge in building a robust RAN slicing is, therefore, designing a RAN slicing (RS)-configuration scheme that can utilize information such as resource availability in substrate networks as well as the interdependent relationships among slices to map (embed) VNFs onto live substrate nodes. With such motivation, we propose a machine-learning-powered RAN slicing scheme that aims to accommodate maximum numbers of slices (a set of connected Virtual Network Functions - VNFs) within a given request set. More specifically, we present a deep reinforcement scheme that is called Deep Allocation Agent (DAA). In short, DAA utilizes an empirically designed deep neural network that observes the current states of the substrate network and the requested slices to schedule the slices of which VNFs are then mapped to substrate nodes using an optimization algorithm. DAA is trained towards the goal of maximizing the number of accommodated slices in the given set by using an explicitly designed reward function. Our experiment study shows that, on average, DAA is able to maintain a rate of successfully routed slices above 80% in a resource-limited substrate network, and about 60% in extreme conditions, i.e., the available resources are much less than the demands.
△ Less
Submitted 18 July, 2022;
originally announced July 2022.
-
Maximizing Entanglement Routing Rate in Quantum Networks: Approximation Algorithms
Authors:
Tu N. Nguyen,
Dung H. P. Nguyen,
Dang H. Pham,
Bing-Hong Liu,
Hoa N. Nguyen
Abstract:
There will be a fast-paced shift from conventional network systems to novel quantum networks that are supported by the quantum entanglement and teleportation, key technologies of the quantum era, to enable secured data transmissions in the next-generation of the Internet. Despite this prospect, migration to quantum networks cannot be done at once, especially on the aspect of quantum routing. In th…
▽ More
There will be a fast-paced shift from conventional network systems to novel quantum networks that are supported by the quantum entanglement and teleportation, key technologies of the quantum era, to enable secured data transmissions in the next-generation of the Internet. Despite this prospect, migration to quantum networks cannot be done at once, especially on the aspect of quantum routing. In this paper, we study the maximizing entangled routing rate (MERR) problem. In particular, given a set of demands, we try to determine entangled routing paths for the maximum number of demands in the quantum network while meeting the network's fidelity. We first formulate the MERR problem using an integer linear programming (ILP) model to capture the traffic patent for all demands in the network. We then leverage the theory of relaxation of ILP to devise two efficient algorithms including HBRA and RRA with provable approximation ratios for the objective function. To deal with the challenge of the combinatorial optimization problem in big scale networks, we also propose the path-length-based approach (PLBA) to solve the MERR problem. Using both simulations and an open quantum network simulator platform to conduct experiments with real-world topologies and traffic matrices, we evaluate the performance of our algorithms and show up the success of maximizing entangled routing rate.
△ Less
Submitted 18 July, 2022;
originally announced July 2022.
-
Optimizing Resource Allocation and VNF Embedding in RAN Slicing
Authors:
Tu N. Nguyen,
Kashyab J. Ambarani,
My T. Thai
Abstract:
5G radio access network (RAN) with network slicing methodology plays a key role in the development of the next-generation network system. RAN slicing focuses on splitting the substrate's resources into a set of self-contained programmable RAN slices. Leveraged by network function virtualization (NFV), a RAN slice is constituted by various virtual network functions (VNFs) and virtual links that are…
▽ More
5G radio access network (RAN) with network slicing methodology plays a key role in the development of the next-generation network system. RAN slicing focuses on splitting the substrate's resources into a set of self-contained programmable RAN slices. Leveraged by network function virtualization (NFV), a RAN slice is constituted by various virtual network functions (VNFs) and virtual links that are embedded as instances on substrate nodes. In this work, we focus on the following fundamental tasks: i) establishing the theoretical foundation for constructing a VNF map** plan for RAN slice recovery optimization and ii) develo** algorithms needed to map/embed VNFs efficiently. In particular, we propose four efficient algorithms, including Resource-based Algorithm (RBA), Connectivity-based Algorithm (CBA), Group-based Algorithm (GBA), and Group-Connectivity-based Algorithm (GCBA) to solve the resource allocation and VNF map** problem. Extensive experiments are also conducted to validate the robustness of RAN slicing via the proposed algorithms.
△ Less
Submitted 18 July, 2022;
originally announced July 2022.
-
Towards An Optimal Solution to Place Bistatic Radars for Belt Barrier Coverage with Minimum Cost
Authors:
Tu N. Nguyen,
Bing-Hong Liu,
My T. Thai,
Ivan Djordjevic
Abstract:
With the rapid growth of threats, sophistication and diversity in the manner of intrusion, traditional belt barrier systems are now faced with a major challenge of providing high and concrete coverage quality to expand the guarding service market. Recent efforts aim at constructing a belt barrier by deploying bistatic radar(s) on a specific line regardless of the limitation on deployment locations…
▽ More
With the rapid growth of threats, sophistication and diversity in the manner of intrusion, traditional belt barrier systems are now faced with a major challenge of providing high and concrete coverage quality to expand the guarding service market. Recent efforts aim at constructing a belt barrier by deploying bistatic radar(s) on a specific line regardless of the limitation on deployment locations, to keep the width of the barrier from going below a specific threshold and the total bistatic radar placement cost is minimized, referred to as the Minimum Cost Linear Placement (MCLP) problem. The existing solutions are heuristic, and their validity is tightly bound by the barrier width parameter that these solutions only work for a fixed barrier width value. In this work, we propose an optimal solution, referred to as the Opt_MCLP, for the "open MCLP problem" that works for full range of the barrier width. Through rigorous theoretical analysis and experimentation, we demonstrate that the proposed algorithms perform well in terms of placement cost reduction and barrier coverage guarantee.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
A Multiple-Entanglement Routing Framework for Quantum Networks
Authors:
Tu N. Nguyen,
Kashyab J. Ambarani,
Linh Le,
Ivan Djordjevic,
Zhi-Li Zhang
Abstract:
Quantum networks are gaining momentum in finding applications in a wide range of domains. However, little research has investigated the potential of a quantum network framework to enable highly reliable communications. The goal of this work is to investigate and design the multiple-entanglement routing framework, namely k-entangled routing. In particular, the $k$-entangled routing will enable k pa…
▽ More
Quantum networks are gaining momentum in finding applications in a wide range of domains. However, little research has investigated the potential of a quantum network framework to enable highly reliable communications. The goal of this work is to investigate and design the multiple-entanglement routing framework, namely k-entangled routing. In particular, the $k$-entangled routing will enable k paths connecting all demands (source-destination pairs) in the network. To design the $k$-entangled routing, we propose two algorithms that are called Sequential Multi-path Scheduling Algorithm and Min-Cut-based Multi-path Scheduling Algorithm. In addition, we evaluate the performance of the proposed algorithms and models through a realistic quantum network simulator, NetSquid, that models the stochastic processes underlying quantum communications. The results show that the proposed algorithms (SMPSA and MCSA) largely enhance the network's traffic flexibility. The proposed paradigms would lay the foundation for further research on the area of entanglement routing.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Temperature shift suppression scheme for two-photon two-color rubidium vapor clocks
Authors:
Tin Nghia Nguyen,
Thomas R. Schibli
Abstract:
We propose a new scheme for interrogating a warm rubidium vapor using two different clock lasers. Performance-wise, this approach is distinctly different from the recently proposed two-color two-photon rubidium clocks as our scheme does not trade off the AC Stark suppression against an increased sensitivity to the cell-temperature/pressure. Instead, our approach compensates all, the AC-Stark shift…
▽ More
We propose a new scheme for interrogating a warm rubidium vapor using two different clock lasers. Performance-wise, this approach is distinctly different from the recently proposed two-color two-photon rubidium clocks as our scheme does not trade off the AC Stark suppression against an increased sensitivity to the cell-temperature/pressure. Instead, our approach compensates all, the AC-Stark shift and the temperature & pressure-induced frequency shifts. The proposed scheme also makes use of the modulation transfer technique, which enables a two-orders of magnitude increase in the signal-to-noise ratio compared to traditional clocks that rely on fluorescence measurements.
△ Less
Submitted 11 July, 2022;
originally announced July 2022.
-
Security-Reliability Trade-Off Analysis for SWIPT- and AF-Based IoT Networks with Friendly Jammers
Authors:
Tan N. Nguyen,
Dinh-Hieu Tran,
Trinh Van Chien,
Van-Duc Phan,
Miroslav Voznak,
Phu Tran Tin,
Symeon Chatzinotas,
Derrick Wing Kwan Ng,
H. Vincent Poor
Abstract:
Radio-frequency (RF) energy harvesting (EH) in wireless relaying networks has attracted considerable recent interest, especially for supplying energy to relay nodes in Internet-of-Things (IoT) systems to assist the information exchange between a source and a destination. Moreover, limited hardware, computational resources, and energy availability of IoT devices have raised various security challen…
▽ More
Radio-frequency (RF) energy harvesting (EH) in wireless relaying networks has attracted considerable recent interest, especially for supplying energy to relay nodes in Internet-of-Things (IoT) systems to assist the information exchange between a source and a destination. Moreover, limited hardware, computational resources, and energy availability of IoT devices have raised various security challenges. To this end, physical layer security (PLS) has been proposed as an effective alternative to cryptographic methods for providing information security. In this study, we propose a PLS approach for simultaneous wireless information and power transfer (SWIPT)-based half-duplex (HD) amplify-and-forward (AF) relaying systems in the presence of an eavesdropper. Furthermore, we take into account both static power splitting relaying (SPSR) and dynamic power splitting relaying (DPSR) to thoroughly investigate the benefits of each one. To further enhance secure communication, we consider multiple friendly jammers to help prevent wiretap** attacks from the eavesdropper. More specifically, we provide a reliability and security analysis by deriving closed-form expressions of outage probability (OP) and intercept probability (IP), respectively, for both the SPSR and DPSR schemes. Then, simulations are also performed to validate our analysis and the effectiveness of the proposed schemes. Specifically, numerical results illustrate the non-trivial trade-off between reliability and security of the proposed system. In addition, we conclude from the simulation results that the proposed DPSR scheme outperforms the SPSR-based scheme in terms of OP and IP under the influences of different parameters on system performance.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
HierarchyNet: Learning to Summarize Source Code with Heterogeneous Representations
Authors:
Minh Huynh Nguyen,
Nghi D. Q. Bui,
Truong Son Hy,
Long Tran-Thanh,
Tien N. Nguyen
Abstract:
We propose a novel method for code summarization utilizing Heterogeneous Code Representations (HCRs) and our specially designed HierarchyNet. HCRs effectively capture essential code features at lexical, syntactic, and semantic levels by abstracting coarse-grained code elements and incorporating fine-grained program elements in a hierarchical structure. Our HierarchyNet method processes each layer…
▽ More
We propose a novel method for code summarization utilizing Heterogeneous Code Representations (HCRs) and our specially designed HierarchyNet. HCRs effectively capture essential code features at lexical, syntactic, and semantic levels by abstracting coarse-grained code elements and incorporating fine-grained program elements in a hierarchical structure. Our HierarchyNet method processes each layer of the HCR separately through a unique combination of the Heterogeneous Graph Transformer, a Tree-based CNN, and a Transformer Encoder. This approach preserves dependencies between code elements and captures relations through a novel Hierarchical-Aware Cross Attention layer. Our method surpasses current state-of-the-art techniques, such as PA-Former, CAST, and NeuralCodeSum.
△ Less
Submitted 9 May, 2023; v1 submitted 30 May, 2022;
originally announced May 2022.
-
DEAR: A Novel Deep Learning-based Approach for Automated Program Repair
Authors:
Yi Li,
Shaohua Wang,
Tien N. Nguyen
Abstract:
The existing deep learning (DL)-based automated program repair (APR) models are limited in fixing general software defects. %
We present {\tool}, a DL-based approach that supports fixing for the general bugs that require dependent changes at once to one or multiple consecutive statements in one or multiple hunks of code. %
We first design a novel fault localization (FL) technique for multi-hunk, m…
▽ More
The existing deep learning (DL)-based automated program repair (APR) models are limited in fixing general software defects. %
We present {\tool}, a DL-based approach that supports fixing for the general bugs that require dependent changes at once to one or multiple consecutive statements in one or multiple hunks of code. %
We first design a novel fault localization (FL) technique for multi-hunk, multi-statement fixes that combines traditional spectrum-based (SB) FL with deep learning and data-flow analysis. It takes the buggy statements returned by the SBFL model, detects the buggy hunks to be fixed at once, and expands a buggy statement $s$ in a hunk to include other suspicious statements around $s$. We design a two-tier, tree-based LSTM model that incorporates cycle training and uses a divide-and-conquer strategy to learn proper code transformations for fixing multiple statements in the suitable fixing context consisting of surrounding subtrees. We conducted several experiments to evaluate {\tool} on three datasets: Defects4J (395 bugs), BigFix (+26k bugs), and CPatMiner (+44k bugs). On Defects4J dataset, {\tool} outperforms the baselines from 42\%--683\% in terms of the number of auto-fixed bugs with only the top-1 patches. On BigFix dataset, it fixes 31--145 more bugs than existing DL-based APR models with the top-1 patches. On CPatMiner dataset, among 667 fixed bugs, there are 169 (25.3\%) multi-hunk/multi-statement bugs. {\tool} fixes 71 and 164 more bugs, including 52 and 61 more multi-hunk/multi-statement bugs, than the state-of-the-art, DL-based APR models.
△ Less
Submitted 3 May, 2022;
originally announced May 2022.
-
PediCXR: An open, large-scale chest radiograph dataset for interpretation of common thoracic diseases in children
Authors:
Hieu H. Pham,
Ngoc H. Nguyen,
Thanh T. Tran,
Tuan N. M. Nguyen,
Ha Q. Nguyen
Abstract:
The development of diagnostic models for detecting and diagnosing pediatric diseases in CXR scans is undertaken due to the lack of high-quality physician-annotated datasets. To overcome this challenge, we introduce and release PediCXR, a new pediatric CXR dataset of 9,125 studies retrospectively collected from a major pediatric hospital in Vietnam between 2020 and 2021. Each scan was manually anno…
▽ More
The development of diagnostic models for detecting and diagnosing pediatric diseases in CXR scans is undertaken due to the lack of high-quality physician-annotated datasets. To overcome this challenge, we introduce and release PediCXR, a new pediatric CXR dataset of 9,125 studies retrospectively collected from a major pediatric hospital in Vietnam between 2020 and 2021. Each scan was manually annotated by a pediatric radiologist with more than ten years of experience. The dataset was labeled for the presence of 36 critical findings and 15 diseases. In particular, each abnormal finding was identified via a rectangle bounding box on the image. To the best of our knowledge, this is the first and largest pediatric CXR dataset containing lesion-level annotations and image-level labels for the detection of multiple findings and diseases. For algorithm development, the dataset was divided into a training set of 7,728 and a test set of 1,397. To encourage new advances in pediatric CXR interpretation using data-driven approaches, we provide a detailed description of the PediCXR data sample and make the dataset publicly available on https://physionet.org/content/pedicxr/1.0.0/
△ Less
Submitted 20 March, 2023; v1 submitted 20 March, 2022;
originally announced March 2022.
-
NOMA-enabled Backscatter Communications for Green Transportation in Automotive-Industry 5.0
Authors:
Wali Ullah Khan,
Asim Ihsan,
Tu N. Nguyen,
Zain Ali,
Muhammad Awais Javed
Abstract:
Automotive-Industry 5.0 will use emerging 6G communications to provide robust, computationally intelligent, and energy-efficient data sharing among various onboard sensors, vehicles, and other Intelligent Transportation System (ITS) entities. Non-Orthogonal Multiple Access (NOMA) and backscatter communications are two key techniques of 6G communications for enhanced spectrum and energy efficiency.…
▽ More
Automotive-Industry 5.0 will use emerging 6G communications to provide robust, computationally intelligent, and energy-efficient data sharing among various onboard sensors, vehicles, and other Intelligent Transportation System (ITS) entities. Non-Orthogonal Multiple Access (NOMA) and backscatter communications are two key techniques of 6G communications for enhanced spectrum and energy efficiency. In this paper, we provide an introduction to green transportation and also discuss the advantages of using backscatter communications and NOMA in Automotive Industry 5.0. We also briefly review the recent work in the area of NOMA empowered backscatter communications. We discuss different use cases of backscatter communications in NOMA-enabled 6G vehicular networks. We also propose a multi-cell optimization framework to maximize the energy efficiency of the backscatter-enabled NOMA vehicular network. In particular, we jointly optimize the transmit power of the roadside unit and the reflection coefficient of the backscatter device in each cell, where several practical constraints are also taken into account. The problem of energy efficiency is formulated as nonconvex which is hard to solve directly. Thus, first, we adopt the Dinkelbach method to transform the objective function into a subtractive one, then we decouple the problem into two subproblems. Second, we employ dual theory and KKT conditions to obtain efficient solutions. Finally, we highlight some open issues and future research opportunities related to NOMA-enabled backscatter communications in 6G vehicular networks.
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
Learning-informed parameter identification in nonlinear time-dependent PDEs
Authors:
Christian Aarset,
Martin Holler,
Tram Thi Ngoc Nguyen
Abstract:
We introduce and analyze a method of learning-informed parameter identification for partial differential equations (PDEs) in an all-at-once framework. The underlying PDE model is formulated in a rather general setting with three unknowns: physical parameter, state and nonlinearity. Inspired by advances in machine learning, we approximate the nonlinearity via a neural network, whose parameters are…
▽ More
We introduce and analyze a method of learning-informed parameter identification for partial differential equations (PDEs) in an all-at-once framework. The underlying PDE model is formulated in a rather general setting with three unknowns: physical parameter, state and nonlinearity. Inspired by advances in machine learning, we approximate the nonlinearity via a neural network, whose parameters are learned from measurement data. The later is assumed to be given as noisy observations of the unknown state, and both the state and the physical parameters are identified simultaneously with the parameters of the neural network. Moreover, diverging from the classical approach, the proposed all-at-once setting avoids constructing the parameter-to-state map by explicitly handling the state as additional variable. The practical feasibility of the proposed method is confirmed with experiments using two different algorithmic settings: A function-space algorithm based on analytic adjoints as well as a purely discretized setting using standard machine learning algorithms.
△ Less
Submitted 27 April, 2023; v1 submitted 22 February, 2022;
originally announced February 2022.
-
Physics-informed neural networks for non-Newtonian fluid thermo-mechanical problems: an application to rubber calendering process
Authors:
Thi Nguyen Khoa Nguyen,
Thibault Dairay,
Raphaël Meunier,
Mathilde Mougeot
Abstract:
Physics-Informed Neural Networks (PINNs) have gained much attention in various fields of engineering thanks to their capability of incorporating physical laws into the models. However, the assessment of PINNs in industrial applications involving coupling between mechanical and thermal fields is still an active research topic. In this work, we present an application of PINNs to a non-Newtonian flui…
▽ More
Physics-Informed Neural Networks (PINNs) have gained much attention in various fields of engineering thanks to their capability of incorporating physical laws into the models. However, the assessment of PINNs in industrial applications involving coupling between mechanical and thermal fields is still an active research topic. In this work, we present an application of PINNs to a non-Newtonian fluid thermo-mechanical problem which is often considered in the rubber calendering process. We demonstrate the effectiveness of PINNs when dealing with inverse and ill-posed problems, which are impractical to be solved by classical numerical discretization methods. We study the impact of the placement of the sensors and the distribution of unsupervised points on the performance of PINNs in a problem of inferring hidden physical fields from some partial data. We also investigate the capability of PINNs to identify unknown physical parameters from the measurements captured by sensors. The effect of noisy measurements is also considered throughout this work. The results of this paper demonstrate that in the problem of identification, PINNs can successfully estimate the unknown parameters using only the measurements on the sensors. In ill-posed problems where boundary conditions are not completely defined, even though the placement of the sensors and the distribution of unsupervised points have a great impact on PINNs performance, we show that the algorithm is able to infer the hidden physics from local measurements.
△ Less
Submitted 27 June, 2022; v1 submitted 31 January, 2022;
originally announced January 2022.
-
Polyphonic audio event detection: multi-label or multi-class multi-task classification problem?
Authors:
Huy Phan,
Thi Ngoc Tho Nguyen,
Philipp Koch,
Alfred Mertins
Abstract:
Polyphonic events are the main error source of audio event detection (AED) systems. In deep-learning context, the most common approach to deal with event overlaps is to treat the AED task as a multi-label classification problem. By doing this, we inherently consider multiple one-vs.-rest classification problems, which are jointly solved by a single (i.e. shared) network. In this work, to better ha…
▽ More
Polyphonic events are the main error source of audio event detection (AED) systems. In deep-learning context, the most common approach to deal with event overlaps is to treat the AED task as a multi-label classification problem. By doing this, we inherently consider multiple one-vs.-rest classification problems, which are jointly solved by a single (i.e. shared) network. In this work, to better handle polyphonic mixtures, we propose to frame the task as a multi-class classification problem by considering each possible label combination as one class. To circumvent the large number of arising classes due to combinatorial explosion, we divide the event categories into multiple groups and construct a multi-task problem in a divide-and-conquer fashion, where each of the tasks is a multi-class classification problem. A network architecture is then devised for multi-class multi-task modelling. The network is composed of a backbone subnet and multiple task-specific subnets. The task-specific subnets are designed to learn time-frequency and channel attention masks to extract features for the task at hand from the common feature maps learned by the backbone. Experiments on the TUT-SED-Synthetic-2016 with high degree of event overlap show that the proposed approach results in more favorable performance than the common multi-label approach.
△ Less
Submitted 29 January, 2022;
originally announced January 2022.
-
SpeedyIBL: A Comprehensive, Precise, and Fast Implementation of Instance-Based Learning Theory
Authors:
Thuy Ngoc Nguyen,
Duy Nhat Phan,
Cleotilde Gonzalez
Abstract:
Instance-Based Learning Theory (IBLT) is a comprehensive account of how humans make decisions from experience during dynamic tasks. Since it was first proposed almost two decades ago, multiple computational models have been constructed based on IBLT (i.e., IBL models). These models have been demonstrated to be very successful in explaining and predicting human decisions in multiple decision making…
▽ More
Instance-Based Learning Theory (IBLT) is a comprehensive account of how humans make decisions from experience during dynamic tasks. Since it was first proposed almost two decades ago, multiple computational models have been constructed based on IBLT (i.e., IBL models). These models have been demonstrated to be very successful in explaining and predicting human decisions in multiple decision making contexts. However, as IBLT has evolved, the initial description of the theory has become less precise, and it is unclear how its demonstration can be expanded to more complex, dynamic, and multi-agent environments. This paper presents an updated version of the current theoretical components of IBLT in a comprehensive and precise form. It also provides an advanced implementation of the full set of theoretical mechanisms, SpeedyIBL, to unlock the capabilities of IBLT to handle a diverse taxonomy of individual and multi-agent decision-making problems. SpeedyIBL addresses a practical computational issue in past implementations of IBL models, the curse of exponential growth, that emerges from memory-based tabular computations. When more observations accumulate over time, there is an exponential growth of the memory of instances that leads directly to an exponential slow down of the computational time. Thus, SpeedyIBL leverages parallel computation with vectorization to speed up the execution time of IBL models. We evaluate the robustness of SpeedyIBL over an existing implementation of IBLT in decision games of increased complexity. The results not only demonstrate the applicability of IBLT through a wide range of decision making tasks, but also highlight the improvement of SpeedyIBL over its prior implementation as the complexity of decision features and number of agents increase. The library is open sourced for the use of the broad research community.
△ Less
Submitted 5 April, 2022; v1 submitted 19 November, 2021;
originally announced November 2021.
-
SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays
Authors:
Thi Ngoc Tho Nguyen,
Douglas L. Jones,
Karn N. Watcharasupat,
Huy Phan,
Woon-Seng Gan
Abstract:
Polyphonic sound event localization and detection (SELD) has many practical applications in acoustic sensing and monitoring. However, the development of real-time SELD has been limited by the demanding computational requirement of most recent SELD systems. In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs. SALSA-Lite is a lightwei…
▽ More
Polyphonic sound event localization and detection (SELD) has many practical applications in acoustic sensing and monitoring. However, the development of real-time SELD has been limited by the demanding computational requirement of most recent SELD systems. In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs. SALSA-Lite is a lightweight variation of a previously proposed SALSA feature for polyphonic SELD. SALSA, which stands for Spatial Cue-Augmented Log-Spectrogram, consists of multichannel log-spectrograms stacked channelwise with the normalized principal eigenvectors of the spectrotemporally corresponding spatial covariance matrices. In contrast to SALSA, which uses eigenvector-based spatial features, SALSA-Lite uses normalized inter-channel phase differences as spatial features, allowing a 30-fold speedup compared to the original SALSA feature. Experimental results on the TAU-NIGENS Spatial Sound Events 2021 dataset showed that the SALSA-Lite feature achieved competitive performance compared to the full SALSA feature, and significantly outperformed the traditional feature set of multichannel log-mel spectrograms with generalized cross-correlation spectra. Specifically, using SALSA-Lite features increased localization-dependent F1 score and class-dependent localization recall by 15% and 5%, respectively, compared to using multichannel log-mel spectrograms with generalized cross-correlation spectra.
△ Less
Submitted 4 May, 2022; v1 submitted 15 November, 2021;
originally announced November 2021.
-
What drives the accuracy of PV output forecasts?
Authors:
Thi Ngoc Nguyen,
Felix Müsgens
Abstract:
Due to the stochastic nature of photovoltaic (PV) power generation, there is high demand for forecasting PV output to better integrate PV generation into power grids. Systematic knowledge regarding the factors influencing forecast accuracy is crucially important, but still mostly unknown. In this paper, we review 180 papers on PV forecasts and extract a database of forecast errors for statistical…
▽ More
Due to the stochastic nature of photovoltaic (PV) power generation, there is high demand for forecasting PV output to better integrate PV generation into power grids. Systematic knowledge regarding the factors influencing forecast accuracy is crucially important, but still mostly unknown. In this paper, we review 180 papers on PV forecasts and extract a database of forecast errors for statistical analysis. We show that among the forecast models, hybrid models consistently outperform the others and will most likely be the future of PV output forecasting. The use of data processing techniques is positively correlated with the forecast quality, while the lengths of the forecast horizon and out-of-sample test set have negative effects on the forecast accuracy. We also found that the inclusion of numerical weather prediction variables, data normalization, and data resampling are the most effective data processing techniques. Furthermore, we found some evidence for cherry picking in reporting errors and recommend that the test sets be at least one year to better assess model performance. The paper also takes the first step towards establishing a benchmark for assessing PV output forecasts.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression
Authors:
Karn N. Watcharasupat,
Thi Ngoc Tho Nguyen,
Woon-Seng Gan,
Shengkui Zhao,
Bin Ma
Abstract:
Echo and noise suppression is an integral part of a full-duplex communication system. Many recent acoustic echo cancellation (AEC) systems rely on a separate adaptive filtering module for linear echo suppression and a neural module for residual echo suppression. However, not only do adaptive filtering modules require convergence and remain susceptible to changes in acoustic environments, but this…
▽ More
Echo and noise suppression is an integral part of a full-duplex communication system. Many recent acoustic echo cancellation (AEC) systems rely on a separate adaptive filtering module for linear echo suppression and a neural module for residual echo suppression. However, not only do adaptive filtering modules require convergence and remain susceptible to changes in acoustic environments, but this two-stage framework also often introduces unnecessary delays to the AEC system when neural modules are already capable of both linear and nonlinear echo suppression. In this paper, we exploit the offset-compensating ability of complex time-frequency masks and propose an end-to-end complex-valued neural network architecture. The building block of the proposed model is a pseudocomplex extension based on the densely-connected multidilated DenseNet (D3Net) building block, resulting in a very small network of only 354K parameters. The architecture utilized the multi-resolution nature of the D3Net building blocks to eliminate the need for pooling, allowing the network to extract features using large receptive fields without any loss of output resolution. We also propose a dual-mask technique for joint echo and noise suppression with simultaneous speech enhancement. Evaluation on both synthetic and real test sets demonstrated promising results across multiple energy-based metrics and perceptual proxies.
△ Less
Submitted 22 January, 2022; v1 submitted 2 October, 2021;
originally announced October 2021.
-
SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection
Authors:
Thi Ngoc Tho Nguyen,
Karn N. Watcharasupat,
Ngoc Khanh Nguyen,
Douglas L. Jones,
Woon-Seng Gan
Abstract:
Sound event localization and detection (SELD) consists of two subtasks, which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses amplitude and/or phase differences between microphones to estimate source directions. As a result, it is often di…
▽ More
Sound event localization and detection (SELD) consists of two subtasks, which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses amplitude and/or phase differences between microphones to estimate source directions. As a result, it is often difficult to jointly optimize these two subtasks. We propose a novel feature called Spatial cue-Augmented Log-SpectrogrAm (SALSA) with exact time-frequency map** between the signal power and the source directional cues, which is crucial for resolving overlap** sound sources. The SALSA feature consists of multichannel log-spectrograms stacked along with the normalized principal eigenvector of the spatial covariance matrix at each corresponding time-frequency bin. Depending on the microphone array format, the principal eigenvector can be normalized differently to extract amplitude and/or phase differences between the microphones. As a result, SALSA features are applicable for different microphone array formats such as first-order ambisonics (FOA) and multichannel microphone array (MIC). Experimental results on the TAU-NIGENS Spatial Sound Events 2021 dataset with directional interferences showed that SALSA features outperformed other state-of-the-art features. Specifically, the use of SALSA features in the FOA format increased the F1 score and localization recall by 6% each, compared to the multichannel log-mel spectrograms with intensity vectors. For the MIC format, using SALSA features increased F1 score and localization recall by 16% and 7%, respectively, compared to using multichannel log-mel spectrograms with generalized cross-correlation spectra.
△ Less
Submitted 6 June, 2022; v1 submitted 1 October, 2021;
originally announced October 2021.
-
Automated Generation of Accurate \& Fluent Medical X-ray Reports
Authors:
Hoang T. N. Nguyen,
Dong Nie,
Taivanbat Badamdorj,
Yujie Liu,
Yingying Zhu,
Jason Truong,
Li Cheng
Abstract:
Our paper focuses on automating the generation of medical reports from chest X-ray image inputs, a critical yet time-consuming task for radiologists. Unlike existing medical re-port generation efforts that tend to produce human-readable reports, we aim to generate medical reports that are both fluent and clinically accurate. This is achieved by our fully differentiable and end-to-end paradigm cont…
▽ More
Our paper focuses on automating the generation of medical reports from chest X-ray image inputs, a critical yet time-consuming task for radiologists. Unlike existing medical re-port generation efforts that tend to produce human-readable reports, we aim to generate medical reports that are both fluent and clinically accurate. This is achieved by our fully differentiable and end-to-end paradigm containing three complementary modules: taking the chest X-ray images and clinical his-tory document of patients as inputs, our classification module produces an internal check-list of disease-related topics, referred to as enriched disease embedding; the embedding representation is then passed to our transformer-based generator, giving rise to the medical reports; meanwhile, our generator also pro-duces the weighted embedding representation, which is fed to our interpreter to ensure consistency with respect to disease-related topics.Our approach achieved promising results on commonly-used metrics concerning language fluency and clinical accuracy. Moreover, noticeable performance gains are consistently ob-served when additional input information is available, such as the clinical document and extra scans of different views.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Discretization of parameter identification in PDEs using Neural Networks
Authors:
Barbara Kaltenbacher,
Tram Thi Ngoc Nguyen
Abstract:
We consider the ill-posed inverse problem of identifying a nonlinearity in a time-dependent PDE model. The nonlinearity is approximated by a neural network, and needs to be determined alongside other unknown physical parameters and the unknown state. Hence, it is not possible to construct input-output data pairs to perform a supervised training process. Proposing an all-at-once approach, we bypass…
▽ More
We consider the ill-posed inverse problem of identifying a nonlinearity in a time-dependent PDE model. The nonlinearity is approximated by a neural network, and needs to be determined alongside other unknown physical parameters and the unknown state. Hence, it is not possible to construct input-output data pairs to perform a supervised training process. Proposing an all-at-once approach, we bypass the need for training data and recover all the unknowns simultaneously. In the general case, the approximation via a neural network can be realized as a discretization scheme, and the training with noisy data can be viewed as an ill-posed inverse problem. Therefore, we study discretization of regularization in terms of Tikhonov and projected Landweber methods for discretization of inverse problems, and prove convergence when the discretization error (network approximation error) and the noise level tend to zero.
△ Less
Submitted 3 August, 2022; v1 submitted 24 August, 2021;
originally announced August 2021.
-
Cybonto: Towards Human Cognitive Digital Twins for Cybersecurity
Authors:
Tam N. Nguyen
Abstract:
Cyber defense is reactive and slow. On average, the time-to-remedy is hundreds of times larger than the time-to-compromise. In response to the expanding ever-more-complex threat landscape, Digital Twins (DTs) and particularly Human Digital Twins (HDTs) offer the capability of running massive simulations across multiple knowledge domains. Simulated results may offer insights into adversaries' behav…
▽ More
Cyber defense is reactive and slow. On average, the time-to-remedy is hundreds of times larger than the time-to-compromise. In response to the expanding ever-more-complex threat landscape, Digital Twins (DTs) and particularly Human Digital Twins (HDTs) offer the capability of running massive simulations across multiple knowledge domains. Simulated results may offer insights into adversaries' behaviors and tactics, resulting in better proactive cyber-defense strategies. For the first time, this paper solidifies the vision of DTs and HDTs for cybersecurity via the Cybonto conceptual framework proposal. The paper also contributes the Cybonto ontology, formally documenting 108 constructs and thousands of cognitive-related paths based on 20 time-tested psychology theories. Finally, the paper applied 20 network centrality algorithms in analyzing the 108 constructs. The identified top 10 constructs call for extensions of current digital cognitive architectures in preparation for the DT future.
△ Less
Submitted 5 August, 2021; v1 submitted 1 August, 2021;
originally announced August 2021.
-
Improving Polyphonic Sound Event Detection on Multichannel Recordings with the Sørensen-Dice Coefficient Loss and Transfer Learning
Authors:
Karn N. Watcharasupat,
Thi Ngoc Tho Nguyen,
Ngoc Khanh Nguyen,
Zhen Jian Lee,
Douglas L. Jones,
Woon Seng Gan
Abstract:
The Sørensen--Dice Coefficient has recently seen rising popularity as a loss function (also known as Dice loss) due to its robustness in tasks where the number of negative samples significantly exceeds that of positive samples, such as semantic segmentation, natural language processing, and sound event detection. Conventional training of polyphonic sound event detection systems with binary cross-e…
▽ More
The Sørensen--Dice Coefficient has recently seen rising popularity as a loss function (also known as Dice loss) due to its robustness in tasks where the number of negative samples significantly exceeds that of positive samples, such as semantic segmentation, natural language processing, and sound event detection. Conventional training of polyphonic sound event detection systems with binary cross-entropy loss often results in suboptimal detection performance as the training is often overwhelmed by updates from negative samples. In this paper, we investigated the effect of the Dice loss, intra- and inter-modal transfer learning, data augmentation, and recording formats, on the performance of polyphonic sound event detection systems with multichannel inputs. Our analysis showed that polyphonic sound event detection systems trained with Dice loss consistently outperformed those trained with cross-entropy loss across different training settings and recording formats in terms of F1 score and error rate. We achieved further performance gains via the use of transfer learning and an appropriate combination of different data augmentation techniques.
△ Less
Submitted 2 October, 2021; v1 submitted 22 July, 2021;
originally announced July 2021.
-
What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis
Authors:
Thi Ngoc Tho Nguyen,
Karn N. Watcharasupat,
Zhen Jian Lee,
Ngoc Khanh Nguyen,
Douglas L. Jones,
Woon Seng Gan
Abstract:
Sound event localization and detection (SELD) is an emerging research topic that aims to unify the tasks of sound event detection and direction-of-arrival estimation. As a result, SELD inherits the challenges of both tasks, such as noise, reverberation, interference, polyphony, and non-stationarity of sound sources. Furthermore, SELD often faces an additional challenge of assigning correct corresp…
▽ More
Sound event localization and detection (SELD) is an emerging research topic that aims to unify the tasks of sound event detection and direction-of-arrival estimation. As a result, SELD inherits the challenges of both tasks, such as noise, reverberation, interference, polyphony, and non-stationarity of sound sources. Furthermore, SELD often faces an additional challenge of assigning correct correspondences between the detected sound classes and directions of arrival to multiple overlap** sound events. Previous studies have shown that unknown interferences in reverberant environments often cause major degradation in the performance of SELD systems. To further understand the challenges of the SELD task, we performed a detailed error analysis on two of our SELD systems, which both ranked second in the team category of DCASE SELD Challenge, one in 2020 and one in 2021. Experimental results indicate polyphony as the main challenge in SELD, due to the difficulty in detecting all sound events of interest. In addition, the SELD systems tend to make fewer errors for the polyphonic scenario that is dominant in the training set.
△ Less
Submitted 2 October, 2021; v1 submitted 22 July, 2021;
originally announced July 2021.
-
DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection
Authors:
Thi Ngoc Tho Nguyen,
Karn Watcharasupat,
Ngoc Khanh Nguyen,
Douglas L. Jones,
Woon Seng Gan
Abstract:
Sound event localization and detection consists of two subtasks which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses magnitude or phase differences between microphones to estimate source directions. Therefore, it is often difficult to joi…
▽ More
Sound event localization and detection consists of two subtasks which are sound event detection and direction-of-arrival estimation. While sound event detection mainly relies on time-frequency patterns to distinguish different sound classes, direction-of-arrival estimation uses magnitude or phase differences between microphones to estimate source directions. Therefore, it is often difficult to jointly train these two subtasks simultaneously. We propose a novel feature called spatial cue-augmented log-spectrogram (SALSA) with exact time-frequency map** between the signal power and the source direction-of-arrival. The feature includes multichannel log-spectrograms stacked along with the estimated direct-to-reverberant ratio and a normalized version of the principal eigenvector of the spatial covariance matrix at each time-frequency bin on the spectrograms. Experimental results on the DCASE 2021 dataset for sound event localization and detection with directional interference showed that the deep learning-based models trained on this new feature outperformed the DCASE challenge baseline by a large margin. We combined several models with slightly different architectures that were trained on the new feature to further improve the system performances for the DCASE sound event localization and detection challenge.
△ Less
Submitted 29 June, 2021;
originally announced June 2021.
-
Vulnerability Detection with Fine-grained Interpretations
Authors:
Yi Li,
Shaohua Wang,
Tien N. Nguyen
Abstract:
Despite the successes of machine learning (ML) and deep learning (DL) based vulnerability detectors (VD), they are limited to providing only the decision on whether a given code is vulnerable or not, without details on what part of the code is relevant to the detected vulnerability. We present IVDetect an interpretable vulnerability detector with the philosophy of using Artificial Intelligence (AI…
▽ More
Despite the successes of machine learning (ML) and deep learning (DL) based vulnerability detectors (VD), they are limited to providing only the decision on whether a given code is vulnerable or not, without details on what part of the code is relevant to the detected vulnerability. We present IVDetect an interpretable vulnerability detector with the philosophy of using Artificial Intelligence (AI) to detect vulnerabilities, while using Intelligence Assistant (IA) via providing VD interpretations in terms of vulnerable statements.
For vulnerability detection, we separately consider the vulnerable statements and their surrounding contexts via data and control dependencies. This allows our model better discriminate vulnerable statements than using the mixture of vulnerable code and~contextual code as in existing approaches. In addition to the coarse-grained vulnerability detection result, we leverage interpretable AI to provide users with fine-grained interpretations that include the sub-graph in the Program Dependency Graph (PDG) with the crucial statements that are relevant to the detected vulnerability. Our empirical evaluation on vulnerability databases shows that IVDetect outperforms the existing DL-based approaches by 43%--84% and 105%--255% in top-10 nDCG and MAP ranking scores. IVDetect correctly points out the vulnerable statements relevant to the vulnerability via its interpretation~in 67% of the cases with a top-5 ranked list. It improves over baseline interpretation models by 12.3%--400% and 9%--400% in accuracy.
△ Less
Submitted 19 June, 2021;
originally announced June 2021.
-
ShortcutFusion: From Tensorflow to FPGA-based accelerator with reuse-aware memory allocation for shortcut data
Authors:
Duy Thanh Nguyen,
Hyeonseung Je,
Tuan Nghia Nguyen,
Soojung Ryu,
Kyujoong Lee,
Hyuk-Jae Lee
Abstract:
Residual block is a very common component in recent state-of-the art CNNs such as EfficientNet or EfficientDet. Shortcut data accounts for nearly 40% of feature-maps access in ResNet152 [8]. Most of the previous DNN compilers, accelerators ignore the shortcut data optimization. This paper presents ShortcutFusion, an optimization tool for FPGA-based accelerator with a reuse-aware static memory allo…
▽ More
Residual block is a very common component in recent state-of-the art CNNs such as EfficientNet or EfficientDet. Shortcut data accounts for nearly 40% of feature-maps access in ResNet152 [8]. Most of the previous DNN compilers, accelerators ignore the shortcut data optimization. This paper presents ShortcutFusion, an optimization tool for FPGA-based accelerator with a reuse-aware static memory allocation for shortcut data, to maximize on-chip data reuse given resource constraints. From TensorFlow DNN models, the proposed design generates instruction sets for a group of nodes which uses an optimized data reuse for each residual block. The accelerator design implemented on the Xilinx KCU1500 FPGA card 2.8x faster and 9.9x more power efficient than NVIDIA RTX 2080 Ti for 256x256 input size. . Compared to the result from baseline, in which the weights, inputs, and outputs are accessed from the off-chip memory exactly once per each layer, ShortcutFusion reduces the DRAM access by 47.8-84.8% for RetinaNet, Yolov3, ResNet152, and EfficientNet. Given a similar buffer size to ShortcutMining [8], which also mine the shortcut data in hardware, the proposed work reduces off-chip access for feature-maps 5.27x while accessing weight from off-chip memory exactly once.
△ Less
Submitted 13 February, 2022; v1 submitted 15 June, 2021;
originally announced June 2021.