-
PathAlign: A vision-language model for whole slide images in histopathology
Authors:
Faruk Ahmed,
Andrew Sellergren,
Lin Yang,
Shawn Xu,
Boris Babenko,
Abbi Ward,
Niels Olson,
Arash Mohtashamian,
Yossi Matias,
Greg S. Corrado,
Quang Duong,
Dale R. Webster,
Shravya Shetty,
Daniel Golden,
Yun Liu,
David F. Steiner,
Ellery Wulczyn
Abstract:
Microscopic interpretation of histopathology images underlies many important diagnostic and treatment decisions. While advances in vision-language modeling raise new opportunities for analysis of such images, the gigapixel-scale size of whole slide images (WSIs) introduces unique challenges. Additionally, pathology reports simultaneously highlight key findings from small regions while also aggrega…
▽ More
Microscopic interpretation of histopathology images underlies many important diagnostic and treatment decisions. While advances in vision-language modeling raise new opportunities for analysis of such images, the gigapixel-scale size of whole slide images (WSIs) introduces unique challenges. Additionally, pathology reports simultaneously highlight key findings from small regions while also aggregating interpretation across multiple slides, often making it difficult to create robust image-text pairs. As such, pathology reports remain a largely untapped source of supervision in computational pathology, with most efforts relying on region-of-interest annotations or self-supervision at the patch-level. In this work, we develop a vision-language model based on the BLIP-2 framework using WSIs paired with curated text from pathology reports. This enables applications utilizing a shared image-text embedding space, such as text or image retrieval for finding cases of interest, as well as integration of the WSI encoder with a frozen large language model (LLM) for WSI-based generative text capabilities such as report generation or AI-in-the-loop interactions. We utilize a de-identified dataset of over 350,000 WSIs and diagnostic text pairs, spanning a wide range of diagnoses, procedure types, and tissue types. We present pathologist evaluation of text generation and text retrieval using WSI embeddings, as well as results for WSI classification and workflow prioritization (slide-level triaging). Model-generated text for WSIs was rated by pathologists as accurate, without clinically significant error or omission, for 78% of WSIs on average. This work demonstrates exciting potential capabilities for language-aligned WSI embeddings.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
What-if Analysis Framework for Digital Twins in 6G Wireless Network Management
Authors:
Elif Ak,
Berk Canberk,
Vishal Sharma,
Octavia A. Dobre,
Trung Q. Duong
Abstract:
This study explores implementing a digital twin network (DTN) for efficient 6G wireless network management, aligning with the fault, configuration, accounting, performance, and security (FCAPS) model. The DTN architecture comprises the Physical Twin Layer, implemented using NS-3, and the Service Layer, featuring machine learning and reinforcement learning for optimizing carrier sensitivity thresho…
▽ More
This study explores implementing a digital twin network (DTN) for efficient 6G wireless network management, aligning with the fault, configuration, accounting, performance, and security (FCAPS) model. The DTN architecture comprises the Physical Twin Layer, implemented using NS-3, and the Service Layer, featuring machine learning and reinforcement learning for optimizing carrier sensitivity threshold and transmit power control in wireless networks. We introduce a robust "What-if Analysis" module, utilizing conditional tabular generative adversarial network (CTGAN) for synthetic data generation to mimic various network scenarios. These scenarios assess four network performance metrics: throughput, latency, packet loss, and coverage. Our findings demonstrate the efficiency of the proposed what-if analysis framework in managing complex network conditions, highlighting the importance of the scenario-maker step and the impact of twinning intervals on network performance.
△ Less
Submitted 24 April, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
Multi-Tier Computing-Enabled Digital Twin in 6G Networks
Authors:
Kunlun Wang,
Yongyi Tang,
Trung Q. Duong,
Saeed R. Khosravirad,
Octavia A. Dobre,
George K. Karagiannidis
Abstract:
Digital twin (DT) is the recurrent and common feature in discussions about future technologies, bringing together advanced communication, computation, and artificial intelligence, to name a few. In the context of Industry 4.0, industries such as manufacturing, automotive, and healthcare are rapidly adopting DT-based development. The main challenges to date have been the high demands on communicati…
▽ More
Digital twin (DT) is the recurrent and common feature in discussions about future technologies, bringing together advanced communication, computation, and artificial intelligence, to name a few. In the context of Industry 4.0, industries such as manufacturing, automotive, and healthcare are rapidly adopting DT-based development. The main challenges to date have been the high demands on communication and computing resources, as well as privacy and security concerns, arising from the large volumes of data exchanges. To achieve low latency and high security services in the emerging DT, multi-tier computing has been proposed by combining edge/fog computing and cloud computing. Specifically, low latency data transmission, efficient resource allocation, and validated security strategies of multi-tier computing systems are used to solve the operational problems of the DT system. In this paper, we introduce the architecture and applications of DT using examples from manufacturing, the Internet-of-Vehicles and healthcare. At the same time, the architecture and technology of multi-tier computing systems are studied to support DT. This paper will provide valuable reference and guidance for the theory, algorithms, and applications in collaborative multi-tier computing and DT.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Digital Twin-Enabled Intelligent DDoS Detection Mechanism for Autonomous Core Networks
Authors:
Yagmur Yigit,
Bahadir Bal,
Aytac Karameseoglu,
Trung Q. Duong,
Berk Canberk
Abstract:
Existing distributed denial of service attack (DDoS) solutions cannot handle highly aggregated data rates; thus, they are unsuitable for Internet service provider (ISP) core networks. This article proposes a digital twin-enabled intelligent DDoS detection mechanism using an online learning method for autonomous systems. Our contributions are three-fold: we first design a DDoS detection architectur…
▽ More
Existing distributed denial of service attack (DDoS) solutions cannot handle highly aggregated data rates; thus, they are unsuitable for Internet service provider (ISP) core networks. This article proposes a digital twin-enabled intelligent DDoS detection mechanism using an online learning method for autonomous systems. Our contributions are three-fold: we first design a DDoS detection architecture based on the digital twin for ISP core networks. We implemented a Yet Another Next Generation (YANG) model and an automated feature selection (AutoFS) module to handle core network data. We used an online learning approach to update the model instantly and efficiently, improve the learning model quickly, and ensure accurate predictions. Finally, we reveal that our proposed solution successfully detects DDoS attacks and updates the feature selection method and learning model with a true classification rate of ninety-seven percent. Our proposed solution can estimate the attack within approximately fifteen minutes after the DDoS attack starts.
△ Less
Submitted 25 October, 2023; v1 submitted 19 October, 2023;
originally announced October 2023.
-
TwinPot: Digital Twin-assisted Honeypot for Cyber-Secure Smart Seaports
Authors:
Yagmur Yigit,
Omer Kemal Kinaci,
Trung Q. Duong,
Berk Canberk
Abstract:
The idea of next-generation ports has become more apparent in the last ten years in response to the challenge posed by the rising demand for efficiency and the ever-increasing volume of goods. In this new era of intelligent infrastructure and facilities, it is evident that cyber-security has recently received the most significant attention from the seaport and maritime authorities, and it is a pri…
▽ More
The idea of next-generation ports has become more apparent in the last ten years in response to the challenge posed by the rising demand for efficiency and the ever-increasing volume of goods. In this new era of intelligent infrastructure and facilities, it is evident that cyber-security has recently received the most significant attention from the seaport and maritime authorities, and it is a primary concern on the agenda of most ports. Traditional security solutions can be applied to safeguard IoT and Cyber-Physical Systems (CPS) from harmful entities. Nevertheless, security researchers can only watch, examine, and learn about the behaviors of attackers if these solutions operate more transparently. Herein, honeypots are potential solutions since they offer valuable information about the attackers. It can be virtual or physical. Virtual honeypots must be more realistic to entice attackers, necessitating better high-fidelity. To this end, Digital Twin (DT) technology can be employed to increase the complexity and simulation fidelity of the honeypots. Seaports can be attacked from both their existing devices and external devices at the same time. Existing mechanisms are insufficient to detect external attacks; therefore, the current systems cannot handle attacks at the desired level. DT and honeypot technologies can be used together to tackle them. Consequently, we suggest a DT-assisted honeypot, called TwinPot, for external attacks in smart seaports. Moreover, we propose an intelligent attack detection mechanism to handle different attack types using DT for internal attacks. Finally, we build an extensive smart seaport dataset for internal and external attacks using the MANSIM tool and two existing datasets to test the performance of our system. We show that under simultaneous internal and external attacks on the system, our solution successfully detects internal and external attacks.
△ Less
Submitted 25 October, 2023; v1 submitted 19 October, 2023;
originally announced October 2023.
-
An approach to extract information from academic transcripts of HUST
Authors:
Nguyen Quang Hieu,
Nguyen Le Quy Duong,
Le Quang Hoa,
Nguyen Quang Dat
Abstract:
In many Vietnamese schools, grades are still being inputted into the database manually, which is not only inefficient but also prone to human error. Thus, the automation of this process is highly necessary, which can only be achieved if we can extract information from academic transcripts. In this paper, we test our improved CRNN model in extracting information from 126 transcripts, with 1008 vert…
▽ More
In many Vietnamese schools, grades are still being inputted into the database manually, which is not only inefficient but also prone to human error. Thus, the automation of this process is highly necessary, which can only be achieved if we can extract information from academic transcripts. In this paper, we test our improved CRNN model in extracting information from 126 transcripts, with 1008 vertical lines, 3859 horizontal lines, and 2139 handwritten test scores. Then, this model is compared to the Baseline model. The results show that our model significantly outperforms the Baseline model with an accuracy of 99.6% in recognizing vertical lines, 100% in recognizing horizontal lines, and 96.11% in recognizing handwritten test scores.
△ Less
Submitted 22 April, 2023;
originally announced April 2023.
-
Leveraging Semantic Representations Combined with Contextual Word Representations for Recognizing Textual Entailment in Vietnamese
Authors:
Quoc-Loc Duong,
Duc-Vu Nguyen,
Ngan Luu-Thuy Nguyen
Abstract:
RTE is a significant problem and is a reasonably active research community. The proposed research works on the approach to this problem are pretty diverse with many different directions. For Vietnamese, the RTE problem is moderately new, but this problem plays a vital role in natural language understanding systems. Currently, methods to solve this problem based on contextual word representation le…
▽ More
RTE is a significant problem and is a reasonably active research community. The proposed research works on the approach to this problem are pretty diverse with many different directions. For Vietnamese, the RTE problem is moderately new, but this problem plays a vital role in natural language understanding systems. Currently, methods to solve this problem based on contextual word representation learning models have given outstanding results. However, Vietnamese is a semantically rich language. Therefore, in this paper, we want to present an experiment combining semantic word representation through the SRL task with context representation of BERT relative models for the RTE problem. The experimental results give conclusions about the influence and role of semantic representation on Vietnamese in understanding natural language. The experimental results show that the semantic-aware contextual representation model has about 1% higher performance than the model that does not incorporate semantic representation. In addition, the effects on the data domain in Vietnamese are also higher than those in English. This result also shows the positive influence of SRL on RTE problem in Vietnamese.
△ Less
Submitted 1 January, 2023;
originally announced January 2023.
-
A Ferroelectric Tunnel Junction-based Integrate-and-Fire Neuron
Authors:
Paolo Gibertini,
Luca Fehlings,
Suzanne Lancaster,
Quang Duong,
Thomas Mikolajick,
Catherine Dubourdieu,
Stefan Slesazeck,
Erika Covi,
Veeresh Deshpande
Abstract:
Event-based neuromorphic systems provide a low-power solution by using artificial neurons and synapses to process data asynchronously in the form of spikes. Ferroelectric Tunnel Junctions (FTJs) are ultra low-power memory devices and are well-suited to be integrated in these systems. Here, we present a hybrid FTJ-CMOS Integrate-and-Fire neuron which constitutes a fundamental building block for new…
▽ More
Event-based neuromorphic systems provide a low-power solution by using artificial neurons and synapses to process data asynchronously in the form of spikes. Ferroelectric Tunnel Junctions (FTJs) are ultra low-power memory devices and are well-suited to be integrated in these systems. Here, we present a hybrid FTJ-CMOS Integrate-and-Fire neuron which constitutes a fundamental building block for new-generation neuromorphic networks for edge computing. We demonstrate electrically tunable neural dynamics achievable by tuning the switching of the FTJ device.
△ Less
Submitted 4 November, 2022;
originally announced November 2022.
-
Improvement of FTJ on-current by work function engineering for massive parallel neuromorphic computing
Authors:
Suzanne Lancaster,
Quang T. Duong,
Erika Covi,
Thomas Mikolajick,
Stefan Slesazeck
Abstract:
HfO2-based ferroelectric tunnel junctions (FTJs) exhibit attractive properties for adoption in neuromorphic applications. The combination of ultra-low-power multi-level switching capability together with the low on-current density suggests the application in circuits for massive parallel computation. In this work, we discuss one example circuit of a differential synaptic cell featuring multiple pa…
▽ More
HfO2-based ferroelectric tunnel junctions (FTJs) exhibit attractive properties for adoption in neuromorphic applications. The combination of ultra-low-power multi-level switching capability together with the low on-current density suggests the application in circuits for massive parallel computation. In this work, we discuss one example circuit of a differential synaptic cell featuring multiple parallel connected FTJ devices. Moreover, from the circuit requirements we deduce that the absolute difference in currents (Ion - Ioff) is a more critical figure of merit than the tunneling electroresistance ratio (TER). Based on this, we discuss the potential of FTJ device optimization by means of electrode work function engineering in bilayer HZO/Al2O3 FTJs.
△ Less
Submitted 1 November, 2022; v1 submitted 21 September, 2022;
originally announced September 2022.
-
An Unsupervised Learning Approach for Spectrum Allocation in Terahertz Communication Systems
Authors:
Akram Shafie,
Chunhui Li,
Nan Yang,
Xiangyun Zhou,
Trung Q. Duong
Abstract:
We propose a new spectrum allocation strategy, aided by unsupervised learning, for multiuser terahertz communication systems. In this strategy, adaptive sub-band bandwidth is considered such that the spectrum of interest can be divided into sub-bands with unequal bandwidths. This strategy reduces the variation in molecular absorption loss among the users, leading to the improved data rate performa…
▽ More
We propose a new spectrum allocation strategy, aided by unsupervised learning, for multiuser terahertz communication systems. In this strategy, adaptive sub-band bandwidth is considered such that the spectrum of interest can be divided into sub-bands with unequal bandwidths. This strategy reduces the variation in molecular absorption loss among the users, leading to the improved data rate performance. We first formulate an optimization problem to determine the optimal sub-band bandwidth and transmit power, and then propose the unsupervised learning-based approach to obtaining the near-optimal solution to this problem. In the proposed approach, we first train a deep neural network (DNN) while utilizing a loss function that is inspired by the Lagrangian of the formulated problem. Then using the trained DNN, we approximate the near-optimal solutions. Numerical results demonstrate that comparing to existing approaches, our proposed unsupervised learning-based approach achieves a higher data rate, especially when the molecular absorption coefficient within the spectrum of interest varies in a highly non-linear manner.
△ Less
Submitted 6 August, 2022;
originally announced August 2022.
-
A 120dB Programmable-Range On-Chip Pulse Generator for Characterizing Ferroelectric Devices
Authors:
Shyam Narayanan,
Erika Covi,
Viktor Havel,
Charlotte Frenkel,
Suzanne Lancaster,
Quang Duong,
Stefan Slesazeck,
Thomas Mikolajick,
Melika Payvand,
Giacomo Indiveri
Abstract:
Novel non-volatile memory devices based on ferroelectric thin films represent a promising emerging technology that is ideally suited for neuromorphic applications. The physical switching mechanism in such films is the nucleation and growth of ferroelectric domains. Since this has a strong dependence on both pulse width and voltage amplitude, it is important to use precise pulsing schemes for a tho…
▽ More
Novel non-volatile memory devices based on ferroelectric thin films represent a promising emerging technology that is ideally suited for neuromorphic applications. The physical switching mechanism in such films is the nucleation and growth of ferroelectric domains. Since this has a strong dependence on both pulse width and voltage amplitude, it is important to use precise pulsing schemes for a thorough characterization of their behaviour. In this work, we present an on-chip 120 dB programmable range pulse generator, that can generate pulse widths ranging from 10ns to 10ms $\pm$2.5% which eliminates the RLC bottleneck in the device characterisation setup. We describe the pulse generator design and show how the pulse width can be tuned with high accuracy, using Digital to Analog converters. Finally, we present experimental results measured from the circuit, fabricated using a standard 180nm CMOS technology.
△ Less
Submitted 8 February, 2022;
originally announced February 2022.
-
TFW2V: An Enhanced Document Similarity Method for the Morphologically Rich Finnish Language
Authors:
Quan Duong,
Mika Hämäläinen,
Khalid Alnajjar
Abstract:
Measuring the semantic similarity of different texts has many important applications in Digital Humanities research such as information retrieval, document clustering and text summarization. The performance of different methods depends on the length of the text, the domain and the language. This study focuses on experimenting with some of the current approaches to Finnish, which is a morphological…
▽ More
Measuring the semantic similarity of different texts has many important applications in Digital Humanities research such as information retrieval, document clustering and text summarization. The performance of different methods depends on the length of the text, the domain and the language. This study focuses on experimenting with some of the current approaches to Finnish, which is a morphologically rich language. At the same time, we propose a simple method, TFW2V, which shows high efficiency in handling both long text documents and limited amounts of data. Furthermore, we design an objective evaluation method which can be used as a framework for benchmarking text similarity approaches.
△ Less
Submitted 23 December, 2021;
originally announced December 2021.
-
Wireless Powered Communication Networks with Non-Ideal Circuit Power Consumption
Authors:
Slavche Pejoski,
Zoran Hadzi-Velkov,
Trung Q. Duong,
Caijun Zhong
Abstract:
Assuming non-ideal circuit power consumption at the energy harvesting (EH) nodes, we propose two practical protocols that optimize the performance of the harvest-then-transmit wireless powered communication networks (WPCNs) under two different objectives: (1) proportional fair (PF) resource allocation, and (2) sum rate maximization. These objectives lead to optimal allocations for the transmit pow…
▽ More
Assuming non-ideal circuit power consumption at the energy harvesting (EH) nodes, we propose two practical protocols that optimize the performance of the harvest-then-transmit wireless powered communication networks (WPCNs) under two different objectives: (1) proportional fair (PF) resource allocation, and (2) sum rate maximization. These objectives lead to optimal allocations for the transmit power by the base station (BS), which broadcasts RF radiation over the downlink, and optimal durations of the EH phase and the uplink information transmission phases within the dynamic time-division multiple access (TDMA) frame. Compared to the max-sum-rate protocol, the PF protocol attains a higher level of system fairness at the expense of the sum rate degradation. The PF protocol is advantageous over the max-sum-rate protocol in terms of system fairness regardless of the circuit power consumption, whereas the uplink sum rates of both protocols converge when this power consumption increases.
△ Less
Submitted 14 August, 2021;
originally announced August 2021.
-
Deep Reinforcement Learning for Intelligent Reflecting Surface-assisted D2D Communications
Authors:
Khoi Khac Nguyen,
Antonino Masaracchia,
Cheng Yin,
Long D. Nguyen,
Octavia A. Dobre,
Trung Q. Duong
Abstract:
In this paper, we propose a deep reinforcement learning (DRL) approach for solving the optimisation problem of the network's sum-rate in device-to-device (D2D) communications supported by an intelligent reflecting surface (IRS). The IRS is deployed to mitigate the interference and enhance the signal between the D2D transmitter and the associated D2D receiver. Our objective is to jointly optimise t…
▽ More
In this paper, we propose a deep reinforcement learning (DRL) approach for solving the optimisation problem of the network's sum-rate in device-to-device (D2D) communications supported by an intelligent reflecting surface (IRS). The IRS is deployed to mitigate the interference and enhance the signal between the D2D transmitter and the associated D2D receiver. Our objective is to jointly optimise the transmit power at the D2D transmitter and the phase shift matrix at the IRS to maximise the network sum-rate. We formulate a Markov decision process and then propose the proximal policy optimisation for solving the maximisation game. Simulation results show impressive performance in terms of the achievable rate and processing time.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
RIS-assisted UAV Communications for IoT with Wireless Power Transfer Using Deep Reinforcement Learning
Authors:
Khoi Khac Nguyen,
Antonino Masaracchia,
Tan Do-Duy,
H. Vincent Poor,
Trung Q. Duong
Abstract:
Many of the devices used in Internet-of-Things (IoT) applications are energy-limited, and thus supplying energy while maintaining seamless connectivity for IoT devices is of considerable importance. In this context, we propose a simultaneous wireless power transfer and information transmission scheme for IoT devices with support from reconfigurable intelligent surface (RIS)-aided unmanned aerial v…
▽ More
Many of the devices used in Internet-of-Things (IoT) applications are energy-limited, and thus supplying energy while maintaining seamless connectivity for IoT devices is of considerable importance. In this context, we propose a simultaneous wireless power transfer and information transmission scheme for IoT devices with support from reconfigurable intelligent surface (RIS)-aided unmanned aerial vehicle (UAV) communications. In particular, in a first phase, IoT devices harvest energy from the UAV through wireless power transfer; and then in a second phase, the UAV collects data from the IoT devices through information transmission. To characterise the agility of the UAV, we consider two scenarios: a hovering UAV and a mobile UAV. Aiming at maximizing the total network sum-rate, we jointly optimize the trajectory of the UAV, the energy harvesting scheduling of IoT devices, and the phaseshift matrix of the RIS. We formulate a Markov decision process and propose two deep reinforcement learning algorithms to solve the optimization problem of maximizing the total network sum-rate. Numerical results illustrate the effectiveness of the UAV's flying path optimization and the network's throughput of our proposed techniques compared with other benchmark schemes. Given the strict requirements of the RIS and UAV, the significant improvement in processing time and throughput performance demonstrates that our proposed scheme is well applicable for practical IoT applications.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
A Simplified Framework for Air Route Clustering Based on ADS-B Data
Authors:
Quan Duong,
Tan Tran,
Duc-Thinh Pham,
An Mai
Abstract:
The volume of flight traffic gets increasing over the time, which makes the strategic traffic flow management become one of the challenging problems since it requires a lot of computational resources to model entire traffic data. On the other hand, Automatic Dependent Surveillance - Broadcast (ADS-B) technology has been considered as a promising data technology to provide both flight crews and gro…
▽ More
The volume of flight traffic gets increasing over the time, which makes the strategic traffic flow management become one of the challenging problems since it requires a lot of computational resources to model entire traffic data. On the other hand, Automatic Dependent Surveillance - Broadcast (ADS-B) technology has been considered as a promising data technology to provide both flight crews and ground control staff the necessary information safely and efficiently about the position and velocity of the airplanes in a specific area. In the attempt to tackle this problem, we presented in this paper a simplified framework that can support to detect the typical air routes between airports based on ADS-B data. Specifically, the flight traffic will be classified into major groups based on similarity measures, which helps to reduce the number of flight paths between airports. As a matter of fact, our framework can be taken into account to reduce practically the computational cost for air flow optimization and evaluate the operational performance. Finally, in order to illustrate the potential applications of our proposed framework, an experiment was performed using ADS-B traffic flight data of three different pairs of airports. The detected typical routes between each couple of airports show promising results by virtue of combining two indices for measuring the clustering performance and incorporating human judgment into the visual inspection.
△ Less
Submitted 7 July, 2021;
originally announced July 2021.
-
Hub and Spoke Logistics Network Design for Urban Region with Clustering-Based Approach
Authors:
Quan Duong,
Dang Nguyen,
Quoc Nguyen
Abstract:
This study aims to propose effective modeling and approach for designing a logistics network in the urban area in order to offer an efficient flow distribution network as a competitive strategy in the logistics industry where demand is sensitive to both price and time. A multi-stage approach is introduced to select the number of hubs and allocate spokes to the hubs for flow distribution and hubs'…
▽ More
This study aims to propose effective modeling and approach for designing a logistics network in the urban area in order to offer an efficient flow distribution network as a competitive strategy in the logistics industry where demand is sensitive to both price and time. A multi-stage approach is introduced to select the number of hubs and allocate spokes to the hubs for flow distribution and hubs' location detection. Specifically, a fuzzy clustering model with the objective function is to minimize the approximate transportation cost is employed, in the next phase is to focus on balancing the demand capacity among the hubs with the help of domain experts, afterward, the facility location vehicle routing problems within the network is introduced. To demonstrate the approach's advantages, an experiment was performed on the designed network and its actual transportation cost for the real operational data in which specific to the Ho Chi Minh city infrastructure conditions. Additionally, we show the flexibility of the designed network in the flow distribution and its computational experiments to develop the managerial insights which contribute to the network design decision-making process.
△ Less
Submitted 7 July, 2021;
originally announced July 2021.
-
FedFog: Network-Aware Optimization of Federated Learning over Wireless Fog-Cloud Systems
Authors:
Van-Dinh Nguyen,
Symeon Chatzinotas,
Bjorn Ottersten,
Trung Q. Duong
Abstract:
Federated learning (FL) is capable of performing large distributed machine learning tasks across multiple edge users by periodically aggregating trained local parameters. To address key challenges of enabling FL over a wireless fog-cloud system (e.g., non-i.i.d. data, users' heterogeneity), we first propose an efficient FL algorithm based on Federated Averaging (called FedFog) to perform the local…
▽ More
Federated learning (FL) is capable of performing large distributed machine learning tasks across multiple edge users by periodically aggregating trained local parameters. To address key challenges of enabling FL over a wireless fog-cloud system (e.g., non-i.i.d. data, users' heterogeneity), we first propose an efficient FL algorithm based on Federated Averaging (called FedFog) to perform the local aggregation of gradient parameters at fog servers and global training update at the cloud. Next, we employ FedFog in wireless fog-cloud systems by investigating a novel network-aware FL optimization problem that strikes the balance between the global loss and completion time. An iterative algorithm is then developed to obtain a precise measurement of the system performance, which helps design an efficient stop** criteria to output an appropriate number of global rounds. To mitigate the straggler effect, we propose a flexible user aggregation strategy that trains fast users first to obtain a certain level of accuracy before allowing slow users to join the global training updates. Extensive numerical results using several real-world FL tasks are provided to verify the theoretical convergence of FedFog. We also show that the proposed co-design of FL and communication is essential to substantially improve resource utilization while achieving comparable accuracy of the learning model.
△ Less
Submitted 10 February, 2022; v1 submitted 4 July, 2021;
originally announced July 2021.
-
Ferroelectric Tunneling Junctions for Edge Computing
Authors:
Erika Covi,
Quang T. Duong,
Suzanne Lancaster,
Viktor Havel,
Jean Coignus,
Justine Barbot,
Ole Richter,
Philip Klein,
Elisabetta Chicca,
Laurent Grenouillet,
Athanasios Dimoulas,
Thomas Mikolajick,
Stefan Slesazeck
Abstract:
Ferroelectric tunneling junctions (FTJ) are considered to be the intrinsically most energy efficient memristors. In this work, specific electrical features of ferroelectric hafnium-zirconium oxide based FTJ devices are investigated. Moreover, the impact on the design of FTJ-based circuits for edge computing applications is discussed by means of two example circuits.
Ferroelectric tunneling junctions (FTJ) are considered to be the intrinsically most energy efficient memristors. In this work, specific electrical features of ferroelectric hafnium-zirconium oxide based FTJ devices are investigated. Moreover, the impact on the design of FTJ-based circuits for edge computing applications is discussed by means of two example circuits.
△ Less
Submitted 5 July, 2021;
originally announced July 2021.
-
3D UAV Trajectory and Data Collection Optimisation via Deep Reinforcement Learning
Authors:
Khoi Khac Nguyen,
Trung Q. Duong,
Tan Do-Duy,
Holger Claussen,
and Lajos Hanzo
Abstract:
Unmanned aerial vehicles (UAVs) are now beginning to be deployed for enhancing the network performance and coverage in wireless communication. However, due to the limitation of their on-board power and flight time, it is challenging to obtain an optimal resource allocation scheme for the UAV-assisted Internet of Things (IoT). In this paper, we design a new UAV-assisted IoT systems relying on the s…
▽ More
Unmanned aerial vehicles (UAVs) are now beginning to be deployed for enhancing the network performance and coverage in wireless communication. However, due to the limitation of their on-board power and flight time, it is challenging to obtain an optimal resource allocation scheme for the UAV-assisted Internet of Things (IoT). In this paper, we design a new UAV-assisted IoT systems relying on the shortest flight path of the UAVs while maximising the amount of data collected from IoT devices. Then, a deep reinforcement learning-based technique is conceived for finding the optimal trajectory and throughput in a specific coverage area. After training, the UAV has the ability to autonomously collect all the data from user nodes at a significant total sum-rate improvement while minimising the associated resources used. Numerical results are provided to highlight how our techniques strike a balance between the throughput attained, trajectory, and the time spent. More explicitly, we characterise the attainable performance in terms of the UAV trajectory, the expected reward and the total sum-rate.
△ Less
Submitted 6 June, 2021;
originally announced June 2021.
-
On the hidden treasure of dialog in video question answering
Authors:
Deniz Engin,
François Schnitzler,
Ngoc Q. K. Duong,
Yannis Avrithis
Abstract:
High-level understanding of stories in video such as movies and TV shows from raw data is extremely challenging. Modern video question answering (VideoQA) systems often use additional human-made sources like plot synopses, scripts, video descriptions or knowledge bases. In this work, we present a new approach to understand the whole story without such external sources. The secret lies in the dialo…
▽ More
High-level understanding of stories in video such as movies and TV shows from raw data is extremely challenging. Modern video question answering (VideoQA) systems often use additional human-made sources like plot synopses, scripts, video descriptions or knowledge bases. In this work, we present a new approach to understand the whole story without such external sources. The secret lies in the dialog: unlike any prior work, we treat dialog as a noisy source to be converted into text description via dialog summarization, much like recent methods treat video. The input of each modality is encoded by transformers independently, and a simple fusion method combines all modalities, using soft temporal attention for localization over long inputs. Our model outperforms the state of the art on the KnowIT VQA dataset by a large margin, without using question-specific human annotation or human-made plot summaries. It even outperforms human evaluators who have never watched any whole episode before. Code is available at https://engindeniz.github.io/dialogsummary-videoqa
△ Less
Submitted 19 August, 2021; v1 submitted 26 March, 2021;
originally announced March 2021.
-
An Unsupervised method for OCR Post-Correction and Spelling Normalisation for Finnish
Authors:
Quan Duong,
Mika Hämäläinen,
Simon Hengchen
Abstract:
Historical corpora are known to contain errors introduced by OCR (optical character recognition) methods used in the digitization process, often said to be degrading the performance of NLP systems. Correcting these errors manually is a time-consuming process and a great part of the automatic approaches have been relying on rules or supervised machine learning. We build on previous work on fully au…
▽ More
Historical corpora are known to contain errors introduced by OCR (optical character recognition) methods used in the digitization process, often said to be degrading the performance of NLP systems. Correcting these errors manually is a time-consuming process and a great part of the automatic approaches have been relying on rules or supervised machine learning. We build on previous work on fully automatic unsupervised extraction of parallel data to train a character-based sequence-to-sequence NMT (neural machine translation) model to conduct OCR error correction designed for English, and adapt it to Finnish by proposing solutions that take the rich morphology of the language into account. Our new method shows increased performance while remaining fully unsupervised, with the added benefit of spelling normalisation. The source code and models are available on GitHub and Zenodo.
△ Less
Submitted 6 November, 2020;
originally announced November 2020.
-
Self-Attention Generative Adversarial Network for Speech Enhancement
Authors:
Huy Phan,
Huy Le Nguyen,
Oliver Y. Chén,
Philipp Koch,
Ngoc Q. K. Duong,
Ian McLoughlin,
Alfred Mertins
Abstract:
Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we…
▽ More
Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we empirically study the effect of placing the self-attention layer at the (de)convolutional layers with varying layer indices as well as at all of them when memory allows. Our experiments show that introducing self-attention to SEGAN leads to consistent improvement across the objective evaluation metrics of enhancement performance. Furthermore, applying at different (de)convolutional layers does not significantly alter performance, suggesting that it can be conveniently applied at the highest-level (de)convolutional layer with the smallest memory overhead.
△ Less
Submitted 6 February, 2021; v1 submitted 18 October, 2020;
originally announced October 2020.
-
On Multitask Loss Function for Audio Event Detection and Localization
Authors:
Huy Phan,
Lam Pham,
Philipp Koch,
Ngoc Q. K. Duong,
Ian McLoughlin,
Alfred Mertins
Abstract:
Audio event localization and detection (SELD) have been commonly tackled using multitask models. Such a model usually consists of a multi-label event classification branch with sigmoid cross-entropy loss for event activity detection and a regression branch with mean squared error loss for direction-of-arrival estimation. In this work, we propose a multitask regression model, in which both (multi-l…
▽ More
Audio event localization and detection (SELD) have been commonly tackled using multitask models. Such a model usually consists of a multi-label event classification branch with sigmoid cross-entropy loss for event activity detection and a regression branch with mean squared error loss for direction-of-arrival estimation. In this work, we propose a multitask regression model, in which both (multi-label) event detection and localization are formulated as regression problems and use the mean squared error loss homogeneously for model training. We show that the common combination of heterogeneous loss functions causes the network to underfit the data whereas the homogeneous mean squared error loss leads to better convergence and performance. Experiments on the development and validation sets of the DCASE 2020 SELD task demonstrate that the proposed system also outperforms the DCASE 2020 SELD baseline across all the detection and localization metrics, reducing the overall SELD error (the combined metric) by approximately 10% absolute.
△ Less
Submitted 11 September, 2020;
originally announced September 2020.
-
COVID-19 Image Data Collection: Prospective Predictions Are the Future
Authors:
Joseph Paul Cohen,
Paul Morrison,
Lan Dao,
Karsten Roth,
Tim Q Duong,
Marzyeh Ghassemi
Abstract:
Across the world's coronavirus disease 2019 (COVID-19) hot spots, the need to streamline patient diagnosis and management has become more pressing than ever. As one of the main imaging tools, chest X-rays (CXRs) are common, fast, non-invasive, relatively cheap, and potentially bedside to monitor the progression of the disease. This paper describes the first public COVID-19 image data collection as…
▽ More
Across the world's coronavirus disease 2019 (COVID-19) hot spots, the need to streamline patient diagnosis and management has become more pressing than ever. As one of the main imaging tools, chest X-rays (CXRs) are common, fast, non-invasive, relatively cheap, and potentially bedside to monitor the progression of the disease. This paper describes the first public COVID-19 image data collection as well as a preliminary exploration of possible use cases for the data. This dataset currently contains hundreds of frontal view X-rays and is the largest public resource for COVID-19 image and prognostic data, making it a necessary resource to develop and evaluate tools to aid in the treatment of COVID-19. It was manually aggregated from publication figures as well as various web based repositories into a machine learning (ML) friendly format with accompanying dataloader code. We collected frontal and lateral view imagery and metadata such as the time since first symptoms, intensive care unit (ICU) status, survival status, intubation status, or hospital location. We present multiple possible use cases for the data such as predicting the need for the ICU, predicting patient survival, and understanding a patient's trajectory during treatment. Data can be accessed here: https://github.com/ieee8023/covid-chestxray-dataset
△ Less
Submitted 14 December, 2020; v1 submitted 21 June, 2020;
originally announced June 2020.
-
COVID-Net S: Towards computer-aided severity assessment via training and validation of deep neural networks for geographic extent and opacity extent scoring of chest X-rays for SARS-CoV-2 lung disease severity
Authors:
Alexander Wong,
Zhong Qiu Lin,
Linda Wang,
Audrey G. Chung,
Beiyi Shen,
Almas Abbasi,
Mahsa Hoshmand-Kochi,
Timothy Q. Duong
Abstract:
Background: A critical step in effective care and treatment planning for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of the COVID-19 pandemic, is the assessment of the severity of disease progression. Chest x-rays (CXRs) are often used to assess SARS-CoV-2 severity, with two important assessment metrics being extent of lung involvement and degree of opacity. In this pro…
▽ More
Background: A critical step in effective care and treatment planning for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of the COVID-19 pandemic, is the assessment of the severity of disease progression. Chest x-rays (CXRs) are often used to assess SARS-CoV-2 severity, with two important assessment metrics being extent of lung involvement and degree of opacity. In this proof-of-concept study, we assess the feasibility of computer-aided scoring of CXRs of SARS-CoV-2 lung disease severity using a deep learning system.
Materials and Methods: Data consisted of 396 CXRs from SARS-CoV-2 positive patient cases. Geographic extent and opacity extent were scored by two board-certified expert chest radiologists (with 20+ years of experience) and a 2nd-year radiology resident. The deep neural networks used in this study, which we name COVID-Net S, are based on a COVID-Net network architecture. 100 versions of the network were independently learned (50 to perform geographic extent scoring and 50 to perform opacity extent scoring) using random subsets of CXRs from the study, and we evaluated the networks using stratified Monte Carlo cross-validation experiments.
Findings: The COVID-Net S deep neural networks yielded R$^2$ of 0.664 $\pm$ 0.032 and 0.635 $\pm$ 0.044 between predicted scores and radiologist scores for geographic extent and opacity extent, respectively, in stratified Monte Carlo cross-validation experiments. The best performing networks achieved R$^2$ of 0.739 and 0.741 between predicted scores and radiologist scores for geographic extent and opacity extent, respectively.
Interpretation: The results are promising and suggest that the use of deep neural networks on CXRs could be an effective tool for computer-aided assessment of SARS-CoV-2 lung disease severity, although additional studies are needed before adoption for routine clinical use.
△ Less
Submitted 16 April, 2021; v1 submitted 26 May, 2020;
originally announced May 2020.
-
Predicting COVID-19 Pneumonia Severity on Chest X-ray with Deep Learning
Authors:
Joseph Paul Cohen,
Lan Dao,
Paul Morrison,
Karsten Roth,
Yoshua Bengio,
Beiyi Shen,
Almas Abbasi,
Mahsa Hoshmand-Kochi,
Marzyeh Ghassemi,
Haifang Li,
Tim Q Duong
Abstract:
Purpose: The need to streamline patient management for COVID-19 has become more pressing than ever. Chest X-rays provide a non-invasive (potentially bedside) tool to monitor the progression of the disease. In this study, we present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images. Such a tool can gauge severity of COVID-19 lung infections (and pneumonia in ge…
▽ More
Purpose: The need to streamline patient management for COVID-19 has become more pressing than ever. Chest X-rays provide a non-invasive (potentially bedside) tool to monitor the progression of the disease. In this study, we present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images. Such a tool can gauge severity of COVID-19 lung infections (and pneumonia in general) that can be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the ICU.
Methods: Images from a public COVID-19 database were scored retrospectively by three blinded experts in terms of the extent of lung involvement as well as the degree of opacity. A neural network model that was pre-trained on large (non-COVID-19) chest X-ray datasets is used to construct features for COVID-19 images which are predictive for our task.
Results: This study finds that training a regression model on a subset of the outputs from an this pre-trained chest X-ray model predicts our geographic extent score (range 0-8) with 1.14 mean absolute error (MAE) and our lung opacity score (range 0-6) with 0.78 MAE.
Conclusions: These results indicate that our model's ability to gauge severity of COVID-19 lung infections could be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the intensive care unit (ICU). A proper clinical trial is needed to evaluate efficacy. To enable this we make our code, labels, and data available online at https://github.com/mlmed/torchxrayvision/tree/master/scripts/covid-severity and https://github.com/ieee8023/covid-chestxray-dataset
△ Less
Submitted 30 June, 2020; v1 submitted 24 May, 2020;
originally announced May 2020.
-
Downlink Spectral Efficiency of Cell-Free Massive MIMO Systems with Multi-antenna Users
Authors:
Trang C. Mai,
Hien Quoc Ngo,
Trung Q. Duong
Abstract:
This paper studies a cell-free massive multiple-input multiple-output (MIMO) system where its access points (APs) and users are equipped with multiple antennas. Two transmission protocols are considered. In the first transmission protocol, there are no downlink pilots, while in the second transmission protocol, downlink pilots are proposed in order to improve the system performance. In both transm…
▽ More
This paper studies a cell-free massive multiple-input multiple-output (MIMO) system where its access points (APs) and users are equipped with multiple antennas. Two transmission protocols are considered. In the first transmission protocol, there are no downlink pilots, while in the second transmission protocol, downlink pilots are proposed in order to improve the system performance. In both transmission protocols, the users use the minimum mean-squared error-based successive interference cancellation (MMSE-SIC) scheme to detect the desired signals. For the analysis, we first derive a general spectral efficiency formula with arbitrary side information at the users. Then analytical expressions for the spectral efficiency of different transmission protocols are derived. To improve the spectral efficiency (SE) of the system, max-min fairness power control (PC) is applied for the first protocol by using the closed-form expression of its SE. Due to the computation complexity of deriving the closed-form performance expression of SE for the second protocol, we apply the optimal power coefficients of the first protocol to the second protocol. Numerical results show that two protocols combining with multi-antenna users are prerequisites to achieve the suboptimal SE regardless of the number of user in the system.
△ Less
Submitted 24 April, 2020;
originally announced April 2020.
-
Full-Duplex MIMO-OFDM Communication with Self-Energy Recycling
Authors:
Ali A. Nasir,
H. D. Tuan,
T. Q. Duong,
H. V. Poor
Abstract:
This paper focuses on energy recycling in full-duplex (FD) relaying multiple-input-multiple-output orthogonal frequency division multiplexing (OFDM) communication. The loop self-interference (SI) due to full-duplexing is seen as an opportunity for the energy-constrained relay node to replenish its energy requirement through wireless power transfer. In forwarding the source information to the desti…
▽ More
This paper focuses on energy recycling in full-duplex (FD) relaying multiple-input-multiple-output orthogonal frequency division multiplexing (OFDM) communication. The loop self-interference (SI) due to full-duplexing is seen as an opportunity for the energy-constrained relay node to replenish its energy requirement through wireless power transfer. In forwarding the source information to the destination, the FD relay can simultaneously harvest energy from the source wireless transmission and also through energy recycling from its own transmission. The objective is to maximize the overall spectral efficiency by designing the optimal power allocation over OFDM sub-carriers and transmit antennas. Due to a large number of sub-carriers, this design problem poses a large-scale nonconvex optimization problem involving a few thousand variables of power allocation, which is very computationally challenging. A new path-following algorithm is proposed, which converges to an optimal solution. This algorithm is very efficient since it is based on \textit{closed-form} calculations. Numerical results for a practical simulation setting show promising results by achieving high spectral efficiency.
△ Less
Submitted 24 March, 2019;
originally announced March 2019.
-
Discriminate natural versus loudspeaker emitted speech
Authors:
Thanh-Ha Le,
Philippe Gilberton,
Ngoc Q. K. Duong
Abstract:
In this work, we address a novel, but potentially emerging, problem of discriminating the natural human voices and those played back by any kind of audio devices in the context of interactions with in-house voice user interface. The tackled problem may find relevant applications in (1) the far-field voice interactions of vocal interfaces such as Amazon Echo, Google Home, Facebook Portal, etc, and…
▽ More
In this work, we address a novel, but potentially emerging, problem of discriminating the natural human voices and those played back by any kind of audio devices in the context of interactions with in-house voice user interface. The tackled problem may find relevant applications in (1) the far-field voice interactions of vocal interfaces such as Amazon Echo, Google Home, Facebook Portal, etc, and (2) the replay spoofing attack detection. The detection of loudspeaker emitted speech will help avoid false wake-ups or unintended interactions with the devices in the first application, while eliminating attacks involve the replay of recordings collected from enrolled speakers in the second one. At first we collect a real-world dataset under well-controlled conditions containing two classes: recorded speeches directly spoken by numerous people (considered as the natural speech), and recorded speeches played back from various loudspeakers (considered as the loudspeaker emitted speech). Then from this dataset, we build prediction models based on Deep Neural Network (DNN) for which different combination of audio features have been considered. Experiment results confirm the feasibility of the task where the combination of audio embeddings extracted from SoundNet and VGGish network yields the classification accuracy up to about 90%.
△ Less
Submitted 17 February, 2019; v1 submitted 31 January, 2019;
originally announced January 2019.
-
VideoMem: Constructing, Analyzing, Predicting Short-term and Long-term Video Memorability
Authors:
Romain Cohendet,
Claire-Hélène Demarty,
Ngoc Q. K. Duong,
Martin Engilberge
Abstract:
Humans share a strong tendency to memorize/forget some of the visual information they encounter. This paper focuses on providing computational models for the prediction of the intrinsic memorability of visual content. To address this new challenge, we introduce a large scale dataset (VideoMem) composed of 10,000 videos annotated with memorability scores. In contrast to previous work on image memor…
▽ More
Humans share a strong tendency to memorize/forget some of the visual information they encounter. This paper focuses on providing computational models for the prediction of the intrinsic memorability of visual content. To address this new challenge, we introduce a large scale dataset (VideoMem) composed of 10,000 videos annotated with memorability scores. In contrast to previous work on image memorability -- where memorability was measured a few minutes after memorization -- memory performance is measured twice: a few minutes after memorization and again 24-72 hours later. Hence, the dataset comes with short-term and long-term memorability annotations. After an in-depth analysis of the dataset, we investigate several deep neural network based models for the prediction of video memorability. Our best model using a ranking loss achieves a Spearman's rank correlation of 0.494 for short-term memorability prediction, while our proposed model with attention mechanism provides insights of what makes a content memorable. The VideoMem dataset with pre-extracted features is publicly available.
△ Less
Submitted 5 December, 2018;
originally announced December 2018.
-
UAV-Empowered Disaster-Resilient Edge Architecture for Delay-Sensitive Communication
Authors:
Zeeshan Kaleem,
Muhammad Yousaf,
Aamir Qamar,
Ayaz Ahmad,
Trung Q. Duong,
Wan Choi,
Abbas Jamalipour
Abstract:
The fifth-generation (5G) communication systems will enable enhanced mobile broadband, ultra-reliable low latency, and massive connectivity services. The broadband and low-latency services are indispensable to public safety (PS) communication during natural or man-made disasters. Recently, the third generation partnership project long term evolution (3GPPLTE) has emerged as a promising candidate t…
▽ More
The fifth-generation (5G) communication systems will enable enhanced mobile broadband, ultra-reliable low latency, and massive connectivity services. The broadband and low-latency services are indispensable to public safety (PS) communication during natural or man-made disasters. Recently, the third generation partnership project long term evolution (3GPPLTE) has emerged as a promising candidate to enable broadband PS communications. In this article, first we present six major PS-LTE enabling services and the current status of PS-LTE in 3GPP releases. Then, we discuss the spectrum bands allocated for PS-LTE in major countries by international telecommunication union (ITU). Finally, we propose a disaster resilient three-layered architecture for PS-LTE (DR-PSLTE). This architecture consists of a software-defined network (SDN) layer to provide centralized control, an unmanned air vehicle (UAV) cloudlet layer to facilitate edge computing or to enable emergency communication link, and a radio access layer. The proposed architecture is flexible and combines the benefits of SDNs and edge computing to efficiently meet the delay requirements of various PS-LTE services. Numerical results verified that under the proposed DR-PSLTE architecture, delay is reduced by 20% as compared with the conventional centralized computing architecture.
△ Less
Submitted 28 January, 2019; v1 submitted 26 September, 2018;
originally announced September 2018.
-
Real-time Optimal Resource Allocation for Embedded UAV Communication Systems
Authors:
Minh-Nghia Nguyen,
Long D. Nguyen,
Trung Q. Duong,
Hoang Duong Tuan
Abstract:
We consider device-to-device (D2D) wireless information and power transfer systems using an unmanned aerial vehicle (UAV) as a relay-assisted node. As the energy capacity and flight time of UAVs is limited, a significant issue in deploying UAV is to manage energy consumption in real-time application, which is proportional to the UAV transmit power. To tackle this important issue, we develop a real…
▽ More
We consider device-to-device (D2D) wireless information and power transfer systems using an unmanned aerial vehicle (UAV) as a relay-assisted node. As the energy capacity and flight time of UAVs is limited, a significant issue in deploying UAV is to manage energy consumption in real-time application, which is proportional to the UAV transmit power. To tackle this important issue, we develop a real-time resource allocation algorithm for maximizing the energy efficiency by jointly optimizing the energy-harvesting time and power control for the considered (D2D) communication embedded with UAV. We demonstrate the effectiveness of the proposed algorithms as running time for solving them can be conducted in milliseconds.
△ Less
Submitted 5 September, 2018;
originally announced September 2018.
-
UAV-Enabled Communication Using NOMA
Authors:
Ali A. Nasir,
Hoang D. Tuan,
Trung Q. Duong,
H. Vincent Poor
Abstract:
Unmanned aerial vehicles (UAVs) can be deployed as flying base stations (BSs) to leverage the strength of line-of-sight connections and effectively support the coverage and throughput of wireless communication. This paper considers a multiuser communication system, in which a single-antenna UAV-BS serves a large number of ground users by employing non-orthogonal multiple access (NOMA). The max-min…
▽ More
Unmanned aerial vehicles (UAVs) can be deployed as flying base stations (BSs) to leverage the strength of line-of-sight connections and effectively support the coverage and throughput of wireless communication. This paper considers a multiuser communication system, in which a single-antenna UAV-BS serves a large number of ground users by employing non-orthogonal multiple access (NOMA). The max-min rate optimization problem is formulated under total power, total bandwidth, UAV altitude, and antenna beamwdith constraints. The objective of max-min rate optimization is non-convex in all optimization variables, i.e. UAV altitude, transmit antenna beamwidth, power allocation and bandwidth allocation for multiple users. A path-following algorithm is proposed to solve the formulated problem. Next, orthogonal multiple access (OMA) and dirty paper coding (DPC)-based max-min rate optimization problems are formulated and respective path-following algorithms are developed to solve them. Numerical results show that NOMA outperforms OMA and achieves rates similar to those attained by DPC. In addition, a clear rate gain is observed by jointly optimizing all the parameters rather than optimizing a subset of parameters, which confirms the desirability of their joint optimization.
△ Less
Submitted 10 June, 2018;
originally announced June 2018.
-
Cell-free Massive MIMO Networks: Optimal Power Control against Active Eavesdrop**
Authors:
Tiep M. Hoang,
Hien Quoc Ngo,
Trung Q. Duong,
Hoang D. Tuan,
Alan Marshall
Abstract:
This paper studies the security aspect of a recently introduced network ("cell-free massive MIMO") under a pilot spoofing attack. Firstly, a simple method to recognize the presence of this type of an active eavesdrop** attack to a particular user is shown. In order to deal with this attack, we consider the problem of maximizing the achievable data rate of the attacked user or its achievable secr…
▽ More
This paper studies the security aspect of a recently introduced network ("cell-free massive MIMO") under a pilot spoofing attack. Firstly, a simple method to recognize the presence of this type of an active eavesdrop** attack to a particular user is shown. In order to deal with this attack, we consider the problem of maximizing the achievable data rate of the attacked user or its achievable secrecy rate. The corresponding problems of minimizing the consumption power subject to security constraints are also considered in parallel. Path-following algorithms are developed to solve the posed optimization problems under different power allocation to access points (APs). Under equip-power allocation to APs, these optimization problems admit closed-form solutions. Numerical results show their efficiencies.
△ Less
Submitted 11 May, 2018;
originally announced May 2018.
-
Weakly Supervised Representation Learning for Unsynchronized Audio-Visual Events
Authors:
Sanjeel Parekh,
Slim Essid,
Alexey Ozerov,
Ngoc Q. K. Duong,
Patrick Pérez,
Gaël Richard
Abstract:
Audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events. To this end, we propose a novel multimodal framework that instantiates multiple instance learning. We show that the learnt representations are useful for classifying events and localizing their characteristic audio-visual elements. The system is traine…
▽ More
Audio-visual representation learning is an important task from the perspective of designing machines with the ability to understand complex events. To this end, we propose a novel multimodal framework that instantiates multiple instance learning. We show that the learnt representations are useful for classifying events and localizing their characteristic audio-visual elements. The system is trained using only video-level event labels without any timing information. An important feature of our method is its capacity to learn from unsynchronized audio-visual events. We achieve state-of-the-art results on a large-scale dataset of weakly-labeled audio event videos. Visualizations of localized visual regions and audio segments substantiate our system's efficacy, especially when dealing with noisy situations where modality-specific cues appear asynchronously.
△ Less
Submitted 9 July, 2018; v1 submitted 19 April, 2018;
originally announced April 2018.
-
Structural inpainting
Authors:
Huy V. Vo,
Ngoc Q. K. Duong,
Patrick Perez
Abstract:
Scene-agnostic visual inpainting remains very challenging despite progress in patch-based methods. Recently, Pathak et al. 2016 have introduced convolutional "context encoders" (CEs) for unsupervised feature learning through image completion tasks. With the additional help of adversarial training, CEs turned out to be a promising tool to complete complex structures in real inpainting problems. In…
▽ More
Scene-agnostic visual inpainting remains very challenging despite progress in patch-based methods. Recently, Pathak et al. 2016 have introduced convolutional "context encoders" (CEs) for unsupervised feature learning through image completion tasks. With the additional help of adversarial training, CEs turned out to be a promising tool to complete complex structures in real inpainting problems. In the present paper we propose to push further this key ability by relying on perceptual reconstruction losses at training time. We show on a wide variety of visual scenes the merit of the approach for structural inpainting, and confirm it through a user study. Combined with the optimization-based refinement of Yang et al. 2016 with neural patches, our context encoder opens up new opportunities for prior-free visual inpainting.
△ Less
Submitted 27 March, 2018;
originally announced March 2018.
-
NOMA for throughput and EE maximization in Energy Harvesting Enabled Networks
Authors:
A. A. Nasir,
H. D. Tuan,
T. Q. Duong,
M. Debbah
Abstract:
Wireless power transfer via radio-frequency (RF) radiation is regarded as a potential solution to energize energy-constrained users, who are deployed close to the base stations (near-by users). However, energy transfer requires much more transmit power than normal information transfer, which makes it very challenging to provide the quality of service in terms of throughput for all near-by users an…
▽ More
Wireless power transfer via radio-frequency (RF) radiation is regarded as a potential solution to energize energy-constrained users, who are deployed close to the base stations (near-by users). However, energy transfer requires much more transmit power than normal information transfer, which makes it very challenging to provide the quality of service in terms of throughput for all near-by users and cell-edge users. Thus, it is of practical interest to employ non-orthogonal multiple access (NOMA) to improve the throughput of all network users, while fulfilling the energy harvesting requirements of the near-by users. To realize both energy harvesting and information decoding, we consider a transmit time-switching (transmit-TS) protocol. We formulate two important beamfoming problems of users' max-min throughput optimization and energy efficiency maximization under power constraint and energy harvesting thresholds at the nearly-located users. For these problems, the optimization objective and energy harvesting are non-convex in beamforming vectors. Thus, we develop efficient path-following algorithms to solve them. In addition, we also consider conventional power splitting (PS)-based energy harvesting receiver. Our numerical results confirm that the proposed transmit-TS based algorithms clearly outperform PS-based algorithms in terms of both, throughput and energy efficiency.
△ Less
Submitted 23 June, 2018; v1 submitted 25 March, 2018;
originally announced March 2018.
-
Effects of CSI Knowledge on Secrecy of Threshold-Selection Decode-and-Forward Relaying
Authors:
Chinmoy Kundu,
Sarbani Ghose,
Telex M. N. Ngatched,
Octavia A. Dobre,
Trung Q. Duong,
Ranjan Bose
Abstract:
This paper considers secrecy of a three node cooperative wireless system in the presence of a passive eavesdropper. The threshold-selection decode-and-forward (DF) relay is considered, which can decode the source message correctly only if a predefined signal-to-noise ratio (SNR) is achieved. The effects of channel state information (CSI) availability on secrecy outage probability (SOP) and ergodic…
▽ More
This paper considers secrecy of a three node cooperative wireless system in the presence of a passive eavesdropper. The threshold-selection decode-and-forward (DF) relay is considered, which can decode the source message correctly only if a predefined signal-to-noise ratio (SNR) is achieved. The effects of channel state information (CSI) availability on secrecy outage probability (SOP) and ergodic secrecy rate (ESR) are investigated, and closed-form expressions are derived. Diversity is achieved from the direct and relaying paths both at the destination and at the eavesdropper by combinations of maximal-ratio combining (MRC) and selection combining (SC) schemes. An asymptotic analysis is provided when each hop SNR is the same in the balanced case and when it is different in the unbalanced case. The analysis shows that both hops can be a bottleneck for secure communication; however, they do not affect the secrecy identically. While it is observed that CSI knowledge can improve secrecy, the amount of improvement for SOP is more when the required rate is low and for ESR when the operating SNR is also low. It is also shown that the source to eavesdropper link SNR is more crucial for secure communication.
△ Less
Submitted 1 March, 2018;
originally announced March 2018.
-
Optimal Beamforming for Physical Layer Security in MISO Wireless Networks
Authors:
Zhichao Sheng,
Hoang Duong Tuan,
Trung Q. Duong,
H. Vincent Poor
Abstract:
A wireless network of multiple transmitter-user pairs overheard by an eavesdropper, where the transmitters are equipped with multiple antennas while the users and eavesdropper are equipped with a single antenna, is considered. At different levels of wireless channel knowledge, the problem of interest is beamforming to optimize the users' quality-of-service (QoS) in terms of their secrecy throughpu…
▽ More
A wireless network of multiple transmitter-user pairs overheard by an eavesdropper, where the transmitters are equipped with multiple antennas while the users and eavesdropper are equipped with a single antenna, is considered. At different levels of wireless channel knowledge, the problem of interest is beamforming to optimize the users' quality-of-service (QoS) in terms of their secrecy throughputs or maximize the network's energy efficiency under users' QoS. All these problems are seen as very difficult optimization problems with many nonconvex constraints and nonlinear equality constraints in beamforming vectors. The paper develops path-following computational procedures of low-complexity and rapid convergence for the optimal beamforming solution. Their practicability is demonstrated through numerical examples.
△ Less
Submitted 19 February, 2018;
originally announced February 2018.
-
Low-Latency Multiuser Two-Way Wireless Relaying for Spectral and Energy Efficiencies
Authors:
Zhichao Sheng,
Hoang Duong Tuan,
Trung Q. Duong,
H. Vincent Poor,
Yong Fang
Abstract:
The paper considers two possible approaches, which enable multiple pairs of users to exchange information via multiple multi-antenna relays within one time-slot to save the communication bandwidth in low-latency communications. The first approach is to deploy full-duplexes for both users and relays to make their simultaneous signal transmission and reception possible. In the second approach the us…
▽ More
The paper considers two possible approaches, which enable multiple pairs of users to exchange information via multiple multi-antenna relays within one time-slot to save the communication bandwidth in low-latency communications. The first approach is to deploy full-duplexes for both users and relays to make their simultaneous signal transmission and reception possible. In the second approach the users use a fraction of a time slot to send their information to the relays and the relays use the remaining complementary fraction of the time slot to send the beamformed signals to the users. The inherent loop self-interference in the duplexes and inter-full-duplexing-user interference in the first approach are absent in the second approach. Under both these approaches, the joint users' power allocation and relays' beamformers to either optimize the users' exchange of information or maximize the energy-efficiency subject to user quality-of-service (QoS) in terms of the exchanging information throughput thresholds lead to complex nonconvex optimization problems. Path-following algorithms are developed for their computational solutions. The provided numerical examples show the advantages of the second approach over the first approach.
△ Less
Submitted 11 December, 2017;
originally announced December 2017.
-
Multi-cell Massive MIMO Beamforming in Assuring QoS for Large Numbers of Users
Authors:
Long D. Nguyen,
Hoang D. Tuan,
Trung Q. Duong,
H. Vincent Poor
Abstract:
Massive multi-input multi-output (MIMO) uses a very large number of low-power transmit antennas to serve much smaller numbers of users. The most widely proposed type of massive MIMO transmit beamforming is zero-forcing, which is based on the right inverse of the overall MIMO channel matrix to force the inter-user interference to zero. The performance of massive MIMO is then analyzed based on the t…
▽ More
Massive multi-input multi-output (MIMO) uses a very large number of low-power transmit antennas to serve much smaller numbers of users. The most widely proposed type of massive MIMO transmit beamforming is zero-forcing, which is based on the right inverse of the overall MIMO channel matrix to force the inter-user interference to zero. The performance of massive MIMO is then analyzed based on the throughput of cell-edge users. This paper reassesses this beamforming philosophy, to instead consider the maximization of the energy efficiency of massive MIMO systems in assuring the quality-of- service (QoS) for as many users as possible. The bottleneck of serving small numbers of users by a large number of transmit antennas is unblocked by a new time-fraction-wise beamforming technique, which focuses signal transmission in fractions of a time slot. Accordingly, massive MIMO can deliver better quality-of-experience (QoE) in assuring QoS for much larger numbers of users. The provided simulations show that the numbers of users served by massive MIMO with the required QoS may be twice or more than the number of its transmit antennas.
△ Less
Submitted 10 December, 2017;
originally announced December 2017.
-
Power Allocation for Energy Efficiency and Secrecy of Interference Wireless Networks
Authors:
Zhichao Sheng,
Hoang Duong Tuan,
Ali Arshad Nasir,
Trung Q. Duong,
H. Vincent Poor
Abstract:
Considering a multi-user interference network with an eavesdropper, this paper aims at the power allocation to optimize the worst secrecy throughput among the network links or the secure energy efficiency in terms of achieved secrecy throughput per Joule under link security requirements. Three scenarios for the access of channel state information are considered: the perfect channel state informati…
▽ More
Considering a multi-user interference network with an eavesdropper, this paper aims at the power allocation to optimize the worst secrecy throughput among the network links or the secure energy efficiency in terms of achieved secrecy throughput per Joule under link security requirements. Three scenarios for the access of channel state information are considered: the perfect channel state information, partial channel state information with channels from the transmitters to the eavesdropper exponentially distributed, and not perfectly known channels between the transmitters and the users with exponentially distributed errors. The paper develops various path-following procedures of low complexity and rapid convergence for the optimal power allocation. Their effectiveness and viability are illustrated through numerical examples. The power allocation schemes are shown to achieve both high secrecy throughput and energy efficiency.
△ Less
Submitted 24 August, 2017;
originally announced August 2017.
-
Securing Wireless Communications of the Internet of Things from the Physical Layer, An Overview
Authors:
Junqing Zhang,
Trung Q. Duong,
Roger Woods,
Alan Marshall
Abstract:
The security of the Internet of Things (IoT) is receiving considerable interest as the low power constraints and complexity features of many IoT devices are limiting the use of conventional cryptographic techniques. This article provides an overview of recent research efforts on alternative approaches for securing IoT wireless communications at the physical layer, specifically the key topics of ke…
▽ More
The security of the Internet of Things (IoT) is receiving considerable interest as the low power constraints and complexity features of many IoT devices are limiting the use of conventional cryptographic techniques. This article provides an overview of recent research efforts on alternative approaches for securing IoT wireless communications at the physical layer, specifically the key topics of key generation and physical layer encryption. These schemes can be implemented and are lightweight, and thus offer practical solutions for providing effective IoT wireless security. Future research to make IoT-based physical layer security more robust and pervasive is also covered.
△ Less
Submitted 16 August, 2017;
originally announced August 2017.
-
Enhancing PHY Security of Cooperative Cognitive Radio Multicast Communications
Authors:
Van-Dinh Nguyen,
Trung Q. Duong,
Oh-Soon Shin,
Arumugam Nallanathan,
George K. Karagiannidis
Abstract:
In this paper, we propose a cooperative approach to improve the security of both primary and secondary systems in cognitive radio multicast communications. During their access to the frequency spectrum licensed to the primary users, the secondary unlicensed users assist the primary system in fortifying security by sending a jamming noise to the eavesdroppers, while simultaneously protect themselve…
▽ More
In this paper, we propose a cooperative approach to improve the security of both primary and secondary systems in cognitive radio multicast communications. During their access to the frequency spectrum licensed to the primary users, the secondary unlicensed users assist the primary system in fortifying security by sending a jamming noise to the eavesdroppers, while simultaneously protect themselves from eavesdrop**. The main objective of this work is to maximize the secrecy rate of the secondary system, while adhering to all individual primary users' secrecy rate constraints. In the case of active eavesdroppers and perfect channel state information (CSI) at the transceivers, the utility function of interest is nonconcave and the involved constraints are nonconvex, and thus, the optimal solutions are troublesome. To solve this problem, we propose an iterative algorithm to arrive at least to a local optimum of the original nonconvex problem. This algorithm is guaranteed to achieve a Karush-Kuhn-Tucker solution. Then, we extend the optimization approach to the case of passive eavesdroppers and imperfect CSI knowledge at the transceivers, where the constraints are transformed into a linear matrix inequality and convex constraints, in order to facilitate the optimal solution.
△ Less
Submitted 7 July, 2017;
originally announced July 2017.
-
Joint Fractional Time Allocation and Beamforming for Downlink Multiuser MISO Systems
Authors:
Van-Dinh Nguyen,
Hoang Duong Tuan,
Trung Q. Duong,
Oh-Soon Shin,
H. Vincent Poor
Abstract:
It is well-known that the traditional transmit beamforming at a base station (BS) to manage interference in serving multiple users is effective only when the number of users is less than the number of transmit antennas at the BS. Non-orthogonal multiple access (NOMA) can improve the throughput of users with poorer channel conditions by compromising their own privacy because other users with better…
▽ More
It is well-known that the traditional transmit beamforming at a base station (BS) to manage interference in serving multiple users is effective only when the number of users is less than the number of transmit antennas at the BS. Non-orthogonal multiple access (NOMA) can improve the throughput of users with poorer channel conditions by compromising their own privacy because other users with better channel conditions can decode the information of users in poorer channel state. NOMA still prefers that the number of users is less than the number of antennas at the BS transmitter. This paper resolves such issues by allocating separate fractional time slots for serving the users with similar channel conditions. This enables the BS to serve more users within the time unit while the privacy of each user is preserved. The fractional times and beamforming vectors are jointly optimized to maximize the system's throughput. An efficient path-following algorithm, which invokes a simple convex quadratic program at each iteration, is proposed for the solution of this challenging optimization problem. Numerical results confirm its versatility.
△ Less
Submitted 27 August, 2017; v1 submitted 6 June, 2017;
originally announced June 2017.
-
Precoder Design for Signal Superposition in MIMO-NOMA Multicell Networks
Authors:
Van-Dinh Nguyen,
Hoang Duong Tuan,
Trung Q. Duong,
H. Vincent Poor,
and Oh-Soon Shin
Abstract:
The throughput of users with poor channel conditions, such as those at a cell edge, is a bottleneck in wireless systems. A major part of the power budget must be allocated to serve these users in guaranteeing their quality-of-service (QoS) requirement, hampering QoS for other users and thus compromising the system reliability. In nonorthogonal multiple access (NOMA), the message intended for a use…
▽ More
The throughput of users with poor channel conditions, such as those at a cell edge, is a bottleneck in wireless systems. A major part of the power budget must be allocated to serve these users in guaranteeing their quality-of-service (QoS) requirement, hampering QoS for other users and thus compromising the system reliability. In nonorthogonal multiple access (NOMA), the message intended for a user with a poor channel condition is decoded by itself and by another user with a better channel condition. The message intended for the latter is then successively decoded by itself after canceling the interference of the former. The overall information throughput is thus improved by this particular successive decoding and interference cancellation. This paper aims to design linear precoders/beamformers for signal superposition at the base stations of NOMA multi-input multi-output multi-cellular systems to maximize the overall sum throughput subject to the users' QoS requirements, which are imposed independently on the users' channel condition. This design problem is formulated as the maximization of a highly nonlinear and nonsmooth function subject to nonconvex constraints, which is very computationally challenging. Path-following algorithms for its solution, which invoke only a simple convex problem of moderate dimension at each iteration are developed. Generating a sequence of improved points, these algorithms converge at least to a local optimum. Extensive numerical simulations are then provided to demonstrate their merit.
△ Less
Submitted 6 June, 2017;
originally announced June 2017.
-
Energy Efficiency in Cell-Free Massive MIMO with Zero-Forcing Precoding Design
Authors:
L. D. Nguyen,
T. Q. Duong,
H. Q. Ngo,
K. Tourki
Abstract:
We consider the downlink of a cell-free massive multiple-input multiple-output (MIMO) network where numerous distributed access points (APs) serve a smaller number of users under time division duplex operation. An important issue in deploying cell-free networks is high power consumption, which is proportional to the number of APs. This issue has raised the question as to their suitability for gree…
▽ More
We consider the downlink of a cell-free massive multiple-input multiple-output (MIMO) network where numerous distributed access points (APs) serve a smaller number of users under time division duplex operation. An important issue in deploying cell-free networks is high power consumption, which is proportional to the number of APs. This issue has raised the question as to their suitability for green communications in terms of the total energy efficiency (bits/Joule). To tackle this, we develop a novel low-complexity power control technique with zero-forcing precoding design to maximize the energy efficiency of cell-free massive MIMO taking into account the backhaul power consumption and the imperfect channel state information.
△ Less
Submitted 11 April, 2017;
originally announced April 2017.
-
How to Scale Up the Spectral Efficiency of Multi-way Massive MIMO Relaying?
Authors:
Chung Duc Ho,
Hien Quoc Ngo,
Michail Matthaiou,
Trung Q. Duong
Abstract:
This paper considers a decode-and-forward (DF) multi-way massive multiple-input multiple-output (MIMO) relay system where many users exchange their data with the aid of a relay station equipped with a massive antenna array. We propose a new transmission protocol which leverages successive cancelation decoding and zero-forcing (ZF) at the users. By using properties of massive MIMO, a tight analytic…
▽ More
This paper considers a decode-and-forward (DF) multi-way massive multiple-input multiple-output (MIMO) relay system where many users exchange their data with the aid of a relay station equipped with a massive antenna array. We propose a new transmission protocol which leverages successive cancelation decoding and zero-forcing (ZF) at the users. By using properties of massive MIMO, a tight analytical approximation of the spectral efficiency is derived. We show that our proposed scheme uses only half of the time-slots required in the conventional scheme (in which the number of time-slots is equal to the number of users [1]), to exchange data across different users. As a result, the sum spectral efficiency of our proposed scheme is nearly double the one of the conventional scheme, thereby boosting the performance of multi-way massive MIMO to unprecedented levels.
△ Less
Submitted 30 March, 2017;
originally announced March 2017.
-
Robust Beamforming for Secrecy Rate in Cooperative Cognitive Radio Multicast Communications
Authors:
Van-Dinh Nguyen,
Trung Q. Duong,
Oh-Soon Shin,
Arumugam Nallanathan,
George K. Karagiannidis
Abstract:
In this paper, we propose a cooperative approach to improve the security of both primary and secondary systems in cognitive radio multicast communications. During their access to the frequency spectrum licensed to the primary users, the secondary unlicensed users assist the primary system in fortifying security by sending a jamming noise to the eavesdroppers, while simultaneously protect themselve…
▽ More
In this paper, we propose a cooperative approach to improve the security of both primary and secondary systems in cognitive radio multicast communications. During their access to the frequency spectrum licensed to the primary users, the secondary unlicensed users assist the primary system in fortifying security by sending a jamming noise to the eavesdroppers, while simultaneously protect themselves from eavesdrop**. The main objective of this work is to maximize the secrecy rate of the secondary system, while adhering to all individual primary users' secrecy rate constraints. In the case of passive eavesdroppers and imperfect channel state information knowledge at the transceivers, the utility function of interest is nonconcave and involved constraints are nonconvex, and thus, the optimal solutions are troublesome. To address this problem, we propose an iterative algorithm to arrive at a local optimum of the considered problem. The proposed iterative algorithm is guaranteed to achieve a Karush-Kuhn-Tucker solution.
△ Less
Submitted 1 March, 2017;
originally announced March 2017.