-
RS-GPT4V: A Unified Multimodal Instruction-Following Dataset for Remote Sensing Image Understanding
Authors:
Linrui Xu,
Ling Zhao,
Wang Guo,
Qiujun Li,
Kewang Long,
Kaiqi Zou,
Yuhan Wang,
Haifeng Li
Abstract:
The remote sensing image intelligence understanding model is undergoing a new profound paradigm shift which has been promoted by multi-modal large language model (MLLM), i.e. from the paradigm learning a domain model (LaDM) shifts to paradigm learning a pre-trained general foundation model followed by an adaptive domain model (LaGD). Under the new LaGD paradigm, the old datasets, which have led to…
▽ More
The remote sensing image intelligence understanding model is undergoing a new profound paradigm shift which has been promoted by multi-modal large language model (MLLM), i.e. from the paradigm learning a domain model (LaDM) shifts to paradigm learning a pre-trained general foundation model followed by an adaptive domain model (LaGD). Under the new LaGD paradigm, the old datasets, which have led to advances in RSI intelligence understanding in the last decade, are no longer suitable for fire-new tasks. We argued that a new dataset must be designed to lighten tasks with the following features: 1) Generalization: training model to learn shared knowledge among tasks and to adapt to different tasks; 2) Understanding complex scenes: training model to understand the fine-grained attribute of the objects of interest, and to be able to describe the scene with natural language; 3) Reasoning: training model to be able to realize high-level visual reasoning. In this paper, we designed a high-quality, diversified, and unified multimodal instruction-following dataset for RSI understanding produced by GPT-4V and existing datasets, which we called RS-GPT4V. To achieve generalization, we used a (Question, Answer) which was deduced from GPT-4V via instruction-following to unify the tasks such as captioning and localization; To achieve complex scene, we proposed a hierarchical instruction description with local strategy in which the fine-grained attributes of the objects and their spatial relationships are described and global strategy in which all the local information are integrated to yield detailed instruction descript; To achieve reasoning, we designed multiple-turn QA pair to provide the reasoning ability for a model. The empirical results show that the fine-tuned MLLMs by RS-GPT4V can describe fine-grained information. The dataset is available at: https://github.com/GeoX-Lab/RS-GPT4V.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Crossfusor: A Cross-Attention Transformer Enhanced Conditional Diffusion Model for Car-Following Trajectory Prediction
Authors:
Junwei You,
Haotian Shi,
Keshu Wu,
Keke Long,
Sicheng Fu,
Sikai Chen,
Bin Ran
Abstract:
Vehicle trajectory prediction is crucial for advancing autonomous driving and advanced driver assistance systems (ADAS), enhancing road safety and traffic efficiency. While traditional methods have laid foundational work, modern deep learning techniques, particularly transformer-based models and generative approaches, have significantly improved prediction accuracy by capturing complex and non-lin…
▽ More
Vehicle trajectory prediction is crucial for advancing autonomous driving and advanced driver assistance systems (ADAS), enhancing road safety and traffic efficiency. While traditional methods have laid foundational work, modern deep learning techniques, particularly transformer-based models and generative approaches, have significantly improved prediction accuracy by capturing complex and non-linear patterns in vehicle motion and traffic interactions. However, these models often overlook the detailed car-following behaviors and inter-vehicle interactions essential for real-world driving scenarios. This study introduces a Cross-Attention Transformer Enhanced Conditional Diffusion Model (Crossfusor) specifically designed for car-following trajectory prediction. Crossfusor integrates detailed inter-vehicular interactions and car-following dynamics into a robust diffusion framework, improving both the accuracy and realism of predicted trajectories. The model leverages a novel temporal feature encoding framework combining GRU, location-based attention mechanisms, and Fourier embedding to capture historical vehicle dynamics. It employs noise scaled by these encoded historical features in the forward diffusion process, and uses a cross-attention transformer to model intricate inter-vehicle dependencies in the reverse denoising process. Experimental results on the NGSIM dataset demonstrate that Crossfusor outperforms state-of-the-art models, particularly in long-term predictions, showcasing its potential for enhancing the predictive capabilities of autonomous driving systems.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Sensor-Based Distributionally Robust Control for Safe Robot Navigation in Dynamic Environments
Authors:
Kehan Long,
Yinzhuang Yi,
Zhirui Dai,
Sylvia Herbert,
Jorge Cortés,
Nikolay Atanasov
Abstract:
We introduce a novel method for safe mobile robot navigation in dynamic, unknown environments, utilizing onboard sensing to impose safety constraints without the need for accurate map reconstruction. Traditional methods typically rely on detailed map information to synthesize safe stabilizing controls for mobile robots, which can be computationally demanding and less effective, particularly in dyn…
▽ More
We introduce a novel method for safe mobile robot navigation in dynamic, unknown environments, utilizing onboard sensing to impose safety constraints without the need for accurate map reconstruction. Traditional methods typically rely on detailed map information to synthesize safe stabilizing controls for mobile robots, which can be computationally demanding and less effective, particularly in dynamic operational conditions. By leveraging recent advances in distributionally robust optimization, we develop a distributionally robust control barrier function (DR-CBF) constraint that directly processes range sensor data to impose safety constraints. Coupling this with a control Lyapunov function (CLF) for path tracking, we demonstrate that our CLF-DR-CBF control synthesis method achieves safe, efficient, and robust navigation in uncertain dynamic environments. We demonstrate the effectiveness of our approach in simulated and real autonomous robot navigation experiments, marking a substantial advancement in real-time safety guarantees for mobile robots.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Disentangling Instructive Information from Ranked Multiple Candidates for Multi-Document Scientific Summarization
Authors:
Pancheng Wang,
Shasha Li,
Dong Li,
Kehan Long,
**tao Tang,
Ting Wang
Abstract:
Automatically condensing multiple topic-related scientific papers into a succinct and concise summary is referred to as Multi-Document Scientific Summarization (MDSS). Currently, while commonly used abstractive MDSS methods can generate flexible and coherent summaries, the difficulty in handling global information and the lack of guidance during decoding still make it challenging to generate bette…
▽ More
Automatically condensing multiple topic-related scientific papers into a succinct and concise summary is referred to as Multi-Document Scientific Summarization (MDSS). Currently, while commonly used abstractive MDSS methods can generate flexible and coherent summaries, the difficulty in handling global information and the lack of guidance during decoding still make it challenging to generate better summaries. To alleviate these two shortcomings, this paper introduces summary candidates into MDSS, utilizing the global information of the document set and additional guidance from the summary candidates to guide the decoding process. Our insights are twofold: Firstly, summary candidates can provide instructive information from both positive and negative perspectives, and secondly, selecting higher-quality candidates from multiple options contributes to producing better summaries. Drawing on the insights, we propose a summary candidates fusion framework -- Disentangling Instructive information from Ranked candidates (DIR) for MDSS. Specifically, DIR first uses a specialized pairwise comparison method towards multiple candidates to pick out those of higher quality. Then DIR disentangles the instructive information of summary candidates into positive and negative latent variables with Conditional Variational Autoencoder. These variables are further incorporated into the decoder to guide generation. We evaluate our approach with three different types of Transformer-based models and three different types of candidates, and consistently observe noticeable performance improvements according to automatic and human evaluation. More analyses further demonstrate the effectiveness of our model in handling global information and enhancing decoding controllability.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Distributionally Robust Policy and Lyapunov-Certificate Learning
Authors:
Kehan Long,
Jorge Cortes,
Nikolay Atanasov
Abstract:
This article presents novel methods for synthesizing distributionally robust stabilizing neural controllers and certificates for control systems under model uncertainty. A key challenge in designing controllers with stability guarantees for uncertain systems is the accurate determination of and adaptation to shifts in model parametric uncertainty during online deployment. We tackle this with a nov…
▽ More
This article presents novel methods for synthesizing distributionally robust stabilizing neural controllers and certificates for control systems under model uncertainty. A key challenge in designing controllers with stability guarantees for uncertain systems is the accurate determination of and adaptation to shifts in model parametric uncertainty during online deployment. We tackle this with a novel distributionally robust formulation of the Lyapunov derivative chance constraint ensuring a monotonic decrease of the Lyapunov certificate. To avoid the computational complexity involved in dealing with the space of probability measures, we identify a sufficient condition in the form of deterministic convex constraints that ensures the Lyapunov derivative constraint is satisfied. We integrate this condition into a loss function for training a neural network-based controller and show that, for the resulting closed-loop system, the global asymptotic stability of its equilibrium can be certified with high confidence, even with Out-of-Distribution (OoD) model uncertainties. To demonstrate the efficacy and efficiency of the proposed methodology, we compare it with an uncertainty-agnostic baseline approach and several reinforcement learning approaches in two control problems in simulation.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Recommending Missed Citations Identified by Reviewers: A New Task, Dataset and Baselines
Authors:
Kehan Long,
Shasha Li,
Pancheng Wang,
Chenlong Bao,
**tao Tang,
Ting Wang
Abstract:
Citing comprehensively and appropriately has become a challenging task with the explosive growth of scientific publications. Current citation recommendation systems aim to recommend a list of scientific papers for a given text context or a draft paper. However, none of the existing work focuses on already included citations of full papers, which are imperfect and still have much room for improveme…
▽ More
Citing comprehensively and appropriately has become a challenging task with the explosive growth of scientific publications. Current citation recommendation systems aim to recommend a list of scientific papers for a given text context or a draft paper. However, none of the existing work focuses on already included citations of full papers, which are imperfect and still have much room for improvement. In the scenario of peer reviewing, it is a common phenomenon that submissions are identified as missing vital citations by reviewers. This may lead to a negative impact on the credibility and validity of the research presented. To help improve citations of full papers, we first define a novel task of Recommending Missed Citations Identified by Reviewers (RMC) and construct a corresponding expert-labeled dataset called CitationR. We conduct an extensive evaluation of several state-of-the-art methods on CitationR. Furthermore, we propose a new framework RMCNet with an Attentive Reference Encoder module mining the relevance between papers, already-made citations, and missed citations. Empirical results prove that RMC is challenging, with the proposed architecture outperforming previous methods in all metrics. We release our dataset and benchmark models to motivate future research on this challenging new task.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Online Physical Enhanced Residual Learning for Connected Autonomous Vehicles Platoon Centralized Control
Authors:
Hang Zhou,
Heye Huang,
Peng Zhang,
Haotian Shi,
Keke Long,
Xiaopeng Li
Abstract:
This paper introduces an online physical enhanced residual learning (PERL) framework for Connected Autonomous Vehicles (CAVs) platoon, aimed at addressing the challenges posed by the dynamic and unpredictable nature of traffic environments. The proposed framework synergistically combines a physical model, represented by Model Predictive Control (MPC), with data-driven online Q-learning. The MPC co…
▽ More
This paper introduces an online physical enhanced residual learning (PERL) framework for Connected Autonomous Vehicles (CAVs) platoon, aimed at addressing the challenges posed by the dynamic and unpredictable nature of traffic environments. The proposed framework synergistically combines a physical model, represented by Model Predictive Control (MPC), with data-driven online Q-learning. The MPC controller, enhanced for centralized CAV platoons, employs vehicle velocity as a control input and focuses on multi-objective cooperative optimization. The learning-based residual controller enriches the MPC with prior knowledge and corrects residuals caused by traffic disturbances. The PERL framework not only retains the interpretability and transparency of physics-based models but also significantly improves computational efficiency and control accuracy in real-world scenarios. The experimental results present that the online Q-learning PERL controller, in comparison to the MPC controller and PERL controller with a neural network, exhibits significantly reduced position and velocity errors. Specifically, the PERL's cumulative absolute position and velocity errors are, on average, 86.73% and 55.28% lower than the MPC's, and 12.82% and 18.83% lower than the neural network-based PERL's, in four tests with different reference trajectories and errors. The results demonstrate our advanced framework's superior accuracy and quick convergence capabilities, proving its effectiveness in maintaining platoon stability under diverse conditions.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
A Survey for Foundation Models in Autonomous Driving
Authors:
Haoxiang Gao,
Yaqian Li,
Kaiwen Long,
Ming Yang,
Yiqing Shen
Abstract:
The advent of foundation models has revolutionized the fields of natural language processing and computer vision, paving the way for their application in autonomous driving (AD). This survey presents a comprehensive review of more than 40 research papers, demonstrating the role of foundation models in enhancing AD. Large language models contribute to planning and simulation in AD, particularly thr…
▽ More
The advent of foundation models has revolutionized the fields of natural language processing and computer vision, paving the way for their application in autonomous driving (AD). This survey presents a comprehensive review of more than 40 research papers, demonstrating the role of foundation models in enhancing AD. Large language models contribute to planning and simulation in AD, particularly through their proficiency in reasoning, code generation and translation. In parallel, vision foundation models are increasingly adapted for critical tasks such as 3D object detection and tracking, as well as creating realistic driving scenarios for simulation and testing. Multi-modal foundation models, integrating diverse inputs, exhibit exceptional visual understanding and spatial reasoning, crucial for end-to-end AD. This survey not only provides a structured taxonomy, categorizing foundation models based on their modalities and functionalities within the AD domain but also delves into the methods employed in current research. It identifies the gaps between existing foundation models and cutting-edge AD approaches, thereby charting future research directions and proposing a roadmap for bridging these gaps.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Safe Stabilizing Control for Polygonal Robots in Dynamic Elliptical Environments
Authors:
Kehan Long,
Khoa Tran,
Melvin Leok,
Nikolay Atanasov
Abstract:
This paper addresses the challenge of safe navigation for rigid-body mobile robots in dynamic environments. We introduce an analytic approach to compute the distance between a polygon and an ellipse, and employ it to construct a control barrier function (CBF) for safe control synthesis. Existing CBF design methods for mobile robot obstacle avoidance usually assume point or circular robots, prevent…
▽ More
This paper addresses the challenge of safe navigation for rigid-body mobile robots in dynamic environments. We introduce an analytic approach to compute the distance between a polygon and an ellipse, and employ it to construct a control barrier function (CBF) for safe control synthesis. Existing CBF design methods for mobile robot obstacle avoidance usually assume point or circular robots, preventing their applicability to more realistic robot body geometries. Our work enables CBF designs that capture complex robot and obstacle shapes. We demonstrate the effectiveness of our approach in simulations highlighting real-time obstacle avoidance in constrained and dynamic environments for both mobile robots and multi-joint robot arms.
△ Less
Submitted 30 April, 2024; v1 submitted 30 September, 2023;
originally announced October 2023.
-
A Physics Enhanced Residual Learning (PERL) Framework for Vehicle Trajectory Prediction
Authors:
Keke Long,
Zihao Sheng,
Haotian Shi,
Xiaopeng Li,
Sikai Chen,
Sue Ahn
Abstract:
In vehicle trajectory prediction, physics models and data-driven models are two predominant methodologies. However, each approach presents its own set of challenges: physics models fall short in predictability, while data-driven models lack interpretability. Addressing these identified shortcomings, this paper proposes a novel framework, the Physics-Enhanced Residual Learning (PERL) model. PERL in…
▽ More
In vehicle trajectory prediction, physics models and data-driven models are two predominant methodologies. However, each approach presents its own set of challenges: physics models fall short in predictability, while data-driven models lack interpretability. Addressing these identified shortcomings, this paper proposes a novel framework, the Physics-Enhanced Residual Learning (PERL) model. PERL integrates the strengths of physics-based and data-driven methods for traffic state prediction. PERL contains a physics model and a residual learning model. Its prediction is the sum of the physics model result and a predicted residual as a correction to it. It preserves the interpretability inherent to physics-based models and has reduced data requirements compared to data-driven methods. Experiments were conducted using a real-world vehicle trajectory dataset. We proposed a PERL model, with the Intelligent Driver Model (IDM) as its physics car-following model and Long Short-Term Memory (LSTM) as its residual learning model. We compare this PERL model with the physics car-following model, data-driven model, and other physics-informed neural network (PINN) models. The result reveals that PERL achieves better prediction with a small dataset, compared to the physics model, data-driven model, and PINN model. Second, the PERL model showed faster convergence during training, offering comparable performance with fewer training samples than the data-driven model and PINN model. Sensitivity analysis also proves comparable performance of PERL using another residual learning model and a physics car-following model.
△ Less
Submitted 21 March, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
When Monte-Carlo Dropout Meets Multi-Exit: Optimizing Bayesian Neural Networks on FPGA
Authors:
Hongxiang Fan,
Hao Chen,
Liam Castelli,
Zhiqiang Que,
He Li,
Kenneth Long,
Wayne Luk
Abstract:
Bayesian Neural Networks (BayesNNs) have demonstrated their capability of providing calibrated prediction for safety-critical applications such as medical imaging and autonomous driving. However, the high algorithmic complexity and the poor hardware performance of BayesNNs hinder their deployment in real-life applications. To bridge this gap, this paper proposes a novel multi-exit Monte-Carlo Drop…
▽ More
Bayesian Neural Networks (BayesNNs) have demonstrated their capability of providing calibrated prediction for safety-critical applications such as medical imaging and autonomous driving. However, the high algorithmic complexity and the poor hardware performance of BayesNNs hinder their deployment in real-life applications. To bridge this gap, this paper proposes a novel multi-exit Monte-Carlo Dropout (MCD)-based BayesNN that achieves well-calibrated predictions with low algorithmic complexity. To further reduce the barrier to adopting BayesNNs, we propose a transformation framework that can generate FPGA-based accelerators for multi-exit MCD-based BayesNNs. Several novel optimization techniques are introduced to improve hardware performance. Our experiments demonstrate that our auto-generated accelerator achieves higher energy efficiency than CPU, GPU, and other state-of-the-art hardware implementations.
△ Less
Submitted 13 August, 2023;
originally announced August 2023.
-
Phase Correction using Deep Learning for Satellite-to-Ground CV-QKD
Authors:
Nathan K. Long,
Robert Malaney,
Kenneth J. Grant
Abstract:
Coherent measurement of quantum signals used for continuous-variable (CV) quantum key distribution (QKD) across satellite-to-ground channels requires compensation of phase wavefront distortions caused by atmospheric turbulence. One compensation technique involves multiplexing classical reference pulses (RPs) and the quantum signal, with direct phase measurements on the RPs then used to modulate a…
▽ More
Coherent measurement of quantum signals used for continuous-variable (CV) quantum key distribution (QKD) across satellite-to-ground channels requires compensation of phase wavefront distortions caused by atmospheric turbulence. One compensation technique involves multiplexing classical reference pulses (RPs) and the quantum signal, with direct phase measurements on the RPs then used to modulate a real local oscillator (RLO) on the ground - a solution that also removes some known attacks on CV-QKD. However, this is a cumbersome task in practice - requiring substantial complexity in equipment requirements and deployment. As an alternative to this traditional practice, here we introduce a new method for estimating phase corrections for an RLO by using only intensity measurements from RPs as input to a convolutional neural network, mitigating completely the necessity to measure phase wavefronts directly. Conventional wisdom dictates such an approach would likely be fruitless. However, we show that the phase correction accuracy needed to provide for non-zero secure key rates through satellite-to-ground channels is achieved by our intensity-only measurements. Our work shows, for the first time, how artificial intelligence algorithms can replace phase-measuring equipment in the context of CV-QKD delivered from space, thereby delivering an alternate deployment paradigm for this global quantum-communication application.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
Distributionally Robust Lyapunov Function Search Under Uncertainty
Authors:
Kehan Long,
Yinzhuang Yi,
Jorge Cortes,
Nikolay Atanasov
Abstract:
This paper develops methods for proving Lyapunov stability of dynamical systems subject to disturbances with an unknown distribution. We assume only a finite set of disturbance samples is available and that the true online disturbance realization may be drawn from a different distribution than the given samples. We formulate an optimization problem to search for a sum-of-squares (SOS) Lyapunov fun…
▽ More
This paper develops methods for proving Lyapunov stability of dynamical systems subject to disturbances with an unknown distribution. We assume only a finite set of disturbance samples is available and that the true online disturbance realization may be drawn from a different distribution than the given samples. We formulate an optimization problem to search for a sum-of-squares (SOS) Lyapunov function and introduce a distributionally robust version of the Lyapunov function derivative constraint. We show that this constraint may be reformulated as several SOS constraints, ensuring that the search for a Lyapunov function remains in the class of SOS polynomial optimization problems. For general systems, we provide a distributionally robust chance-constrained formulation for neural network Lyapunov function search. Simulations demonstrate the validity and efficiency of either formulation on non-linear uncertain dynamical systems.
△ Less
Submitted 30 April, 2024; v1 submitted 3 December, 2022;
originally announced December 2022.
-
Safe and Stable Control Synthesis for Uncertain System Models via Distributionally Robust Optimization
Authors:
Kehan Long,
Yinzhuang Yi,
Jorge Cortes,
Nikolay Atanasov
Abstract:
This paper considers enforcing safety and stability of dynamical systems in the presence of model uncertainty. Safety and stability constraints may be specified using a control barrier function (CBF) and a control Lyapunov function (CLF), respectively. To take model uncertainty into account, robust and chance formulations of the constraints are commonly considered. However, this requires known err…
▽ More
This paper considers enforcing safety and stability of dynamical systems in the presence of model uncertainty. Safety and stability constraints may be specified using a control barrier function (CBF) and a control Lyapunov function (CLF), respectively. To take model uncertainty into account, robust and chance formulations of the constraints are commonly considered. However, this requires known error bounds or a known distribution for the model uncertainty, and the resulting formulations may suffer from over-conservatism or over-confidence. In this paper, we assume that only a finite set of model parametric uncertainty samples is available and formulate a distributionally robust chance-constrained program (DRCCP) for control synthesis with CBF safety and CLF stability guarantees. To facilitate efficient computation of control inputs during online execution, we present a reformulation of the DRCCP as a second-order cone program (SOCP). Our formulation is evaluated in an adaptive cruise control example in comparison to 1) a baseline CLF-CBF quadratic programming approach, 2) a robust approach that assumes known error bounds of the system uncertainty, and 3) a chance-constrained approach that assumes a known Gaussian Process distribution of the uncertainty.
△ Less
Submitted 16 March, 2023; v1 submitted 3 October, 2022;
originally announced October 2022.
-
Application of Data Encryption in Chinese Named Entity Recognition
Authors:
Kaifang Long,
Jikun Dong,
Shengyu Fan,
Yanfang Geng,
Yang Cao,
Han Zhao,
Hui Yu,
Weizhi Xu
Abstract:
Recently, with the continuous development of deep learning, the performance of named entity recognition tasks has been dramatically improved. However, the privacy and the confidentiality of data in some specific fields, such as biomedical and military, cause insufficient data to support the training of deep neural networks. In this paper, we propose an encryption learning framework to address the…
▽ More
Recently, with the continuous development of deep learning, the performance of named entity recognition tasks has been dramatically improved. However, the privacy and the confidentiality of data in some specific fields, such as biomedical and military, cause insufficient data to support the training of deep neural networks. In this paper, we propose an encryption learning framework to address the problems of data leakage and inconvenient disclosure of sensitive data in certain domains. We introduce multiple encryption algorithms to encrypt training data in the named entity recognition task for the first time. In other words, we train the deep neural network using the encrypted data. We conduct experiments on six Chinese datasets, three of which are constructed by ourselves. The experimental results show that the encryption method achieves satisfactory results. The performance of some models trained with encrypted data even exceeds the performance of the unencrypted method, which verifies the effectiveness of the introduced encryption method and solves the problem of data leakage to a certain extent.
△ Less
Submitted 31 August, 2022;
originally announced August 2022.
-
Response Component Analysis for Sea State Estimation Using Artificial Neural Networks and Vessel Response Spectral Data
Authors:
Nathan K. Long,
Daniel Sgarioto,
Matthew Garratt,
Karl Sammut
Abstract:
The use of the `ship as a wave buoy analogy' (SAWB) provides a novel means to estimate sea states, where relationships are established between causal wave properties and vessel motion response information. This study focuses on a model-free machine learning approach to SAWB-based sea state estimation (SSE), using neural networks (NNs) to map vessel response spectral data to statistical wave proper…
▽ More
The use of the `ship as a wave buoy analogy' (SAWB) provides a novel means to estimate sea states, where relationships are established between causal wave properties and vessel motion response information. This study focuses on a model-free machine learning approach to SAWB-based sea state estimation (SSE), using neural networks (NNs) to map vessel response spectral data to statistical wave properties for a small uninhabited surface vessel.
Results showed a strong correlation between heave responses and significant wave height estimates, whilst the accuracy of mean wave period and wave heading predictions were observed to improve considerably when data from multiple vessel degrees of freedom (DOFs) was utilized. Overall, 3-DOF (heave, pitch and roll) NNs for SSE were shown to perform well when compared to existing SSE approaches that use similar simulation setups. One advantage of using small vessels for SAWB was shown as SSE accuracy was reasonable even when motion responses were low (in high-frequency, low wave height sea states). Given the information-dense statistical representation of vessel motion responses in spectral form, as well as the ability of NNs to effectively model complex relationships between variables, the designed SSE method shows promise for future adaptation to mobile SSE systems using the SAWB approach.
△ Less
Submitted 12 August, 2022; v1 submitted 4 May, 2022;
originally announced May 2022.
-
TJ4DRadSet: A 4D Radar Dataset for Autonomous Driving
Authors:
Lianqing Zheng,
Zhixiong Ma,
Xichan Zhu,
Bin Tan,
Sen Li,
Kai Long,
Weiqi Sun,
Sihan Chen,
Lu Zhang,
Mengyue Wan,
Libo Huang,
Jie Bai
Abstract:
The next-generation high-resolution automotive radar (4D radar) can provide additional elevation measurement and denser point clouds, which has great potential for 3D sensing in autonomous driving. In this paper, we introduce a dataset named TJ4DRadSet with 4D radar points for autonomous driving research. The dataset was collected in various driving scenarios, with a total of 7757 synchronized fra…
▽ More
The next-generation high-resolution automotive radar (4D radar) can provide additional elevation measurement and denser point clouds, which has great potential for 3D sensing in autonomous driving. In this paper, we introduce a dataset named TJ4DRadSet with 4D radar points for autonomous driving research. The dataset was collected in various driving scenarios, with a total of 7757 synchronized frames in 44 consecutive sequences, which are well annotated with 3D bounding boxes and track ids. We provide a 4D radar-based 3D object detection baseline for our dataset to demonstrate the effectiveness of deep learning methods for 4D radar point clouds. The dataset can be accessed via the following link: https://github.com/TJRadarLab/TJ4DRadSet.
△ Less
Submitted 27 July, 2022; v1 submitted 28 April, 2022;
originally announced April 2022.
-
AI-aided Traffic Control Scheme for M2M Communications in the Internet of Vehicles
Authors:
Haijun Zhang,
Minghui Jiang,
Xiangnan Liu,
Ke** Long,
Victor C. M. Leung
Abstract:
Due to the rapid growth of data transmissions in internet of vehicles (IoV), finding schemes that can effectively alleviate access congestion has become an important issue. Recently, many traffic control schemes have been studied. Nevertheless, the dynamics of traffic and the heterogeneous requirements of different IoV applications are not considered in most existing studies, which is significant…
▽ More
Due to the rapid growth of data transmissions in internet of vehicles (IoV), finding schemes that can effectively alleviate access congestion has become an important issue. Recently, many traffic control schemes have been studied. Nevertheless, the dynamics of traffic and the heterogeneous requirements of different IoV applications are not considered in most existing studies, which is significant for the random access resource allocation. In this paper, we consider a hybrid traffic control scheme and use proximal policy optimization (PPO) method to tackle it. Firstly, IoV devices are divided into various classes based on delay characteristics. The target of maximizing the successful transmission of packets with the success rate constraint is established. Then, the optimization objective is transformed into a markov decision process (MDP) model. Finally, the access class barring (ACB) factors are obtained based on the PPO method to maximize the number of successful access devices. The performance of the proposal algorithm in respect of successful events and delay compared to existing schemes is verified by simulations.
△ Less
Submitted 12 April, 2022; v1 submitted 5 March, 2022;
originally announced April 2022.
-
End-to-end multi-particle reconstruction in high occupancy imaging calorimeters with graph neural networks
Authors:
Shah Rukh Qasim,
Nadezda Chernyavskaya,
Jan Kieseler,
Kenneth Long,
Oleksandr Viazlo,
Maurizio Pierini,
Raheel Nawaz
Abstract:
We present an end-to-end reconstruction algorithm to build particle candidates from detector hits in next-generation granular calorimeters similar to that foreseen for the high-luminosity upgrade of the CMS detector. The algorithm exploits a distance-weighted graph neural network, trained with object condensation, a graph segmentation technique. Through a single-shot approach, the reconstruction t…
▽ More
We present an end-to-end reconstruction algorithm to build particle candidates from detector hits in next-generation granular calorimeters similar to that foreseen for the high-luminosity upgrade of the CMS detector. The algorithm exploits a distance-weighted graph neural network, trained with object condensation, a graph segmentation technique. Through a single-shot approach, the reconstruction task is paired with energy regression. We describe the reconstruction performance in terms of efficiency as well as in terms of energy resolution. In addition, we show the jet reconstruction performance of our method and discuss its inference computational cost. To our knowledge, this work is the first-ever example of single-shot calorimetric reconstruction of ${\cal O}(1000)$ particles in high-luminosity conditions with 200 pileup.
△ Less
Submitted 30 September, 2022; v1 submitted 4 April, 2022;
originally announced April 2022.
-
Calibrating constitutive models with full-field data via physics informed neural networks
Authors:
Craig M. Hamel,
Kevin N. Long,
Sharlotte L. B. Kramer
Abstract:
The calibration of solid constitutive models with full-field experimental data is a long-standing challenge, especially in materials which undergo large deformation. In this paper, we propose a physics-informed deep-learning framework for the discovery of constitutive model parameterizations given full-field displacement data and global force-displacement data. Contrary to the majority of recent l…
▽ More
The calibration of solid constitutive models with full-field experimental data is a long-standing challenge, especially in materials which undergo large deformation. In this paper, we propose a physics-informed deep-learning framework for the discovery of constitutive model parameterizations given full-field displacement data and global force-displacement data. Contrary to the majority of recent literature in this field, we work with the weak form of the governing equations rather than the strong form to impose physical constraints upon the neural network predictions. The approach presented in this paper is computationally efficient, suitable for irregular geometric domains, and readily ingests displacement data without the need for interpolation onto a computational grid. A selection of canonical hyperelastic materials models suitable for different material classes is considered including the Neo-Hookean, Gent, and Blatz-Ko constitutive models as exemplars for general hyperelastic behavior, polymer behavior with lock-up, and compressible foam behavior respectively. We demonstrate that physics informed machine learning is an enabling technology and may shift the paradigm of how full-field experimental data is utilized to calibrate constitutive models under finite deformations.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Proximal Policy Optimization-based Transmit Beamforming and Phase-shift Design in an IRS-aided ISAC System for the THz Band
Authors:
Xiangnan Liu,
Haijun Zhang,
Ke** Long,
Mingyu Zhou,
Yonghui Li,
H. Vincent Poor
Abstract:
In this paper, an IRS-aided integrated sensing and communications (ISAC) system operating in the terahertz (THz) band is proposed to maximize the system capacity. Transmit beamforming and phase-shift design are transformed into a universal optimization problem with ergodic constraints. Then the joint optimization of transmit beamforming and phase-shift design is achieved by gradient-based, primal-…
▽ More
In this paper, an IRS-aided integrated sensing and communications (ISAC) system operating in the terahertz (THz) band is proposed to maximize the system capacity. Transmit beamforming and phase-shift design are transformed into a universal optimization problem with ergodic constraints. Then the joint optimization of transmit beamforming and phase-shift design is achieved by gradient-based, primal-dual proximal policy optimization (PPO) in the multi-user multiple-input single-output (MISO) scenario. Specifically, the actor part generates continuous transmit beamforming and the critic part takes charge of discrete phase shift design. Based on the MISO scenario, we investigate a distributed PPO (DPPO) framework with the concept of multi-threading learning in the multi-user multiple-input multiple-output (MIMO) scenario. Simulation results demonstrate the effectiveness of the primal-dual PPO algorithm and its multi-threading version in terms of transmit beamforming and phase-shift design.
△ Less
Submitted 21 March, 2022;
originally announced March 2022.
-
Safe Control Synthesis with Uncertain Dynamics and Constraints
Authors:
Kehan Long,
Vikas Dhiman,
Melvin Leok,
Jorge Cortés,
Nikolay Atanasov
Abstract:
This paper considers safe control synthesis for dynamical systems with either probabilistic or worst-case uncertainty in both the dynamics model and the safety constraints. We formulate novel probabilistic and robust (worst-case) control Lyapunov function (CLF) and control barrier function (CBF) constraints that take into account the effect of uncertainty in either case. We show that either the pr…
▽ More
This paper considers safe control synthesis for dynamical systems with either probabilistic or worst-case uncertainty in both the dynamics model and the safety constraints. We formulate novel probabilistic and robust (worst-case) control Lyapunov function (CLF) and control barrier function (CBF) constraints that take into account the effect of uncertainty in either case. We show that either the probabilistic or the robust (worst-case) formulation leads to a second-order cone program (SOCP), which enables efficient safe and stable control synthesis. We evaluate our approach in PyBullet simulations of an autonomous robot navigating in unknown environments and compare the performance with a baseline CLF-CBF quadratic programming approach.
△ Less
Submitted 30 September, 2022; v1 submitted 19 February, 2022;
originally announced February 2022.
-
GearV: A Two-Gear Hypervisor for Mixed-Criticality IoT Systems
Authors:
Kaiwen Long,
Chong Xing,
Yuebin Qi,
Pei Zhang,
Changsong Wu,
Wenxiao Fang,
**g Tan,
Jie Chen,
Shiming Zhang,
Zuosheng Wang,
Zuanmin Liu,
Cao Liang,
Jiaxiang Xu
Abstract:
This paper presents GearV, a two-gear lightweight hypervisor architecture to address the some known challenges. By dividing hypervisor into some partitions, and dividing scheduling policies into Gear1 and Gear2 respectively, GearV creates a consolidated platform to run best-effort system and safety-critical system simultaneously with managed engineering effort. The two-gears architecture also simp…
▽ More
This paper presents GearV, a two-gear lightweight hypervisor architecture to address the some known challenges. By dividing hypervisor into some partitions, and dividing scheduling policies into Gear1 and Gear2 respectively, GearV creates a consolidated platform to run best-effort system and safety-critical system simultaneously with managed engineering effort. The two-gears architecture also simplifies retrofitting the virtualization systems. We believe that GearV can serves as a reasonable hypervisor architecture for the mix-critical IoT systems.
△ Less
Submitted 5 June, 2021;
originally announced June 2021.
-
Lesion-Inspired Denoising Network: Connecting Medical Image Denoising and Lesion Detection
Authors:
Kecheng Chen,
Kun Long,
Yazhou Ren,
Jiayu Sun,
Xiaorong Pu
Abstract:
Deep learning has achieved notable performance in the denoising task of low-quality medical images and the detection task of lesions, respectively. However, existing low-quality medical image denoising approaches are disconnected from the detection task of lesions. Intuitively, the quality of denoised images will influence the lesion detection accuracy that in turn can be used to affect the denois…
▽ More
Deep learning has achieved notable performance in the denoising task of low-quality medical images and the detection task of lesions, respectively. However, existing low-quality medical image denoising approaches are disconnected from the detection task of lesions. Intuitively, the quality of denoised images will influence the lesion detection accuracy that in turn can be used to affect the denoising performance. To this end, we propose a play-and-plug medical image denoising framework, namely Lesion-Inspired Denoising Network (LIDnet), to collaboratively improve both denoising performance and detection accuracy of denoised medical images. Specifically, we propose to insert the feedback of downstream detection task into existing denoising framework by jointly learning a multi-loss objective. Instead of using perceptual loss calculated on the entire feature map, a novel region-of-interest (ROI) perceptual loss induced by the lesion detection task is proposed to further connect these two tasks. To achieve better optimization for overall framework, we propose a customized collaborative training strategy for LIDnet. On consideration of clinical usability and imaging characteristics, three low-dose CT images datasets are used to evaluate the effectiveness of the proposed LIDnet. Experiments show that, by equip** with LIDnet, both of the denoising and lesion detection performance of baseline methods can be significantly improved.
△ Less
Submitted 18 April, 2021;
originally announced April 2021.
-
Learning Barrier Functions with Memory for Robust Safe Navigation
Authors:
Kehan Long,
Cheng Qian,
Jorge Cortés,
Nikolay Atanasov
Abstract:
Control barrier functions are widely used to enforce safety properties in robot motion planning and control. However, the problem of constructing barrier functions online and synthesizing safe controllers that can deal with the associated uncertainty has received little attention. This paper investigates safe navigation in unknown environments, using onboard range sensing to construct control barr…
▽ More
Control barrier functions are widely used to enforce safety properties in robot motion planning and control. However, the problem of constructing barrier functions online and synthesizing safe controllers that can deal with the associated uncertainty has received little attention. This paper investigates safe navigation in unknown environments, using onboard range sensing to construct control barrier functions online. To represent different objects in the environment, we use the distance measurements to train neural network approximations of the signed distance functions incrementally with replay memory. This allows us to formulate a novel robust control barrier safety constraint which takes into account the error in the estimated distance fields and its gradient. Our formulation leads to a second-order cone program, enabling safe and stable control synthesis in a priori unknown environments.
△ Less
Submitted 10 February, 2021; v1 submitted 3 November, 2020;
originally announced November 2020.
-
Optimal Wireless Streaming of Multi-Quality 360 VR Video by Exploiting Natural, Relative Smoothness-enabled and Transcoding-enabled Multicast Opportunities
Authors:
Kaixuan Long,
Ying Cui,
Chencheng Ye,
Zhi Liu
Abstract:
In this paper, we would like to investigate optimal wireless streaming of a multi-quality tiled 360 virtual reality (VR) video from a server to multiple users. To this end, we propose to maximally exploit potential multicast opportunities by effectively utilizing characteristics of multi-quality tiled 360 VR videos and computation resources at the users' side. In particular, we consider two requir…
▽ More
In this paper, we would like to investigate optimal wireless streaming of a multi-quality tiled 360 virtual reality (VR) video from a server to multiple users. To this end, we propose to maximally exploit potential multicast opportunities by effectively utilizing characteristics of multi-quality tiled 360 VR videos and computation resources at the users' side. In particular, we consider two requirements for quality variation in one field-of-view (FoV), i.e., the absolute smoothness requirement and the relative smoothness requirement, and two video playback modes, i.e., the direct-playback mode (without user transcoding) and transcode-playback mode (with user transcoding). Besides natural multicast opportunities, we introduce two new types of multicast opportunities, namely, relative smoothness-enabled multicast opportunities, which allow flexible tradeoff between viewing quality and communications resource consumption, and transcoding-enabled multicast opportunities, which allow flexible tradeoff between computation and communications resource consumptions. Then, we establish a novel mathematical model that reflects the impacts of natural, relative smoothness-enabled and transcoding-enabled multicast opportunities on the average transmission energy and transcoding energy. Based on this model, we optimize the transmission resource allocation, playback quality level selection and transmission quality level selection to minimize the energy consumption in the four cases with different requirements for quality variation and video playback modes. By comparing the optimal values in the four cases, we prove that the energy consumption reduces when more multicast opportunities can be utilized. Finally, numerical results show substantial gains of the proposed solutions over existing schemes, and demonstrate the importance of effective exploitation of the three types of multicast opportunities.
△ Less
Submitted 2 September, 2020;
originally announced September 2020.
-
AI Tax: The Hidden Cost of AI Data Center Applications
Authors:
Daniel Richins,
Dharmisha Doshi,
Matthew Blackmore,
Aswathy Thulaseedharan Nair,
Neha Pathapati,
Ankit Patel,
Brainard Daguman,
Daniel Dobrijalowski,
Ramesh Illikkal,
Kevin Long,
David Zimmerman,
Vijay Janapa Reddi
Abstract:
Artificial intelligence and machine learning are experiencing widespread adoption in industry and academia. This has been driven by rapid advances in the applications and accuracy of AI through increasingly complex algorithms and models; this, in turn, has spurred research into specialized hardware AI accelerators. Given the rapid pace of advances, it is easy to forget that they are often develope…
▽ More
Artificial intelligence and machine learning are experiencing widespread adoption in industry and academia. This has been driven by rapid advances in the applications and accuracy of AI through increasingly complex algorithms and models; this, in turn, has spurred research into specialized hardware AI accelerators. Given the rapid pace of advances, it is easy to forget that they are often developed and evaluated in a vacuum without considering the full application environment. This paper emphasizes the need for a holistic, end-to-end analysis of AI workloads and reveals the "AI tax." We deploy and characterize Face Recognition in an edge data center. The application is an AI-centric edge video analytics application built using popular open source infrastructure and ML tools. Despite using state-of-the-art AI and ML algorithms, the application relies heavily on pre-and post-processing code. As AI-centric applications benefit from the acceleration promised by accelerators, we find they impose stresses on the hardware and software infrastructure: storage and network bandwidth become major bottlenecks with increasing AI acceleration. By specializing for AI applications, we show that a purpose-built edge data center can be designed for the stresses of accelerated AI at 15% lower TCO than one derived from homogeneous servers and infrastructure.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Deep Learning based Radio Resource Management in NOMA Networks: User Association, Subchannel and Power Allocation
Authors:
Haijun Zhang,
Haisen Zhang,
Ke** Long,
George K. Karagiannidis
Abstract:
With the rapid development of future wireless communication, the combination of NOMA technology and millimeter-wave(mmWave) technology has become a research hotspot. The application of NOMA in mmWave heterogeneous networks can meet the diverse needs of users in different applications and scenarios in future communications. In this paper, we propose a machine learning framework to deal with the use…
▽ More
With the rapid development of future wireless communication, the combination of NOMA technology and millimeter-wave(mmWave) technology has become a research hotspot. The application of NOMA in mmWave heterogeneous networks can meet the diverse needs of users in different applications and scenarios in future communications. In this paper, we propose a machine learning framework to deal with the user association, subchannel and power allocation problems in such a complex scenario. We focus on maximizing the energy efficiency (EE) of the system under the constraints of quality of service (QoS), interference limitation, and power limitation. Specifically, user association is solved through the Lagrange dual decomposition method, while semi-supervised learning and deep neural network (DNN) are used for the subchannel and power allocation, respectively. In particular, unlabeled samples are introduced to improve approximation and generalization ability for subchannel allocation. The simulation indicates that the proposed scheme can achieve higher EE with lower complexity.
△ Less
Submitted 20 June, 2020;
originally announced June 2020.
-
Energy Efficiency Optimization for NOMA UAV Network with Imperfect CSI
Authors:
Haijun Zhang,
Jianmin Zhang,
Ke** Long
Abstract:
Unmanned aerial vehicles (UAVs) are develo** rapidly owing to flexible deployment and access services as air base stations. However, the channel errors of low-altitude communication links formed by mobile deployment of UAVs cannot be ignored. And the energy efficiency of the UAVs communication with imperfect channel state information (CSI) hasnt been well studied yet. Therefore, we focus on syst…
▽ More
Unmanned aerial vehicles (UAVs) are develo** rapidly owing to flexible deployment and access services as air base stations. However, the channel errors of low-altitude communication links formed by mobile deployment of UAVs cannot be ignored. And the energy efficiency of the UAVs communication with imperfect channel state information (CSI) hasnt been well studied yet. Therefore, we focus on system performance optimization in non-orthogonal multiple access (NOMA) UAV network considering imperfect CSI between the UAV and users. A suboptimal resource allocation scheme including user scheduling and power allocation is designed for maximizing energy efficiency. Because of the nonconvexity of optimization function with an probability constraint for imperfect CSI, the original problem is converted into a non-probability problem and then decoupled into two convex subproblems. First, a user scheduling method is applied in the two-side matching of users and subchannels by the difference of convex programming. Then based on user scheduling, the energy efficiency in UAV cells is optimized through a suboptimal power allocation algorithm by successive convex approximation method. The simulation results prove that the proposed algorithm is effective compared with existing resource allocation schemes.
△ Less
Submitted 5 May, 2020;
originally announced May 2020.
-
Optimal Transmission of Multi-Quality Tiled 360 VR Video by Exploiting Multicast Opportunities
Authors:
Kaixuan Long,
Ying Cui,
Chencheng Ye,
Zhi Liu
Abstract:
In this paper, we would like to investigate fundamental impacts of multicast opportunities on efficient transmission of a 360 VR video to multiple users in the cases with and without transcoding at each user. We establish a novel mathematical model that reflects the impacts of multicast opportunities on the average transmission energy in both cases and the transcoding energy in the case with user…
▽ More
In this paper, we would like to investigate fundamental impacts of multicast opportunities on efficient transmission of a 360 VR video to multiple users in the cases with and without transcoding at each user. We establish a novel mathematical model that reflects the impacts of multicast opportunities on the average transmission energy in both cases and the transcoding energy in the case with user transcoding, and facilitates the optimal exploitation of transcoding-enabled multicast opportunities. In the case without user transcoding, we optimize the transmission resource allocation to minimize the average transmission energy by exploiting natural multicast opportunities. The problem is nonconvex. We transform it to an equivalent convex problem and obtain an optimal solution using standard convex optimization techniques. In the case with user transcoding, we optimize the transmission resource allocation and the transmission quality level selection to minimize the weighted sum of the average transmission energy and the transcoding energy by exploiting both natural and transcoding-enabled multicast opportunities. The problem is a challenging mixed discrete-continuous optimization problem. We transform it to a Difference of Convex (DC) programming problem and obtain a suboptimal solution using a DC algorithm. Finally, numerical results demonstrate the importance of effective exploitation of transcoding-enabled multicast opportunities in the case with user transcoding.
△ Less
Submitted 7 January, 2020;
originally announced January 2020.
-
A Comprehensive Review of Shepherding as a Bio-inspired Swarm-Robotics Guidance Approach
Authors:
Nathan K Long,
Karl Sammut,
Daniel Sgarioto,
Matthew Garratt,
Hussein Abbass
Abstract:
The simultaneous control of multiple coordinated robotic agents represents an elaborate problem. If solved, however, the interaction between the agents can lead to solutions to sophisticated problems. The concept of swarming, inspired by nature, can be described as the emergence of complex system-level behaviors from the interactions of relatively elementary agents. Due to the effectiveness of sol…
▽ More
The simultaneous control of multiple coordinated robotic agents represents an elaborate problem. If solved, however, the interaction between the agents can lead to solutions to sophisticated problems. The concept of swarming, inspired by nature, can be described as the emergence of complex system-level behaviors from the interactions of relatively elementary agents. Due to the effectiveness of solutions found in nature, bio-inspired swarming-based control techniques are receiving a lot of attention in robotics. One method, known as swarm shepherding, is founded on the sheep herding behavior exhibited by sheepdogs, where a swarm of relatively simple agents are governed by a shepherd (or shepherds) which is responsible for high-level guidance and planning. Many studies have been conducted on shepherding as a control technique, ranging from the replication of sheep herding via simulation, to the control of uninhabited vehicles and robots for a variety of applications. We present a comprehensive review of the literature on swarm shepherding to reveal the advantages and potential of the approach to be applied to a plethora of robotic systems in the future.
△ Less
Submitted 30 April, 2020; v1 submitted 16 December, 2019;
originally announced December 2019.
-
Neurlux: Dynamic Malware Analysis Without Feature Engineering
Authors:
Chani **dal,
Christopher Salls,
Hojjat Aghakhani,
Keith Long,
Christopher Kruegel,
Giovanni Vigna
Abstract:
Malware detection plays a vital role in computer security. Modern machine learning approaches have been centered around domain knowledge for extracting malicious features. However, many potential features can be used, and it is time consuming and difficult to manually identify the best features, especially given the diverse nature of malware.
In this paper, we propose Neurlux, a neural network f…
▽ More
Malware detection plays a vital role in computer security. Modern machine learning approaches have been centered around domain knowledge for extracting malicious features. However, many potential features can be used, and it is time consuming and difficult to manually identify the best features, especially given the diverse nature of malware.
In this paper, we propose Neurlux, a neural network for malware detection. Neurlux does not rely on any feature engineering, rather it learns automatically from dynamic analysis reports that detail behavioral information. Our model borrows ideas from the field of document classification, using word sequences present in the reports to predict if a report is from a malicious binary or not. We investigate the learned features of our model and show which components of the reports it tends to give the highest importance. Then, we evaluate our approach on two different datasets and report formats, showing that Neurlux improves on the state of the art and can effectively learn from the dynamic analysis reports. Furthermore, we show that our approach is portable to other malware analysis environments and generalizes to different datasets.
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
Deep Comprehensive Correlation Mining for Image Clustering
Authors:
Jianlong Wu,
Keyu Long,
Fei Wang,
Chen Qian,
Cheng Li,
Zhouchen Lin,
Hongbin Zha
Abstract:
Recent developed deep unsupervised methods allow us to jointly learn representation and cluster unlabelled data. These deep clustering methods mainly focus on the correlation among samples, e.g., selecting high precision pairs to gradually tune the feature representation, which neglects other useful correlations. In this paper, we propose a novel clustering framework, named deep comprehensive corr…
▽ More
Recent developed deep unsupervised methods allow us to jointly learn representation and cluster unlabelled data. These deep clustering methods mainly focus on the correlation among samples, e.g., selecting high precision pairs to gradually tune the feature representation, which neglects other useful correlations. In this paper, we propose a novel clustering framework, named deep comprehensive correlation mining(DCCM), for exploring and taking full advantage of various kinds of correlations behind the unlabeled data from three aspects: 1) Instead of only using pair-wise information, pseudo-label supervision is proposed to investigate category information and learn discriminative features. 2) The features' robustness to image transformation of input space is fully explored, which benefits the network learning and significantly improves the performance. 3) The triplet mutual information among features is presented for clustering problem to lift the recently discovered instance-level deep mutual information to a triplet-level formation, which further helps to learn more discriminative features. Extensive experiments on several challenging datasets show that our method achieves good performance, e.g., attaining $62.3\%$ clustering accuracy on CIFAR-10, which is $10.1\%$ higher than the state-of-the-art results.
△ Less
Submitted 12 August, 2019; v1 submitted 15 April, 2019;
originally announced April 2019.
-
Optimal Multi-Quality Multicast for 360 Virtual Reality Video
Authors:
Kaixuan Long,
Chencheng Ye,
Ying Cui,
Zhi Liu
Abstract:
A 360 virtual reality (VR) video, recording a scene of interest in every direction, provides VR users with immersive viewing experience. However, transmission of a 360 VR video which is of a much larger size than a traditional video to mobile users brings a heavy burden to a wireless network. In this paper, we consider multi-quality multicast of a 360 VR video from a single server to multiple user…
▽ More
A 360 virtual reality (VR) video, recording a scene of interest in every direction, provides VR users with immersive viewing experience. However, transmission of a 360 VR video which is of a much larger size than a traditional video to mobile users brings a heavy burden to a wireless network. In this paper, we consider multi-quality multicast of a 360 VR video from a single server to multiple users using time division multiple access (TDMA). To improve transmission efficiency, tiling is adopted, and each tile is pre-encoded into multiple representations with different qualities. We optimize the quality level selection, transmission time allocation and transmission power allocation to maximize the total utility of all users under the transmission time and power allocation constraints as well as the quality smoothness constraints for mixed-quality tiles. The problem is a challenging mixed discrete-continuous opti-mization problem. We propose two low-complexity algorithms to obtain two suboptimal solutions, using continuous relaxation and DC programming, respectively. Finally, numerical results demonstrate the advantage of the proposed solutions.
△ Less
Submitted 8 January, 2019;
originally announced January 2019.
-
Learning to Communicate: A Machine Learning Framework for Heterogeneous Multi-Agent Robotic Systems
Authors:
Hyung-** Yoon,
Huaiyu Chen,
Kehan Long,
Heling Zhang,
Aditya Gahlawat,
Donghwan Lee,
Naira Hovakimyan
Abstract:
We present a machine learning framework for multi-agent systems to learn both the optimal policy for maximizing the rewards and the encoding of the high dimensional visual observation. The encoding is useful for sharing local visual observations with other agents under communication resource constraints. The actor-encoder encodes the raw images and chooses an action based on local observations and…
▽ More
We present a machine learning framework for multi-agent systems to learn both the optimal policy for maximizing the rewards and the encoding of the high dimensional visual observation. The encoding is useful for sharing local visual observations with other agents under communication resource constraints. The actor-encoder encodes the raw images and chooses an action based on local observations and messages sent by the other agents. The machine learning agent generates not only an actuator command to the physical device, but also a communication message to the other agents. We formulate a reinforcement learning problem, which extends the action space to consider the communication action as well. The feasibility of the reinforcement learning framework is demonstrated using a 3D simulation environment with two collaborating agents. The environment provides realistic visual observations to be used and shared between the two agents.
△ Less
Submitted 12 December, 2018;
originally announced December 2018.
-
LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild
Authors:
Shuang Yang,
Yuanhang Zhang,
Dalu Feng,
Mingmin Yang,
Chenhao Wang,
**gyun Xiao,
Keyu Long,
Shiguang Shan,
Xilin Chen
Abstract:
Large-scale datasets have successively proven their fundamental importance in several research fields, especially for early progress in some emerging topics. In this paper, we focus on the problem of visual speech recognition, also known as lipreading, which has received increasing interest in recent years. We present a naturally-distributed large-scale benchmark for lip reading in the wild, named…
▽ More
Large-scale datasets have successively proven their fundamental importance in several research fields, especially for early progress in some emerging topics. In this paper, we focus on the problem of visual speech recognition, also known as lipreading, which has received increasing interest in recent years. We present a naturally-distributed large-scale benchmark for lip reading in the wild, named LRW-1000, which contains 1,000 classes with 718,018 samples from more than 2,000 individual speakers. Each class corresponds to the syllables of a Mandarin word composed of one or several Chinese characters. To the best of our knowledge, it is currently the largest word-level lipreading dataset and also the only public large-scale Mandarin lipreading dataset. This dataset aims at covering a "natural" variability over different speech modes and imaging conditions to incorporate challenges encountered in practical applications. It has shown a large variation in this benchmark in several aspects, including the number of samples in each class, video resolution, lighting conditions, and speakers' attributes such as pose, age, gender, and make-up. Besides providing a detailed description of the dataset and its collection pipeline, we evaluate several typical popular lipreading methods and perform a thorough analysis of the results from several aspects. The results demonstrate the consistency and challenges of our dataset, which may open up some new promising directions for future work.
△ Less
Submitted 23 April, 2019; v1 submitted 16 October, 2018;
originally announced October 2018.
-
Circular-shift Linear Network Codes with Arbitrary Odd Block Lengths
Authors:
Qifu Tyler Sun,
Hanqi Tang,
Zongpeng Li,
Xiaolong Yang,
Ke** Long
Abstract:
Circular-shift linear network coding (LNC) is a class of vector LNC with low encoding and decoding complexities, and with local encoding kernels chosen from cyclic permutation matrices. When $L$ is a prime with primitive root $2$, it was recently shown that a scalar linear solution over GF($2^{L-1}$) induces an $L$-dimensional circular-shift linear solution at rate $(L-1)/L$. In this work, we prov…
▽ More
Circular-shift linear network coding (LNC) is a class of vector LNC with low encoding and decoding complexities, and with local encoding kernels chosen from cyclic permutation matrices. When $L$ is a prime with primitive root $2$, it was recently shown that a scalar linear solution over GF($2^{L-1}$) induces an $L$-dimensional circular-shift linear solution at rate $(L-1)/L$. In this work, we prove that for arbitrary odd $L$, every scalar linear solution over GF($2^{m_L}$), where $m_L$ refers to the multiplicative order of $2$ modulo $L$, can induce an $L$-dimensional circular-shift linear solution at a certain rate. Based on the generalized connection, we further prove that for such $L$ with $m_L$ beyond a threshold, every multicast network has an $L$-dimensional circular-shift linear solution at rate $φ(L)/L$, where $φ(L)$ is the Euler's totient function of $L$. An efficient algorithm for constructing such a solution is designed. Finally, we prove that every multicast network is asymptotically circular-shift linearly solvable.
△ Less
Submitted 2 January, 2019; v1 submitted 12 June, 2018;
originally announced June 2018.
-
Resource Allocation in NOMA based Fog Radio Access Networks
Authors:
H. Zhang,
Y. Qiu,
K. Long,
G. K. Karagiannidis,
X. Wang,
A. Nallanathan
Abstract:
In the wake of growth in intelligent mobile devices and wide usage of bandwidth-hungry applications of mobile Internet, the demand of wireless data traffic and ubiquitous mobile broadband is rapidly increasing. On account of these developments, the research on fifth generation (5G) networks presents an accelerative tendency on a global scale. Edge computing draw lots of attention for reducing the…
▽ More
In the wake of growth in intelligent mobile devices and wide usage of bandwidth-hungry applications of mobile Internet, the demand of wireless data traffic and ubiquitous mobile broadband is rapidly increasing. On account of these developments, the research on fifth generation (5G) networks presents an accelerative tendency on a global scale. Edge computing draw lots of attention for reducing the time delay and improving the Quality of Service for the networks. While, fog radio access networks (F-RANs) is an emergent architecture, which takes full use of edge computing and distributed storing capabilities in edge devices. In this article, we propose an architecture of non-orthogonal multiple access (NOMA) based F-RANs, which has a strong capability of edge computing and can meet the heterogeneous requirements in 5G systems. NOMA with successive interference cancellation (SIC) is regarded as a critical multi-user access technology. In NOMA, more than one user can access the same time, code domain, and frequency resources. With assigning different power levels to multi-user and implementing SIC, multiple users detection can be achieved. In this article, we provide a description of the NOMA based F-RANs architecture, and discuss the resource allocation in that. We will focus on the power and subchannel allocation in consideration of using NOMA and the edge caching. Simulation results show that the proposed NOMA baesd F-RANs architecture and the resource management mechanisms can achieve the high net utility for the RANs.
△ Less
Submitted 15 March, 2018;
originally announced March 2018.
-
Energy-Efficient Resource Allocation in NOMA Heterogeneous Networks
Authors:
Haijun Zhang,
Fang Fang,
Julian Cheng,
Ke** Long,
Wei Wang,
Victor C. M. Leung
Abstract:
Non-orthogonal multiple access (NOMA) has attracted much recent attention owing to its capability for improving the system spectral efficiency in wireless communications. Deploying NOMA in heterogeneous network can satisfy users' explosive data traffic requirements, and NOMA will likely play an important role in the fifth-generation (5G) mobile communication networks. However, NOMA brings new tech…
▽ More
Non-orthogonal multiple access (NOMA) has attracted much recent attention owing to its capability for improving the system spectral efficiency in wireless communications. Deploying NOMA in heterogeneous network can satisfy users' explosive data traffic requirements, and NOMA will likely play an important role in the fifth-generation (5G) mobile communication networks. However, NOMA brings new technical challenges on resource allocation due to the mutual cross-tier interference in heterogeneous networks. In this article, to study the tradeoff between data rate performance and energy consumption in NOMA, we examine the problem of energy-efficient user scheduling and power optimization in 5G NOMA heterogeneous networks. The energy-efficient user scheduling and power allocation schemes are introduced for the downlink 5G NOMA heterogeneous network for perfect and imperfect channel state information (CSI) respectively. Simulation results show that the resource allocation schemes can significantly increase the energy efficiency of 5G NOMA heterogeneous network for both cases of perfect CSI and imperfect CSI.
△ Less
Submitted 14 January, 2018;
originally announced January 2018.
-
Secure Communications in NOMA System: Subcarrier Assignment and Power Allocation
Authors:
Haijun Zhang,
Ning Yang,
Ke** Long,
Miao Pan,
George K. Karagiannidis,
Victor C. M. Leung
Abstract:
Secure communication is a promising technology for wireless networks because it ensures secure transmission of information. In this paper, we investigate the joint subcarrier (SC) assignment and power allocation problem for non-orthogonal multiple access (NOMA) amplify-and-forward two-way relay wireless networks, in the presence of eavesdroppers. By exploiting cooperative jamming (CJ) to enhance t…
▽ More
Secure communication is a promising technology for wireless networks because it ensures secure transmission of information. In this paper, we investigate the joint subcarrier (SC) assignment and power allocation problem for non-orthogonal multiple access (NOMA) amplify-and-forward two-way relay wireless networks, in the presence of eavesdroppers. By exploiting cooperative jamming (CJ) to enhance the security of the communication link, we aim to maximize the achievable secrecy energy efficiency by jointly designing the SC assignment, user pair scheduling and power allocation. Assuming the perfect knowledge of the channel state information (CSI) at the relay station, we propose a low-complexity subcarrier assignment scheme (SCAS-1), which is equivalent to many-to-many matching games, and then SCAS-2 is formulated as a secrecy energy efficiency maximization problem. The secure power allocation problem is modeled as a convex geometric programming problem, and then solved by interior point methods. Simulation results demonstrate that the effectiveness of the proposed SSPA algorithms under scenarios of using and not using CJ, respectively.
△ Less
Submitted 13 January, 2018;
originally announced January 2018.
-
Fog Radio Access Networks: Mobility Management, Interference Mitigation and Resource Optimization
Authors:
Haijun Zhang,
Yu Qiu,
Xiaoli Chu,
Ke** Long,
Victor C. M. Leung
Abstract:
In order to make Internet connections ubiquitous and autonomous in our daily lives, maximizing the utilization of radio resources and social information is one of the major research topics in future mobile communication technologies. Fog radio access network (FRAN) is regarded as a promising paradigm for the fifth generation (5G) of mobile networks. FRAN integrates fog computing with RAN and makes…
▽ More
In order to make Internet connections ubiquitous and autonomous in our daily lives, maximizing the utilization of radio resources and social information is one of the major research topics in future mobile communication technologies. Fog radio access network (FRAN) is regarded as a promising paradigm for the fifth generation (5G) of mobile networks. FRAN integrates fog computing with RAN and makes full use of the edge of networks. FRAN would be different in networking, computing, storage and control as compared with conventional radio access networks (RAN) and the emerging cloud RAN. In this article, we provide a description of the FRAN architecture, and discuss how the distinctive characteristics of FRAN make it possible to efficiently alleviate the burden on the fronthaul, backhaul and backbone networks, as well as reduce content delivery latencies. We will focus on the mobility management, interference mitigation, and resource optimization in FRAN. Our simulation results show that the proposed FRAN architecture and the associated mobility and resource management mechanisms can reduce the signaling cost and increase the net utility for the RAN.
△ Less
Submitted 20 July, 2017;
originally announced July 2017.
-
Circular-shift Linear Network Coding
Authors:
Hanqi Tang,
Qifu Tyler Sun,
Zongpeng Li,
Xiaolong Yang,
Ke** Long
Abstract:
We study a class of linear network coding (LNC) schemes, called circular-shift LNC, whose encoding operations consist of only circular-shifts and bit-wise additions (XOR). Formulated as a special vector linear code over GF($2$), an $L$-dimensional circular-shift linear code of degree $δ$ restricts its local encoding kernels to be the summation of at most $δ$ cyclic permutation matrices of size…
▽ More
We study a class of linear network coding (LNC) schemes, called circular-shift LNC, whose encoding operations consist of only circular-shifts and bit-wise additions (XOR). Formulated as a special vector linear code over GF($2$), an $L$-dimensional circular-shift linear code of degree $δ$ restricts its local encoding kernels to be the summation of at most $δ$ cyclic permutation matrices of size $L$. We show that on a general network, for a certain block length $L$, every scalar linear solution over GF($2^{L-1}$) can induce an $L$-dimensional circular-shift linear solution with 1-bit redundancy per-edge transmission. Consequently, specific to a multicast network, such a circular-shift linear solution of an arbitrary degree $δ$ can be efficiently constructed, which has an interesting complexity tradeoff between encoding and decoding with different choices of $δ$. By further proving that circular-shift LNC is insufficient to achieve the exact capacity of certain multicast networks, we show the optimality of the efficiently constructed circular-shift linear solution in the sense that its 1-bit redundancy is inevitable. Finally, both theoretical and numerical analysis imply that with increasing $L$, a randomly constructed circular-shift linear code has linear solvability behavior comparable to a randomly constructed permutation-based linear code, but has shorter overheads.
△ Less
Submitted 25 April, 2018; v1 submitted 7 July, 2017;
originally announced July 2017.
-
Network Slicing Based 5G and Future Mobile Networks: Mobility, Resource Management, and Challenges
Authors:
H. Zhang,
N. Liu,
X. Chu,
K. Long,
A. Aghvami,
V. C. M. Leung
Abstract:
The fifth-generation (5G) networks are expected to be able to satisfy users' different quality-of-service (QoS) requirements. Network slicing is a promising technology for 5G networks to provide services tailored for users' specific QoS demands. Driven by the increased massive wireless data traffic from different application scenarios, efficient resource allocation schemes should be exploited to i…
▽ More
The fifth-generation (5G) networks are expected to be able to satisfy users' different quality-of-service (QoS) requirements. Network slicing is a promising technology for 5G networks to provide services tailored for users' specific QoS demands. Driven by the increased massive wireless data traffic from different application scenarios, efficient resource allocation schemes should be exploited to improve the flexibility of network resource allocation and capacity of 5G networks based on network slicing. Due to the diversity of 5G application scenarios, new mobility management schemes are greatly needed to guarantee seamless handover in network slicing based 5G systems. In this article, we introduce a logical architecture for network slicing based 5G systems, and present a scheme for managing mobility between different access networks, as well as a joint power and subchannel allocation scheme in spectrum-sharing two-tier systems based on network slicing, where both the co-tier interference and cross-tier interference are taken into account. Simulation results demonstrate that the proposed resource allocation scheme can flexibly allocate network resources between different slices in 5G systems. Finally, several open issues and challenges in network slicing based 5G networks are discussed, including network reconstruction, network slicing management and cooperation with other 5G technologies.
△ Less
Submitted 24 April, 2017;
originally announced April 2017.
-
Energy Efficient User Association and Power Allocation in Millimeter Wave Based Ultra Dense Networks with Energy Harvesting Base Stations
Authors:
H. Zhang,
S. Huang,
C. Jiang,
K. Long,
V. C. M. Leung,
H. Vincent Poor
Abstract:
Millimeter wave (mmWave) communication technologies have recently emerged as an attractive solution to meet the exponentially increasing demand on mobile data traffic. Moreover, ultra dense networks (UDNs) combined with mmWave technology are expected to increase both energy efficiency and spectral efficiency. In this paper, user association and power allocation in mmWave based UDNs is considered w…
▽ More
Millimeter wave (mmWave) communication technologies have recently emerged as an attractive solution to meet the exponentially increasing demand on mobile data traffic. Moreover, ultra dense networks (UDNs) combined with mmWave technology are expected to increase both energy efficiency and spectral efficiency. In this paper, user association and power allocation in mmWave based UDNs is considered with attention to load balance constraints, energy harvesting by base stations, user quality of service requirements, energy efficiency, and cross-tier interference limits. The joint user association and power optimization problem is modeled as a mixed-integer programming problem, which is then transformed into a convex optimization problem by relaxing the user association indicator and solved by Lagrangian dual decomposition. An iterative gradient user association and power allocation algorithm is proposed and shown to converge rapidly to an optimal point. The complexity of the proposed algorithm is analyzed and the effectiveness of the proposed scheme compared with existing methods is verified by simulations.
△ Less
Submitted 24 April, 2017;
originally announced April 2017.
-
On Vector Linear Solvability of Multicast Networks
Authors:
Qifu Tyler Sun,
Xiaolong Yang,
Ke** Long,
Xunrui Yin,
Zongpeng Li
Abstract:
Vector linear network coding (LNC) is a generalization of the conventional scalar LNC, such that the data unit transmitted on every edge is an $L$-dimensional vector of data symbols over a base field GF($q$). Vector LNC enriches the choices of coding operations at intermediate nodes, and there is a popular conjecture on the benefit of vector LNC over scalar LNC in terms of alphabet size of data un…
▽ More
Vector linear network coding (LNC) is a generalization of the conventional scalar LNC, such that the data unit transmitted on every edge is an $L$-dimensional vector of data symbols over a base field GF($q$). Vector LNC enriches the choices of coding operations at intermediate nodes, and there is a popular conjecture on the benefit of vector LNC over scalar LNC in terms of alphabet size of data units: there exist (single-source) multicast networks that are vector linearly solvable of dimension $L$ over GF($q$) but not scalar linearly solvable over any field of size $q' \leq q^L$. This paper introduces a systematic way to construct such multicast networks, and subsequently establish explicit instances to affirm the positive answer of this conjecture for \emph{infinitely many} alphabet sizes $p^L$ with respect to an \emph{arbitrary} prime $p$. On the other hand, this paper also presents explicit instances with the special property that they do not have a vector linear solution of dimension $L$ over GF(2) but have scalar linear solutions over GF($q'$) for some $q' < 2^L$, where $q'$ can be odd or even. This discovery also unveils that over a given base field, a multicast network that has a vector linear solution of dimension $L$ does not necessarily have a vector linear solution of dimension $L' > L$.
△ Less
Submitted 9 May, 2016;
originally announced May 2016.
-
Cognitive Internet of Things: A New Paradigm beyond Connection
Authors:
Qihui Wu,
Guoru Ding,
Yuhua Xu,
Shuo Feng,
Zhiyong Du,
**long Wang,
Ke** Long
Abstract:
Current research on Internet of Things (IoT) mainly focuses on how to enable general objects to see, hear, and smell the physical world for themselves, and make them connected to share the observations. In this paper, we argue that only connected is not enough, beyond that, general objects should have the capability to learn, think, and understand both physical and social worlds by themselves. Thi…
▽ More
Current research on Internet of Things (IoT) mainly focuses on how to enable general objects to see, hear, and smell the physical world for themselves, and make them connected to share the observations. In this paper, we argue that only connected is not enough, beyond that, general objects should have the capability to learn, think, and understand both physical and social worlds by themselves. This practical need impels us to develop a new paradigm, named Cognitive Internet of Things (CIoT), to empower the current IoT with a `brain' for high-level intelligence. Specifically, we first present a comprehensive definition for CIoT, primarily inspired by the effectiveness of human cognition. Then, we propose an operational framework of CIoT, which mainly characterizes the interactions among five fundamental cognitive tasks: perception-action cycle, massive data analytics, semantic derivation and knowledge discovery, intelligent decision-making, and on-demand service provisioning. Furthermore, we provide a systematic tutorial on key enabling techniques involved in the cognitive tasks. In addition, we also discuss the design of proper performance metrics on evaluating the enabling techniques. Last but not least, we present the research challenges and open issues ahead. Building on the present work and potentially fruitful future studies, CIoT has the capability to bridge the physical world (with objects, resources, etc.) and the social world (with human demand, social behavior, etc.), and enhance smart resource allocation, automatic network operation, and intelligent service provisioning.
△ Less
Submitted 11 March, 2014;
originally announced March 2014.
-
Multicast Network Coding and Field Sizes
Authors:
Qifu,
Sun,
Xunrui Yin,
Zongpeng Li,
Ke** Long
Abstract:
In an acyclic multicast network, it is well known that a linear network coding solution over GF($q$) exists when $q$ is sufficiently large. In particular, for each prime power $q$ no smaller than the number of receivers, a linear solution over GF($q$) can be efficiently constructed. In this work, we reveal that a linear solution over a given finite field does \emph{not} necessarily imply the exist…
▽ More
In an acyclic multicast network, it is well known that a linear network coding solution over GF($q$) exists when $q$ is sufficiently large. In particular, for each prime power $q$ no smaller than the number of receivers, a linear solution over GF($q$) can be efficiently constructed. In this work, we reveal that a linear solution over a given finite field does \emph{not} necessarily imply the existence of a linear solution over all larger finite fields. Specifically, we prove by construction that: (i) For every source dimension no smaller than 3, there is a multicast network linearly solvable over GF(7) but not over GF(8), and another multicast network linearly solvable over GF(16) but not over GF(17); (ii) There is a multicast network linearly solvable over GF(5) but not over such GF($q$) that $q > 5$ is a Mersenne prime plus 1, which can be extremely large; (iii) A multicast network linearly solvable over GF($q^{m_1}$) and over GF($q^{m_2}$) is \emph{not} necessarily linearly solvable over GF($q^{m_1+m_2}$); (iv) There exists a class of multicast networks with a set $T$ of receivers such that the minimum field size $q_{min}$ for a linear solution over GF($q_{min}$) is lower bounded by $Θ(\sqrt{|T|})$, but not every larger field than GF($q_{min}$) suffices to yield a linear solution. The insight brought from this work is that not only the field size, but also the order of subgroups in the multiplicative group of a finite field affects the linear solvability of a multicast network.
△ Less
Submitted 13 February, 2015; v1 submitted 14 January, 2014;
originally announced January 2014.