Search | arXiv e-print repository

DDM-NET: End-to-end learning of keypoint feature Detection, Description and Matching for 3D localization

Authors: Xiangyu Xu, Li Guan, Enrique Dunn, Haoxiang Li, Gang Hua

Abstract: In this paper, we propose an end-to-end framework that jointly learns keypoint detection, descriptor representation and cross-frame matching for the task of image-based 3D localization. Prior art has tackled each of these components individually, purportedly aiming to alleviate difficulties in effectively train a holistic network. We design a self-supervised image war** correspondence loss for b… ▽ More In this paper, we propose an end-to-end framework that jointly learns keypoint detection, descriptor representation and cross-frame matching for the task of image-based 3D localization. Prior art has tackled each of these components individually, purportedly aiming to alleviate difficulties in effectively train a holistic network. We design a self-supervised image war** correspondence loss for both feature detection and matching, a weakly-supervised epipolar constraints loss on relative camera pose learning, and a directional matching scheme that detects key-point features in a source image and performs coarse-to-fine correspondence search on the target image. We leverage this framework to enforce cycle consistency in our matching module. In addition, we propose a new loss to robustly handle both definite inlier/outlier matches and less-certain matches. The integration of these learning mechanisms enables end-to-end training of a single network performing all three localization components. Bench-marking our approach on public data-sets, exemplifies how such an end-to-end framework is able to yield more accurate localization that out-performs both traditional methods as well as state-of-the-art weakly supervised methods. △ Less

Submitted 1 February, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

arXiv:2212.03110 [pdf, other]

High Rate Studies of the ATLAS sTGC Detector and Optimization of the Filter Circuit on the Input of the Front-End Amplifier

Authors: Siyuan Sun, Luca Moleri, Gerardo Vasquez, Peter Teterin, Sabrina Corsetti, Liang Guan, Benoit Lefebvre, Enrique Kajomovitz, Lorne Levinson, Nachman Lupu, Rob McPherson, Alexander Vdovin, Rongkun Wang, Bing Zhou, Junjie Zhu

Abstract: The Large Hadron Collider (LHC) at CERN is expected to be upgraded to the High-Luminosity LHC (HL-LHC) by 2029 and achieve instantaneous luminosity around 5 - 7.5 $\times$ 10$^{34}$cm$^{-2}$ s$^{-1}$. This represents a more than 3-4 fold increase in the instantaneous luminosity compared to what has been achieved in Run 2. The New Small Wheel (NSW) upgrade is designed to be able to operate efficien… ▽ More The Large Hadron Collider (LHC) at CERN is expected to be upgraded to the High-Luminosity LHC (HL-LHC) by 2029 and achieve instantaneous luminosity around 5 - 7.5 $\times$ 10$^{34}$cm$^{-2}$ s$^{-1}$. This represents a more than 3-4 fold increase in the instantaneous luminosity compared to what has been achieved in Run 2. The New Small Wheel (NSW) upgrade is designed to be able to operate efficiently in this high background rate environment. In this article, we summarize multiple performance studies of the small-strip Thin Gap Chamber (sTGC) at high rate using nearly final front-end electronics. We demonstrate that the efficiency versus rate distribution can be well described by an exponential decay with electronics dead-time being the primary cause of loss of efficiency at high rate. We then demonstrate several methods that can decrease the electronics dead-time and therefore minimize efficiency loss. One such method is to install either a pi-network input filter or pull-up resistor to minimize the charge input into the amplifier. We optimized the pi-network capacitance and pull-up resistor resistance using the results from our measurements. The results shown here were not only critical to finalizing the components on the front-end board, but also are critical for setting the optimal operating parameters of the sTGC detector and electronics in the ATLAS cavern. △ Less

Submitted 17 April, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

Comments: to be submitted

arXiv:2211.12699 [pdf, other]

doi 10.1016/j.physletb.2023.137698

Search for an axion-like particle in radiative $J/ψ$ decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, V. Batozskaya, D. Becker, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (576 additional authors not shown)

Abstract: We search for an axion-like particle (ALP) $a$ through the process $ψ(3686)\rightarrowπ^+π^-J/ψ$, $J/ψ\rightarrowγa$, $a\rightarrowγγ$ in a data sample of $(2.71\pm0.01)\times10^9$ $ψ(3686)$ events collected by the BESIII detector. No significant ALP signal is observed over the expected background, and the upper limits on the branching fraction of the decay $J/ψ\rightarrowγa$ and the ALP-photon co… ▽ More We search for an axion-like particle (ALP) $a$ through the process $ψ(3686)\rightarrowπ^+π^-J/ψ$, $J/ψ\rightarrowγa$, $a\rightarrowγγ$ in a data sample of $(2.71\pm0.01)\times10^9$ $ψ(3686)$ events collected by the BESIII detector. No significant ALP signal is observed over the expected background, and the upper limits on the branching fraction of the decay $J/ψ\rightarrowγa$ and the ALP-photon coupling constant $g_{aγγ}$ are set at 95% confidence level in the mass range of $0.165\leq m_a\leq2.84\,\mbox{GeV}/c^2$. The limits on $B(J/ψ\rightarrowγa)$ range from $8.3\times10^{-8}$ to $1.8\times10^{-6}$ over the search region, and the constraints on the ALP-photon coupling are the most stringent to date for $0.165\leq m_a\leq1.468\,\mbox{GeV}/c^2$. △ Less

Submitted 30 March, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

Comments: 10 pages, 6 figures

Journal ref: Physics Letters B 838 (2023) 137698

arXiv:2211.10755 [pdf, ps, other]

doi 10.1103/PhysRevD.107.112001

Measurement of $e^+e^-\rightarrowΛ\barΛη$ from 3.5106 to 4.6988 GeV and study of $Λ\barΛ$ mass threshold enhancement

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (587 additional authors not shown)

Abstract: Using data samples with a total integrated luminosity of approximately 18 fb$^{-1}$ collected by the BESIII detector operating at the BEPCII, the process $e^+e^-\rightarrowΛ\barΛ η$ is studied at center-of-mass energies between 3.5106 and 4.6988 GeV. The Born cross section for the process $e^+e^-\rightarrowΛ\barΛη$ is measured. No significant structure is observed in the Born cross section line sh… ▽ More Using data samples with a total integrated luminosity of approximately 18 fb$^{-1}$ collected by the BESIII detector operating at the BEPCII, the process $e^+e^-\rightarrowΛ\barΛ η$ is studied at center-of-mass energies between 3.5106 and 4.6988 GeV. The Born cross section for the process $e^+e^-\rightarrowΛ\barΛη$ is measured. No significant structure is observed in the Born cross section line shape. An enhancement near the $Λ\barΛ$ mass threshold is observed for the first time in the process. The structure can be described by an $S$-wave Breit-Wigner function. Neglecting contribution of excited $Λ$ states and potential interferences, the mass and width are determined to be ($2356\pm 7\pm17$) MeV/$c^2$ and ($304\pm28\pm54$) MeV, respectively, where the first uncertainties are statistical and the second are systematic. △ Less

Submitted 22 November, 2022; v1 submitted 19 November, 2022; originally announced November 2022.

Journal ref: Phys. Rev. D 107, 112001(2023)

arXiv:2211.02541 [pdf]

Generation of Chinese classical poetry based on pre-trained model

Authors: Ziyao Wang, Lu** Guan, Guanyu Liu

Abstract: In order to test whether artificial intelligence can create qualified classical poetry like humans, the author proposes a study of Chinese classical poetry generation based on a pre-trained model. This paper mainly tries to use BART and other pre training models, proposes FS2TEXT and RR2TEXT to generate metrical poetry text and even specific style poetry text, and solves the problem that the user'… ▽ More In order to test whether artificial intelligence can create qualified classical poetry like humans, the author proposes a study of Chinese classical poetry generation based on a pre-trained model. This paper mainly tries to use BART and other pre training models, proposes FS2TEXT and RR2TEXT to generate metrical poetry text and even specific style poetry text, and solves the problem that the user's writing intention gradually reduces the relevance of the generated poetry text. In order to test the model's results, the authors selected ancient poets, by combining it with BART's poetic model work, developed a set of AI poetry Turing problems, it was reviewed by a group of poets and poetry writing researchers. There were more than 600 participants, and the final results showed that, high-level poetry lovers can't distinguish between AI activity and human activity, this indicates that the author's working methods are not significantly different from human activities. The model of poetry generation studied by the author generalizes works that cannot be distinguished from those of advanced scholars. The number of modern Chinese poets has reached 5 million. However, many modern Chinese poets lack language ability and skills as a result of their childhood learning. However, many modern poets have no creative inspiration, and the author's model can help them. They can look at this model when they choose words and phrases and they can write works based on the poems they already have, and they can write their own poems. The importance of poetry lies in the author's thoughts and reflections. It doesn't matter how good AI poetry is. The only thing that matters is for people to see and inspire them. △ Less

Submitted 4 November, 2022; originally announced November 2022.

Comments: 8 pages,2 figures

ACM Class: J.5; I.2.7

arXiv:2210.16007 [pdf, ps, other]

Design of Protograph LDPC-Coded MIMO-VLC Systems with Generalized Spatial Modulation

Authors: Lin Dai, Yi Fang, Yong Liang Guan, Mohsen Guizani

Abstract: This paper investigates the bit-interleaved coded generalized spatial modulation (BICGSM) with iterative decoding (BICGSM-ID) for multiple-input multiple-output (MIMO) visible light communications (VLC). In the BICGSM-ID scheme, the information bits conveyed by the signal-domain (SiD) symbols and the spatial-domain (SpD) light emitting diode (LED)-index patterns are coded by a protograph low-densi… ▽ More This paper investigates the bit-interleaved coded generalized spatial modulation (BICGSM) with iterative decoding (BICGSM-ID) for multiple-input multiple-output (MIMO) visible light communications (VLC). In the BICGSM-ID scheme, the information bits conveyed by the signal-domain (SiD) symbols and the spatial-domain (SpD) light emitting diode (LED)-index patterns are coded by a protograph low-density parity-check (P-LDPC) code. Specifically, we propose a signal-domain symbol expanding and re-allocating (SSER) method for constructing a type of novel generalized spatial modulation (GSM) constellations, referred to as SSERGSM constellations, so as to boost the performance of the BICGSM-ID MIMO-VLC systems. Moreover, by applying a modified PEXIT (MPEXIT) algorithm, we further design a family of rate-compatible P-LDPC codes, referred to as enhanced accumulate-repeat-accumulate (EARA) codes, which possess both excellent decoding thresholds and linear-minimum-distance-growth property. Both analysis and simulation results illustrate that the proposed SSERGSM constellations and P-LDPC codes can remarkably improve the convergence and decoding performance of MIMO-VLC systems. Therefore, the proposed P-LDPC-coded SSERGSM-mapped BICGSM-ID configuration is envisioned as a promising transmission solution to satisfy the high-throughput requirement of MIMO-VLC applications. △ Less

Submitted 28 October, 2022; originally announced October 2022.

arXiv:2210.15906 [pdf, other]

Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences

Authors: Lin Guan, Karthik Valmeekam, Subbarao Kambhampati

Abstract: Generating complex behaviors that satisfy the preferences of non-expert users is a crucial requirement for AI agents. Interactive reward learning from trajectory comparisons (a.k.a. RLHF) is one way to allow non-expert users to convey complex objectives by expressing preferences over short clips of agent behaviors. Even though this parametric method can encode complex tacit knowledge present in th… ▽ More Generating complex behaviors that satisfy the preferences of non-expert users is a crucial requirement for AI agents. Interactive reward learning from trajectory comparisons (a.k.a. RLHF) is one way to allow non-expert users to convey complex objectives by expressing preferences over short clips of agent behaviors. Even though this parametric method can encode complex tacit knowledge present in the underlying tasks, it implicitly assumes that the human is unable to provide richer feedback than binary preference labels, leading to intolerably high feedback complexity and poor user experience. While providing a detailed symbolic closed-form specification of the objectives might be tempting, it is not always feasible even for an expert user. However, in most cases, humans are aware of how the agent should change its behavior along meaningful axes to fulfill their underlying purpose, even if they are not able to fully specify task objectives symbolically. Using this as motivation, we introduce the notion of Relative Behavioral Attributes, which allows the users to tweak the agent behavior through symbolic concepts (e.g., increasing the softness or speed of agents' movement). We propose two practical methods that can learn to model any kind of behavioral attributes from ordered behavior clips. We demonstrate the effectiveness of our methods on four tasks with nine different behavioral attributes, showing that once the attributes are learned, end users can produce desirable agent behaviors relatively effortlessly, by providing feedback just around ten times. This is over an order of magnitude less than that required by the popular learning-from-human-preferences baselines. The supplementary video and source code are available at: https://guansuns.github.io/pages/rba. △ Less

Submitted 27 February, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

Comments: ICLR 2023 Camera Ready

arXiv:2210.15096 [pdf, other]

Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion

Authors: Utkarsh Soni, Nupur Thakur, Sarath Sreedharan, Lin Guan, Mudit Verma, Matthew Marquez, Subbarao Kambhampati

Abstract: There is a growing interest in develo** automated agents that can work alongside humans. In addition to completing the assigned task, such an agent will undoubtedly be expected to behave in a manner that is preferred by the human. This requires the human to communicate their preferences to the agent. To achieve this, the current approaches either require the users to specify the reward function… ▽ More There is a growing interest in develo** automated agents that can work alongside humans. In addition to completing the assigned task, such an agent will undoubtedly be expected to behave in a manner that is preferred by the human. This requires the human to communicate their preferences to the agent. To achieve this, the current approaches either require the users to specify the reward function or the preference is interactively learned from queries that ask the user to compare behavior. The former approach can be challenging if the internal representation used by the agent is inscrutable to the human while the latter is unnecessarily cumbersome for the user if their preference can be specified more easily in symbolic terms. In this work, we propose PRESCA (PREference Specification through Concept Acquisition), a system that allows users to specify their preferences in terms of concepts that they understand. PRESCA maintains a set of such concepts in a shared vocabulary. If the relevant concept is not in the shared vocabulary, then it is learned. To make learning a new concept more feedback efficient, PRESCA leverages causal associations between the target concept and concepts that are already known. In addition, we use a novel data augmentation approach to further reduce required feedback. We evaluate PRESCA by using it on a Minecraft environment and show that it can effectively align the agent with the user's preference. △ Less

Submitted 31 January, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

arXiv:2210.06988 [pdf, ps, other]

doi 10.1103/PhysRevD.107.072003

Measurement of $e^+ e^- \rightarrow φη^{\prime}$ cross sections at center-of-mass energies between 3.508 and 4.600 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (579 additional authors not shown)

Abstract: We present a measurement of the dressed cross sections for $e^+ e^- \rightarrow φη^{\prime}$ at different center-of-mass energies between 3.508 and 4.600 GeV based on 15.1 fb$^{-1}$ of $e^+ e^-$ annihilation data collected with the BESIII detector operating at the BEPCII collider. In addition, a search for the decay $Y(4230) \to φη^{\prime}$ is performed. No clear signal is observed and the corres… ▽ More We present a measurement of the dressed cross sections for $e^+ e^- \rightarrow φη^{\prime}$ at different center-of-mass energies between 3.508 and 4.600 GeV based on 15.1 fb$^{-1}$ of $e^+ e^-$ annihilation data collected with the BESIII detector operating at the BEPCII collider. In addition, a search for the decay $Y(4230) \to φη^{\prime}$ is performed. No clear signal is observed and the corresponding upper limit is provided. △ Less

Submitted 25 April, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

Journal ref: Phys. Rev. D 107, 7, 072003 (2023)

arXiv:2209.13027 [pdf, other]

A GPU-accelerated Algorithm for Distinct Discriminant Canonical Correlation Network

Authors: Kai Liu, Lei Gao, Ling Guan

Abstract: Currently, deep neural networks (DNNs)-based models have drawn enormous attention and have been utilized to different domains widely. However, due to the data-driven nature, the DNN models may generate unsatisfying performance on the small scale data sets. To address this problem, a distinct discriminant canonical correlation network (DDCCANet) is proposed to generate the deep-level feature repres… ▽ More Currently, deep neural networks (DNNs)-based models have drawn enormous attention and have been utilized to different domains widely. However, due to the data-driven nature, the DNN models may generate unsatisfying performance on the small scale data sets. To address this problem, a distinct discriminant canonical correlation network (DDCCANet) is proposed to generate the deep-level feature representation, producing improved performance on image classification. However, the DDCCANet model was originally implemented on a CPU with computing time on par with state-of-the-art DNN models running on GPUs. In this paper, a GPU-based accelerated algorithm is proposed to further optimize the DDCCANet algorithm. As a result, not only is the performance of DDCCANet guaranteed, but also greatly shortens the calculation time, making the model more applicable in real tasks. To demonstrate the effectiveness of the proposed accelerated algorithm, we conduct experiments on three database with different scales. Experimental results validate the superiority of the proposed accelerated algorithm on given examples. △ Less

Submitted 26 September, 2022; originally announced September 2022.

arXiv:2209.11171 [pdf, other]

A Cooperative Deception Strategy for Covert Communication in Presence of a Multi-antenna Adversary

Authors: Jiangbo Si, Zizhen Liu, Zan Li, Hang Hu, Lei Guan, Chao Wang, Naofal Al-Dhahir

Abstract: Covert transmission is investigated for a cooperative deception strategy, where a cooperative jammer (Jammer) tries to attract a multi-antenna adversary (Willie) and degrade the adversary's reception ability for the signal from a transmitter (Alice). For this strategy, we formulate an optimization problem to maximize the covert rate when three different types of channel state information (CSI) are… ▽ More Covert transmission is investigated for a cooperative deception strategy, where a cooperative jammer (Jammer) tries to attract a multi-antenna adversary (Willie) and degrade the adversary's reception ability for the signal from a transmitter (Alice). For this strategy, we formulate an optimization problem to maximize the covert rate when three different types of channel state information (CSI) are available. The total power is optimally allocated between Alice and Jammer subject to Kullback-Leibler (KL) divergence constraint. Different from the existing literature, in our proposed strategy, we also determine the optimal transmission power at the jammer when Alice is silent, while existing works always assume that the jammer's power is fixed. Specifically, we apply the S-procedure to convert infinite constraints into linear-matrix-inequalities (LMI) constraints. When statistical CSI at Willie is available, we convert double integration to single integration using asymptotic approximation and substitution method. In addition, the transmission strategy without jammer deception is studied as a benchmark. Finally, our simulation results show that for the proposed strategy, the covert rate is increased with the number of antennas at Willie. Moreover, compared to the benchmark, our proposed strategy is more robust in face of imperfect CSI. △ Less

Submitted 7 September, 2022; originally announced September 2022.

Comments: 33 pages, 8 Figures

MSC Class: 14J60 (Primary) 14F05; 14J26 (Secondary) ACM Class: F.2.2; I.2.7

arXiv:2208.07833 [pdf, other]

What Your Firmware Tells You Is Not How You Should Emulate It: A Specification-Guided Approach for Firmware Emulation (Extended Version)

Authors: Wei Zhou, Lan Zhang, Le Guan, Peng Liu, Yuqing Zhang

Abstract: Emulating firmware of microcontrollers is challenging due to the lack of peripheral models. Existing work finds out how to respond to peripheral read operations by analyzing the target firmware. This is problematic because the firmware sometimes does not contain enough clues to support the emulation or even contains misleading information (e.g. buggy firmware). In this work, we propose a new appro… ▽ More Emulating firmware of microcontrollers is challenging due to the lack of peripheral models. Existing work finds out how to respond to peripheral read operations by analyzing the target firmware. This is problematic because the firmware sometimes does not contain enough clues to support the emulation or even contains misleading information (e.g. buggy firmware). In this work, we propose a new approach that builds peripheral models from the peripheral specification. Using NLP, we translate peripheral behaviors in human language (documented in chip manuals) into a set of structured condition-action rules. By checking, executing, and chaining them at runtime, we can dynamically synthesize a peripheral model for each firmware execution. The extracted condition-action rules might not be complete or even be wrong. We, therefore, propose incorporating symbolic execution to quickly pinpoint the root cause. This assists us in the manual correction of the problematic rules. We have implemented our idea for five popular MCU boards spanning three different chip vendors. Using a new edit-distance-based algorithm to calculate trace differences, our evaluation against a large firmware corpus confirmed that our prototype achieves much higher fidelity compared with state-of-the-art solutions. Benefiting from the accurate emulation, our emulator effectively avoids false positives observed in existing fuzzing work. We also designed a new dynamic analysis method to perform driver code compliance checks against the specification. We found some non-compliance which we later confirmed to be bugs caused by race conditions. △ Less

Submitted 11 October, 2022; v1 submitted 16 August, 2022; originally announced August 2022.

Comments: Wei Zhou and Lan Zhang contributed equally to this work

arXiv:2206.12134 [pdf, other]

Capacity Optimal Coded Generalized MU-MIMO

Authors: Yuhao Chi, Lei Liu, Guanghui Song, Ying Li, Yong Liang Guan, Chau Yuen

Abstract: With the complication of future communication scenarios, most conventional signal processing technologies of multi-user multiple-input multiple-output (MU-MIMO) become unreliable, which are designed based on ideal assumptions, such as Gaussian signaling and independent identically distributed (IID) channel matrices. As a result, this paper considers a generalized MU-MIMO (GMU-MIMO) system with mor… ▽ More With the complication of future communication scenarios, most conventional signal processing technologies of multi-user multiple-input multiple-output (MU-MIMO) become unreliable, which are designed based on ideal assumptions, such as Gaussian signaling and independent identically distributed (IID) channel matrices. As a result, this paper considers a generalized MU-MIMO (GMU-MIMO) system with more general assumptions, i.e., arbitrarily fixed input distributions, and general unitarily-invariant channel matrices. However, there is still no accurate capacity analysis and capacity optimal transceiver with practical complexity for GMU-MIMO under the constraint of coding. To address these issues, inspired by the replica method, the constrained sum capacity of coded GMU-MIMO with fixed input distribution is calculated by using the celebrated mutual information and minimum mean-square error (MMSE) lemma and the MMSE optimality of orthogonal/vector approximate message passing (OAMP/VAMP). Then, a capacity optimal multiuser OAMP/VAMP receiver is proposed, whose achievable rate is proved to be equal to the constrained sum capacity. Moreover, a design principle of multi-user codes is presented for the multiuser OAMP/VAMP, based on which a kind of practical multi-user low-density parity-check (MU-LDPC) code is designed. Numerical results show that finite-length performances of the proposed MU-LDPC codes with multi-user OAMP/VAMP are about 2 dB away from the constrained sum capacity and outperform those of the existing state-of-art methods. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Comments: Accepted by the 2022 IEEE International Symposium on Information Theory (ISIT).arXiv admin note:substantial text overlap with arXiv:2111.11061

arXiv:2205.10791 [pdf, other]

Federated Spectrum Learning for Reconfigurable Intelligent Surfaces-Aided Wireless Edge Networks

Authors: Bo Yang, Xuelin Cao, Chongwen Huang, Chau Yuen, Marco Di Renzo, Yong Liang Guan, Dusit Niyato, Lijun Qian, Merouane Debbah

Abstract: Increasing concerns on intelligent spectrum sensing call for efficient training and inference technologies. In this paper, we propose a novel federated learning (FL) framework, dubbed federated spectrum learning (FSL), which exploits the benefits of reconfigurable intelligent surfaces (RISs) and overcomes the unfavorable impact of deep fading channels. Distinguishingly, we endow conventional RISs… ▽ More Increasing concerns on intelligent spectrum sensing call for efficient training and inference technologies. In this paper, we propose a novel federated learning (FL) framework, dubbed federated spectrum learning (FSL), which exploits the benefits of reconfigurable intelligent surfaces (RISs) and overcomes the unfavorable impact of deep fading channels. Distinguishingly, we endow conventional RISs with spectrum learning capabilities by leveraging a fully-trained convolutional neural network (CNN) model at each RIS controller, thereby hel** the base station to cooperatively infer the users who request to participate in FL at the beginning of each training iteration. To fully exploit the potential of FL and RISs, we address three technical challenges: RISs phase shifts configuration, user-RIS association, and wireless bandwidth allocation. The resulting joint learning, wireless resource allocation, and user-RIS association design is formulated as an optimization problem whose objective is to maximize the system utility while considering the impact of FL prediction accuracy. In this context, the accuracy of FL prediction interplays with the performance of resource optimization. In particular, if the accuracy of the trained CNN model deteriorates, the performance of resource allocation worsens. The proposed FSL framework is tested by using real radio frequency (RF) traces and numerical results demonstrate its advantages in terms of spectrum prediction accuracy and system utility: a better CNN prediction accuracy and FL system utility can be achieved with a larger number of RISs and reflecting elements. △ Less

Submitted 22 May, 2022; originally announced May 2022.

arXiv:2202.08877 [pdf, other]

Frequency dependence of near-surface oceanic kinetic energy from drifter observations and global high-resolution models

Authors: Brian K. Arbic, Shane Elipot, Jonathan M. Brasch, Dimitris Menemenlis, Aurelien L. Ponte, Jay F. Shriver, Xiaolong Yu, Edward D. Zaron, Matthew H. Alford, Maarten C. Buijsman, Ryan Abernathey, Daniel Garcia, Lingxiao Guan, Paige E. Martin, Arin D. Nelson

Abstract: The geographical variability, frequency content, and vertical structure of near-surface oceanic kinetic energy (KE) are important for air-sea interaction, marine ecosystems, operational oceanography, pollutant tracking, and interpreting remotely sensed velocity measurements. Here, KE in high-resolution global simulations (HYbrid Coordinate Ocean Model; HYCOM, and Massachusetts Institute of Technol… ▽ More The geographical variability, frequency content, and vertical structure of near-surface oceanic kinetic energy (KE) are important for air-sea interaction, marine ecosystems, operational oceanography, pollutant tracking, and interpreting remotely sensed velocity measurements. Here, KE in high-resolution global simulations (HYbrid Coordinate Ocean Model; HYCOM, and Massachusetts Institute of Technology general circulation model; MITgcm), at the sea surface (0 m) and 15 m, are respectively compared with KE from undrogued and drogued surface drifters. Global maps and zonal averages are computed for low-frequency ($<$ 0.5 cpd), near-inertial, diurnal, and semi-diurnal bands. Both models exhibit low-frequency equatorial KE that is low relative to drifter values. HYCOM near-inertial KE is higher than in MITgcm, and closer to drifter values, probably due to more frequently updated atmospheric forcing. HYCOM semi-diurnal KE is lower than in MITgcm, and closer to drifter values, likely due to inclusion of a parameterized topographic internal wave drag. A concurrent tidal harmonic analysis in the diurnal band demonstrates that much of the diurnal flow is non-tidal. We compute a simple proxy of near-surface vertical structure, the ratio of 0 m KE to 0 m KE plus 15 m KE in model outputs, and undrogued KE to undrogued KE plus drogued KE in drifter observations. Over most latitudes and frequency bands, model ratios track the drifter ratios to within error bars. Values of this ratio demonstrate significant vertical structure in all frequency bands except the semidiurnal band. Latitudinal dependence in the ratio is greatest in diurnal and low-frequency bands. △ Less

Submitted 22 July, 2022; v1 submitted 17 February, 2022; originally announced February 2022.

Comments: revised for AGU JGR: Oceans

arXiv:2202.03013 [pdf, other]

doi 10.1145/3510003.3510208

$μ$AFL: Non-intrusive Feedback-driven Fuzzing for Microcontroller Firmware

Authors: Wenqiang Li, Jiameng Shi, Fengjun Li, **gqiang Lin, Wei Wang, Le Guan

Abstract: Fuzzing is one of the most effective approaches to finding software flaws. However, applying it to microcontroller firmware incurs many challenges. For example, rehosting-based solutions cannot accurately model peripheral behaviors and thus cannot be used to fuzz the corresponding driver code. In this work, we present $μ$AFL, a hardware-in-the-loop approach to fuzzing microcontroller firmware. It… ▽ More Fuzzing is one of the most effective approaches to finding software flaws. However, applying it to microcontroller firmware incurs many challenges. For example, rehosting-based solutions cannot accurately model peripheral behaviors and thus cannot be used to fuzz the corresponding driver code. In this work, we present $μ$AFL, a hardware-in-the-loop approach to fuzzing microcontroller firmware. It leverages debugging tools in existing embedded system development to construct an AFL-compatible fuzzing framework. Specifically, we use the debug dongle to bridge the fuzzing environment on the PC and the target firmware on the microcontroller device. To collect code coverage information without costly code instrumentation, $μ$AFL relies on the ARM ETM hardware debugging feature, which transparently collects the instruction trace and streams the results to the PC. However, the raw ETM data is obscure and needs enormous computing resources to recover the actual instruction flow. We therefore propose an alternative representation of code coverage, which retains the same path sensitivity as the original AFL algorithm, but can directly work on the raw ETM data without matching them with disassembled instructions. To further reduce the workload, we use the DWT hardware feature to selectively collect runtime information of interest. We evaluated $μ$AFL on two real evaluation boards from two major vendors: NXP and STMicroelectronics. With our prototype, we discovered ten zero-day bugs in the driver code shipped with the SDK of STMicroelectronics and three zero-day bugs in the SDK of NXP. Eight CVEs have been allocated for them. Considering the wide adoption of vendor SDKs in real products, our results are alarming. △ Less

Submitted 19 April, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

Comments: 44th International Conference on Software Engineering (ICSE 2022)

arXiv:2202.02886 [pdf, other]

Leveraging Approximate Symbolic Models for Reinforcement Learning via Skill Diversity

Authors: Lin Guan, Sarath Sreedharan, Subbarao Kambhampati

Abstract: Creating reinforcement learning (RL) agents that are capable of accepting and leveraging task-specific knowledge from humans has been long identified as a possible strategy for develo** scalable approaches for solving long-horizon problems. While previous works have looked at the possibility of using symbolic models along with RL approaches, they tend to assume that the high-level action models… ▽ More Creating reinforcement learning (RL) agents that are capable of accepting and leveraging task-specific knowledge from humans has been long identified as a possible strategy for develo** scalable approaches for solving long-horizon problems. While previous works have looked at the possibility of using symbolic models along with RL approaches, they tend to assume that the high-level action models are executable at low level and the fluents can exclusively characterize all desirable MDP states. Symbolic models of real world tasks are however often incomplete. To this end, we introduce Approximate Symbolic-Model Guided Reinforcement Learning, wherein we will formalize the relationship between the symbolic model and the underlying MDP that will allow us to characterize the incompleteness of the symbolic model. We will use these models to extract high-level landmarks that will be used to decompose the task. At the low level, we learn a set of diverse policies for each possible task subgoal identified by the landmark, which are then stitched together. We evaluate our system by testing on three different benchmark domains and show how even with incomplete symbolic model information, our approach is able to discover the task structure and efficiently guide the RL agent towards the goal. △ Less

Submitted 17 June, 2022; v1 submitted 6 February, 2022; originally announced February 2022.

arXiv:2201.07021 [pdf, other]

MuSCLe: A Multi-Strategy Contrastive Learning Framework for Weakly Supervised Semantic Segmentation

Authors: Kunhao Yuan, Gerald Schaefer, Yu-Kun Lai, Yifan Wang, Xiyao Liu, Lin Guan, Hui Fang

Abstract: Weakly supervised semantic segmentation (WSSS) has gained significant popularity since it relies only on weak labels such as image level annotations rather than pixel level annotations required by supervised semantic segmentation (SSS) methods. Despite drastically reduced annotation costs, typical feature representations learned from WSSS are only representative of some salient parts of objects an… ▽ More Weakly supervised semantic segmentation (WSSS) has gained significant popularity since it relies only on weak labels such as image level annotations rather than pixel level annotations required by supervised semantic segmentation (SSS) methods. Despite drastically reduced annotation costs, typical feature representations learned from WSSS are only representative of some salient parts of objects and less reliable compared to SSS due to the weak guidance during training. In this paper, we propose a novel Multi-Strategy Contrastive Learning (MuSCLe) framework to obtain enhanced feature representations and improve WSSS performance by exploiting similarity and dissimilarity of contrastive sample pairs at image, region, pixel and object boundary levels. Extensive experiments demonstrate the effectiveness of our method and show that MuSCLe outperforms the current state-of-the-art on the widely used PASCAL VOC 2012 dataset. △ Less

Submitted 18 January, 2022; originally announced January 2022.

arXiv:2201.05289 [pdf, other]

$\ell_1$-norm constrained multi-block sparse canonical correlation analysis via proximal gradient descent

Authors: Leying Guan

Abstract: Multi-block CCA constructs linear relationships explaining coherent variations across multiple blocks of data. We view the multi-block CCA problem as finding leading generalized eigenvectors and propose to solve it via a proximal gradient descent algorithm with $\ell_1$ constraint for high dimensional data. In particular, we use a decaying sequence of constraints over proximal iterations, and show… ▽ More Multi-block CCA constructs linear relationships explaining coherent variations across multiple blocks of data. We view the multi-block CCA problem as finding leading generalized eigenvectors and propose to solve it via a proximal gradient descent algorithm with $\ell_1$ constraint for high dimensional data. In particular, we use a decaying sequence of constraints over proximal iterations, and show that the resulting estimate is rate-optimal under suitable assumptions. Although several previous works have demonstrated such optimality for the $\ell_0$ constrained problem using iterative approaches, the same level of theoretical understanding for the $\ell_1$ constrained formulation is still lacking. We also describe an easy-to-implement deflation procedure to estimate multiple eigenvectors sequentially. We compare our proposals to several existing methods whose implementations are available on R CRAN, and the proposed methods show competitive performances in both simulations and a real data example. △ Less

Submitted 13 January, 2022; originally announced January 2022.

Comments: Main paper: 21 pages; Supplements: 39 pages

arXiv:2201.00785 [pdf, other]

Implicit Autoencoder for Point-Cloud Self-Supervised Representation Learning

Authors: Siming Yan, Zhenpei Yang, Haoxiang Li, Chen Song, Li Guan, Hao Kang, Gang Hua, Qixing Huang

Abstract: This paper advocates the use of implicit surface representation in autoencoder-based self-supervised 3D representation learning. The most popular and accessible 3D representation, i.e., point clouds, involves discrete samples of the underlying continuous 3D surface. This discretization process introduces sampling variations on the 3D shape, making it challenging to develop transferable knowledge o… ▽ More This paper advocates the use of implicit surface representation in autoencoder-based self-supervised 3D representation learning. The most popular and accessible 3D representation, i.e., point clouds, involves discrete samples of the underlying continuous 3D surface. This discretization process introduces sampling variations on the 3D shape, making it challenging to develop transferable knowledge of the true 3D geometry. In the standard autoencoding paradigm, the encoder is compelled to encode not only the 3D geometry but also information on the specific discrete sampling of the 3D shape into the latent code. This is because the point cloud reconstructed by the decoder is considered unacceptable unless there is a perfect map** between the original and the reconstructed point clouds. This paper introduces the Implicit AutoEncoder (IAE), a simple yet effective method that addresses the sampling variation issue by replacing the commonly-used point-cloud decoder with an implicit decoder. The implicit decoder reconstructs a continuous representation of the 3D shape, independent of the imperfections in the discrete samples. Extensive experiments demonstrate that the proposed IAE achieves state-of-the-art performance across various self-supervised learning benchmarks. △ Less

Submitted 27 August, 2023; v1 submitted 3 January, 2022; originally announced January 2022.

Comments: Published in ICCV 2023. The code is available at https://github.com/SimingYan/IAE

arXiv:2112.13219 [pdf, other]

Measurement of $e^{+}e^{-}\toφπ^{+}π^{-}$ cross sections at center-of-mass energies from 2.00 to 3.08 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (589 additional authors not shown)

Abstract: Using data corresponding to an integrated luminosity of $651~\mathrm{pb}^{-1}$ accumulated at 22 center-of-mass energies from 2.00 to 3.08 GeV by the BESIII experiment, the process $e^{+}e^{-}\toφπ^{+}π^{-}$ is studied. The cross sections for $e^{+}e^{-}\toφπ^{+}π^{-}$ are consistent with previous results, but with improved precision. To measure the mass and width of the structure observed in the… ▽ More Using data corresponding to an integrated luminosity of $651~\mathrm{pb}^{-1}$ accumulated at 22 center-of-mass energies from 2.00 to 3.08 GeV by the BESIII experiment, the process $e^{+}e^{-}\toφπ^{+}π^{-}$ is studied. The cross sections for $e^{+}e^{-}\toφπ^{+}π^{-}$ are consistent with previous results, but with improved precision. To measure the mass and width of the structure observed in the cross section line shape, a combine fit is performed after enhancing the contribution from $φf_{0}(980)$. The fit reveals a structure with the mass of $M=2178\pm20\pm5~{\rm MeV}/c^2$ and the width of $\varGamma=140\pm36\pm16~{\rm MeV}$, where the first uncertainties are statistical and the second ones are systematic. △ Less

Submitted 17 August, 2023; v1 submitted 25 December, 2021; originally announced December 2021.

Comments: 11 pages, 5 figures

arXiv:2112.09417 [pdf, other]

Concatenated Code Design for Constrained DNA Data Storage with Asymmetric Errors

Authors: Yixin Wang, Li Deng, Md. Noor-A-Rahim, Erry Gunawan, Yong L. Guan, Zhi P. Shi, Chueh L. Poh

Abstract: DNA Data storage has recently attracted much attention due to its durable preservation and extremely high information density (bits per gram) properties. In this work, we propose a hybrid coding strategy comprising of generalized constrained codes to tackle homopolymer (run-length) limit and a protograph based low-density parity-check (LDPC) code to correct asymmetric nucleotide level (i.e., A/T/C… ▽ More DNA Data storage has recently attracted much attention due to its durable preservation and extremely high information density (bits per gram) properties. In this work, we propose a hybrid coding strategy comprising of generalized constrained codes to tackle homopolymer (run-length) limit and a protograph based low-density parity-check (LDPC) code to correct asymmetric nucleotide level (i.e., A/T/C/G) substitution errors that may occur in the process of DNA sequencing. Two sequencing techniques namely, Nanopore sequencer and Illumina sequencer with their equivalent channel models and capacities are analyzed. A coding architecture is proposed to potentially eliminate the catastrophic errors caused by the error-propagation in the constrained decoding while enabling high coding potential. We also show the log likelihood ratio (LLR) calculation method for the belief propagation decoding with this coding architecture. The simulation results and the theoretical analysis show that the proposed coding scheme exhibits good bit-error rate (BER) performance and high coding potential ($\sim1.98$ bits per nucleotide). △ Less

Submitted 17 December, 2021; originally announced December 2021.

arXiv:2112.08557 [pdf, ps, other]

Protograph Bit-Interleaved Coded Modulation: A Bandwidth-Efficient Design Paradigm for 6G Wireless Communications

Authors: Yi Fang, **** Chen, Yong Liang Guan, Francis C. M. Lau, Yonghui Li, Guanrong Chen

Abstract: Bit-interleaved coded modulation (BICM) has attracted considerable attention from the research community in the past three decades, because it can achieve desirable error performance with relatively low implementation complexity for a large number of communication and storage systems. By exploiting the iterative demap** and decoding (ID), the BICM is able to approach capacity limits of coded mod… ▽ More Bit-interleaved coded modulation (BICM) has attracted considerable attention from the research community in the past three decades, because it can achieve desirable error performance with relatively low implementation complexity for a large number of communication and storage systems. By exploiting the iterative demap** and decoding (ID), the BICM is able to approach capacity limits of coded modulation over various channels. In recent years, protograph low-density parity-check (PLDPC) codes and their spatially-coupled (SC) variants have emerged to be a pragmatic forward-error-correction (FEC) solution for BICM systems due to their tremendous error-correction capability and simple structures, and found widespread applications such as deep-space communication, satellite communication, wireless communication, optical communication, and data storage. This article offers a comprehensive survey on the state-of-the-art development of PLDPC-BICM and its innovative SC variants over a variety of channel models, e.g., additive white Gaussian noise (AWGN) channels, fading channels, Poisson pulse position modulation (PPM) channels, and flash-memory channels. Of particular interest is code construction, constellation sha**, as well as bit-mapper design, where the receiver is formulated as a serially-concatenated decoding framework consisting of a soft-decision demapper and a belief-propagation decoder. Finally, several promising research directions are discussed, which have not been adequately addressed in the current literature. △ Less

Submitted 27 October, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

arXiv:2112.03487 [pdf, other]

Enhanced Exploration in Neural Feature Selection for Deep Click-Through Rate Prediction Models via Ensemble of Gating Layers

Authors: Lin Guan, Xia Xiao, Ming Chen, Youlong Cheng

Abstract: Feature selection has been an essential step in develo** industry-scale deep Click-Through Rate (CTR) prediction systems. The goal of neural feature selection (NFS) is to choose a relatively small subset of features with the best explanatory power as a means to remove redundant features and reduce computational cost. Inspired by gradient-based neural architecture search (NAS) and network pruning… ▽ More Feature selection has been an essential step in develo** industry-scale deep Click-Through Rate (CTR) prediction systems. The goal of neural feature selection (NFS) is to choose a relatively small subset of features with the best explanatory power as a means to remove redundant features and reduce computational cost. Inspired by gradient-based neural architecture search (NAS) and network pruning methods, people have tackled the NFS problem with Gating approach that inserts a set of differentiable binary gates to drop less informative features. The binary gates are optimized along with the network parameters in an efficient end-to-end manner. In this paper, we analyze the gradient-based solution from an exploration-exploitation perspective and use empirical results to show that Gating approach might suffer from insufficient exploration. To improve the exploration capacity of gradient-based solutions, we propose a simple but effective ensemble learning approach, named Ensemble Gating. We choose two public datasets, namely Avazu and Criteo, to evaluate this approach. Our experiments show that, without adding any computational overhead or introducing any hyper-parameter (except the size of the ensemble), our method is able to consistently improve Gating approach and find a better subset of features on the two datasets with three different underlying deep CTR prediction models. △ Less

Submitted 6 December, 2021; originally announced December 2021.

arXiv:2112.02692 [pdf, other]

Faster Content Delivery using RSU Caching and Vehicular Pre-caching in Vehicular Networks

Authors: R. S. Pereira, L. Guan, M. Ye, Z. Zhang

Abstract: Most non-safety applications deployed in Vehicular Ad-hoc Network (VANET) use vehicle-to-infrastructure (V2I) and I2V communications to receive various forms of content such as periodic traffic updates, advertisements from adjacent road-side units (RSUs). In case of heavy traffic on highways and urban areas, content delivery time (CDT) can be significantly affected. Increase in CDT can be attribut… ▽ More Most non-safety applications deployed in Vehicular Ad-hoc Network (VANET) use vehicle-to-infrastructure (V2I) and I2V communications to receive various forms of content such as periodic traffic updates, advertisements from adjacent road-side units (RSUs). In case of heavy traffic on highways and urban areas, content delivery time (CDT) can be significantly affected. Increase in CDT can be attributed to high load on the RSU or high volume of broadcasted content which can flood the network. Therefore, this paper suggests a novel caching strategy to improve CDT in high traffic areas and three major contributions have been made: (1) Design and simulation of a caching strategy to decrease the average content delivery time; (2) Evaluation and comparison of caching performance in both urban scenario and highway scenario; (3) Evaluation and comparison of caching performance in single RSU and multiple RSUs. The simulation results show that caching effectively reduces the CDT by 50% in urban scenario and 60-70% in highway scenario. △ Less

Submitted 5 December, 2021; originally announced December 2021.

Comments: 6 pages

arXiv:2111.11061 [pdf, other]

Capacity Optimal Generalized Multi-User MIMO: A Theoretical and Practical Framework

Authors: Yuhao Chi, Lei Liu, Guanghui Song, Ying Li, Yong Liang Guan, Chau Yuen

Abstract: Conventional multi-user multiple-input multiple-output (MU-MIMO) mainly focused on Gaussian signaling, independent and identically distributed (IID) channels, and a limited number of users. It will be laborious to cope with the heterogeneous requirements in next-generation wireless communications, such as various transmission data, complicated communication scenarios, and massive user access. Ther… ▽ More Conventional multi-user multiple-input multiple-output (MU-MIMO) mainly focused on Gaussian signaling, independent and identically distributed (IID) channels, and a limited number of users. It will be laborious to cope with the heterogeneous requirements in next-generation wireless communications, such as various transmission data, complicated communication scenarios, and massive user access. Therefore, this paper studies a generalized MU-MIMO (GMU-MIMO) system with more practical constraints, i.e., non-Gaussian signaling, non-IID channel, and massive users and antennas. These generalized assumptions bring new challenges in theory and practice. For example, there is no accurate capacity analysis for GMU-MIMO. In addition, it is unclear how to achieve the capacity optimal performance with practical complexity. To address these challenges, a unified framework is proposed to derive the GMU-MIMO capacity and design a capacity optimal transceiver, which jointly considers encoding, modulation, detection, and decoding. Group asymmetry is developed to make a tradeoff between user rate allocation and implementation complexity. Specifically, the capacity region of group asymmetric GMU-MIMO is characterized by using the celebrated mutual information and minimum mean-square error (MMSE) lemma and the MMSE optimality of orthogonal approximate message passing (OAMP)/vector AMP (VAMP). Furthermore, a theoretically optimal multi-user OAMP/VAMP receiver and practical multi-user low-density parity-check (MU-LDPC) codes are proposed to achieve the capacity region of group asymmetric GMU-MIMO. Numerical results verify that the gaps between theoretical detection thresholds of the proposed framework with optimized MU-LDPC codes and QPSK modulation and the sum capacity of GMU-MIMO are about 0.2 dB. Moreover, their finite-length performances are about 1~2 dB away from the associated sum capacity. △ Less

Submitted 19 September, 2022; v1 submitted 22 November, 2021; originally announced November 2021.

arXiv:2110.14830 [pdf, ps, other]

ODMTCNet: An Interpretable Multi-view Deep Neural Network Architecture for Image Feature Representation

Authors: Lei Gao, Zheng Guo, Ling Guan

Abstract: This work proposes an interpretable multi-view deep neural network architecture, namely optimal discriminant multi-view tensor convolutional network (ODMTCNet), by integrating statistical machine learning (SML) principles with the deep neural network (DNN) architecture. This work proposes an interpretable multi-view deep neural network architecture, namely optimal discriminant multi-view tensor convolutional network (ODMTCNet), by integrating statistical machine learning (SML) principles with the deep neural network (DNN) architecture. △ Less

Submitted 27 October, 2021; originally announced October 2021.

Comments: Submitted to IEEE TPAMI

arXiv:2110.12469 [pdf]

doi 10.1016/j.nima.2022.166326

Design and testing of an sTGC ASIC interface board for the ATLAS New Small Wheel upgrade

Authors: Xu Wang, Liang Guan, Siyuan Sun, Bing Zhou, Junjie Zhu, Ge **

Abstract: The ATLAS experiment will replace the present Small Wheel (SW) detector with a New Small Wheel detector (NSW) aiming to improve the performance of muon triggering and precision tracking in the endcap region at the High-Luminosity LHC. Small-strip Thin Gap Chamber (sTGC) is one of the two new detector technologies used in this upgrade. A few custom-designed ASICs are needed for the sTGC detector. W… ▽ More The ATLAS experiment will replace the present Small Wheel (SW) detector with a New Small Wheel detector (NSW) aiming to improve the performance of muon triggering and precision tracking in the endcap region at the High-Luminosity LHC. Small-strip Thin Gap Chamber (sTGC) is one of the two new detector technologies used in this upgrade. A few custom-designed ASICs are needed for the sTGC detector. We designed an sTGC ASIC interface board to test ASIC-to-ASIC communication and validate the functionality of the entire system. A test platform with the final readout system is set up and the whole sTGC readout chain is demonstrated for the first time. Key parameters in the readout chain are discussed and the results are shown. △ Less

Submitted 24 October, 2021; originally announced October 2021.

Comments: 11 pages, 7 figures

arXiv:2110.06541 [pdf, ps, other]

Collaborative Radio SLAM for Multiple Robots based on WiFi Fingerprint Similarity

Authors: Ran Liu, Zhenghong Qin, Hua Zhang, Billy Pik Lik Lau, Khairuldanial Ismail, Achala Athukorala, Chau Yuen, Yong Liang Guan, U-Xuan Tan

Abstract: Simultaneous Localization and Map** (SLAM) enables autonomous robots to navigate and execute their tasks through unknown environments. However, performing SLAM in large environments with a single robot is not efficient, and visual or LiDAR-based SLAM requires feature extraction and matching algorithms, which are computationally expensive. In this paper, we present a collaborative SLAM approach w… ▽ More Simultaneous Localization and Map** (SLAM) enables autonomous robots to navigate and execute their tasks through unknown environments. However, performing SLAM in large environments with a single robot is not efficient, and visual or LiDAR-based SLAM requires feature extraction and matching algorithms, which are computationally expensive. In this paper, we present a collaborative SLAM approach with multiple robots using the pervasive WiFi radio signals. A centralized solution is proposed to optimize the trajectory based on the odometry and radio fingerprints collected from multiple robots. To improve the localization accuracy, a novel similarity model is introduced that combines received signal strength (RSS) and detection likelihood of an access point (AP). We perform extensive experiments to demonstrate the effectiveness of the proposed similarity model and collaborative SLAM framework. △ Less

Submitted 19 October, 2021; v1 submitted 13 October, 2021; originally announced October 2021.

Comments: Accepted by 2021 IEEE International Conference on Robotics and Biomimetics, Sanya, China

arXiv:2110.05286 [pdf, other]

Learning from Ambiguous Demonstrations with Self-Explanation Guided Reinforcement Learning

Authors: Yantian Zha, Lin Guan, Subbarao Kambhampati

Abstract: Our work aims at efficiently leveraging ambiguous demonstrations for the training of a reinforcement learning (RL) agent. An ambiguous demonstration can usually be interpreted in multiple ways, which severely hinders the RL-Agent from learning stably and efficiently. Since an optimal demonstration may also suffer from being ambiguous, previous works that combine RL and learning from demonstration… ▽ More Our work aims at efficiently leveraging ambiguous demonstrations for the training of a reinforcement learning (RL) agent. An ambiguous demonstration can usually be interpreted in multiple ways, which severely hinders the RL-Agent from learning stably and efficiently. Since an optimal demonstration may also suffer from being ambiguous, previous works that combine RL and learning from demonstration (RLfD works) may not work well. Inspired by how humans handle such situations, we propose to use self-explanation (an agent generates explanations for itself) to recognize valuable high-level relational features as an interpretation of why a successful trajectory is successful. This way, the agent can provide some guidance for its RL learning. Our main contribution is to propose the Self-Explanation for RL from Demonstrations (SERLfD) framework, which can overcome the limitations of traditional RLfD works. Our experimental results show that an RLfD model can be improved by using our SERLfD framework in terms of training stability and performance. △ Less

Submitted 7 February, 2024; v1 submitted 11 October, 2021; originally announced October 2021.

arXiv:2109.12209 [pdf, other]

Finding Taint-Style Vulnerabilities in Linux-based Embedded Firmware with SSE-based Alias Analysis

Authors: Kai Cheng, Tao Liu, Le Guan, Peng Liu, Hong Li, Hongsong Zhu, Limin Sun

Abstract: Although the importance of using static analysis to detect taint-style vulnerabilities in Linux-based embedded firmware is widely recognized, existing approaches are plagued by three major limitations. (a) Approaches based on symbolic execution may miss alias information and therefore suffer from a high false-negative rate. (b) Approaches based on VSA (value set analysis) often provide an over-app… ▽ More Although the importance of using static analysis to detect taint-style vulnerabilities in Linux-based embedded firmware is widely recognized, existing approaches are plagued by three major limitations. (a) Approaches based on symbolic execution may miss alias information and therefore suffer from a high false-negative rate. (b) Approaches based on VSA (value set analysis) often provide an over-approximate pointer range. As a result, many false positives could be produced. (c) Existing work for detecting taint-style vulnerability does not consider indirect call resolution, whereas indirect calls are frequently used in Internet-facing embedded devices. As a result, many false negatives could be produced. In this work, we propose a precise demand-driven flow-, context- and field-sensitive alias analysis approach. Based on this new approach, we also design a novel indirect call resolution scheme. Combined with sanitization rule checking, our solution discovers taint-style vulnerabilities by static taint analysis. We implemented our idea with a prototype called EmTaint and evaluated it against 35 real-world embedded firmware samples from six popular vendors. EmTaint discovered at least 192 bugs, including 41 n-day bugs and 151 0-day bugs. At least 115 CVE/PSV numbers have been allocated from a subset of the reported vulnerabilities at the time of writing. Compared to state-of-the-art tools such as KARONTE and SaTC, EmTaint found significantly more bugs on the same dataset in less time. △ Less

Submitted 24 September, 2021; originally announced September 2021.

Comments: 14 pages, 4 figures

arXiv:2109.09904 [pdf, other]

Symbols as a Lingua Franca for Bridging Human-AI Chasm for Explainable and Advisable AI Systems

Authors: Subbarao Kambhampati, Sarath Sreedharan, Mudit Verma, Yantian Zha, Lin Guan

Abstract: Despite the surprising power of many modern AI systems that often learn their own representations, there is significant discontent about their inscrutability and the attendant problems in their ability to interact with humans. While alternatives such as neuro-symbolic approaches have been proposed, there is a lack of consensus on what they are about. There are often two independent motivations (i)… ▽ More Despite the surprising power of many modern AI systems that often learn their own representations, there is significant discontent about their inscrutability and the attendant problems in their ability to interact with humans. While alternatives such as neuro-symbolic approaches have been proposed, there is a lack of consensus on what they are about. There are often two independent motivations (i) symbols as a lingua franca for human-AI interaction and (ii) symbols as system-produced abstractions used by the AI system in its internal reasoning. The jury is still out on whether AI systems will need to use symbols in their internal reasoning to achieve general intelligence capabilities. Whatever the answer there is, the need for (human-understandable) symbols in human-AI interaction seems quite compelling. Symbols, like emotions, may well not be sine qua non for intelligence per se, but they will be crucial for AI systems to interact with us humans -- as we can neither turn off our emotions nor get by without our symbols. In particular, in many human-designed domains, humans would be interested in providing explicit (symbolic) knowledge and advice -- and expect machine explanations in kind. This alone requires AI systems to to maintain a symbolic interface for interaction with humans. In this blue sky paper, we argue this point of view, and discuss research directions that need to be pursued to allow for this type of human-AI interaction. △ Less

Submitted 9 December, 2021; v1 submitted 20 September, 2021; originally announced September 2021.

arXiv:2109.06667 [pdf, other]

A Blockchain based Federated Learning for Message Dissemination in Vehicular Networks

Authors: Ferheen Ayaz, Zhengguo Sheng, Daxin Tian, Yong Liang Guan

Abstract: Message exchange among vehicles plays an important role in ensuring road safety. Emergency message dissemination is usually carried out by broadcasting. However, high vehicle density and mobility usually lead to challenges in message dissemination such as broadcasting storm and low probability of packet reception. This paper proposes a federated learning based blockchain-assisted message dissemina… ▽ More Message exchange among vehicles plays an important role in ensuring road safety. Emergency message dissemination is usually carried out by broadcasting. However, high vehicle density and mobility usually lead to challenges in message dissemination such as broadcasting storm and low probability of packet reception. This paper proposes a federated learning based blockchain-assisted message dissemination solution. Similar to the incentive-based Proof-of-Work consensus in blockchain, vehicles compete to become a relay node (miner) by processing the proposed Proof-of-Federated-Learning (PoFL) consensus which is embedded in the smart contract of blockchain. Both theoretical and practical analysis of the proposed solution are provided. Specifically, the proposed blockchain based federated learning results in more number of vehicles uploading their models in a given time, which can potentially lead to a more accurate model in less time as compared to the same solution without using blockchain. It also outperforms the other blockchain approaches for message dissemination by reducing 65.2% of time delay in consensus, improving at least 8.2% message delivery rate and preserving privacy of neighbor vehicle more efficiently. The economic model to incentivize vehicles participating in federated learning and message dissemination is further analyzed using Stackelberg game model. △ Less

Submitted 11 September, 2021; originally announced September 2021.

Comments: Submitted to IEEE Transactions on Vehicular Technology

arXiv:2109.01287 [pdf, other]

Spectrum Learning-Aided Reconfigurable Intelligent Surfaces for 'Green' 6G Networks

Authors: Bo Yang, Xuelin Cao, Chongwen Huang, Yong Liang Guan, Chau Yuen, Marco Di Renzo, Dusit Niyato, Merouane Debbah, Lajos Hanzo

Abstract: In the sixth-generation (6G) era, emerging large-scale computing based applications (for example processing enormous amounts of images in real-time in autonomous driving) tend to lead to excessive energy consumption for the end users, whose devices are usually energy-constrained. In this context, energy-efficiency becomes a critical challenge to be solved for harnessing these promising application… ▽ More In the sixth-generation (6G) era, emerging large-scale computing based applications (for example processing enormous amounts of images in real-time in autonomous driving) tend to lead to excessive energy consumption for the end users, whose devices are usually energy-constrained. In this context, energy-efficiency becomes a critical challenge to be solved for harnessing these promising applications to realize 'green' 6G networks. As a remedy, reconfigurable intelligent surfaces (RIS) have been proposed for improving the energy efficiency by beneficially reconfiguring the wireless propagation environment. In conventional RIS solutions, however, the received signal-to-interference-plus-noise ratio (SINR) sometimes may even become degraded. This is because the signals im**ing upon an RIS are typically contaminated by interfering signals which are usually dynamic and unknown. To address this issue, `learning' the properties of the surrounding spectral environment is a promising solution, motivating the convergence of artificial intelligence and spectrum sensing, termed here as spectrum learning (SL). Inspired by this, we develop an SL-aided RIS framework for intelligently exploiting the inherent characteristics of the radio frequency (RF) spectrum for green 6G networks. Given the proposed framework, the RIS controller becomes capable of intelligently `{think-and-decide}' whether to reflect or not the incident signals. Therefore, the received SINR can be improved by dynamically configuring the binary ON-OFF status of the RIS elements. The energy-efficiency benefits attained are validated with the aid of a specific case study. Finally, we conclude with a list of promising future research directions. △ Less

Submitted 2 September, 2021; originally announced September 2021.

arXiv:2107.12867 [pdf, other]

doi 10.14722/ndss.2021.24308

From Library Portability to Para-rehosting: Natively Executing Microcontroller Software on Commodity Hardware

Authors: Wenqiang Li, Le Guan, **gqiang Lin, Jiameng Shi, Fengjun Li

Abstract: Finding bugs in microcontroller (MCU) firmware is challenging, even for device manufacturers who own the source code. The MCU runs different instruction sets than x86 and exposes a very different development environment. This invalidates many existing sophisticated software testing tools on x86. To maintain a unified develo** and testing environment, a straightforward way is to re-compile the so… ▽ More Finding bugs in microcontroller (MCU) firmware is challenging, even for device manufacturers who own the source code. The MCU runs different instruction sets than x86 and exposes a very different development environment. This invalidates many existing sophisticated software testing tools on x86. To maintain a unified develo** and testing environment, a straightforward way is to re-compile the source code into the native executable for a commodity machine (called rehosting). However, ad-hoc re-hosting is a daunting and tedious task and subject to many issues (library-dependence, kernel-dependence and hardware-dependence). In this work, we systematically explore the portability problem of MCU software and propose pararehosting to ease the porting process. Specifically, we abstract and implement a portable MCU (PMCU) using the POSIX interface. It models common functions of the MCU cores. For peripheral specific logic, we propose HAL-based peripheral function replacement, in which high-level hardware functions are replaced with an equivalent backend driver on the host. These backend drivers are invoked by well-designed para-APIs and can be reused across many MCU OSs. We categorize common HAL functions into four types and implement templates for quick backend development. Using the proposed approach, we have successfully rehosted nine MCU OSs including the widely deployed Amazon FreeRTOS, ARM Mbed OS, Zephyr and LiteOS. To demonstrate the superiority of our approach in terms of security testing, we used off-the-shelf dynamic analysis tools (AFL and ASAN) against the rehosted programs and discovered 28 previously-unknown bugs, among which 5 were confirmed by CVE and the other 19 were confirmed by vendors at the time of writing. △ Less

Submitted 4 July, 2021; originally announced July 2021.

Comments: 18 pages, 4 figures, Network and Distributed Systems Security (NDSS) Symposium 2021

arXiv:2107.09869 [pdf, other]

ECG Heartbeat Classification Using Multimodal Fusion

Authors: Zeeshan Ahmad, Anika Tabassum, Ling Guan, Naimul Khan

Abstract: Electrocardiogram (ECG) is an authoritative source to diagnose and counter critical cardiovascular syndromes such as arrhythmia and myocardial infarction (MI). Current machine learning techniques either depend on manually extracted features or large and complex deep learning networks which merely utilize the 1D ECG signal directly. Since intelligent multimodal fusion can perform at the stateof-the… ▽ More Electrocardiogram (ECG) is an authoritative source to diagnose and counter critical cardiovascular syndromes such as arrhythmia and myocardial infarction (MI). Current machine learning techniques either depend on manually extracted features or large and complex deep learning networks which merely utilize the 1D ECG signal directly. Since intelligent multimodal fusion can perform at the stateof-the-art level with an efficient deep network, therefore, in this paper, we propose two computationally efficient multimodal fusion frameworks for ECG heart beat classification called Multimodal Image Fusion (MIF) and Multimodal Feature Fusion (MFF). At the input of these frameworks, we convert the raw ECG data into three different images using Gramian Angular Field (GAF), Recurrence Plot (RP) and Markov Transition Field (MTF). In MIF, we first perform image fusion by combining three imaging modalities to create a single image modality which serves as input to the Convolutional Neural Network (CNN). In MFF, we extracted features from penultimate layer of CNNs and fused them to get unique and interdependent information necessary for better performance of classifier. These informational features are finally used to train a Support Vector Machine (SVM) classifier for ECG heart-beat classification. We demonstrate the superiority of the proposed fusion models by performing experiments on PhysioNets MIT-BIH dataset for five distinct conditions of arrhythmias which are consistent with the AAMI EC57 protocols and on PTB diagnostics dataset for Myocardial Infarction (MI) classification. We achieved classification accuracy of 99.7% and 99.2% on arrhythmia and MI classification, respectively. △ Less

Submitted 20 July, 2021; originally announced July 2021.

arXiv:2107.07759 [pdf, other]

Automatic Firmware Emulation through Invalidity-guided Knowledge Inference (Extended Version)

Authors: Wei Zhou, Le Guan, Peng Liu, Yuqing Zhang

Abstract: Emulating firmware for microcontrollers is challenging due to the tight coupling between the hardware and firmware. This has greatly impeded the application of dynamic analysis tools to firmware analysis. The state-of-the-art work automatically models unknown peripherals by observing their access patterns, and then leverages heuristics to calculate the appropriate responses when unknown peripheral… ▽ More Emulating firmware for microcontrollers is challenging due to the tight coupling between the hardware and firmware. This has greatly impeded the application of dynamic analysis tools to firmware analysis. The state-of-the-art work automatically models unknown peripherals by observing their access patterns, and then leverages heuristics to calculate the appropriate responses when unknown peripheral registers are accessed. However, we empirically found that this approach and the corresponding heuristics are frequently insufficient to emulate firmware. In this work, we propose a new approach called uEmu to emulate firmware with unknown peripherals. Unlike existing work that attempts to build a general model for each peripheral, our approach learns how to correctly emulate firmware execution at individual peripheral access points. It takes the image as input and symbolically executes it by representing unknown peripheral registers as symbols. During symbolic execution, it infers the rules to respond to unknown peripheral accesses. These rules are stored in a knowledge base, which is referred to during the dynamic firmware analysis. uEmu achieved a passing rate of 95% in a set of unit tests for peripheral drivers without any manual assistance. We also evaluated uEmu with real-world firmware samples and new bugs were discovered. △ Less

Submitted 27 July, 2021; v1 submitted 16 July, 2021; originally announced July 2021.

Comments: Extended version of Usenix'21 paper

arXiv:2106.08460 [pdf, other]

Localized Conformal Prediction: A Generalized Inference Framework for Conformal Prediction

Authors: Leying Guan

Abstract: We propose a new inference framework called localized conformal prediction. It generalizes the framework of conformal prediction by offering a single-test-sample adaptive construction that emphasizes a local region around this test sample, and can be combined with different conformal score constructions. The proposed framework enjoys an assumption-free finite sample marginal coverage guarantee, an… ▽ More We propose a new inference framework called localized conformal prediction. It generalizes the framework of conformal prediction by offering a single-test-sample adaptive construction that emphasizes a local region around this test sample, and can be combined with different conformal score constructions. The proposed framework enjoys an assumption-free finite sample marginal coverage guarantee, and it also offers additional local coverage guarantees under suitable assumptions. We demonstrate how to change from conformal prediction to localized conformal prediction using several conformal scores, and we illustrate a potential gain via numerical examples. △ Less

Submitted 28 February, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: This paper is based on the results on localized conformal prediction under the i.i.d settings from arXiv:1908.08558, with strengthened theoretical results, new and more efficient algorithms, and additional empirical studies

arXiv:2105.13536 [pdf, other]

ECG Heart-beat Classification Using Multimodal Image Fusion

Authors: Zeeshan Ahmad, Anika Tabassum, Naimul Khan, Ling Guan

Abstract: In this paper, we present a novel Image Fusion Model (IFM) for ECG heart-beat classification to overcome the weaknesses of existing machine learning techniques that rely either on manual feature extraction or direct utilization of 1D raw ECG signal. At the input of IFM, we first convert the heart beats of ECG into three different images using Gramian Angular Field (GAF), Recurrence Plot (RP) and M… ▽ More In this paper, we present a novel Image Fusion Model (IFM) for ECG heart-beat classification to overcome the weaknesses of existing machine learning techniques that rely either on manual feature extraction or direct utilization of 1D raw ECG signal. At the input of IFM, we first convert the heart beats of ECG into three different images using Gramian Angular Field (GAF), Recurrence Plot (RP) and Markov Transition Field (MTF) and then fuse these images to create a single imaging modality. We use AlexNet for feature extraction and classification and thus employ end to end deep learning. We perform experiments on PhysioNet MIT-BIH dataset for five different arrhythmias in accordance with the AAMI EC57 standard and on PTB diagnostics dataset for myocardial infarction (MI) classification. We achieved an state of an art results in terms of prediction accuracy, precision and recall. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2104.05184 [pdf, other]

Smooth and probabilistic PARAFAC model with auxiliary covariates

Authors: Leying Guan

Abstract: In immunological and clinical studies, matrix-valued time-series data clustering is increasingly popular. Researchers are interested in finding low-dimensional embedding of subjects based on potentially high-dimensional longitudinal features and investigating relationships between static clinical covariates and the embedding. These studies are often challenging due to high dimensionality, as well… ▽ More In immunological and clinical studies, matrix-valued time-series data clustering is increasingly popular. Researchers are interested in finding low-dimensional embedding of subjects based on potentially high-dimensional longitudinal features and investigating relationships between static clinical covariates and the embedding. These studies are often challenging due to high dimensionality, as well as the sparse and irregular nature of sample collection along the time dimension. We propose a smoothed probabilistic PARAFAC model with covariates (SPACO) to tackle these two problems while utilizing auxiliary covariates of interest. We provide intensive simulations to test different aspects of SPACO and demonstrate its use on an immunological data set from patients with SARs-CoV-2 infection. △ Less

Submitted 26 August, 2022; v1 submitted 11 April, 2021; originally announced April 2021.

arXiv:2104.00878 [pdf, other]

Contrastively Learning Visual Attention as Affordance Cues from Demonstrations for Robotic Gras**

Authors: Yantian Zha, Siddhant Bhambri, Lin Guan

Abstract: Conventional works that learn gras** affordance from demonstrations need to explicitly predict gras** configurations, such as gripper approaching angles or gras** preshapes. Classic motion planners could then sample trajectories by using such predicted configurations. In this work, our goal is instead to fill the gap between affordance discovery and affordance-based policy learning by integr… ▽ More Conventional works that learn gras** affordance from demonstrations need to explicitly predict gras** configurations, such as gripper approaching angles or gras** preshapes. Classic motion planners could then sample trajectories by using such predicted configurations. In this work, our goal is instead to fill the gap between affordance discovery and affordance-based policy learning by integrating the two objectives in an end-to-end imitation learning framework based on deep neural networks. From a psychological perspective, there is a close association between attention and affordance. Therefore, with an end-to-end neural network, we propose to learn affordance cues as visual attention that serves as a useful indicating signal of how a demonstrator accomplishes tasks, instead of explicitly modeling affordances. To achieve this, we propose a contrastive learning framework that consists of a Siamese encoder and a trajectory decoder. We further introduce a coupled triplet loss to encourage the discovered affordance cues to be more affordance-relevant. Our experimental results demonstrate that our model with the coupled triplet loss achieves the highest gras** success rate in a simulated robot environment. Our project website can be accessed at https://sites.google.com/asu.edu/affordance-aware-imitation/project. △ Less

Submitted 13 August, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

arXiv:2103.12926 [pdf, other]

Beyond Visual Attractiveness: Physically Plausible Single Image HDR Reconstruction for Spherical Panoramas

Authors: Wei Wei, Li Guan, Yue Liu, Hao Kang, Haoxiang Li, Ying Wu, Gang Hua

Abstract: HDR reconstruction is an important task in computer vision with many industrial needs. The traditional approaches merge multiple exposure shots to generate HDRs that correspond to the physical quantity of illuminance of the scene. However, the tedious capturing process makes such multi-shot approaches inconvenient in practice. In contrast, recent single-shot methods predict a visually appealing HD… ▽ More HDR reconstruction is an important task in computer vision with many industrial needs. The traditional approaches merge multiple exposure shots to generate HDRs that correspond to the physical quantity of illuminance of the scene. However, the tedious capturing process makes such multi-shot approaches inconvenient in practice. In contrast, recent single-shot methods predict a visually appealing HDR from a single LDR image through deep learning. But it is not clear whether the previously mentioned physical properties would still hold, without training the network to explicitly model them. In this paper, we introduce the physical illuminance constraints to our single-shot HDR reconstruction framework, with a focus on spherical panoramas. By the proposed physical regularization, our method can generate HDRs which are not only visually appealing but also physically plausible. For evaluation, we collect a large dataset of LDR and HDR images with ground truth illuminance measures. Extensive experiments show that our HDR images not only maintain high visual quality but also top all baseline methods in illuminance prediction accuracy. △ Less

Submitted 23 March, 2021; originally announced March 2021.

arXiv:2103.05597 [pdf, ps, other]

A Discriminative Vectorial Framework for Multi-modal Feature Representation

Authors: Lei Gao, Ling Guan

Abstract: Due to the rapid advancements of sensory and computing technology, multi-modal data sources that represent the same pattern or phenomenon have attracted growing attention. As a result, finding means to explore useful information from these multi-modal data sources has quickly become a necessity. In this paper, a discriminative vectorial framework is proposed for multi-modal feature representation… ▽ More Due to the rapid advancements of sensory and computing technology, multi-modal data sources that represent the same pattern or phenomenon have attracted growing attention. As a result, finding means to explore useful information from these multi-modal data sources has quickly become a necessity. In this paper, a discriminative vectorial framework is proposed for multi-modal feature representation in knowledge discovery by employing multi-modal hashing (MH) and discriminative correlation maximization (DCM) analysis. Specifically, the proposed framework is capable of minimizing the semantic similarity among different modalities by MH and exacting intrinsic discriminative representations across multiple data sources by DCM analysis jointly, enabling a novel vectorial framework of multi-modal feature representation. Moreover, the proposed feature representation strategy is analyzed and further optimized based on canonical and non-canonical cases, respectively. Consequently, the generated feature representation leads to effective utilization of the input data sources of high quality, producing improved, sometimes quite impressive, results in various applications. The effectiveness and generality of the proposed framework are demonstrated by utilizing classical features and deep neural network (DNN) based features with applications to image and multimedia analysis and recognition tasks, including data visualization, face recognition, object recognition; cross-modal (text-image) recognition and audio emotion recognition. Experimental results show that the proposed solutions are superior to state-of-the-art statistical machine learning (SML) and DNN algorithms. △ Less

Submitted 9 March, 2021; originally announced March 2021.

Comments: Accepted

Journal ref: IEEE Transactions on Multimedia, 2021

arXiv:2103.00367 [pdf, ps, other]

doi 10.1109/LSP.2020.3028006

A Complete Discriminative Tensor Representation Learning for Two-Dimensional Correlation Analysis

Authors: Lei Gao, Ling Guan

Abstract: As an effective tool for two-dimensional data analysis, two-dimensional canonical correlation analysis (2DCCA) is not only capable of preserving the intrinsic structural information of original two-dimensional (2D) data, but also reduces the computational complexity effectively. However, due to the unsupervised nature, 2DCCA is incapable of extracting sufficient discriminatory representations, res… ▽ More As an effective tool for two-dimensional data analysis, two-dimensional canonical correlation analysis (2DCCA) is not only capable of preserving the intrinsic structural information of original two-dimensional (2D) data, but also reduces the computational complexity effectively. However, due to the unsupervised nature, 2DCCA is incapable of extracting sufficient discriminatory representations, resulting in an unsatisfying performance. In this letter, we propose a complete discriminative tensor representation learning (CDTRL) method based on linear correlation analysis for analyzing 2D signals (e.g. images). This letter shows that the introduction of the complete discriminatory tensor representation strategy provides an effective vehicle for revealing, and extracting the discriminant representations across the 2D data sets, leading to improved results. Experimental results show that the proposed CDTRL outperforms state-of-the-art methods on the evaluated data sets. △ Less

Submitted 27 February, 2021; originally announced March 2021.

Journal ref: IEEE Signal Processing Letters, 2020

arXiv:2103.00365 [pdf, ps, other]

doi 10.1109/LSP.2021.3050052

The Property of Frequency Shift in 2D-FRFT Domain with Application to Image Encryption

Authors: Lei Gao, Lin Qi, Ling Guan

Abstract: The Fractional Fourier Transform (FRFT) has been playing a unique and increasingly important role in signal and image processing. In this letter, we investigate the property of frequency shift in two-dimensional FRFT (2D-FRFT) domain. It is shown that the magnitude of image reconstruction from phase information is frequency shift-invariant in 2D-FRFT domain, enhancing the robustness of image encry… ▽ More The Fractional Fourier Transform (FRFT) has been playing a unique and increasingly important role in signal and image processing. In this letter, we investigate the property of frequency shift in two-dimensional FRFT (2D-FRFT) domain. It is shown that the magnitude of image reconstruction from phase information is frequency shift-invariant in 2D-FRFT domain, enhancing the robustness of image encryption, an important multimedia security task. Experiments are conducted to demonstrate the effectiveness of this property against the frequency shift attack, improving the robustness of image encryption. △ Less

Submitted 27 February, 2021; originally announced March 2021.

Comments: IEEE Signal Processing Letters, 2021

arXiv:2103.00361 [pdf, ps, other]

doi 10.1109/TIP.2017.2765820

Discriminative Multiple Canonical Correlation Analysis for Information Fusion

Authors: Lei Gao, Lin Qi, Enqing Chen, Ling Guan

Abstract: In this paper, we propose the Discriminative Multiple Canonical Correlation Analysis (DMCCA) for multimodal information analysis and fusion. DMCCA is capable of extracting more discriminative characteristics from multimodal information representations. Specifically, it finds the projected directions which simultaneously maximize the within-class correlation and minimize the between-class correlati… ▽ More In this paper, we propose the Discriminative Multiple Canonical Correlation Analysis (DMCCA) for multimodal information analysis and fusion. DMCCA is capable of extracting more discriminative characteristics from multimodal information representations. Specifically, it finds the projected directions which simultaneously maximize the within-class correlation and minimize the between-class correlation, leading to better utilization of the multimodal information. In the process, we analytically demonstrate that the optimally projected dimension by DMCCA can be quite accurately predicted, leading to both superior performance and substantial reduction in computational cost. We further verify that Canonical Correlation Analysis (CCA), Multiple Canonical Correlation Analysis (MCCA) and Discriminative Canonical Correlation Analysis (DCCA) are special cases of DMCCA, thus establishing a unified framework for Canonical Correlation Analysis. We implement a prototype of DMCCA to demonstrate its performance in handwritten digit recognition and human emotion recognition. Extensive experiments show that DMCCA outperforms the traditional methods of serial fusion, CCA, MCCA and DCCA. △ Less

Submitted 27 February, 2021; originally announced March 2021.

Journal ref: IEEE Transactions on Image Processing, 2018

arXiv:2103.00359 [pdf, ps, other]

doi 10.1109/TMM.2018.2859590

The Labeled Multiple Canonical Correlation Analysis for Information Fusion

Authors: Lei Gao, Rui Zhang, Lin Qi, Enqing Chen, Ling Guan

Abstract: The objective of multimodal information fusion is to mathematically analyze information carried in different sources and create a new representation which will be more effectively utilized in pattern recognition and other multimedia information processing tasks. In this paper, we introduce a new method for multimodal information fusion and representation based on the Labeled Multiple Canonical Cor… ▽ More The objective of multimodal information fusion is to mathematically analyze information carried in different sources and create a new representation which will be more effectively utilized in pattern recognition and other multimedia information processing tasks. In this paper, we introduce a new method for multimodal information fusion and representation based on the Labeled Multiple Canonical Correlation Analysis (LMCCA). By incorporating class label information of the training samples,the proposed LMCCA ensures that the fused features carry discriminative characteristics of the multimodal information representations, and are capable of providing superior recognition performance. We implement a prototype of LMCCA to demonstrate its effectiveness on handwritten digit recognition,face recognition and object recognition utilizing multiple features,bimodal human emotion recognition involving information from both audio and visual domains. The generic nature of LMCCA allows it to take as input features extracted by any means,including those by deep learning (DL) methods. Experimental results show that the proposed method enhanced the performance of both statistical machine learning (SML) methods, and methods based on DL. △ Less

Submitted 27 February, 2021; originally announced March 2021.

Journal ref: IEEE Transactions on Multimedia, 2019

arXiv:2103.00356 [pdf, other]

doi 10.1109/MIS.2016.26

Online Behavioral Analysis with Application to Emotion State Identification

Authors: Lei Gao, Lin Qi, Ling Guan

Abstract: In this paper, we propose a novel discriminative model for online behavioral analysis with application to emotion state identification. The proposed model is able to extract more discriminative characteristics from behavioral data effectively and find the direction of optimal projection efficiently to satisfy requirements of online data analysis, leading to better utilization of the behavioral inf… ▽ More In this paper, we propose a novel discriminative model for online behavioral analysis with application to emotion state identification. The proposed model is able to extract more discriminative characteristics from behavioral data effectively and find the direction of optimal projection efficiently to satisfy requirements of online data analysis, leading to better utilization of the behavioral information to produce more accurate recognition results. △ Less

Submitted 27 February, 2021; originally announced March 2021.

Journal ref: IEEE Intelligent Systems, 2016

arXiv:2102.00953 [pdf]

QoS-aware Link Scheduling Strategy for Data Transmission in SDVN

Authors: Yong Zhang, Mao Ye, Lin Guan

Abstract: The vehicular ad-hoc network (VANET) based on dedicated short-range communication (DSRC) is a distributed communication system, in which all the nodes share the wireless channel with carrier sense multiple access/collision avoid (CSMA/CA) protocol. However, the backoff mechanism of CSMA/CA in the channel contention might cause uncertain transmission delay and impede a certain quality of service (Q… ▽ More The vehicular ad-hoc network (VANET) based on dedicated short-range communication (DSRC) is a distributed communication system, in which all the nodes share the wireless channel with carrier sense multiple access/collision avoid (CSMA/CA) protocol. However, the backoff mechanism of CSMA/CA in the channel contention might cause uncertain transmission delay and impede a certain quality of service (QoS) of applications. Moreover, there still exists a possibility of parlous data-packets collision, especially for broadcast or non-acknowledgement (NACK) transmissions. The original contributions of this paper are summarized as follows: (1) Model the packets collision probability of broadcast or NACK transmission in VANET with the combination theory and investigate the potential influence of miss my packets (MMP) problem. (2) Based on the software define vehicular network (SDVN) framework and QoS requirement, a novel link-level scheduling strategy, which determines the start-sending time for each connection, is proposed to maximize packets delivery ratio (PDR). Alternatively, maximizing PDR has been converted to the overlap minimization among transmission durations. (3) Meanwhile, an innovative transmission scheduling greedy search (TSGS) algorithm is originally proposed to mitigate computational complexity. Extensive simulations have been done in a unified platform Veins combining SUMO and OMNET++. And numerous results show that the proposed algorithm can effectively improve the PDR by at least 15%, enhance the collision-avoidance performance by almost 40%, and reduce the MMP ratio by about 3% compared with the random transmitting, meanwhile meet the QoS requirement. △ Less

Submitted 1 February, 2021; originally announced February 2021.

Comments: 24 pages, 15 figures. arXiv admin note: text overlap with arXiv:2101.06522

arXiv:2101.06522 [pdf]

doi 10.1109/ISNCC52172.2021.9615863

Overlap-Minimization Scheduling Strategy for Data Transmission in VANET

Authors: Yong Zhang, Mao Ye, Lin Guan

Abstract: The vehicular ad-hoc network (VANET) based on dedicated short-range communication (DSRC) is a distributed communication system, in which all the nodes share the wireless channel with carrier sense multiple access/collision avoid (CSMA/CA) protocol. However, the competition and backoff mechanisms of CSMA/CA often bring additional delays and data packet collisions, which may hardly meet the QoS requ… ▽ More The vehicular ad-hoc network (VANET) based on dedicated short-range communication (DSRC) is a distributed communication system, in which all the nodes share the wireless channel with carrier sense multiple access/collision avoid (CSMA/CA) protocol. However, the competition and backoff mechanisms of CSMA/CA often bring additional delays and data packet collisions, which may hardly meet the QoS requirements in terms of delay and packets delivery ratio (PDR). Moreover, because of the distribution nature of security information in broadcast mode, the sender cannot know whether the receivers have received the information successfully. Similarly, this problem also exists in no-acknowledge (non-ACK) transmissions of VANET. Therefore, the probability of packet collisions should be considered in broadcast or non-ACK working modes. This paper presents a connection-level scheduling algorithm overlaid on CSMA/CA to schedule the start sending time of each transmission. By converting the object of reducing collision probability to minimizing the overlap of transmission durations of connections, the probability of backoff-activation can be greatly decreased. Then the delay and the probability of packet collisions can also be decreased. Numerical simulations have been conducted in our unified platform containing SUMO, Veins and Omnet++. The result shows that the proposed algorithm can effectively improve the PDR and reduce the packets collision in VANET. △ Less

Submitted 16 January, 2021; originally announced January 2021.

Comments: 6 pages,7 figures

Showing 151–200 of 340 results for author: Guan, L