Search | arXiv e-print repository

arXiv:2406.18100 [pdf, other]

Natural Language but Omitted? On the Ineffectiveness of Large Language Models' privacy policy from End-users' Perspective

Authors: Shuning Zhang, Haobin Xing, Xin Yi, Hewu Li

Abstract: LLMs driven products were increasingly prevalent in our daily lives, With a natural language based interaction style, people may potentially leak their personal private information. Thus, privacy policy and user agreement played an important role in regulating and alerting people. However, there lacked the work examining the reading of LLM's privacy policy. Thus, we conducted the first user study… ▽ More LLMs driven products were increasingly prevalent in our daily lives, With a natural language based interaction style, people may potentially leak their personal private information. Thus, privacy policy and user agreement played an important role in regulating and alerting people. However, there lacked the work examining the reading of LLM's privacy policy. Thus, we conducted the first user study to let participants read the privacy policy and user agreement with two different styles (a cursory and detailed style). We found users lack important information upon cursory reading and even detailed reading. Besides, their privacy concerns was not solved even upon detailed reading. We provided four design implications based on the findings. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.11569 [pdf, other]

Pre-Training and Personalized Fine-Tuning via Over-the-Air Federated Meta-Learning: Convergence-Generalization Trade-Offs

Authors: Haifeng Wen, Hong Xing, Osvaldo Simeone

Abstract: For modern artificial intelligence (AI) applications such as large language models (LLMs), the training paradigm has recently shifted to pre-training followed by fine-tuning. Furthermore, owing to dwindling open repositories of data and thanks to efforts to democratize access to AI models, pre-training is expected to increasingly migrate from the current centralized deployments to federated learni… ▽ More For modern artificial intelligence (AI) applications such as large language models (LLMs), the training paradigm has recently shifted to pre-training followed by fine-tuning. Furthermore, owing to dwindling open repositories of data and thanks to efforts to democratize access to AI models, pre-training is expected to increasingly migrate from the current centralized deployments to federated learning (FL) implementations. Meta-learning provides a general framework in which pre-training and fine-tuning can be formalized. Meta-learning-based personalized FL (meta-pFL) moves beyond basic personalization by targeting generalization to new agents and tasks. This paper studies the generalization performance of meta-pFL for a wireless setting in which the agents participating in the pre-training phase, i.e., meta-learning, are connected via a shared wireless channel to the server. Adopting over-the-air computing, we study the trade-off between generalization to new agents and tasks, on the one hand, and convergence, on the other hand. The trade-off arises from the fact that channel impairments may enhance generalization, while degrading convergence. Extensive numerical results validate the theory. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 37 pages, 7 figures, submitted for possible journal publication

arXiv:2406.08754 [pdf, other]

StructuralSleight: Automated Jailbreak Attacks on Large Language Models Utilizing Uncommon Text-Encoded Structure

Authors: Bangxin Li, Hengrui Xing, Chao Huang, ** Qian, Huangqing Xiao, Linfeng Feng, Cong Tian

Abstract: Large Language Models (LLMs) are widely used in natural language processing but face the risk of jailbreak attacks that maliciously induce them to generate harmful content. Existing jailbreak attacks, including character-level and context-level attacks, mainly focus on the prompt of the plain text without specifically exploring the significant influence of its structure. In this paper, we focus on… ▽ More Large Language Models (LLMs) are widely used in natural language processing but face the risk of jailbreak attacks that maliciously induce them to generate harmful content. Existing jailbreak attacks, including character-level and context-level attacks, mainly focus on the prompt of the plain text without specifically exploring the significant influence of its structure. In this paper, we focus on studying how prompt structure contributes to the jailbreak attack. We introduce a novel structure-level attack method based on tail structures that are rarely used during LLM training, which we refer to as Uncommon Text-Encoded Structure (UTES). We extensively study 12 UTESs templates and 6 obfuscation methods to build an effective automated jailbreak tool named StructuralSleight that contains three escalating attack strategies: Structural Attack, Structural and Character/Context Obfuscation Attack, and Fully Obfuscated Structural Attack. Extensive experiments on existing LLMs show that StructuralSleight significantly outperforms baseline methods. In particular, the attack success rate reaches 94.62\% on GPT-4o, which has not been addressed by state-of-the-art techniques. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 12 pages, 4 figures

arXiv:2406.02266 [pdf]

Enhancing Retrieval-Augmented LMs with a Two-stage Consistency Learning Compressor

Authors: Chuankai Xu, Dongming Zhao, Bo Wang, Hanwen Xing

Abstract: Despite the prevalence of retrieval-augmented language models (RALMs), the seamless integration of these models with retrieval mechanisms to enhance performance in document-based tasks remains challenging. While some post-retrieval processing Retrieval-Augmented Generation (RAG) methods have achieved success, most still lack the ability to distinguish pertinent from extraneous information, leading… ▽ More Despite the prevalence of retrieval-augmented language models (RALMs), the seamless integration of these models with retrieval mechanisms to enhance performance in document-based tasks remains challenging. While some post-retrieval processing Retrieval-Augmented Generation (RAG) methods have achieved success, most still lack the ability to distinguish pertinent from extraneous information, leading to potential inconsistencies and reduced precision in the generated output, which subsequently affects the truthfulness of the language model's responses. To address these limitations, this work proposes a novel two-stage consistency learning approach for retrieved information compression in retrieval-augmented language models to enhance performance. By incorporating consistency learning, the aim is to generate summaries that maintain coherence and alignment with the intended semantic representations of a teacher model while improving faithfulness to the original retrieved documents. The proposed method is empirically validated across multiple datasets, demonstrating notable enhancements in precision and efficiency for question-answering tasks. It outperforms existing baselines and showcases the synergistic effects of combining contrastive and consistency learning paradigms within the retrieval-augmented generation framework. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.14709 [pdf, other]

OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance

Authors: Shuheng Ge, Haoyu Xing, Li Zhang, Xiangqian Wu

Abstract: Creating realistic, natural, and lip-readable talking face videos remains a formidable challenge. Previous research primarily concentrated on generating and aligning single-frame images while overlooking the smoothness of frame-to-frame transitions and temporal dependencies. This often compromised visual quality and effects in practical settings, particularly when handling complex facial data and… ▽ More Creating realistic, natural, and lip-readable talking face videos remains a formidable challenge. Previous research primarily concentrated on generating and aligning single-frame images while overlooking the smoothness of frame-to-frame transitions and temporal dependencies. This often compromised visual quality and effects in practical settings, particularly when handling complex facial data and audio content, which frequently led to semantically incongruent visual illusions. Specifically, synthesized videos commonly featured disorganized lip movements, making them difficult to understand and recognize. To overcome these limitations, this paper introduces the application of optical flow to guide facial image generation, enhancing inter-frame continuity and semantic consistency. We propose "OpFlowTalker", a novel approach that utilizes predicted optical flow changes from audio inputs rather than direct image predictions. This method smooths image transitions and aligns changes with semantic content. Moreover, it employs a sequence fusion technique to replace the independent generation of single frames, thus preserving contextual information and maintaining temporal coherence. We also developed an optical flow synchronization module that regulates both full-face and lip movements, optimizing visual synthesis by balancing regional dynamics. Furthermore, we introduce a Visual Text Consistency Score (VTCS) that accurately measures lip-readability in synthesized videos. Extensive empirical evidence validates the effectiveness of our approach. △ Less

Submitted 28 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.00736 [pdf, other]

Joint Signal Detection and Automatic Modulation Classification via Deep Learning

Authors: Huijun Xing, Xuhui Zhang, Shuo Chang, **ke Ren, Zixun Zhang, Jie Xu, Shuguang Cui

Abstract: Signal detection and modulation classification are two crucial tasks in various wireless communication systems. Different from prior works that investigate them independently, this paper studies the joint signal detection and automatic modulation classification (AMC) by considering a realistic and complex scenario, in which multiple signals with different modulation schemes coexist at different ca… ▽ More Signal detection and modulation classification are two crucial tasks in various wireless communication systems. Different from prior works that investigate them independently, this paper studies the joint signal detection and automatic modulation classification (AMC) by considering a realistic and complex scenario, in which multiple signals with different modulation schemes coexist at different carrier frequencies. We first generate a coexisting RADIOML dataset (CRML23) to facilitate the joint design. Different from the publicly available AMC dataset ignoring the signal detection step and containing only one signal, our synthetic dataset covers the more realistic multiple-signal coexisting scenario. Then, we present a joint framework for detection and classification (JDM) for such a multiple-signal coexisting environment, which consists of two modules for signal detection and AMC, respectively. In particular, these two modules are interconnected using a designated data structure called "proposal". Finally, we conduct extensive simulations over the newly developed dataset, which demonstrate the effectiveness of our designs. Our code and dataset are now available as open-source (https://github.com/Singingkettle/ChangShuoRadioData). △ Less

Submitted 29 April, 2024; originally announced May 2024.

arXiv:2404.16891 [pdf, other]

Attacks on Third-Party APIs of Large Language Models

Authors: Wanru Zhao, Vidit Khazanchi, Haodi Xing, Xuanli He, Qiongkai Xu, Nicholas Donald Lane

Abstract: Large language model (LLM) services have recently begun offering a plugin ecosystem to interact with third-party API services. This innovation enhances the capabilities of LLMs, but it also introduces risks, as these plugins developed by various third parties cannot be easily trusted. This paper proposes a new attacking framework to examine security and safety vulnerabilities within LLM platforms… ▽ More Large language model (LLM) services have recently begun offering a plugin ecosystem to interact with third-party API services. This innovation enhances the capabilities of LLMs, but it also introduces risks, as these plugins developed by various third parties cannot be easily trusted. This paper proposes a new attacking framework to examine security and safety vulnerabilities within LLM platforms that incorporate third-party services. Applying our framework specifically to widely used LLMs, we identify real-world malicious attacks across various domains on third-party APIs that can imperceptibly modify LLM outputs. The paper discusses the unique challenges posed by third-party API integration and offers strategic possibilities to improve the security and safety of LLM ecosystems moving forward. Our code is released at https://github.com/vk0812/Third-Party-Attacks-on-LLMs. △ Less

Submitted 24 April, 2024; originally announced April 2024.

Comments: ICLR 2024 Workshop on Secure and Trustworthy Large Language Models

arXiv:2401.14656 [pdf, other]

Scientific Large Language Models: A Survey on Biological & Chemical Domains

Authors: Qiang Zhang, Keyang Ding, Tianwen Lyv, Xinda Wang, Qingyu Yin, Yiwen Zhang, **g Yu, Yuhao Wang, Xiaotong Li, Zhuoyi Xiang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Mengyao Zhang, **lu Zhang, Jiyu Cui, Renjun Xu, Hongyang Chen, Xiaohui Fan, Huabin Xing, Huajun Chen

Abstract: Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. This growing interest has led to the advent o… ▽ More Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. This growing interest has led to the advent of scientific LLMs, a novel subclass specifically engineered for facilitating scientific discovery. As a burgeoning area in the community of AI for Science, scientific LLMs warrant comprehensive exploration. However, a systematic and up-to-date survey introducing them is currently lacking. In this paper, we endeavor to methodically delineate the concept of "scientific language", whilst providing a thorough review of the latest advancements in scientific LLMs. Given the expansive realm of scientific disciplines, our analysis adopts a focused lens, concentrating on the biological and chemical domains. This includes an in-depth examination of LLMs for textual knowledge, small molecules, macromolecular proteins, genomic sequences, and their combinations, analyzing them in terms of model architectures, capabilities, datasets, and evaluation. Finally, we critically examine the prevailing challenges and point out promising research directions along with the advances of LLMs. By offering a comprehensive overview of technical developments in this field, this survey aspires to be an invaluable resource for researchers navigating the intricate landscape of scientific LLMs. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.01522 [pdf, other]

LORE++: Logical Location Regression Network for Table Structure Recognition with Pre-training

Authors: Rujiao Long, Hangdi Xing, Zhibo Yang, Qi Zheng, Zhi Yu, Cong Yao, Fei Huang

Abstract: Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats. Recent methods solve this problem by predicting the adjacency relations of detected cell boxes or learning to directly generate the corresponding markup sequences from the table images. However, existing approaches either count on additional heuristic rules to recover the table structures, or… ▽ More Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats. Recent methods solve this problem by predicting the adjacency relations of detected cell boxes or learning to directly generate the corresponding markup sequences from the table images. However, existing approaches either count on additional heuristic rules to recover the table structures, or face challenges in capturing long-range dependencies within tables, resulting in increased complexity. In this paper, we propose an alternative paradigm. We model TSR as a logical location regression problem and propose a new TSR framework called LORE, standing for LOgical location REgression network, which for the first time regresses logical location as well as spatial location of table cells in a unified network. Our proposed LORE is conceptually simpler, easier to train, and more accurate than other paradigms of TSR. Moreover, inspired by the persuasive success of pre-trained models on a number of computer vision and natural language processing tasks, we propose two pre-training tasks to enrich the spatial and logical representations at the feature level of LORE, resulting in an upgraded version called LORE++. The incorporation of pre-training in LORE++ has proven to enjoy significant advantages, leading to a substantial enhancement in terms of accuracy, generalization, and few-shot capability compared to its predecessor. Experiments on standard benchmarks against methods of previous paradigms demonstrate the superiority of LORE++, which highlights the potential and promising prospect of the logical location regression paradigm for TSR. △ Less

Submitted 2 January, 2024; originally announced January 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2303.03730

arXiv:2310.16606 [pdf, ps, other]

AirFL-Mem: Improving Communication-Learning Trade-Off by Long-Term Memory

Authors: Haifeng Wen, Hong Xing, Osvaldo Simeone

Abstract: Addressing the communication bottleneck inherent in federated learning (FL), over-the-air FL (AirFL) has emerged as a promising solution, which is, however, hampered by deep fading conditions. In this paper, we propose AirFL-Mem, a novel scheme designed to mitigate the impact of deep fading by implementing a \emph{long-term} memory mechanism. Convergence bounds are provided that account for long-t… ▽ More Addressing the communication bottleneck inherent in federated learning (FL), over-the-air FL (AirFL) has emerged as a promising solution, which is, however, hampered by deep fading conditions. In this paper, we propose AirFL-Mem, a novel scheme designed to mitigate the impact of deep fading by implementing a \emph{long-term} memory mechanism. Convergence bounds are provided that account for long-term memory, as well as for existing AirFL variants with short-term memory, for general non-convex objectives. The theory demonstrates that AirFL-Mem exhibits the same convergence rate of federated averaging (FedAvg) with ideal communication, while the performance of existing schemes is generally limited by error floors. The theoretical results are also leveraged to propose a novel convex optimization strategy for the truncation threshold used for power control in the presence of Rayleigh fading channels. Experimental results validate the analysis, confirming the advantages of a long-term memory mechanism for the mitigation of deep fading. △ Less

Submitted 27 October, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

Comments: 8 pages, 3 figures, submitted for possible publication

arXiv:2310.09533 [pdf, other]

Towards End-to-End Unsupervised Saliency Detection with Self-Supervised Top-Down Context

Authors: Yicheng Song, Shuyong Gao, Haozhe Xing, Yiting Cheng, Yan Wang, Wenqiang Zhang

Abstract: Unsupervised salient object detection aims to detect salient objects without using supervision signals eliminating the tedious task of manually labeling salient objects. To improve training efficiency, end-to-end methods for USOD have been proposed as a promising alternative. However, current solutions rely heavily on noisy handcraft labels and fail to mine rich semantic information from deep feat… ▽ More Unsupervised salient object detection aims to detect salient objects without using supervision signals eliminating the tedious task of manually labeling salient objects. To improve training efficiency, end-to-end methods for USOD have been proposed as a promising alternative. However, current solutions rely heavily on noisy handcraft labels and fail to mine rich semantic information from deep features. In this paper, we propose a self-supervised end-to-end salient object detection framework via top-down context. Specifically, motivated by contrastive learning, we exploit the self-localization from the deepest feature to construct the location maps which are then leveraged to learn the most instructive segmentation guidance. Further considering the lack of detailed information in deepest features, we exploit the detail-boosting refiner module to enrich the location labels with details. Moreover, we observe that due to lack of supervision, current unsupervised saliency models tend to detect non-salient objects that are salient in some other samples of corresponding scenarios. To address this widespread issue, we design a novel Unsupervised Non-Salient Suppression (UNSS) method develo** the ability to ignore non-salient objects. Extensive experiments on benchmark datasets demonstrate that our method achieves leading performance among the recent end-to-end methods and most of the multi-stage solutions. The code is available. △ Less

Submitted 14 October, 2023; originally announced October 2023.

Comments: accepted by ACM MM 2023

arXiv:2307.05357 [pdf, other]

Over-the-Air Computation in OFDM Systems with Imperfect Channel State Information

Authors: Yilong Chen, Huijun Xing, Jie Xu, Lexi Xu, Shuguang Cui

Abstract: This paper studies the over-the-air computation (AirComp) in an orthogonal frequency division multiplexing (OFDM) system with imperfect channel state information (CSI), in which multiple single-antenna wireless devices (WDs) simultaneously send uncoded signals to a multi-antenna access point (AP) for distributed functional computation over multiple subcarriers. In particular, we consider two scena… ▽ More This paper studies the over-the-air computation (AirComp) in an orthogonal frequency division multiplexing (OFDM) system with imperfect channel state information (CSI), in which multiple single-antenna wireless devices (WDs) simultaneously send uncoded signals to a multi-antenna access point (AP) for distributed functional computation over multiple subcarriers. In particular, we consider two scenarios with best-effort and error-constrained computation tasks, with the objectives of minimizing the average computation mean squared error (MSE) and the computation outage probability over the multiple subcarriers, respectively. Towards this end, we jointly optimize the transmit coefficients at the WDs and the receive beamforming vectors at the AP over subcarriers, subject to the maximum transmit power constraints at individual WDs. First, for the special case with a single receive antenna at the AP, we propose the semi-closed-form globally optimal solutions to the two problems using the Lagrange-duality method. It is shown that at each subcarrier, the WDs' optimized power control policy for average MSE minimization follows a regularized channel inversion structure, while that for computation outage probability minimization follows an on-off regularized channel inversion, with the regularization dependent on the transmit power budget and channel estimation error. Next, for the general case with multiple receive antennas at the AP, we present efficient algorithms based on alternating optimization and convex optimization to find converged solutions to both problems. △ Less

Submitted 7 July, 2023; originally announced July 2023.

Comments: 13 pages, 6 figures

arXiv:2306.14109 [pdf, other]

When SAM Meets Sonar Images

Authors: Lin Wang, Xiufen Ye, Liqiang Zhu, Weijie Wu, Jianguo Zhang, Huiming Xing, Chao Hu

Abstract: Segment Anything Model (SAM) has revolutionized the way of segmentation. However, SAM's performance may decline when applied to tasks involving domains that differ from natural images. Nonetheless, by employing fine-tuning techniques, SAM exhibits promising capabilities in specific domains, such as medicine and planetary science. Notably, there is a lack of research on the application of SAM to so… ▽ More Segment Anything Model (SAM) has revolutionized the way of segmentation. However, SAM's performance may decline when applied to tasks involving domains that differ from natural images. Nonetheless, by employing fine-tuning techniques, SAM exhibits promising capabilities in specific domains, such as medicine and planetary science. Notably, there is a lack of research on the application of SAM to sonar imaging. In this paper, we aim to address this gap by conducting a comprehensive investigation of SAM's performance on sonar images. Specifically, we evaluate SAM using various settings on sonar images. Additionally, we fine-tune SAM using effective methods both with prompts and for semantic segmentation, thereby expanding its applicability to tasks requiring automated segmentation. Experimental results demonstrate a significant improvement in the performance of the fine-tuned SAM. △ Less

Submitted 24 June, 2023; originally announced June 2023.

Comments: 12 pages, 3 figures

arXiv:2306.06603 [pdf, ps, other]

Task-Oriented Integrated Sensing, Computation and Communication for Wireless Edge AI

Authors: Hong Xing, Guangxu Zhu, Dongzhu Liu, Haifeng Wen, Kaibin Huang, Kaishun Wu

Abstract: With the advent of emerging IoT applications such as autonomous driving, digital-twin and metaverse etc. featuring massive data sensing, analyzing and inference as well critical latency in beyond 5G (B5G) networks, edge artificial intelligence (AI) has been proposed to provide high-performance computation of a conventional cloud down to the network edge. Recently, convergence of wireless sensing,… ▽ More With the advent of emerging IoT applications such as autonomous driving, digital-twin and metaverse etc. featuring massive data sensing, analyzing and inference as well critical latency in beyond 5G (B5G) networks, edge artificial intelligence (AI) has been proposed to provide high-performance computation of a conventional cloud down to the network edge. Recently, convergence of wireless sensing, computation and communication (SC${}^2$) for specific edge AI tasks, has aroused paradigm shift by enabling (partial) sharing of the radio-frequency (RF) transceivers and information processing pipelines among these three fundamental functionalities of IoT. However, most existing design frameworks separate these designs incurring unnecessary signaling overhead and waste of energy, and it is therefore of paramount importance to advance fully integrated sensing, computation and communication (ISCC) to achieve ultra-reliable and low-latency edge intelligence acquisition. In this article, we provide an overview of principles of enabling ISCC technologies followed by two concrete use cases of edge AI tasks demonstrating the advantage of task-oriented ISCC, and pointed out some practical challenges in edge AI design with advanced ISCC solutions. △ Less

Submitted 11 June, 2023; originally announced June 2023.

Comments: 18 pages, 6 figures, submitted for possible journal publication

arXiv:2305.11135 [pdf, other]

Convergence Analysis of Over-the-Air FL with Compression and Power Control via Clip**

Authors: Haifeng Wen, Hong Xing, Osvaldo Simeone

Abstract: One of the key challenges towards the deployment of over-the-air federated learning (AirFL) is the design of mechanisms that can comply with the power and bandwidth constraints of the shared channel, while causing minimum deterioration to the learning performance as compared to baseline noiseless implementations. For additive white Gaussian noise (AWGN) channels with instantaneous per-device power… ▽ More One of the key challenges towards the deployment of over-the-air federated learning (AirFL) is the design of mechanisms that can comply with the power and bandwidth constraints of the shared channel, while causing minimum deterioration to the learning performance as compared to baseline noiseless implementations. For additive white Gaussian noise (AWGN) channels with instantaneous per-device power constraints, prior work has demonstrated the optimality of a power control mechanism based on norm clip**. This was done through the minimization of an upper bound on the optimality gap for smooth learning objectives satisfying the Polyak-Łojasiewicz (PL) condition. In this paper, we make two contributions to the development of AirFL based on norm clip**, which we refer to as AirFL-Clip. First, we provide a convergence bound for AirFLClip that applies to general smooth and non-convex learning objectives. Unlike existing results, the derived bound is free from run-specific parameters, thus supporting an offline evaluation. Second, we extend AirFL-Clip to include Top-k sparsification and linear compression. For this generalized protocol, referred to as AirFL-Clip-Comp, we derive a convergence bound for general smooth and non-convex learning objectives. We argue, and demonstrate via experiments, that the only time-varying quantities present in the bound can be efficiently estimated offline by leveraging the well-studied properties of sparse recovery algorithms. △ Less

Submitted 18 May, 2023; originally announced May 2023.

Comments: 6 pages, 3 figures, submitted for possible publication

arXiv:2305.08468 [pdf, other]

doi 10.1145/3589785

PolarDB-IMCI: A Cloud-Native HTAP Database System at Alibaba

Authors: Jianying Wang, Tongliang Li, Haoze Song, Xinjun Yang, Wenchao Zhou, Feifei Li, Baoyue Yan, Qianqian Wu, Yukun Liang, Chengjun Ying, Yujie Wang, Baokai Chen, Chang Cai, Yubin Ruan, Xiaoyi Weng, Shibin Chen, Liang Yin, Chengzhong Yang, Xin Cai, Hongyan Xing, Nanlong Yu, Xiaofei Chen, Dapeng Huang, Jianling Sun

Abstract: Cloud-native databases have become the de-facto choice for mission-critical applications on the cloud due to the need for high availability, resource elasticity, and cost efficiency. Meanwhile, driven by the increasing connectivity between data generation and analysis, users prefer a single database to efficiently process both OLTP and OLAP workloads, which enhances data freshness and reduces the… ▽ More Cloud-native databases have become the de-facto choice for mission-critical applications on the cloud due to the need for high availability, resource elasticity, and cost efficiency. Meanwhile, driven by the increasing connectivity between data generation and analysis, users prefer a single database to efficiently process both OLTP and OLAP workloads, which enhances data freshness and reduces the complexity of data synchronization and the overall business cost. In this paper, we summarize five crucial design goals for a cloud-native HTAP database based on our experience and customers' feedback, i.e., transparency, competitive OLAP performance, minimal perturbation on OLTP workloads, high data freshness, and excellent resource elasticity. As our solution to realize these goals, we present PolarDB-IMCI, a cloud-native HTAP database system designed and deployed at Alibaba Cloud. Our evaluation results show that PolarDB-IMCI is able to handle HTAP efficiently on both experimental and production workloads; notably, it speeds up analytical queries up to $\times149$ on TPC-H (100 $GB$). PolarDB-IMCI introduces low visibility delay and little performance perturbation on OLTP workloads (< 5%), and resource elasticity can be achieved by scaling out in tens of seconds. △ Less

Submitted 15 May, 2023; originally announced May 2023.

Comments: 14 pages, 16 figures, to be published in ACM SIGMOD 2023

arXiv:2303.03730 [pdf, other]

LORE: Logical Location Regression Network for Table Structure Recognition

Authors: Hangdi Xing, Feiyu Gao, Rujiao Long, Jiajun Bu, Qi Zheng, Liangcheng Li, Cong Yao, Zhi Yu

Abstract: Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats. Recent methods solve this problem by predicting the adjacency relations of detected cell boxes, or learning to generate the corresponding markup sequences from the table images. However, they either count on additional heuristic rules to recover the table structures, or require a huge amount… ▽ More Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats. Recent methods solve this problem by predicting the adjacency relations of detected cell boxes, or learning to generate the corresponding markup sequences from the table images. However, they either count on additional heuristic rules to recover the table structures, or require a huge amount of training data and time-consuming sequential decoders. In this paper, we propose an alternative paradigm. We model TSR as a logical location regression problem and propose a new TSR framework called LORE, standing for LOgical location REgression network, which for the first time combines logical location regression together with spatial location regression of table cells. Our proposed LORE is conceptually simpler, easier to train and more accurate than previous TSR models of other paradigms. Experiments on standard benchmarks demonstrate that LORE consistently outperforms prior arts. Code is available at https:// github.com/AlibabaResearch/AdvancedLiterateMachinery/tree/main/DocumentUnderstanding/LORE-TSR. △ Less

Submitted 7 March, 2023; originally announced March 2023.

arXiv:2301.13546 [pdf, ps, other]

Joint Task Offloading and Cache Placement for Energy-Efficient Mobile Edge Computing Systems

Authors: **gxuan Liang, Hong Xing, Feng Wang, Vincent K. N. Lau

Abstract: This letter investigates a cache-enabled multiuser mobile edge computing (MEC) system with dynamic task arrivals, taking into account the impact of proactive cache placement on the system's overall energy consumption. We consider that an access point (AP) schedules a wireless device (WD) to offload computational tasks while executing the tasks of a finite library in the \emph{task caching} phase,… ▽ More This letter investigates a cache-enabled multiuser mobile edge computing (MEC) system with dynamic task arrivals, taking into account the impact of proactive cache placement on the system's overall energy consumption. We consider that an access point (AP) schedules a wireless device (WD) to offload computational tasks while executing the tasks of a finite library in the \emph{task caching} phase, such that the nearby WDs with the same task request arriving later can directly download the task results in the \emph{task arrival and execution} phase. We aim for minimizing the system's weighted-sum energy over a finite-time horizon, by jointly optimizing the task caching decision and the MEC execution of the AP, and local computing as well as task offloading of the WDs at each time slot, subject to caching capacity, task causality, and completion deadline constraints. The formulated design problem is a mixed-integer nonlinear program. Under the assumption of fully predicable task arrivals, we first propose a branch-and-bound (BnB) based method to obtain the optimal offline solution. Next, we propose two low-complexity schemes based on convex relaxation and task-popularity, respectively. Finally, numerical results show the benefit of the proposed schemes over existing benchmark schemes. △ Less

Submitted 31 January, 2023; originally announced January 2023.

Comments: 5 pages, 3 figures, accepted for publication in WCL

arXiv:2208.02989 [pdf, other]

Covariant-Contravariant Refinement Modal $μ$-calculus

Authors: Huili Xing

Abstract: The notion of covariant-contravariant refinement (CC-refinement, for short) is a generalization of the notions of bisimulation, simulation and refinement. This paper introduces CC-refinement modal $μ$-calculus (CCRML$^μ$) obtained from the modal $μ$-calculus system K$^μ$ by adding CC-refinement quantifiers, establishes an axiom system for CCRML$^μ$ and explores the important properties: soundness,… ▽ More The notion of covariant-contravariant refinement (CC-refinement, for short) is a generalization of the notions of bisimulation, simulation and refinement. This paper introduces CC-refinement modal $μ$-calculus (CCRML$^μ$) obtained from the modal $μ$-calculus system K$^μ$ by adding CC-refinement quantifiers, establishes an axiom system for CCRML$^μ$ and explores the important properties: soundness, completeness and decidability of this axiom system. The language of CCRML$^μ$ may be considered as a specification language for describing the properties of a system referring to reactive and generative actions. It may be used to formalize some interesting problems in the field of formal methods. △ Less

Submitted 5 August, 2022; originally announced August 2022.

arXiv:2207.07795 [pdf, other]

doi 10.1145/3503161.3548344

RCRN: Real-world Character Image Restoration Network via Skeleton Extraction

Authors: Daqian Shi, Xiaolei Diao, Hao Tang, Xiaomin Li, Hao Xing, Hao Xu

Abstract: Constructing high-quality character image datasets is challenging because real-world images are often affected by image degradation. There are limitations when applying current image restoration methods to such real-world character images, since (i) the categories of noise in character images are different from those in general images; (ii) real-world character images usually contain more complex… ▽ More Constructing high-quality character image datasets is challenging because real-world images are often affected by image degradation. There are limitations when applying current image restoration methods to such real-world character images, since (i) the categories of noise in character images are different from those in general images; (ii) real-world character images usually contain more complex image degradation, e.g., mixed noise at different noise levels. To address these problems, we propose a real-world character restoration network (RCRN) to effectively restore degraded character images, where character skeleton information and scale-ensemble feature extraction are utilized to obtain better restoration performance. The proposed method consists of a skeleton extractor (SENet) and a character image restorer (CiRNet). SENet aims to preserve the structural consistency of the character and normalize complex noise. Then, CiRNet reconstructs clean images from degraded character images and their skeletons. Due to the lack of benchmarks for real-world character image restoration, we constructed a dataset containing 1,606 character images with real-world degradation to evaluate the validity of the proposed method. The experimental results demonstrate that RCRN outperforms state-of-the-art methods quantitatively and qualitatively. △ Less

Submitted 19 July, 2022; v1 submitted 15 July, 2022; originally announced July 2022.

Comments: Accepted to ACM MM 2022

arXiv:2207.07564 [pdf, other]

Rethinking Attention Mechanism in Time Series Classification

Authors: Bowen Zhao, Huanlai Xing, Xinhan Wang, Fuhong Song, Zhiwen Xiao

Abstract: Attention-based models have been widely used in many areas, such as computer vision and natural language processing. However, relevant applications in time series classification (TSC) have not been explored deeply yet, causing a significant number of TSC algorithms still suffer from general problems of attention mechanism, like quadratic complexity. In this paper, we promote the efficiency and per… ▽ More Attention-based models have been widely used in many areas, such as computer vision and natural language processing. However, relevant applications in time series classification (TSC) have not been explored deeply yet, causing a significant number of TSC algorithms still suffer from general problems of attention mechanism, like quadratic complexity. In this paper, we promote the efficiency and performance of the attention mechanism by proposing our flexible multi-head linear attention (FMLA), which enhances locality awareness by layer-wise interactions with deformable convolutional blocks and online knowledge distillation. What's more, we propose a simple but effective mask mechanism that helps reduce the noise influence in time series and decrease the redundancy of the proposed FMLA by masking some positions of each given series proportionally. To stabilize this mechanism, samples are forwarded through the model with random mask layers several times and their outputs are aggregated to teach the same model with regular mask layers. We conduct extensive experiments on 85 UCR2018 datasets to compare our algorithm with 11 well-known ones and the results show that our algorithm has comparable performance in terms of top-1 accuracy. We also compare our model with three Transformer-based models with respect to the floating-point operations per second and number of parameters and find that our algorithm achieves significantly better efficiency with lower complexity. △ Less

Submitted 14 July, 2022; originally announced July 2022.

arXiv:2207.07269 [pdf, other]

Weakly Supervised Video Salient Object Detection via Point Supervision

Authors: Shuyong Gao, Haozhe Xing, Wei Zhang, Yan Wang, Qianyu Guo, Wenqiang Zhang

Abstract: Video salient object detection models trained on pixel-wise dense annotation have achieved excellent performance, yet obtaining pixel-by-pixel annotated datasets is laborious. Several works attempt to use scribble annotations to mitigate this problem, but point supervision as a more labor-saving annotation method (even the most labor-saving method among manual annotation methods for dense predicti… ▽ More Video salient object detection models trained on pixel-wise dense annotation have achieved excellent performance, yet obtaining pixel-by-pixel annotated datasets is laborious. Several works attempt to use scribble annotations to mitigate this problem, but point supervision as a more labor-saving annotation method (even the most labor-saving method among manual annotation methods for dense prediction), has not been explored. In this paper, we propose a strong baseline model based on point supervision. To infer saliency maps with temporal information, we mine inter-frame complementary information from short-term and long-term perspectives, respectively. Specifically, we propose a hybrid token attention module, which mixes optical flow and image information from orthogonal directions, adaptively highlighting critical optical flow information (channel dimension) and critical token information (spatial dimension). To exploit long-term cues, we develop the Long-term Cross-Frame Attention module (LCFA), which assists the current frame in inferring salient objects based on multi-frame tokens. Furthermore, we label two point-supervised datasets, P-DAVIS and P-DAVSOD, by relabeling the DAVIS and the DAVSOD dataset. Experiments on the six benchmark datasets illustrate our method outperforms the previous state-of-the-art weakly supervised methods and even is comparable with some fully supervised approaches. Source code and datasets are available. △ Less

Submitted 14 July, 2022; originally announced July 2022.

Comments: accepted by ACM MM 2022

arXiv:2207.06351 [pdf, other]

doi 10.1016/j.patcog.2022.108806

Joint Prediction of Monocular Depth and Structure using Planar and Parallax Geometry

Authors: Hao Xing, Yifan Cao, Maximilian Biber, Mingchuan Zhou, Darius Burschka

Abstract: Supervised learning depth estimation methods can achieve good performance when trained on high-quality ground-truth, like LiDAR data. However, LiDAR can only generate sparse 3D maps which causes losing information. Obtaining high-quality ground-truth depth data per pixel is difficult to acquire. In order to overcome this limitation, we propose a novel approach combining structure information from… ▽ More Supervised learning depth estimation methods can achieve good performance when trained on high-quality ground-truth, like LiDAR data. However, LiDAR can only generate sparse 3D maps which causes losing information. Obtaining high-quality ground-truth depth data per pixel is difficult to acquire. In order to overcome this limitation, we propose a novel approach combining structure information from a promising Plane and Parallax geometry pipeline with depth information into a U-Net supervised learning network, which results in quantitative and qualitative improvement compared to existing popular learning-based methods. In particular, the model is evaluated on two large-scale and challenging datasets: KITTI Vision Benchmark and Cityscapes dataset and achieve the best performance in terms of relative error. Compared with pure depth supervision models, our model has impressive performance on depth prediction of thin objects and edges, and compared to structure prediction baseline, our model performs more robustly. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Comments: Pattern Recognition, May 2022

arXiv:2207.05493 [pdf, other]

Skeletal Human Action Recognition using Hybrid Attention based Graph Convolutional Network

Authors: Hao Xing, Darius Burschka

Abstract: In skeleton-based action recognition, Graph Convolutional Networks model human skeletal joints as vertices and connect them through an adjacency matrix, which can be seen as a local attention mask. However, in most existing Graph Convolutional Networks, the local attention mask is defined based on natural connections of human skeleton joints and ignores the dynamic relations for example between he… ▽ More In skeleton-based action recognition, Graph Convolutional Networks model human skeletal joints as vertices and connect them through an adjacency matrix, which can be seen as a local attention mask. However, in most existing Graph Convolutional Networks, the local attention mask is defined based on natural connections of human skeleton joints and ignores the dynamic relations for example between head, hands and feet joints. In addition, the attention mechanism has been proven effective in Natural Language Processing and image description, which is rarely investigated in existing methods. In this work, we proposed a new adaptive spatial attention layer that extends local attention map to global based on relative distance and relative angle information. Moreover, we design a new initial graph adjacency matrix that connects head, hands and feet, which shows visible improvement in terms of action recognition accuracy. The proposed model is evaluated on two large-scale and challenging datasets in the field of human activities in daily life: NTU-RGB+D and Kinetics skeleton. The results demonstrate that our model has strong performance on both dataset. △ Less

Submitted 12 July, 2022; originally announced July 2022.

Comments: 26th International Conference on Pattern Recognition, 2022

arXiv:2206.14506 [pdf, other]

An extension of process calculus for asynchronous communications between agents with epistemic states

Authors: Huili Xing

Abstract: It plays a central role in intelligent agent systems to model agent's epistemic state and its change. Asynchrony plays a key role in distributed systems, in which the messages transmitted may not be received instantly by the agents. To characterize asynchronous communications, asynchronous announcement logic (AAL) has been presented, which focuses on the logic laws of the change of epistemic state… ▽ More It plays a central role in intelligent agent systems to model agent's epistemic state and its change. Asynchrony plays a key role in distributed systems, in which the messages transmitted may not be received instantly by the agents. To characterize asynchronous communications, asynchronous announcement logic (AAL) has been presented, which focuses on the logic laws of the change of epistemic state after receiving information. However AAL does not involve the interactive behaviours between an agent and its environment. Through enriching the well-known pi-calculus by adding the operators for passing basic facts and applying the well-known action model logic to describe agents' epistemic states, this paper presents the e-calculus to model epistemic interactions between agents with epistemic states. The e-calculus can be adopted to characterize synchronous and asynchronous communications between agents. To capture the asynchrony, a buffer pools is constructed to store the basic facts announced and each agent reads these facts from this buffer pool in some order. Based on the transmission of link names, the e-calculus is able to realize reading from this buffer pool in different orders. This paper gives two examples: one is to read in the order in which the announced basic facts are sent (First-in-first-out, FIFO), and the other is in an arbitrary order. △ Less

Submitted 24 February, 2023; v1 submitted 29 June, 2022; originally announced June 2022.

Comments: 22 pages and 2 figures

arXiv:2205.09925 [pdf, other]

On Jointly Optimizing Partial Offloading and SFC Map**: A Cooperative Dual-agent Deep Reinforcement Learning Approach

Authors: Xinhan Wang, Huanlai Xing, Fuhong Song, Shouxi Luo, Penglin Dai, Bowen Zhao

Abstract: Multi-access edge computing (MEC) and network function virtualization (NFV) are promising technologies to support emerging IoT applications, especially those computation-intensive. In NFV-enabled MEC environment, service function chain (SFC), i.e., a set of ordered virtual network functions (VNFs), can be mapped on MEC servers. Mobile devices (MDs) can offload computation-intensive applications, w… ▽ More Multi-access edge computing (MEC) and network function virtualization (NFV) are promising technologies to support emerging IoT applications, especially those computation-intensive. In NFV-enabled MEC environment, service function chain (SFC), i.e., a set of ordered virtual network functions (VNFs), can be mapped on MEC servers. Mobile devices (MDs) can offload computation-intensive applications, which can be represented by SFCs, fully or partially to MEC servers for remote execution. This paper studies the partial offloading and SFC map** joint optimization (POSMJO) problem in an NFV-enabled MEC system, where an incoming task can be partitioned into two parts, one for local execution and the other for remote execution. The objective is to minimize the average cost in the long term which is a combination of execution delay, MD's energy consumption, and usage charge for edge computing. This problem consists of two closely related decision-making steps, namely task partition and VNF placement, which is highly complex and quite challenging. To address this, we propose a cooperative dual-agent deep reinforcement learning (CDADRL) algorithm, where we design a framework enabling interaction between two agents. Simulation results show that the proposed algorithm outperforms three combinations of deep reinforcement learning algorithms in terms of cumulative and average episodic rewards and it overweighs a number of baseline algorithms with respect to execution delay, energy consumption, and usage charge. △ Less

Submitted 19 May, 2022; originally announced May 2022.

arXiv:2202.12028 [pdf, other]

Evolutionary Multi-Objective Reinforcement Learning Based Trajectory Control and Task Offloading in UAV-Assisted Mobile Edge Computing

Authors: Fuhong Song, Huanlai Xing, Xinhan Wang, Shouxi Luo, Penglin Dai, Zhiwen Xiao, Bowen Zhao

Abstract: This paper studies the trajectory control and task offloading (TCTO) problem in an unmanned aerial vehicle (UAV)-assisted mobile edge computing system, where a UAV flies along a planned trajectory to collect computation tasks from smart devices (SDs). We consider a scenario that SDs are not directly connected by the base station (BS) and the UAV has two roles to play: MEC server or wireless relay.… ▽ More This paper studies the trajectory control and task offloading (TCTO) problem in an unmanned aerial vehicle (UAV)-assisted mobile edge computing system, where a UAV flies along a planned trajectory to collect computation tasks from smart devices (SDs). We consider a scenario that SDs are not directly connected by the base station (BS) and the UAV has two roles to play: MEC server or wireless relay. The UAV makes task offloading decisions online, in which the collected tasks can be executed locally on the UAV or offloaded to the BS for remote processing. The TCTO problem involves multi-objective optimization as its objectives are to minimize the task delay and the UAV's energy consumption, and maximize the number of tasks collected by the UAV, simultaneously. This problem is challenging because the three objectives conflict with each other. The existing reinforcement learning (RL) algorithms, either single-objective RLs or single-policy multi-objective RLs, cannot well address the problem since they cannot output multiple policies for various preferences (i.e. weights) across objectives in a single run. This paper adapts the evolutionary multi-objective RL (EMORL), a multi-policy multi-objective RL, to the TCTO problem. This algorithm can output multiple optimal policies in just one run, each optimizing a certain preference. The simulation results demonstrate that the proposed algorithm can obtain more excellent nondominated policies by striking a balance between the three objectives regarding policy quality, compared with two evolutionary and two multi-policy RL algorithms. △ Less

Submitted 24 February, 2022; originally announced February 2022.

arXiv:2201.00011 [pdf, other]

An Efficient Federated Distillation Learning System for Multi-task Time Series Classification

Authors: Huanlai Xing, Zhiwen Xiao, Rong Qu, Zonghai Zhu, Bowen Zhao

Abstract: This paper proposes an efficient federated distillation learning system (EFDLS) for multi-task time series classification (TSC). EFDLS consists of a central server and multiple mobile users, where different users may run different TSC tasks. EFDLS has two novel components, namely a feature-based student-teacher (FBST) framework and a distance-based weights matching (DBWM) scheme. Within each user,… ▽ More This paper proposes an efficient federated distillation learning system (EFDLS) for multi-task time series classification (TSC). EFDLS consists of a central server and multiple mobile users, where different users may run different TSC tasks. EFDLS has two novel components, namely a feature-based student-teacher (FBST) framework and a distance-based weights matching (DBWM) scheme. Within each user, the FBST framework transfers knowledge from its teacher's hidden layers to its student's hidden layers via knowledge distillation, with the teacher and student having identical network structure. For each connected user, its student model's hidden layers' weights are uploaded to the EFDLS server periodically. The DBWM scheme is deployed on the server, with the least square distance used to measure the similarity between the weights of two given models. This scheme finds a partner for each connected user such that the user's and its partner's weights are the closest among all the weights uploaded. The server exchanges and sends back the user's and its partner's weights to these two users which then load the received weights to their teachers' hidden layers. Experimental results show that the proposed EFDLS achieves excellent performance on a set of selected UCR2018 datasets regarding top-1 accuracy. △ Less

Submitted 30 December, 2021; originally announced January 2022.

Comments: 11 pages

arXiv:2112.14893 [pdf]

doi 10.1039/D0CP06378A

Reversible Upper Confidence Bound Algorithm to Generate Diverse Optimized Candidates

Authors: Bin Chong, Yingguang Yang, Zi-Le Wang, Hang Xing, Zhirong Liu

Abstract: Most algorithms for the multi-armed bandit problem in reinforcement learning aimed to maximize the expected reward, which are thus useful in searching the optimized candidate with the highest reward (function value) for diverse applications (e.g., AlphaGo). However, in some typical application scenaios such as drug discovery, the aim is to search a diverse set of candidates with high reward. Here… ▽ More Most algorithms for the multi-armed bandit problem in reinforcement learning aimed to maximize the expected reward, which are thus useful in searching the optimized candidate with the highest reward (function value) for diverse applications (e.g., AlphaGo). However, in some typical application scenaios such as drug discovery, the aim is to search a diverse set of candidates with high reward. Here we propose a reversible upper confidence bound (rUCB) algorithm for such a purpose, and demonstrate its application in virtual screening upon intrinsically disordered proteins (IDPs). It is shown that rUCB greatly reduces the query times while achieving both high accuracy and low performance loss.The rUCB may have potential application in multipoint optimization and other reinforcement-learning cases. △ Less

Submitted 29 December, 2021; originally announced December 2021.

Comments: 10 pages, 10 figures

Journal ref: Phys. Chem. Chem. Phys. 23 (11), 6800-6806 (2021)

arXiv:2110.04039 [pdf, other]

Global Context Enhanced Social Recommendation with Hierarchical Graph Neural Networks

Authors: Huance Xu, Chao Huang, Yong Xu, Lianghao Xia, Hao Xing, Dawei Yin

Abstract: Social recommendation which aims to leverage social connections among users to enhance the recommendation performance. With the revival of deep learning techniques, many efforts have been devoted to develo** various neural network-based social recommender systems, such as attention mechanisms and graph-based message passing frameworks. However, two important challenges have not been well address… ▽ More Social recommendation which aims to leverage social connections among users to enhance the recommendation performance. With the revival of deep learning techniques, many efforts have been devoted to develo** various neural network-based social recommender systems, such as attention mechanisms and graph-based message passing frameworks. However, two important challenges have not been well addressed yet: (i) Most of existing social recommendation models fail to fully explore the multi-type user-item interactive behavior as well as the underlying cross-relational inter-dependencies. (ii) While the learned social state vector is able to model pair-wise user dependencies, it still has limited representation capacity in capturing the global social context across users. To tackle these limitations, we propose a new Social Recommendation framework with Hierarchical Graph Neural Networks (SR-HGNN). In particular, we first design a relation-aware reconstructed graph neural network to inject the cross-type collaborative semantics into the recommendation framework. In addition, we further augment SR-HGNN with a social relation encoder based on the mutual information learning paradigm between low-level user embeddings and high-level global representation, which endows SR-HGNN with the capability of capturing the global social contextual signals. Empirical results on three public benchmarks demonstrate that SR-HGNN significantly outperforms state-of-the-art recommendation methods. Source codes are available at: https://github.com/xhcdream/SR-HGNN. △ Less

Submitted 8 October, 2021; originally announced October 2021.

Comments: Published as a full paper at ICDM 2020

arXiv:2110.03987 [pdf, other]

Knowledge-aware Coupled Graph Neural Network for Social Recommendation

Authors: Chao Huang, Huance Xu, Yong Xu, Peng Dai, Lianghao Xia, Mengyin Lu, Liefeng Bo, Hao Xing, ** Lai, Yanfang Ye

Abstract: Social recommendation task aims to predict users' preferences over items with the incorporation of social connections among users, so as to alleviate the sparse issue of collaborative filtering. While many recent efforts show the effectiveness of neural network-based social recommender systems, several important challenges have not been well addressed yet: (i) The majority of models only consider… ▽ More Social recommendation task aims to predict users' preferences over items with the incorporation of social connections among users, so as to alleviate the sparse issue of collaborative filtering. While many recent efforts show the effectiveness of neural network-based social recommender systems, several important challenges have not been well addressed yet: (i) The majority of models only consider users' social connections, while ignoring the inter-dependent knowledge across items; (ii) Most of existing solutions are designed for singular type of user-item interactions, making them infeasible to capture the interaction heterogeneity; (iii) The dynamic nature of user-item interactions has been less explored in many social-aware recommendation techniques. To tackle the above challenges, this work proposes a Knowledge-aware Coupled Graph Neural Network (KCGN) that jointly injects the inter-dependent knowledge across items and users into the recommendation framework. KCGN enables the high-order user- and item-wise relation encoding by exploiting the mutual information for global graph structure awareness. Additionally, we further augment KCGN with the capability of capturing dynamic multi-typed user-item interactive patterns. Experimental studies on real-world datasets show the effectiveness of our method against many strong baselines in a variety of settings. Source codes are available at: https://github.com/xhcdream/KCGN. △ Less

Submitted 8 October, 2021; originally announced October 2021.

Comments: Published as a paper at AAAI 2021

arXiv:2109.02376 [pdf, ps, other]

Robust Event Detection based on Spatio-Temporal Latent Action Unit using Skeletal Information

Authors: Hao Xing, Yuxuan Xue, Mingchuan Zhou, Darius Burschka

Abstract: This paper propose a novel dictionary learning approach to detect event action using skeletal information extracted from RGBD video. The event action is represented as several latent atoms and composed of latent spatial and temporal attributes. We perform the method at the example of fall event detection. The skeleton frames are clustered by an initial K-means method. Each skeleton frame is assign… ▽ More This paper propose a novel dictionary learning approach to detect event action using skeletal information extracted from RGBD video. The event action is represented as several latent atoms and composed of latent spatial and temporal attributes. We perform the method at the example of fall event detection. The skeleton frames are clustered by an initial K-means method. Each skeleton frame is assigned with a varying weight parameter and fed into our Gradual Online Dictionary Learning (GODL) algorithm. During the training process, outlier frames will be gradually filtered by reducing the weight that is inversely proportional to a cost. In order to strictly distinguish the event action from similar actions and robustly acquire its action unit, we build a latent unit temporal structure for each sub-action. We evaluate the proposed method on parts of the NTURGB+D dataset, which includes 209 fall videos, 405 ground-lift videos, 420 sit-down videos, and 280 videos of 46 otheractions. We present the experimental validation of the achieved accuracy, recall and precision. Our approach achieves the bestperformance on precision and accuracy of human fall event detection, compared with other existing dictionary learning methods. With increasing noise ratio, our method remains the highest accuracy and the lowest variance. △ Less

Submitted 1 October, 2021; v1 submitted 6 September, 2021; originally announced September 2021.

Comments: 2021 IROS

ACM Class: I.5.1; I.5.2; I.5.3

arXiv:2109.01164 [pdf, other]

Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development

Authors: Mingkuan Liu, Chi Zhang, Hua Xing, Chao Feng, Monchu Chen, Judith Bishop, Grace Ngapo

Abstract: This paper introduces a human-in-the-loop (HITL) data annotation pipeline to generate high-quality, large-scale speech datasets. The pipeline combines human and machine advantages to more quickly, accurately, and cost-effectively annotate datasets with machine pre-labeling and fully manual auditing. Quality control mechanisms such as blind testing, behavior monitoring, and data validation have bee… ▽ More This paper introduces a human-in-the-loop (HITL) data annotation pipeline to generate high-quality, large-scale speech datasets. The pipeline combines human and machine advantages to more quickly, accurately, and cost-effectively annotate datasets with machine pre-labeling and fully manual auditing. Quality control mechanisms such as blind testing, behavior monitoring, and data validation have been adopted in the annotation pipeline to mitigate potential bias introduced by machine-generated labels. Our A/B testing and pilot results demonstrated the HITL pipeline can improve annotation speed and capacity by at least 80% and quality is comparable to or higher than manual double pass annotation. We are leveraging this scalable pipeline to create and continuously grow ultra-high volume off-the-shelf (UHV-OTS) speech corpora for multiple languages, with the capability to expand to 10,000+ hours per language annually. Customized datasets can be produced from the UHV-OTS corpora using dynamic packaging. UHV-OTS is a long-term Appen project to support commercial and academic research data needs in speech processing. Appen will donate a number of free speech datasets from the UHV-OTS each year to support academic and open source community research under the CC-BY-SA license. We are also releasing the code of the data pre-processing and pre-tagging pipeline under the Apache 2.0 license to allow reproduction of the results reported in the paper. △ Less

Submitted 1 September, 2021; originally announced September 2021.

Comments: Submitted to NeurIPS 2021 Datasets and Benchmarks Track (Round 2)

arXiv:2108.04682 [pdf, other]

ChemiRise: a data-driven retrosynthesis engine

Authors: Xiangyan Sun, Ke Liu, Yuquan Lin, Lingjie Wu, Haoming Xing, Minghong Gao, Ji Liu, Suocheng Tan, Zekun Ni, Qi Han, Junqiu Wu, Jie Fan

Abstract: We have developed an end-to-end, retrosynthesis system, named ChemiRise, that can propose complete retrosynthesis routes for organic compounds rapidly and reliably. The system was trained on a processed patent database of over 3 million organic reactions. Experimental reactions were atom-mapped, clustered, and extracted into reaction templates. We then trained a graph convolutional neural network-… ▽ More We have developed an end-to-end, retrosynthesis system, named ChemiRise, that can propose complete retrosynthesis routes for organic compounds rapidly and reliably. The system was trained on a processed patent database of over 3 million organic reactions. Experimental reactions were atom-mapped, clustered, and extracted into reaction templates. We then trained a graph convolutional neural network-based one-step reaction proposer using template embeddings and developed a guiding algorithm on the directed acyclic graph (DAG) of chemical compounds to find the best candidate to explore. The atom-map** algorithm and the one-step reaction proposer were benchmarked against previous studies and showed better results. The final product was demonstrated by retrosynthesis routes reviewed and rated by human experts, showing satisfying functionality and a potential productivity boost in real-life use cases. △ Less

Submitted 9 August, 2021; originally announced August 2021.

arXiv:2103.11220 [pdf, ps, other]

Joint Resource Allocation and Cache Placement for Location-Aware Multi-User Mobile Edge Computing

Authors: Jiechen Chen, Hong Xing, Xiaohui Lin, Arumugam Nallanathan, Suzhi Bi

Abstract: With the growing demand for latency-critical and computation-intensive Internet of Things (IoT) services, the IoT-oriented network architecture, mobile edge computing (MEC), has emerged as a promising technique to reinforce the computation capability of the resource-constrained IoT devices. To exploit the cloud-like functions at the network edge, service caching has been implemented to reuse the c… ▽ More With the growing demand for latency-critical and computation-intensive Internet of Things (IoT) services, the IoT-oriented network architecture, mobile edge computing (MEC), has emerged as a promising technique to reinforce the computation capability of the resource-constrained IoT devices. To exploit the cloud-like functions at the network edge, service caching has been implemented to reuse the computation task input/output data, thus effectively reducing the delay incurred by data retransmissions and repeated execution of the same task. In a multi-user cache-assisted MEC system, users' preferences for different types of services, possibly dependent on their locations, play an important role in joint design of communication, computation and service caching. In this paper, we consider multiple representative locations, where users at the same location share the same preference profile for a given set of services. Specifically, by exploiting the location-aware users' preference profiles, we propose joint optimization of the binary cache placement, the edge computation resource and the bandwidth allocation to minimize the expected sum-energy consumption, subject to the bandwidth and the computation limitations as well as the service latency constraints. To effectively solve the mixed-integer non-convex problem, we propose a deep learning (DL)-based offline cache placement scheme using a novel stochastic quantization based discrete-action generation method. The proposed hybrid learning framework advocates both benefits from the model-free DL approach and the model-based optimization. The simulations verify that the proposed DL-based scheme saves roughly 33% and 6.69% of energy consumption compared with the greedy caching and the popular caching, respectively, while achieving up to 99.01% of the optimal performance. △ Less

Submitted 6 August, 2022; v1 submitted 20 March, 2021; originally announced March 2021.

Comments: 32 pages, 9 figures, accepted to IEEE Internet of Things Journal

arXiv:2103.04162 [pdf, other]

Molecular modeling with machine-learned universal potential functions

Authors: Ke Liu, Zekun Ni, Zhenyu Zhou, Suocheng Tan, Xun Zou, Haoming Xing, Xiangyan Sun, Qi Han, Junqiu Wu, Jie Fan

Abstract: Molecular modeling is an important topic in drug discovery. Decades of research have led to the development of high quality scalable molecular force fields. In this paper, we show that neural networks can be used to train a universal approximator for energy potential functions. By incorporating a fully automated training process we have been able to train smooth, differentiable, and predictive pot… ▽ More Molecular modeling is an important topic in drug discovery. Decades of research have led to the development of high quality scalable molecular force fields. In this paper, we show that neural networks can be used to train a universal approximator for energy potential functions. By incorporating a fully automated training process we have been able to train smooth, differentiable, and predictive potential functions on large-scale crystal structures. A variety of tests have also been performed to show the superiority and versatility of the machine-learned model. △ Less

Submitted 19 April, 2021; v1 submitted 6 March, 2021; originally announced March 2021.

arXiv:2101.12704 [pdf, ps, other]

doi 10.1109/JSAC.2021.3118400

Federated Learning over Wireless Device-to-Device Networks: Algorithms and Convergence Analysis

Authors: Hong Xing, Osvaldo Simeone, Suzhi Bi

Abstract: The proliferation of Internet-of-Things (IoT) devices and cloud-computing applications over siloed data centers is motivating renewed interest in the collaborative training of a shared model by multiple individual clients via federated learning (FL). To improve the communication efficiency of FL implementations in wireless systems, recent works have proposed compression and dimension reduction mec… ▽ More The proliferation of Internet-of-Things (IoT) devices and cloud-computing applications over siloed data centers is motivating renewed interest in the collaborative training of a shared model by multiple individual clients via federated learning (FL). To improve the communication efficiency of FL implementations in wireless systems, recent works have proposed compression and dimension reduction mechanisms, along with digital and analog transmission schemes that account for channel noise, fading, and interference. The prior art has mainly focused on star topologies consisting of distributed clients and a central server. In contrast, this paper studies FL over wireless device-to-device (D2D) networks by providing theoretical insights into the performance of digital and analog implementations of decentralized stochastic gradient descent (DSGD). First, we introduce generic digital and analog wireless implementations of communication-efficient DSGD algorithms, leveraging random linear coding (RLC) for compression and over-the-air computation (AirComp) for simultaneous analog transmissions. Next, under the assumptions of convexity and connectivity, we provide convergence bounds for both implementations. The results demonstrate the dependence of the optimality gap on the connectivity and on the signal-to-noise ratio (SNR) levels in the network. The analysis is corroborated by experiments on an image-classification task. △ Less

Submitted 12 October, 2021; v1 submitted 29 January, 2021; originally announced January 2021.

Comments: 46 pages, 9 figures, to appear in IEEE J. Sel. Areas Commun

arXiv:2011.11829 [pdf, other]

RTFN: A Robust Temporal Feature Network for Time Series Classification

Authors: Zhiwen Xiao, Xin Xu, Huanlai Xing, Shouxi Luo, Penglin Dai, Dawei Zhan

Abstract: Time series data usually contains local and global patterns. Most of the existing feature networks pay more attention to local features rather than the relationships among them. The latter is, however, also important yet more difficult to explore. To obtain sufficient representations by a feature network is still challenging. To this end, we propose a novel robust temporal feature network (RTFN) f… ▽ More Time series data usually contains local and global patterns. Most of the existing feature networks pay more attention to local features rather than the relationships among them. The latter is, however, also important yet more difficult to explore. To obtain sufficient representations by a feature network is still challenging. To this end, we propose a novel robust temporal feature network (RTFN) for feature extraction in time series classification, containing a temporal feature network (TFN) and an LSTM-based attention network (LSTMaN). TFN is a residual structure with multiple convolutional layers. It functions as a local-feature extraction network to mine sufficient local features from data. LSTMaN is composed of two identical layers, where attention and long short-term memory (LSTM) networks are hybridized. This network acts as a relation extraction network to discover the intrinsic relationships among the extracted features at different positions in sequential data. In experiments, we embed RTFN into a supervised structure as a feature extractor and into an unsupervised structure as an encoder, respectively. The results show that the RTFN-based structures achieve excellent supervised and unsupervised performance on a large number of UCR2018 and UEA2018 datasets. △ Less

Submitted 28 December, 2020; v1 submitted 23 November, 2020; originally announced November 2020.

Comments: 41pages, 7figures, Revised Paper

arXiv:2008.07707 [pdf, other]

RTFN: Robust Temporal Feature Network

Authors: Zhiwen Xiao, Xin Xu, Huanlai Xing, Juan Chen

Abstract: Time series analysis plays a vital role in various applications, for instance, healthcare, weather prediction, disaster forecast, etc. However, to obtain sufficient shapelets by a feature network is still challenging. To this end, we propose a novel robust temporal feature network (RTFN) that contains temporal feature networks and attentional LSTM networks. The temporal feature networks are built… ▽ More Time series analysis plays a vital role in various applications, for instance, healthcare, weather prediction, disaster forecast, etc. However, to obtain sufficient shapelets by a feature network is still challenging. To this end, we propose a novel robust temporal feature network (RTFN) that contains temporal feature networks and attentional LSTM networks. The temporal feature networks are built to extract basic features from input data while the attentional LSTM networks are devised to capture complicated shapelets and relationships to enrich features. In experiments, we embed RTFN into supervised structure as a feature extraction network and into unsupervised clustering as an encoder, respectively. The results show that the RTFN-based supervised structure is a winner of 40 out of 85 datasets and the RTFN-based unsupervised clustering performs the best on 4 out of 11 datasets in the UCR2018 archive. △ Less

Submitted 28 December, 2020; v1 submitted 17 August, 2020; originally announced August 2020.

Comments: 10pages, 6 figures

arXiv:2004.14319 [pdf, ps, other]

Real-Time Resource Allocation for Wireless Powered Multiuser Mobile Edge Computing With Energy and Task Causality

Authors: Feng Wang, Hong Xing, Jie Xu

Abstract: This paper considers a wireless powered multiuser mobile edge computing (MEC) system, in which a multi-antenna hybrid access point (AP) wirelessly charges multiple users, and each user relies on the harvested energy to execute computation tasks. We jointly optimize the energy beamforming and remote task execution at the AP, as well as the local computing and task offloading, aiming to minimize the… ▽ More This paper considers a wireless powered multiuser mobile edge computing (MEC) system, in which a multi-antenna hybrid access point (AP) wirelessly charges multiple users, and each user relies on the harvested energy to execute computation tasks. We jointly optimize the energy beamforming and remote task execution at the AP, as well as the local computing and task offloading, aiming to minimize the total system energy consumption over a finite time horizon, subject to causality constraints for both energy harvesting and task arrival at the users. In particular, we consider a practical scenario with casual task state information (TSI) and channel state information (CSI), i.e., only the current and previous TSI and CSI are available, but the future TSI and CSI can only be predicted subject to certain errors. To solve this real-time resource allocation problem, we propose an offline-optimization inspired online design approach. First, we consider the offline optimization case by assuming that the TSI and CSI are perfectly known a-priori. In this case, the energy minimization problem corresponds to a convex problem, for which the semi-closed-form optimal solution is obtained via the Lagrange duality method. Next, inspired by the optimal offline solution, we propose a sliding-window based online resource allocation design in practical cases by integrating with the sequential optimization. Finally, numerical results show that the proposed joint wireless powered MEC designs significantly improve the system's energy efficiency, as compared with the benchmark schemes that consider a sliding window of size one or without such joint optimization. △ Less

Submitted 24 July, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

Comments: This paper mainly addresses offline/online joint-WPT-MEC designs over time by considering dynamic task arrivals at multiple users; 15 pages, 7 figures, full paper, and accepted for publication in IEEE TCOM

arXiv:2004.09783 [pdf]

STDPG: A Spatio-Temporal Deterministic Policy Gradient Agent for Dynamic Routing in SDN

Authors: Juan Chen, Zhiwen Xiao, Huanlai Xing, Penglin Dai, Shouxi Luo, Muhammad Azhar Iqbal

Abstract: Dynamic routing in software-defined networking (SDN) can be viewed as a centralized decision-making problem. Most of the existing deep reinforcement learning (DRL) agents can address it, thanks to the deep neural network (DNN)incorporated. However, fully-connected feed-forward neural network (FFNN) is usually adopted, where spatial correlation and temporal variation of traffic flows are ignored. T… ▽ More Dynamic routing in software-defined networking (SDN) can be viewed as a centralized decision-making problem. Most of the existing deep reinforcement learning (DRL) agents can address it, thanks to the deep neural network (DNN)incorporated. However, fully-connected feed-forward neural network (FFNN) is usually adopted, where spatial correlation and temporal variation of traffic flows are ignored. This drawback usually leads to significantly high computational complexity due to large number of training parameters. To overcome this problem, we propose a novel model-free framework for dynamic routing in SDN, which is referred to as spatio-temporal deterministic policy gradient (STDPG) agent. Both the actor and critic networks are based on identical DNN structure, where a combination of convolutional neural network (CNN) and long short-term memory network (LSTM) with temporal attention mechanism, CNN-LSTM-TAM, is devised. By efficiently exploiting spatial and temporal features, CNNLSTM-TAM helps the STDPG agent learn better from the experience transitions. Furthermore, we employ the prioritized experience replay (PER) method to accelerate the convergence of model training. The experimental results show that STDPG can automatically adapt for current network environment and achieve robust convergence. Compared with a number state-ofthe-art DRL agents, STDPG achieves better routing solutions in terms of the average end-to-end delay. △ Less

Submitted 21 April, 2020; originally announced April 2020.

Comments: 6 pages,5 figures,accepted by IEEE ICC 2020

arXiv:2002.12507 [pdf, ps, other]

Decentralized Federated Learning via SGD over Wireless D2D Networks

Authors: Hong Xing, Osvaldo Simeone, Suzhi Bi

Abstract: Federated Learning (FL), an emerging paradigm for fast intelligent acquisition at the network edge, enables joint training of a machine learning model over distributed data sets and computing resources with limited disclosure of local data. Communication is a critical enabler of large-scale FL due to significant amount of model information exchanged among edge devices. In this paper, we consider a… ▽ More Federated Learning (FL), an emerging paradigm for fast intelligent acquisition at the network edge, enables joint training of a machine learning model over distributed data sets and computing resources with limited disclosure of local data. Communication is a critical enabler of large-scale FL due to significant amount of model information exchanged among edge devices. In this paper, we consider a network of wireless devices sharing a common fading wireless channel for the deployment of FL. Each device holds a generally distinct training set, and communication typically takes place in a Device-to-Device (D2D) manner. In the ideal case in which all devices within communication range can communicate simultaneously and noiselessly, a standard protocol that is guaranteed to converge to an optimal solution of the global empirical risk minimization problem under convexity and connectivity assumptions is Decentralized Stochastic Gradient Descent (DSGD). DSGD integrates local SGD steps with periodic consensus averages that require communication between neighboring devices. In this paper, wireless protocols are proposed that implement DSGD by accounting for the presence of path loss, fading, blockages, and mutual interference. The proposed protocols are based on graph coloring for scheduling and on both digital and analog transmission strategies at the physical layer, with the latter leveraging over-the-air computing via sparsity-based recovery. △ Less

Submitted 27 February, 2020; originally announced February 2020.

Comments: 5 pages, 3 figures, submitted for possible conference publication

arXiv:1908.09334 [pdf, other]

Collaborative Computation Offloading in Wireless Powered Mobile-Edge Computing Systems

Authors: Binqi He, Suzhi Bi, Hong Xing, Xiaohui Lin

Abstract: This paper studies a novel user cooperation model in a wireless powered mobile edge computing system where two wireless users harvest wireless power transferred by one energy node and can offload part of their computation tasks to an edge server (ES) for remote execution. In particular, we consider that the direct communication link between one user to the ES is blocked, such that the other user a… ▽ More This paper studies a novel user cooperation model in a wireless powered mobile edge computing system where two wireless users harvest wireless power transferred by one energy node and can offload part of their computation tasks to an edge server (ES) for remote execution. In particular, we consider that the direct communication link between one user to the ES is blocked, such that the other user acts as a relay to forward its offloading data to the server. Meanwhile, instead of forwarding all the received task data, we also allow the hel** user to compute part of the received task locally to reduce the potentially high energy and time cost on task offloading to the ES. Our aim is to maximize the amount of data that can be processed within a given time frame of the two users by jointly optimizing the amount of task data computed at each device (users and ES), the system time allocation, the transmit power and CPU frequency of the users. We propose an efficient method to find the optimal solution and show that the proposed user cooperation can effectively enhance the computation performance of the system compared to other representative benchmark methods under different scenarios. △ Less

Submitted 4 September, 2019; v1 submitted 25 August, 2019; originally announced August 2019.

Comments: The paper is accepted for publication by IEEE GLOBECOM 2019, at Waikoloa, HI, USA, in Dec. 2019

arXiv:1908.06334 [pdf, ps, other]

Energy-Efficient Proactive Caching for Fog Computing with Correlated Task Arrivals

Authors: Hong Xing, **g**g Cui, Yansha Deng, Arumugam Nallanathan

Abstract: With the proliferation of latency-critical applications, fog-radio network (FRAN) has been envisioned as a paradigm shift enabling distributed deployment of cloud-clone facilities at the network edge. In this paper, we consider proactive caching for a one-user one-access point (AP) fog computing system over a finite time horizon, in which consecutive tasks of the same type of application are tempo… ▽ More With the proliferation of latency-critical applications, fog-radio network (FRAN) has been envisioned as a paradigm shift enabling distributed deployment of cloud-clone facilities at the network edge. In this paper, we consider proactive caching for a one-user one-access point (AP) fog computing system over a finite time horizon, in which consecutive tasks of the same type of application are temporarily correlated. Under the assumption of predicable length of the task-input bits, we formulate a long-term weighted-sum energy minimization problem with three-slot correlation to jointly optimize computation offloading policies and caching decisions subject to stringent per-slot deadline constraints. The formulated problem is hard to solve due to the mixed-integer non-convexity. To tackle this challenge, first, we assume that task-related information are perfectly known {\em a priori}, and provide offline solution leveraging the technique of semi-definite relaxation (SDR), thereby serving as theoretical upper bound. Next, based on the offline solution, we propose a sliding-window based online algorithm under arbitrarily distributed prediction error. Finally, the advantage of computation caching as well the proposed algorithm is verified by numerical examples by comparison with several benchmarks. △ Less

Submitted 17 August, 2019; originally announced August 2019.

Comments: 5 pages, pre-print version for IEEE SPAWC 2019

arXiv:1907.11384 [pdf, other]

Product Image Recognition with Guidance Learning and Noisy Supervision

Authors: Qing Li, Xiaojiang Peng, Liangliang Cao, Wenbin Du, Hao Xing, Yu Qiao

Abstract: This paper considers recognizing products from daily photos, which is an important problem in real-world applications but also challenging due to background clutters, category diversities, noisy labels, etc. We address this problem by two contributions. First, we introduce a novel large-scale product image dataset, termed as Product-90. Instead of collecting product images by labor-and time-intens… ▽ More This paper considers recognizing products from daily photos, which is an important problem in real-world applications but also challenging due to background clutters, category diversities, noisy labels, etc. We address this problem by two contributions. First, we introduce a novel large-scale product image dataset, termed as Product-90. Instead of collecting product images by labor-and time-intensive image capturing, we take advantage of the web and download images from the reviews of several e-commerce websites where the images are casually captured by consumers. Labels are assigned automatically by the categories of e-commerce websites. Totally the Product-90 consists of more than 140K images with 90 categories. Due to the fact that consumers may upload unrelated images, it is inevitable that our Product-90 introduces noisy labels. As the second contribution, we develop a simple yet efficient \textit{guidance learning} (GL) method for training convolutional neural networks (CNNs) with noisy supervision. The GL method first trains an initial teacher network with the full noisy dataset, and then trains a target/student network with both large-scale noisy set and small manually-verified clean set in a multi-task manner. Specifically, in the stage of student network training, the large-scale noisy data is supervised by its guidance knowledge which is the combination of its given noisy label and the soften label from the teacher network. We conduct extensive experiments on our Products-90 and public datasets, namely Food101, Food-101N, and Clothing1M. Our guidance learning method achieves performance superior to state-of-the-art methods on these datasets. △ Less

Submitted 26 July, 2019; originally announced July 2019.

Comments: 10 pages

arXiv:1902.10017 [pdf, ps, other]

Joint Task Assignment and Resource Allocation for D2D-Enabled Mobile-Edge Computing

Authors: Hong Xing, Liang Liu, Jie Xu, Arumugam Nallanathan

Abstract: With the proliferation of computation-extensive and latency-critical applications in the 5G and beyond networks, mobile-edge computing (MEC) or fog computing, which provides cloud-like computation and/or storage capabilities at the network edge, is envisioned to reduce computation latency as well as to conserve energy for wireless devices (WDs). This paper studies a novel device-to-device (D2D)-en… ▽ More With the proliferation of computation-extensive and latency-critical applications in the 5G and beyond networks, mobile-edge computing (MEC) or fog computing, which provides cloud-like computation and/or storage capabilities at the network edge, is envisioned to reduce computation latency as well as to conserve energy for wireless devices (WDs). This paper studies a novel device-to-device (D2D)-enabled multi-helper MEC system, in which a local user solicits its nearby WDs serving as helpers for cooperative computation. We assume a time division multiple access (TDMA) transmission protocol, under which the local user offloads the tasks to multiple helpers and downloads the results from them over orthogonal pre-scheduled time slots. Under this setup, we minimize the computation latency by optimizing the local user's task assignment jointly with the time and rate for task offloading and results downloading, as well as the computation frequency for task execution, subject to individual energy and computation capacity constraints at the local user and the helpers. However, the formulated problem is a mixed-integer non-linear program (MINLP) that is difficult to solve. To tackle this challenge, we propose an efficient algorithm by first relaxing the original problem into a convex one, and then constructing a suboptimal task assignment solution based on the obtained optimal one. Next, we consider a benchmark scheme that endows the WDs with their maximum computation capacities. To further reduce the implementation complexity, we also develop a heuristic scheme based on the greedy task assignment. Finally, numerical results validate the effectiveness of our proposed algorithm, as compared against the heuristic scheme and other benchmark ones without either joint optimization of radio and computation resources or task assignment design. △ Less

Submitted 26 February, 2019; originally announced February 2019.

Comments: 32 pages, 8 figures, accepted by IEEE Transactions on Communications

arXiv:1902.08779 [pdf, ps, other]

Optimal Resource Allocation for Wireless Powered Mobile Edge Computing with Dynamic Task Arrivals

Authors: Feng Wang, Hong Xing, Jie Xu

Abstract: This paper considers a wireless powered multiuser mobile edge computing (MEC) system, where a multi-antenna access point (AP) employs the radio-frequency (RF) signal based wireless power transfer (WPT) to charge a number of distributed users, and each user utilizes the harvested energy to execute computation tasks via local computing and task offloading. We consider the frequency division multiple… ▽ More This paper considers a wireless powered multiuser mobile edge computing (MEC) system, where a multi-antenna access point (AP) employs the radio-frequency (RF) signal based wireless power transfer (WPT) to charge a number of distributed users, and each user utilizes the harvested energy to execute computation tasks via local computing and task offloading. We consider the frequency division multiple access (FDMA) protocol to support simultaneous task offloading from multiple users to the AP. Different from previous works that considered one-shot optimization with static task models, we study the joint computation and wireless resource allocation optimization with dynamic task arrivals over a finite time horizon consisting of multiple slots. Under this setup, our objective is to minimize the system energy consumption including the AP's transmission energy and the MEC server's computing energy over the whole horizon, by jointly optimizing the transmit energy beamforming at the AP, and the local computing and task offloading strategies at the users over different time slots. To characterize the fundamental performance limit of such systems, we focus on the offline optimization by assuming the task and channel information are known a-priori at the AP. In this case, the energy minimization problem corresponds to a convex optimization problem. Leveraging the Lagrange duality method, we obtain the optimal solution to this problem in a well structure. It is shown that in order to maximize the system energy efficiency, the optimal number of task input-bits at each user and the AP are monotonically increasing over time, and the offloading strategies at different users depend on both the wireless channel conditions and the task load at the AP. Numerical results demonstrate the benefit of the proposed joint-WPT-MEC design over alternative benchmark schemes without such joint design. △ Less

Submitted 23 February, 2019; originally announced February 2019.

Comments: 7 pages, 3 figures, and Accepted by IEEE ICC 2019, Shanghai, China

arXiv:1809.00966 [pdf, ps, other]

Energy-Efficient Mobile-Edge Computation Offloading for Applications with Shared Data

Authors: Xiangyu He, Hong Xing, Yue Chen, Arumugam Nallanathan

Abstract: Mobile-edge computation offloading (MECO) has been recognized as a promising solution to alleviate the burden of resource-limited Internet of Thing (IoT) devices by offloading computation tasks to the edge of cellular networks (also known as {\em cloudlet}). Specifically, latency-critical applications such as virtual reality (VR) and augmented reality (AR) have inherent collaborative properties si… ▽ More Mobile-edge computation offloading (MECO) has been recognized as a promising solution to alleviate the burden of resource-limited Internet of Thing (IoT) devices by offloading computation tasks to the edge of cellular networks (also known as {\em cloudlet}). Specifically, latency-critical applications such as virtual reality (VR) and augmented reality (AR) have inherent collaborative properties since part of the input/output data are shared by different users in proximity. In this paper, we consider a multi-user fog computing system, in which multiple single-antenna mobile users running applications featuring shared data can choose between (partially) offloading their individual tasks to a nearby single-antenna cloudlet for remote execution and performing pure local computation. The mobile users' energy minimization is formulated as a convex problem, subject to the total computing latency constraint, the total energy constraints for individual data downloading, and the computing frequency constraints for local computing, for which classical Lagrangian duality can be applied to find the optimal solution. Based upon the semi-closed form solution, the shared data proves to be transmitted by only one of the mobile users instead of multiple ones. Besides, compared to those baseline algorithms without considering the shared data property or the mobile users' local computing capabilities, the proposed joint computation offloading and communications resource allocation provides significant energy saving. △ Less

Submitted 4 September, 2018; originally announced September 2018.

Comments: 6 pages, 3 figures, accepted by IEEE Globecom 2018

arXiv:1803.06236 [pdf]

Chemi-net: a graph convolutional network for accurate drug property prediction

Authors: Ke Liu, Xiangyan Sun, Lei Jia, Jun Ma, Haoming Xing, Junqiu Wu, Hua Gao, Yax Sun, Florian Boulnois, Jie Fan

Abstract: Absorption, distribution, metabolism, and excretion (ADME) studies are critical for drug discovery. Conventionally, these tasks, together with other chemical property predictions, rely on domain-specific feature descriptors, or fingerprints. Following the recent success of neural networks, we developed Chemi-Net, a completely data-driven, domain knowledge-free, deep learning method for ADME proper… ▽ More Absorption, distribution, metabolism, and excretion (ADME) studies are critical for drug discovery. Conventionally, these tasks, together with other chemical property predictions, rely on domain-specific feature descriptors, or fingerprints. Following the recent success of neural networks, we developed Chemi-Net, a completely data-driven, domain knowledge-free, deep learning method for ADME property prediction. To compare the relative performance of Chemi-Net with Cubist, one of the popular machine learning programs used by Amgen, a large-scale ADME property prediction study was performed on-site at Amgen. The results showed that our deep neural network method improved current methods by a large margin. We foresee that the significantly increased accuracy of ADME prediction seen with Chemi-Net over Cubist will greatly accelerate drug discovery. △ Less

Submitted 21 March, 2018; v1 submitted 16 March, 2018; originally announced March 2018.

arXiv:1802.06862 [pdf, ps, other]

Joint Task Assignment and Wireless Resource Allocation for Cooperative Mobile-Edge Computing

Authors: Hong Xing, Liang Liu, Jie Xu, Arumugam Nallanathan

Abstract: This paper studies a multi-user cooperative mobile-edge computing (MEC) system, in which a local mobile user can offload intensive computation tasks to multiple nearby edge devices serving as helpers for remote execution. We focus on the scenario where the local user has a number of independent tasks that can be executed in parallel but cannot be further partitioned. We consider a time division mu… ▽ More This paper studies a multi-user cooperative mobile-edge computing (MEC) system, in which a local mobile user can offload intensive computation tasks to multiple nearby edge devices serving as helpers for remote execution. We focus on the scenario where the local user has a number of independent tasks that can be executed in parallel but cannot be further partitioned. We consider a time division multiple access (TDMA) communication protocol, in which the local user can offload computation tasks to the helpers and download results from them over pre-scheduled time slots. Under this setup, we minimize the local user's computation latency by optimizing the task assignment jointly with the time and power allocations, subject to individual energy constraints at the local user and the helpers. However, the joint task assignment and wireless resource allocation problem is a mixed-integer non-linear program (MINLP) that is hard to solve optimally. To tackle this challenge, we first relax it into a convex problem, and then propose an efficient suboptimal solution based on the optimal solution to the relaxed convex problem. Finally, numerical results show that our proposed joint design significantly reduces the local user's computation latency, as compared against other benchmark schemes that design the task assignment separately from the offloading/downloading resource allocations and local execution. △ Less

Submitted 7 February, 2018; originally announced February 2018.

Comments: 6 pages, 4 figures, accepted by IEEE International Conference on Communications (ICC), Kansas City, MO, USA, 2018

Showing 1–50 of 64 results for author: Xing, H