-
Navigating High-Degree Heterogeneity: Federated Learning in Aerial and Space Networks
Authors:
Fan Dong,
Henry Leung,
Steve Drew
Abstract:
Federated learning offers a compelling solution to the challenges of networking and data privacy within aerial and space networks by utilizing vast private edge data and computing capabilities accessible through drones, balloons, and satellites. While current research has focused on optimizing the learning process, computing efficiency, and minimizing communication overhead, the issue of heterogen…
▽ More
Federated learning offers a compelling solution to the challenges of networking and data privacy within aerial and space networks by utilizing vast private edge data and computing capabilities accessible through drones, balloons, and satellites. While current research has focused on optimizing the learning process, computing efficiency, and minimizing communication overhead, the issue of heterogeneity and class imbalance remains a significant barrier to rapid model convergence. In our study, we explore the influence of heterogeneity on class imbalance, which diminishes performance in ASN-based federated learning. We illustrate the correlation between heterogeneity and class imbalance within grouped data and show how constraints such as battery life exacerbate the class imbalance challenge. Our findings indicate that ASN-based FL faces heightened class imbalance issues even with similar levels of heterogeneity compared to other scenarios. Finally, we analyze the impact of varying degrees of heterogeneity on FL training and evaluate the efficacy of current state-of-the-art algorithms under these conditions. Our results reveal that the heterogeneity challenge is more pronounced in ASN-based federated learning and that prevailing algorithms often fail to effectively address high levels of heterogeneity.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Progressive Confident Masking Attention Network for Audio-Visual Segmentation
Authors:
Yuxuan Wang,
Feng Dong,
**chao Zhu
Abstract:
Audio and visual signals typically occur simultaneously, and humans possess an innate ability to correlate and synchronize information from these two modalities. Recently, a challenging problem known as Audio-Visual Segmentation (AVS) has emerged, intending to produce segmentation maps for sounding objects within a scene. However, the methods proposed so far have not sufficiently integrated audio…
▽ More
Audio and visual signals typically occur simultaneously, and humans possess an innate ability to correlate and synchronize information from these two modalities. Recently, a challenging problem known as Audio-Visual Segmentation (AVS) has emerged, intending to produce segmentation maps for sounding objects within a scene. However, the methods proposed so far have not sufficiently integrated audio and visual information, and the computational costs have been extremely high. Additionally, the outputs of different stages have not been fully utilized. To facilitate this research, we introduce a novel Progressive Confident Masking Attention Network (PMCANet). It leverages attention mechanisms to uncover the intrinsic correlations between audio signals and visual frames. Furthermore, we design an efficient and effective cross-attention module to enhance semantic perception by selecting query tokens. This selection is determined through confidence-driven units based on the network's multi-stage predictive outputs. Experiments demonstrate that our network outperforms other AVS methods while requiring less computational resources.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Tight Characterizations for Preprocessing against Cryptographic Salting
Authors:
Fangqi Dong,
Qipeng Liu,
Kewen Wu
Abstract:
Cryptography often considers the strongest yet plausible attacks in the real world. Preprocessing (a.k.a. non-uniform attack) plays an important role in both theory and practice: an efficient online attacker can take advantage of advice prepared by a time-consuming preprocessing stage.
Salting is a heuristic strategy to counter preprocessing attacks by feeding a small amount of randomness to the…
▽ More
Cryptography often considers the strongest yet plausible attacks in the real world. Preprocessing (a.k.a. non-uniform attack) plays an important role in both theory and practice: an efficient online attacker can take advantage of advice prepared by a time-consuming preprocessing stage.
Salting is a heuristic strategy to counter preprocessing attacks by feeding a small amount of randomness to the cryptographic primitive. We present general and tight characterizations of preprocessing against cryptographic salting, with upper bounds matching the advantages of the most intuitive attack. Our result quantitatively strengthens the previous work by Coretti, Dodis, Guo, and Steinberger (EUROCRYPT'18). Our proof exploits a novel connection between the non-uniform security of salted games and direct product theorems for memoryless algorithms.
For quantum adversaries, we give similar characterizations for property finding games, resolving an open problem of the quantum non-uniform security of salted collision resistant hash by Chung, Guo, Liu, and Qian (FOCS'20). Our proof extends the compressed oracle framework of Zhandry (CRYPTO'19) to prove quantum strong direct product theorems for property finding games in the average-case hardness.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
SimiSketch: Efficiently Estimating Similarity of streaming Multisets
Authors:
Fenghao Dong,
Yang He,
Yutong Liang,
Zirui Liu,
Yuhan Wu,
Peiqing Chen,
Tong Yang
Abstract:
The challenge of estimating similarity between sets has been a significant concern in data science, finding diverse applications across various domains. However, previous approaches, such as MinHash, have predominantly centered around hashing techniques, which are well-suited for sets but less naturally adaptable to multisets, a common occurrence in scenarios like network streams and text data. Mo…
▽ More
The challenge of estimating similarity between sets has been a significant concern in data science, finding diverse applications across various domains. However, previous approaches, such as MinHash, have predominantly centered around hashing techniques, which are well-suited for sets but less naturally adaptable to multisets, a common occurrence in scenarios like network streams and text data. Moreover, with the increasing prevalence of data arriving in streaming patterns, many existing methods struggle to handle cases where set items are presented in a continuous stream. Consequently, our focus in this paper is on the challenging scenario of multisets with item streams. To address this, we propose SimiSketch, a sketching algorithm designed to tackle this specific problem. The paper begins by presenting two simpler versions that employ intuitive sketches for similarity estimation. Subsequently, we formally introduce SimiSketch and leverage SALSA to enhance accuracy. To validate our algorithms, we conduct extensive testing on synthetic datasets, real-world network traffic, and text articles. Our experiment shows that compared with the state-of-the-art, SimiSketch can improve the accuracy by up to 42 times, and increase the throughput by up to 360 times. The complete source code is open-sourced and available on GitHub for reference.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
FedGreen: Carbon-aware Federated Learning with Model Size Adaptation
Authors:
Ali Abbasi,
Fan Dong,
Xin Wang,
Henry Leung,
Jiayu Zhou,
Steve Drew
Abstract:
Federated learning (FL) provides a promising collaborative framework to build a model from distributed clients, and this work investigates the carbon emission of the FL process. Cloud and edge servers hosting FL clients may exhibit diverse carbon footprints influenced by their geographical locations with varying power sources, offering opportunities to reduce carbon emissions by training local mod…
▽ More
Federated learning (FL) provides a promising collaborative framework to build a model from distributed clients, and this work investigates the carbon emission of the FL process. Cloud and edge servers hosting FL clients may exhibit diverse carbon footprints influenced by their geographical locations with varying power sources, offering opportunities to reduce carbon emissions by training local models with adaptive computations and communications. In this paper, we propose FedGreen, a carbon-aware FL approach to efficiently train models by adopting adaptive model sizes shared with clients based on their carbon profiles and locations using ordered dropout as a model compression technique. We theoretically analyze the trade-offs between the produced carbon emissions and the convergence accuracy, considering the carbon intensity discrepancy across countries to choose the parameters optimally. Empirical studies show that FedGreen can substantially reduce the carbon footprints of FL compared to the state-of-the-art while maintaining competitive model accuracy.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Fundamental Limits of Communication-Assisted Sensing in ISAC Systems
Authors:
Fuwang Dong,
Fan Liu,
Shihang Liu,
Yifeng Xiong,
Weijie Yuan,
Yuanhao Cui
Abstract:
In this paper, we introduce a novel communication-assisted sensing (CAS) framework that explores the potential coordination gains offered by the integrated sensing and communication technique. The CAS system endows users with beyond-line-of-the-sight sensing capabilities, supported by a dual-functional base station that enables simultaneous sensing and communication. To delve into the system's fun…
▽ More
In this paper, we introduce a novel communication-assisted sensing (CAS) framework that explores the potential coordination gains offered by the integrated sensing and communication technique. The CAS system endows users with beyond-line-of-the-sight sensing capabilities, supported by a dual-functional base station that enables simultaneous sensing and communication. To delve into the system's fundamental limits, we characterize the information-theoretic framework of the CAS system in terms of rate-distortion theory. We reveal the achievable overall distortion between the target's state and the reconstructions at the end-user, referred to as the sensing quality of service, within a special case where the distortion metric is separable for sensing and communication processes. As a case study, we employ a typical application to demonstrate distortion minimization under the ISAC signaling strategy, showcasing the potential of CAS in enhancing sensing capabilities.
△ Less
Submitted 23 April, 2024; v1 submitted 11 April, 2024;
originally announced April 2024.
-
Msmsfnet: a multi-stream and multi-scale fusion net for edge detection
Authors:
Chenguang Liu,
Chisheng Wang,
Feifei Dong,
Xin Su,
Chuanhua Zhu,
De** Zhang,
Qingquan Li
Abstract:
Edge detection is a long standing problem in computer vision. Recent deep learning based algorithms achieve state of-the-art performance in publicly available datasets. Despite the efficiency of these algorithms, their performance, however, relies heavily on the pretrained weights of the backbone network on the ImageNet dataset. This limits heavily the design space of deep learning based edge dete…
▽ More
Edge detection is a long standing problem in computer vision. Recent deep learning based algorithms achieve state of-the-art performance in publicly available datasets. Despite the efficiency of these algorithms, their performance, however, relies heavily on the pretrained weights of the backbone network on the ImageNet dataset. This limits heavily the design space of deep learning based edge detectors. Whenever we want to devise a new model, we have to train this new model on the ImageNet dataset first, and then fine tune the model using the edge detection datasets. The comparison would be unfair otherwise. However, it is usually not feasible for many researchers to train a model on the ImageNet dataset due to the limited computation resources. In this work, we study the performance that can be achieved by state-of-the-art deep learning based edge detectors in publicly available datasets when they are trained from scratch, and devise a new network architecture, the multi-stream and multi scale fusion net (msmsfnet), for edge detection. We show in our experiments that by training all models from scratch to ensure the fairness of comparison, out model outperforms state-of-the art deep learning based edge detectors in three publicly available datasets.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
NODLINK: An Online System for Fine-Grained APT Attack Detection and Investigation
Authors:
Shaofei Li,
Feng Dong,
Xusheng Xiao,
Haoyu Wang,
Fei Shao,
Jiedong Chen,
Yao Guo,
Xiangqun Chen,
Ding Li
Abstract:
Advanced Persistent Threats (APT) attacks have plagued modern enterprises, causing significant financial losses. To counter these attacks, researchers propose techniques that capture the complex and stealthy scenarios of APT attacks by using provenance graphs to model system entities and their dependencies. Particularly, to accelerate attack detection and reduce financial losses, online provenance…
▽ More
Advanced Persistent Threats (APT) attacks have plagued modern enterprises, causing significant financial losses. To counter these attacks, researchers propose techniques that capture the complex and stealthy scenarios of APT attacks by using provenance graphs to model system entities and their dependencies. Particularly, to accelerate attack detection and reduce financial losses, online provenance-based detection systems that detect and investigate APT attacks under the constraints of timeliness and limited resources are in dire need. Unfortunately, existing online systems usually sacrifice detection granularity to reduce computational complexity and produce provenance graphs with more than 100,000 nodes, posing challenges for security admins to interpret the detection results. In this paper, we design and implement NodLink, the first online detection system that maintains high detection accuracy without sacrificing detection granularity. Our insight is that the APT attack detection process in online provenance-based detection systems can be modeled as a Steiner Tree Problem (STP), which has efficient online approximation algorithms that recover concise attack-related provenance graphs with a theoretically bounded error. To utilize STP approximation algorithm frameworks for APT attack detection, we propose a novel design of in-memory cache, an efficient attack screening method, and a new STP approximation algorithm that is more efficient than the conventional one in APT attack detection while maintaining the same complexity. We evaluate NodLink in a production environment. The open-world experiment shows that NodLink outperforms two state-of-the-art (SOTA) online provenance analysis systems by achieving magnitudes higher detection and investigation accuracy while having the same or higher throughput.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
Random ISAC Signals Deserve Dedicated Precoding
Authors:
Shihang Lu,
Fan Liu,
Fuwang Dong,
Yifeng Xiong,
Jie Xu,
Ya-Feng Liu,
Shi **
Abstract:
Radar systems typically employ well-designed deterministic signals for target sensing, while integrated sensing and communications (ISAC) systems have to adopt random signals to convey useful information. This paper analyzes the sensing and ISAC performance relying on random signaling in a multi-antenna system. Towards this end, we define a new sensing performance metric, namely, ergodic linear mi…
▽ More
Radar systems typically employ well-designed deterministic signals for target sensing, while integrated sensing and communications (ISAC) systems have to adopt random signals to convey useful information. This paper analyzes the sensing and ISAC performance relying on random signaling in a multi-antenna system. Towards this end, we define a new sensing performance metric, namely, ergodic linear minimum mean square error (ELMMSE), which characterizes the estimation error averaged over random ISAC signals. Then, we investigate a data-dependent precoding (DDP) scheme to minimize the ELMMSE in sensing-only scenarios, which attains the optimized performance at the cost of high implementation overhead. To reduce the cost, we present an alternative data-independent precoding (DIP) scheme by stochastic gradient projection (SGP). Moreover, we shed light on the optimal structures of both sensing-only DDP and DIP precoders. As a further step, we extend the proposed DDP and DIP approaches to ISAC scenarios, which are solved via a tailored penalty-based alternating optimization algorithm. Our numerical results demonstrate that the proposed DDP and DIP methods achieve substantial performance gains over conventional ISAC signaling schemes that treat the signal sample covariance matrix as deterministic, which proves that random ISAC signals deserve dedicated precoding designs.
△ Less
Submitted 31 March, 2024; v1 submitted 3 November, 2023;
originally announced November 2023.
-
MPAI-EEV: Standardization Efforts of Artificial Intelligence based End-to-End Video Coding
Authors:
Chuanmin Jia,
Feng Ye,
Fanke Dong,
Kai Lin,
Leonardo Chiariglione,
Siwei Ma,
Huifang Sun,
Wen Gao
Abstract:
The rapid advancement of artificial intelligence (AI) technology has led to the prioritization of standardizing the processing, coding, and transmission of video using neural networks. To address this priority area, the Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI) group is develo** a suite of standards called MPAI-EEV for "end-to-end optimized neural video coding." Th…
▽ More
The rapid advancement of artificial intelligence (AI) technology has led to the prioritization of standardizing the processing, coding, and transmission of video using neural networks. To address this priority area, the Moving Picture, Audio, and Data Coding by Artificial Intelligence (MPAI) group is develo** a suite of standards called MPAI-EEV for "end-to-end optimized neural video coding." The aim of this AI-based video standard project is to compress the number of bits required to represent high-fidelity video data by utilizing data-trained neural coding technologies. This approach is not constrained by how data coding has traditionally been applied in the context of a hybrid framework. This paper presents an overview of recent and ongoing standardization efforts in this area and highlights the key technologies and design philosophy of EEV. It also provides a comparison and report on some primary efforts such as the coding efficiency of the reference model. Additionally, it discusses emerging activities such as learned Unmanned-Aerial-Vehicles (UAVs) video coding which are currently planned, under development, or in the exploration phase. With a focus on UAV video signals, this paper addresses the current status of these preliminary efforts. It also indicates development timelines, summarizes the main technical details, and provides pointers to further points of reference. The exploration experiment shows that the EEV model performs better than the state-of-the-art video coding standard H.266/VVC in terms of perceptual evaluation metric.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Are we there yet? An Industrial Viewpoint on Provenance-based Endpoint Detection and Response Tools
Authors:
Feng Dong,
Shaofei Li,
Peng Jiang,
Ding Li,
Haoyu Wang,
Liangyi Huang,
Xusheng Xiao,
Jiedong Chen,
Xiapu Luo,
Yao Guo,
Xiangqun Chen
Abstract:
Provenance-Based Endpoint Detection and Response (P-EDR) systems are deemed crucial for future APT defenses. Despite the fact that numerous new techniques to improve P-EDR systems have been proposed in academia, it is still unclear whether the industry will adopt P-EDR systems and what improvements the industry desires for P-EDR systems. To this end, we conduct the first set of systematic studies…
▽ More
Provenance-Based Endpoint Detection and Response (P-EDR) systems are deemed crucial for future APT defenses. Despite the fact that numerous new techniques to improve P-EDR systems have been proposed in academia, it is still unclear whether the industry will adopt P-EDR systems and what improvements the industry desires for P-EDR systems. To this end, we conduct the first set of systematic studies on the effectiveness and the limitations of P-EDR systems. Our study consists of four components: a one-to-one interview, an online questionnaire study, a survey of the relevant literature, and a systematic measurement study. Our research indicates that all industry experts consider P-EDR systems to be more effective than conventional Endpoint Detection and Response (EDR) systems. However, industry experts are concerned about the operating cost of P-EDR systems. In addition, our research reveals three significant gaps between academia and industry: (1) overlooking client-side overhead; (2) imbalanced alarm triage cost and interpretation cost; and (3) excessive server-side memory consumption. This paper's findings provide objective data on the effectiveness of P-EDR systems and how much improvements are needed to adopt P-EDR systems in industry.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Federated Learning on Non-iid Data via Local and Global Distillation
Authors:
Xiaolin Zheng,
Senci Ying,
Fei Zheng,
Jianwei Yin,
Longfei Zheng,
Chaochao Chen,
Fengqin Dong
Abstract:
Most existing federated learning algorithms are based on the vanilla FedAvg scheme. However, with the increase of data complexity and the number of model parameters, the amount of communication traffic and the number of iteration rounds for training such algorithms increases significantly, especially in non-independently and homogeneously distributed scenarios, where they do not achieve satisfacto…
▽ More
Most existing federated learning algorithms are based on the vanilla FedAvg scheme. However, with the increase of data complexity and the number of model parameters, the amount of communication traffic and the number of iteration rounds for training such algorithms increases significantly, especially in non-independently and homogeneously distributed scenarios, where they do not achieve satisfactory performance. In this work, we propose FedND: federated learning with noise distillation. The main idea is to use knowledge distillation to optimize the model training process. In the client, we propose a self-distillation method to train the local model. In the server, we generate noisy samples for each client and use them to distill other clients. Finally, the global model is obtained by the aggregation of local models. Experimental results show that the algorithm achieves the best performance and is more communication-efficient than state-of-the-art methods.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Federated Learning Model Aggregation in Heterogenous Aerial and Space Networks
Authors:
Fan Dong,
Ali Abbasi,
Henry Leung,
Xin Wang,
Jiayu Zhou,
Steve Drew
Abstract:
Federated learning offers a promising approach under the constraints of networking and data privacy constraints in aerial and space networks (ASNs), utilizing large-scale private edge data from drones, balloons, and satellites. Existing research has extensively studied the optimization of the learning process, computing efficiency, and communication overhead. An important yet often overlooked aspe…
▽ More
Federated learning offers a promising approach under the constraints of networking and data privacy constraints in aerial and space networks (ASNs), utilizing large-scale private edge data from drones, balloons, and satellites. Existing research has extensively studied the optimization of the learning process, computing efficiency, and communication overhead. An important yet often overlooked aspect is that participants contribute predictive knowledge with varying diversity of knowledge, affecting the quality of the learned federated models. In this paper, we propose a novel approach to address this issue by introducing a Weighted Averaging and Client Selection (WeiAvgCS) framework that emphasizes updates from high-diversity clients and diminishes the influence of those from low-diversity clients. Direct sharing of the data distribution may be prohibitive due to the additional private information that is sent from the clients. As such, we introduce an estimation for the diversity using a projection-based method. Extensive experiments have been performed to show WeiAvgCS's effectiveness. WeiAvgCS could converge 46% faster on FashionMNIST and 38% faster on CIFAR10 than its benchmarks on average in our experiments.
△ Less
Submitted 16 April, 2024; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Integrated Sensing and Communications: Recent Advances and Ten Open Challenges
Authors:
Shihang Lu,
Fan Liu,
Yunxin Li,
Kecheng Zhang,
Hongjia Huang,
Jiaqi Zou,
Xinyu Li,
Yuxiang Dong,
Fuwang Dong,
Jia Zhu,
Yifeng Xiong,
Weijie Yuan,
Yuanhao Cui,
Lajos Hanzo
Abstract:
It is anticipated that integrated sensing and communications (ISAC) would be one of the key enablers of next-generation wireless networks (such as beyond 5G (B5G) and 6G) for supporting a variety of emerging applications. In this paper, we provide a comprehensive review of the recent advances in ISAC systems, with a particular focus on their foundations, system design, networking aspects and ISAC…
▽ More
It is anticipated that integrated sensing and communications (ISAC) would be one of the key enablers of next-generation wireless networks (such as beyond 5G (B5G) and 6G) for supporting a variety of emerging applications. In this paper, we provide a comprehensive review of the recent advances in ISAC systems, with a particular focus on their foundations, system design, networking aspects and ISAC applications. Furthermore, we discuss the corresponding open questions of the above that emerged in each issue. Hence, we commence with the information theory of sensing and communications (S$\&$C), followed by the information-theoretic limits of ISAC systems by shedding light on the fundamental performance metrics. Next, we discuss their clock synchronization and phase offset problems, the associated Pareto-optimal signaling strategies, as well as the associated super-resolution ISAC system design. Moreover, we envision that ISAC ushers in a paradigm shift for the future cellular networks relying on network sensing, transforming the classic cellular architecture, cross-layer resource management methods, and transmission protocols. In ISAC applications, we further highlight the security and privacy issues of wireless sensing. Finally, we close by studying the recent advances in a representative ISAC use case, namely the multi-object multi-task (MOMT) recognition problem using wireless signals.
△ Less
Submitted 17 December, 2023; v1 submitted 29 April, 2023;
originally announced May 2023.
-
Segment Anything Model for Medical Images?
Authors:
Yuhao Huang,
Xin Yang,
Lian Liu,
Han Zhou,
Ao Chang,
Xinrui Zhou,
Rusi Chen,
Junxuan Yu,
Jiongquan Chen,
Chaoyu Chen,
Si**g Liu,
Haozhe Chi,
Xindi Hu,
Kejuan Yue,
Lei Li,
Vicente Grau,
Deng-** Fan,
Fa** Dong,
Dong Ni
Abstract:
The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging because of the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. To fully validate SAM's perfo…
▽ More
The Segment Anything Model (SAM) is the first foundation model for general image segmentation. It has achieved impressive results on various natural image segmentation tasks. However, medical image segmentation (MIS) is more challenging because of the complex modalities, fine anatomical structures, uncertain and complex object boundaries, and wide-range object scales. To fully validate SAM's performance on medical data, we collected and sorted 53 open-source datasets and built a large medical segmentation dataset with 18 modalities, 84 objects, 125 object-modality paired targets, 1050K 2D images, and 6033K masks. We comprehensively analyzed different models and strategies on the so-called COSMOS 1050K dataset. Our findings mainly include the following: 1) SAM showed remarkable performance in some specific objects but was unstable, imperfect, or even totally failed in other situations. 2) SAM with the large ViT-H showed better overall performance than that with the small ViT-B. 3) SAM performed better with manual hints, especially box, than the Everything mode. 4) SAM could help human annotation with high labeling quality and less time. 5) SAM was sensitive to the randomness in the center point and tight box prompts, and may suffer from a serious performance drop. 6) SAM performed better than interactive methods with one or a few points, but will be outpaced as the number of points increases. 7) SAM's performance correlated to different factors, including boundary complexity, intensity differences, etc. 8) Finetuning the SAM on specific medical tasks could improve its average DICE performance by 4.39% and 6.68% for ViT-B and ViT-H, respectively. We hope that this comprehensive report can help researchers explore the potential of SAM applications in MIS, and guide how to appropriately use and develop SAM.
△ Less
Submitted 17 January, 2024; v1 submitted 28 April, 2023;
originally announced April 2023.
-
Underwater Camouflage Object Detection Dataset
Authors:
Feng Dong,
**chao Zhu
Abstract:
We have made a dataset of camouflage object detection mainly for complex seabed scenes, and named it UnderWater RGB&Sonar,or UW-RS for short. The UW-RS dataset contains a total of 1972 image data. The dataset mainly consists of two parts, namely underwater optical data part (UW-R dataset) and underwater sonar data part (UW-S dataset).
We have made a dataset of camouflage object detection mainly for complex seabed scenes, and named it UnderWater RGB&Sonar,or UW-RS for short. The UW-RS dataset contains a total of 1972 image data. The dataset mainly consists of two parts, namely underwater optical data part (UW-R dataset) and underwater sonar data part (UW-S dataset).
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Label Information Enhanced Fraud Detection against Low Homophily in Graphs
Authors:
Yuchen Wang,
**ghui Zhang,
Zhengjie Huang,
Weibin Li,
Shikun Feng,
Ziheng Ma,
Yu Sun,
Dianhai Yu,
Fang Dong,
Jiahui **,
Beilun Wang,
Junzhou Luo
Abstract:
Node classification is a substantial problem in graph-based fraud detection. Many existing works adopt Graph Neural Networks (GNNs) to enhance fraud detectors. While promising, currently most GNN-based fraud detectors fail to generalize to the low homophily setting. Besides, label utilization has been proved to be significant factor for node classification problem. But we find they are less effect…
▽ More
Node classification is a substantial problem in graph-based fraud detection. Many existing works adopt Graph Neural Networks (GNNs) to enhance fraud detectors. While promising, currently most GNN-based fraud detectors fail to generalize to the low homophily setting. Besides, label utilization has been proved to be significant factor for node classification problem. But we find they are less effective in fraud detection tasks due to the low homophily in graphs. In this work, we propose GAGA, a novel Group AGgregation enhanced TrAnsformer, to tackle the above challenges. Specifically, the group aggregation provides a portable method to cope with the low homophily issue. Such an aggregation explicitly integrates the label information to generate distinguishable neighborhood information. Along with group aggregation, an attempt towards end-to-end trainable group encoding is proposed which augments the original feature space with the class labels. Meanwhile, we devise two additional learnable encodings to recognize the structural and relational context. Then, we combine the group aggregation and the learnable encodings into a Transformer encoder to capture the semantic information. Experimental results clearly show that GAGA outperforms other competitive graph-based fraud detectors by up to 24.39% on two trending public datasets and a real-world industrial dataset from Anonymous. Even more, the group aggregation is demonstrated to outperform other label utilization methods (e.g., C&S, BoT/UniMP) in the low homophily setting.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
A Privacy-Preserving Hybrid Federated Learning Framework for Financial Crime Detection
Authors:
Haobo Zhang,
Junyuan Hong,
Fan Dong,
Steve Drew,
Liangjie Xue,
Jiayu Zhou
Abstract:
The recent decade witnessed a surge of increase in financial crimes across the public and private sectors, with an average cost of scams of $102m to financial institutions in 2022. Develo** a mechanism for battling financial crimes is an impending task that requires in-depth collaboration from multiple institutions, and yet such collaboration imposed significant technical challenges due to the p…
▽ More
The recent decade witnessed a surge of increase in financial crimes across the public and private sectors, with an average cost of scams of $102m to financial institutions in 2022. Develo** a mechanism for battling financial crimes is an impending task that requires in-depth collaboration from multiple institutions, and yet such collaboration imposed significant technical challenges due to the privacy and security requirements of distributed financial data. For example, consider the modern payment network systems, which can generate millions of transactions per day across a large number of global institutions. Training a detection model of fraudulent transactions requires not only secured transactions but also the private account activities of those involved in each transaction from corresponding bank systems. The distributed nature of both samples and features prevents most existing learning systems from being directly adopted to handle the data mining task. In this paper, we collectively address these challenges by proposing a hybrid federated learning system that offers secure and privacy-aware learning and inference for financial crime detection. We conduct extensive empirical studies to evaluate the proposed framework's detection performance and privacy-protection capability, evaluating its robustness against common malicious attacks of collaborative learning. We release our source code at https://github.com/illidanlab/HyFL .
△ Less
Submitted 18 April, 2023; v1 submitted 7 February, 2023;
originally announced February 2023.
-
Topology-aware Federated Learning in Edge Computing: A Comprehensive Survey
Authors:
Jiajun Wu,
Steve Drew,
Fan Dong,
Zhuangdi Zhu,
Jiayu Zhou
Abstract:
The ultra-low latency requirements of 5G/6G applications and privacy constraints call for distributed machine learning systems to be deployed at the edge. With its simple yet effective approach, federated learning (FL) is a natural solution for massive user-owned devices in edge computing with distributed and private training data. FL methods based on FedAvg typically follow a naive star topology,…
▽ More
The ultra-low latency requirements of 5G/6G applications and privacy constraints call for distributed machine learning systems to be deployed at the edge. With its simple yet effective approach, federated learning (FL) is a natural solution for massive user-owned devices in edge computing with distributed and private training data. FL methods based on FedAvg typically follow a naive star topology, ignoring the heterogeneity and hierarchy of the volatile edge computing architectures and topologies in reality. Several other network topologies exist and can address the limitations and bottlenecks of the star topology. This motivates us to survey network topology-related FL solutions. In this paper, we conduct a comprehensive survey of the existing FL works focusing on network topologies. After a brief overview of FL and edge computing networks, we discuss various edge network topologies as well as their advantages and disadvantages. Lastly, we discuss the remaining challenges and future works for applying FL to topology-specific edge networks.
△ Less
Submitted 17 June, 2024; v1 submitted 6 February, 2023;
originally announced February 2023.
-
Interactive Context-Aware Network for RGB-T Salient Object Detection
Authors:
Yuxuan Wang,
Feng Dong,
**chao Zhu
Abstract:
Salient object detection (SOD) focuses on distinguishing the most conspicuous objects in the scene. However, most related works are based on RGB images, which lose massive useful information. Accordingly, with the maturity of thermal technology, RGB-T (RGB-Thermal) multi-modality tasks attain more and more attention. Thermal infrared images carry important information which can be used to improve…
▽ More
Salient object detection (SOD) focuses on distinguishing the most conspicuous objects in the scene. However, most related works are based on RGB images, which lose massive useful information. Accordingly, with the maturity of thermal technology, RGB-T (RGB-Thermal) multi-modality tasks attain more and more attention. Thermal infrared images carry important information which can be used to improve the accuracy of SOD prediction. To accomplish it, the methods to integrate multi-modal information and suppress noises are critical. In this paper, we propose a novel network called Interactive Context-Aware Network (ICANet). It contains three modules that can effectively perform the cross-modal and cross-scale fusions. We design a Hybrid Feature Fusion (HFF) module to integrate the features of two modalities, which utilizes two types of feature extraction. The Multi-Scale Attention Reinforcement (MSAR) and Upper Fusion (UF) blocks are responsible for the cross-scale fusion that converges different levels of features and generate the prediction maps. We also raise a novel Context-Aware Multi-Supervised Network (CAMSNet) to calculate the content loss between the prediction and the ground truth (GT). Experiments prove that our network performs favorably against the state-of-the-art RGB-T SOD methods.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Causality Learning With Wasserstein Generative Adversarial Networks
Authors:
Hristo Petkov,
Colin Hanley,
Feng Dong
Abstract:
Conventional methods for causal structure learning from data face significant challenges due to combinatorial search space. Recently, the problem has been formulated into a continuous optimization framework with an acyclicity constraint to learn Directed Acyclic Graphs (DAGs). Such a framework allows the utilization of deep generative models for causal structure learning to better capture the rela…
▽ More
Conventional methods for causal structure learning from data face significant challenges due to combinatorial search space. Recently, the problem has been formulated into a continuous optimization framework with an acyclicity constraint to learn Directed Acyclic Graphs (DAGs). Such a framework allows the utilization of deep generative models for causal structure learning to better capture the relations between data sample distributions and DAGs. However, so far no study has experimented with the use of Wasserstein distance in the context of causal structure learning. Our model named DAG-WGAN combines the Wasserstein-based adversarial loss with an acyclicity constraint in an auto-encoder architecture. It simultaneously learns causal structures while improving its data generation capability. We compare the performance of DAG-WGAN with other models that do not involve the Wasserstein metric in order to identify its contribution to causal structure learning. Our model performs better with high cardinality data according to our experiments.
△ Less
Submitted 3 June, 2022;
originally announced June 2022.
-
Hyper-Learning for Gradient-Based Batch Size Adaptation
Authors:
Calum Robert MacLellan,
Feng Dong
Abstract:
Scheduling the batch size to increase is an effective strategy to control gradient noise when training deep neural networks. Current approaches implement scheduling heuristics that neglect structure within the optimization procedure, limiting their flexibility to the training dynamics and capacity to discern the impact of their adaptations on generalization. We introduce Arbiter as a new hyperpara…
▽ More
Scheduling the batch size to increase is an effective strategy to control gradient noise when training deep neural networks. Current approaches implement scheduling heuristics that neglect structure within the optimization procedure, limiting their flexibility to the training dynamics and capacity to discern the impact of their adaptations on generalization. We introduce Arbiter as a new hyperparameter optimization algorithm to perform batch size adaptations for learnable scheduling heuristics using gradients from a meta-objective function, which overcomes previous heuristic constraints by enforcing a novel learning process called hyper-learning. With hyper-learning, Arbiter formulates a neural network agent to generate optimal batch size samples for an inner deep network by learning an adaptive heuristic through observing concomitant responses over T inner descent steps. Arbiter avoids unrolled optimization, and does not require hypernetworks to facilitate gradients, making it reasonably cheap, simple to implement, and versatile to different tasks. We demonstrate Arbiter's effectiveness in several illustrative experiments: to act as a stand-alone batch size scheduler; to complement fixed batch size schedules with greater flexibility; and to promote variance reduction during stochastic meta-optimization of the learning rate.
△ Less
Submitted 17 May, 2022;
originally announced May 2022.
-
DAG-WGAN: Causal Structure Learning With Wasserstein Generative Adversarial Networks
Authors:
Hristo Petkov,
Colin Hanley,
Feng Dong
Abstract:
The combinatorial search space presents a significant challenge to learning causality from data. Recently, the problem has been formulated into a continuous optimization framework with an acyclicity constraint, allowing for the exploration of deep generative models to better capture data sample distributions and support the discovery of Directed Acyclic Graphs (DAGs) that faithfully represent the…
▽ More
The combinatorial search space presents a significant challenge to learning causality from data. Recently, the problem has been formulated into a continuous optimization framework with an acyclicity constraint, allowing for the exploration of deep generative models to better capture data sample distributions and support the discovery of Directed Acyclic Graphs (DAGs) that faithfully represent the underlying data distribution. However, so far no study has investigated the use of Wasserstein distance for causal structure learning via generative models. This paper proposes a new model named DAG-WGAN, which combines the Wasserstein-based adversarial loss, an auto-encoder architecture together with an acyclicity constraint. DAG-WGAN simultaneously learns causal structures and improves its data generation capability by leveraging the strength from the Wasserstein distance metric. Compared with other models, it scales well and handles both continuous and discrete data. Our experiments have evaluated DAG-WGAN against the state-of-the-art and demonstrated its good performance.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Roadmap on Signal Processing for Next Generation Measurement Systems
Authors:
D. K. Iakovidis,
M. Ooi,
Y. C. Kuang,
S. Demidenko,
A. Shestakov,
V. Sinitsin,
M. Henry,
A. Sciacchitano,
A. Discetti,
S. Donati,
M. Norgia,
A. Menychtas,
I. Maglogiannis,
S. C. Wriessnegger,
L. A. Barradas Chacon,
G. Dimas,
D. Filos,
A. H. Aletras,
J. Töger,
F. Dong,
S. Ren,
A. Uhl,
J. Paziewski,
J. Geng,
F. Fioranelli
, et al. (9 additional authors not shown)
Abstract:
Signal processing is a fundamental component of almost any sensor-enabled system, with a wide range of applications across different scientific disciplines. Time series data, images, and video sequences comprise representative forms of signals that can be enhanced and analysed for information extraction and quantification. The recent advances in artificial intelligence and machine learning are shi…
▽ More
Signal processing is a fundamental component of almost any sensor-enabled system, with a wide range of applications across different scientific disciplines. Time series data, images, and video sequences comprise representative forms of signals that can be enhanced and analysed for information extraction and quantification. The recent advances in artificial intelligence and machine learning are shifting the research attention towards intelligent, data-driven, signal processing. This roadmap presents a critical overview of the state-of-the-art methods and applications aiming to highlight future challenges and research opportunities towards next generation measurement systems. It covers a broad spectrum of topics ranging from basic to industrial research, organized in concise thematic sections that reflect the trends and the impacts of current and future developments per research field. Furthermore, it offers guidance to researchers and funding agencies in identifying new prospects.
△ Less
Submitted 28 January, 2022; v1 submitted 3 November, 2021;
originally announced November 2021.
-
Modal-Adaptive Gated Recoding Network for RGB-D Salient Object Detection
Authors:
**chao Zhu,
Xiaoyu Zhang,
Xian Fang,
Feng Dong,
Qiu Yu
Abstract:
The multi-modal salient object detection model based on RGB-D information has better robustness in the real world. However, it remains nontrivial to better adaptively balance effective multi-modal information in the feature fusion phase. In this letter, we propose a novel gated recoding network (GRNet) to evaluate the information validity of the two modes, and balance their influence. Our framewor…
▽ More
The multi-modal salient object detection model based on RGB-D information has better robustness in the real world. However, it remains nontrivial to better adaptively balance effective multi-modal information in the feature fusion phase. In this letter, we propose a novel gated recoding network (GRNet) to evaluate the information validity of the two modes, and balance their influence. Our framework is divided into three phases: perception phase, recoding mixing phase and feature integration phase. First, A perception encoder is adopted to extract multi-level single-modal features, which lays the foundation for multi-modal semantic comparative analysis. Then, a modal-adaptive gate unit (MGU) is proposed to suppress the invalid information and transfer the effective modal features to the recoding mixer and the hybrid branch decoder. The recoding mixer is responsible for recoding and mixing the balanced multi-modal information. Finally, the hybrid branch decoder completes the multi-level feature integration under the guidance of an optional edge guidance stream (OEGS). Experiments and analysis on eight popular benchmarks verify that our framework performs favorably against 9 state-of-art methods.
△ Less
Submitted 9 November, 2021; v1 submitted 13 August, 2021;
originally announced August 2021.
-
Perception-and-Regulation Network for Salient Object Detection
Authors:
**chao Zhu,
Xiaoyu Zhang,
Xian Fang,
Feng Dong,
Li Yuehua,
Junnan Liu
Abstract:
Effective fusion of different types of features is the key to salient object detection. The majority of existing network structure design is based on the subjective experience of scholars and the process of feature fusion does not consider the relationship between the fused features and highest-level features. In this paper, we focus on the feature relationship and propose a novel global attention…
▽ More
Effective fusion of different types of features is the key to salient object detection. The majority of existing network structure design is based on the subjective experience of scholars and the process of feature fusion does not consider the relationship between the fused features and highest-level features. In this paper, we focus on the feature relationship and propose a novel global attention unit, which we term the "perception- and-regulation" (PR) block, that adaptively regulates the feature fusion process by explicitly modeling interdependencies between features. The perception part uses the structure of fully-connected layers in classification networks to learn the size and shape of objects. The regulation part selectively strengthens and weakens the features to be fused. An imitating eye observation module (IEO) is further employed for improving the global perception ability of the network. The imitation of foveal vision and peripheral vision enables IEO to scrutinize highly detailed objects and to organize the broad spatial scene to better segment objects. Sufficient experiments conducted on SOD datasets demonstrate that the proposed method performs favorably against 22 state-of-the-art methods.
△ Less
Submitted 10 February, 2022; v1 submitted 26 July, 2021;
originally announced July 2021.
-
Federated Evaluation and Tuning for On-Device Personalization: System Design & Applications
Authors:
Matthias Paulik,
Matt Seigel,
Henry Mason,
Dominic Telaar,
Joris Kluivers,
Rogier van Dalen,
Chi Wai Lau,
Luke Carlson,
Filip Granqvist,
Chris Vandevelde,
Sudeep Agarwal,
Julien Freudiger,
Andrew Byde,
Abhishek Bhowmick,
Gaurav Kapoor,
Si Beaumont,
Áine Cahill,
Dominic Hughes,
Omid Javidbakht,
Fei Dong,
Rehan Rishi,
Stanley Hung
Abstract:
We describe the design of our federated task processing system. Originally, the system was created to support two specific federated tasks: evaluation and tuning of on-device ML systems, primarily for the purpose of personalizing these systems. In recent years, support for an additional federated task has been added: federated learning (FL) of deep neural networks. To our knowledge, only one other…
▽ More
We describe the design of our federated task processing system. Originally, the system was created to support two specific federated tasks: evaluation and tuning of on-device ML systems, primarily for the purpose of personalizing these systems. In recent years, support for an additional federated task has been added: federated learning (FL) of deep neural networks. To our knowledge, only one other system has been described in literature that supports FL at scale. We include comparisons to that system to help discuss design decisions and attached trade-offs. Finally, we describe two specific large scale personalization use cases in detail to showcase the applicability of federated tuning to on-device personalization and to highlight application specific solutions.
△ Less
Submitted 16 February, 2021;
originally announced February 2021.
-
Deep Attention-based Representation Learning for Heart Sound Classification
Authors:
Zhao Ren,
Kun Qian,
Fengquan Dong,
Zhenyu Dai,
Yoshiharu Yamamoto,
Björn W. Schuller
Abstract:
Cardiovascular diseases are the leading cause of deaths and severely threaten human health in daily life. On the one hand, there have been dramatically increasing demands from both the clinical practice and the smart home application for monitoring the heart status of subjects suffering from chronic cardiovascular diseases. On the other hand, experienced physicians who can perform an efficient aus…
▽ More
Cardiovascular diseases are the leading cause of deaths and severely threaten human health in daily life. On the one hand, there have been dramatically increasing demands from both the clinical practice and the smart home application for monitoring the heart status of subjects suffering from chronic cardiovascular diseases. On the other hand, experienced physicians who can perform an efficient auscultation are still lacking in terms of number. Automatic heart sound classification leveraging the power of advanced signal processing and machine learning technologies has shown encouraging results. Nevertheless, human hand-crafted features are expensive and time-consuming. To this end, we propose a novel deep representation learning method with an attention mechanism for heart sound classification. In this paradigm, high-level representations are learnt automatically from the recorded heart sound data. Particularly, a global attention pooling layer improves the performance of the learnt representations by estimating the contribution of each unit in feature maps. The Heart Sounds Shenzhen (HSS) corpus (170 subjects involved) is used to validate the proposed method. Experimental results validate that, our approach can achieve an unweighted average recall of 51.2% for classifying three categories of heart sounds, i. e., normal, mild, and moderate/severe annotated by cardiologists with the help of Echocardiography.
△ Less
Submitted 13 January, 2021;
originally announced January 2021.
-
Learning under Concept Drift: A Review
Authors:
Jie Lu,
An** Liu,
Fan Dong,
Feng Gu,
Joao Gama,
Guangquan Zhang
Abstract:
Concept drift describes unforeseeable changes in the underlying distribution of streaming data over time. Concept drift research involves the development of methodologies and techniques for drift detection, understanding and adaptation. Data analysis has revealed that machine learning in a concept drift environment will result in poor learning results if the drift is not addressed. To help researc…
▽ More
Concept drift describes unforeseeable changes in the underlying distribution of streaming data over time. Concept drift research involves the development of methodologies and techniques for drift detection, understanding and adaptation. Data analysis has revealed that machine learning in a concept drift environment will result in poor learning results if the drift is not addressed. To help researchers identify which research topics are significant and how to apply related techniques in data analysis tasks, it is necessary that a high quality, instructive review of current research developments and trends in the concept drift field is conducted. In addition, due to the rapid development of concept drift in recent years, the methodologies of learning under concept drift have become noticeably systematic, unveiling a framework which has not been mentioned in literature. This paper reviews over 130 high quality publications in concept drift related research areas, analyzes up-to-date developments in methodologies and techniques, and establishes a framework of learning under concept drift including three main components: concept drift detection, concept drift understanding, and concept drift adaptation. This paper lists and discusses 10 popular synthetic datasets and 14 publicly available benchmark datasets used for evaluating the performance of learning algorithms aiming at handling concept drift. Also, concept drift related research directions are covered and discussed. By providing state-of-the-art knowledge, this survey will directly support researchers in their understanding of research developments in the field of learning under concept drift.
△ Less
Submitted 13 April, 2020;
originally announced April 2020.
-
Contextual-Bandit Based Personalized Recommendation with Time-Varying User Interests
Authors:
Xiao Xu,
Fang Dong,
Yanghua Li,
Shaojian He,
Xin Li
Abstract:
A contextual bandit problem is studied in a highly non-stationary environment, which is ubiquitous in various recommender systems due to the time-varying interests of users. Two models with disjoint and hybrid payoffs are considered to characterize the phenomenon that users' preferences towards different items vary differently over time. In the disjoint payoff model, the reward of playing an arm i…
▽ More
A contextual bandit problem is studied in a highly non-stationary environment, which is ubiquitous in various recommender systems due to the time-varying interests of users. Two models with disjoint and hybrid payoffs are considered to characterize the phenomenon that users' preferences towards different items vary differently over time. In the disjoint payoff model, the reward of playing an arm is determined by an arm-specific preference vector, which is piecewise-stationary with asynchronous and distinct changes across different arms. An efficient learning algorithm that is adaptive to abrupt reward changes is proposed and theoretical regret analysis is provided to show that a sublinear scaling of regret in the time length $T$ is achieved. The algorithm is further extended to a more general setting with hybrid payoffs where the reward of playing an arm is determined by both an arm-specific preference vector and a joint coefficient vector shared by all arms. Empirical experiments are conducted on real-world datasets to verify the advantages of the proposed learning algorithms against baseline ones in both settings.
△ Less
Submitted 29 February, 2020;
originally announced March 2020.
-
MadDroid: Characterising and Detecting Devious Ad Content for Android Apps
Authors:
Tianming Liu,
Haoyu Wang,
Li Li,
Xiapu Luo,
Feng Dong,
Yao Guo,
Liu Wang,
Tegawendé F. Bissyandé,
Jacques Klein
Abstract:
Advertisement drives the economy of the mobile app ecosystem. As a key component in the mobile ad business model, mobile ad content has been overlooked by the research community, which poses a number of threats, e.g., propagating malware and undesirable contents. To understand the practice of these devious ad behaviors, we perform a large-scale study on the app contents harvested through automated…
▽ More
Advertisement drives the economy of the mobile app ecosystem. As a key component in the mobile ad business model, mobile ad content has been overlooked by the research community, which poses a number of threats, e.g., propagating malware and undesirable contents. To understand the practice of these devious ad behaviors, we perform a large-scale study on the app contents harvested through automated app testing. In this work, we first provide a comprehensive categorization of devious ad contents, including five kinds of behaviors belonging to two categories: \emph{ad loading content} and \emph{ad clicking content}. Then, we propose MadDroid, a framework for automated detection of devious ad contents. MadDroid leverages an automated app testing framework with a sophisticated ad view exploration strategy for effectively collecting ad-related network traffic and subsequently extracting ad contents. We then integrate dedicated approaches into the framework to identify devious ad contents. We have applied MadDroid to 40,000 Android apps and found that roughly 6\% of apps deliver devious ad contents, e.g., distributing malicious apps that cannot be downloaded via traditional app markets. Experiment results indicate that devious ad contents are prevalent, suggesting that our community should invest more effort into the detection and mitigation of devious ads towards building a trustworthy mobile advertising ecosystem.
△ Less
Submitted 5 February, 2020;
originally announced February 2020.
-
An Action Recognition network for specific target based on rMC and RPN
Authors:
Mingjie Li,
Youqian Feng,
Zhonghai Yin,
Cheng Zhou,
Fanghao Dong,
Yuan Lin,
Yuhao Dong
Abstract:
The traditional methods of action recognition are not specific for the operator, thus results are easy to be disturbed when other actions are operated in videos. The network based on mixed convolutional resnet and RPN is proposed in this paper. The rMC is tested in the data set of UCF-101 to compare with the method of R3D. The result shows that its correct rate reaches 71.07%. Meanwhile, the actio…
▽ More
The traditional methods of action recognition are not specific for the operator, thus results are easy to be disturbed when other actions are operated in videos. The network based on mixed convolutional resnet and RPN is proposed in this paper. The rMC is tested in the data set of UCF-101 to compare with the method of R3D. The result shows that its correct rate reaches 71.07%. Meanwhile, the action recognition network is tested in our gesture and body posture data sets for specific target. The simulation achieves a good performance in which the running speed reaches 200 FPS. Finally, our model is improved by introducing the regression block and performs better, which shows the great potential of this model.
△ Less
Submitted 19 June, 2019;
originally announced June 2019.
-
Impoved RPN for Single Targets Detection based on the Anchor Mask Net
Authors:
Mingjie Li,
Youqian Feng,
Zhonghai Yin,
Cheng Zhou,
Fanghao Dong
Abstract:
Common target detection is usually based on single frame images, which is vulnerable to affected by the similar targets in the image and not applicable to video images. In this paper , anchor mask is proposed to add the prior knowledge for target detection and an anchor mask net is designed to impove the RPN performance for single target detection. Tested in the VOT2016, the model perform better.
Common target detection is usually based on single frame images, which is vulnerable to affected by the similar targets in the image and not applicable to video images. In this paper , anchor mask is proposed to add the prior knowledge for target detection and an anchor mask net is designed to impove the RPN performance for single target detection. Tested in the VOT2016, the model perform better.
△ Less
Submitted 18 June, 2019;
originally announced June 2019.
-
FraudDroid: Automated Ad Fraud Detection for Android Apps
Authors:
Feng Dong,
Haoyu Wang,
Li Li,
Yao Guo,
Tegawende F. Bissyande,
Tianming Liu,
Guoai Xu,
Jacques Klein
Abstract:
Although mobile ad frauds have been widespread, state-of-the-art approaches in the literature have mainly focused on detecting the so-called static placement frauds, where only a single UI state is involved and can be identified based on static information such as the size or location of ad views. Other types of fraud exist that involve multiple UI states and are performed dynamically while users…
▽ More
Although mobile ad frauds have been widespread, state-of-the-art approaches in the literature have mainly focused on detecting the so-called static placement frauds, where only a single UI state is involved and can be identified based on static information such as the size or location of ad views. Other types of fraud exist that involve multiple UI states and are performed dynamically while users interact with the app. Such dynamic interaction frauds, although now widely spread in apps, have not yet been explored nor addressed in the literature. In this work, we investigate a wide range of mobile ad frauds to provide a comprehensive taxonomy to the research community. We then propose, FraudDroid, a novel hybrid approach to detect ad frauds in mobile Android apps. FraudDroid analyses apps dynamically to build UI state transition graphs and collects their associated runtime network traffics, which are then leveraged to check against a set of heuristic-based rules for identifying ad fraudulent behaviours. We show empirically that FraudDroid detects ad frauds with a high precision (93%) and recall (92%). Experimental results further show that FraudDroid is capable of detecting ad frauds across the spectrum of fraud types. By analysing 12,000 ad-supported Android apps, FraudDroid identified 335 cases of fraud associated with 20 ad networks that are further confirmed to be true positive results and are shared with our fellow researchers to promote advanced ad fraud detection
△ Less
Submitted 13 June, 2018; v1 submitted 4 September, 2017;
originally announced September 2017.
-
Neural Reranking for Named Entity Recognition
Authors:
Jie Yang,
Yue Zhang,
Fei Dong
Abstract:
We propose a neural reranking system for named entity recognition (NER). The basic idea is to leverage recurrent neural network models to learn sentence-level patterns that involve named entity mentions. In particular, given an output sentence produced by a baseline NER model, we replace all entity mentions, such as \textit{Barack Obama}, into their entity types, such as \textit{PER}. The resultin…
▽ More
We propose a neural reranking system for named entity recognition (NER). The basic idea is to leverage recurrent neural network models to learn sentence-level patterns that involve named entity mentions. In particular, given an output sentence produced by a baseline NER model, we replace all entity mentions, such as \textit{Barack Obama}, into their entity types, such as \textit{PER}. The resulting sentence patterns contain direct output information, yet is less sparse without specific named entities. For example, "PER was born in LOC" can be such a pattern. LSTM and CNN structures are utilised for learning deep representations of such sentences for reranking. Results show that our system can significantly improve the NER accuracies over two different baselines, giving the best reported results on a standard benchmark.
△ Less
Submitted 17 July, 2017;
originally announced July 2017.
-
Neural Word Segmentation with Rich Pretraining
Authors:
Jie Yang,
Yue Zhang,
Fei Dong
Abstract:
Neural word segmentation research has benefited from large-scale raw texts by leveraging them for pretraining character and word embeddings. On the other hand, statistical segmentation research has exploited richer sources of external information, such as punctuation, automatic segmentation and POS. We investigate the effectiveness of a range of external training sources for neural word segmentati…
▽ More
Neural word segmentation research has benefited from large-scale raw texts by leveraging them for pretraining character and word embeddings. On the other hand, statistical segmentation research has exploited richer sources of external information, such as punctuation, automatic segmentation and POS. We investigate the effectiveness of a range of external training sources for neural word segmentation by building a modular segmentation model, pretraining the most important submodule using rich external sources. Results show that such pretraining significantly improves the model, leading to accuracies competitive to the best methods on six benchmarks.
△ Less
Submitted 28 April, 2017;
originally announced April 2017.
-
Threshold for the Outbreak of Cascading Failures in Degree-degree Uncorrelated Networks
Authors:
Junbiao Liu,
Xinyu **,
Lurong Jiang,
Yongxiang Xia,
Bo Ouyang,
Fang Dong,
Yicong Lang,
Wen** Zhang
Abstract:
In complex networks, the failure of one or very few nodes may cause cascading failures. When this dynamical process stops in steady state, the size of the giant component formed by remaining un-failed nodes can be used to measure the severity of cascading failures, which is critically important for estimating the robustness of networks. In this paper, we provide a cascade of overload failure model…
▽ More
In complex networks, the failure of one or very few nodes may cause cascading failures. When this dynamical process stops in steady state, the size of the giant component formed by remaining un-failed nodes can be used to measure the severity of cascading failures, which is critically important for estimating the robustness of networks. In this paper, we provide a cascade of overload failure model with local load sharing mechanism, and then explore the threshold of node capacity when the large-scale cascading failures happen and un-failed nodes in steady state cannot connect to each other to form a large connected sub-network. We get the theoretical derivation of this threshold in degree-degree uncorrelated networks, and validate the effectiveness of this method in simulation. This threshold provide us a guidance to improve the network robustness under the premise of limited capacity resource when creating a network and assigning load. Therefore, this threshold is useful and important to analyze the robustness of networks.
△ Less
Submitted 27 June, 2015;
originally announced June 2015.
-
Eigenstructure of Maximum Likelihood from Counts Data
Authors:
Fanghu Dong
Abstract:
The MLE (Maximum Likelihood Estimate) for a multinomial model is proportional to the data. We call such estimate an eigenestimate and the relationship of it to the data as the eigenstructure. When the multinomial model is generalized to deal with data arise from incomplete or censored categorical counts, we would naturally look for this eigenstructure between MLE and data. The paper finds the alge…
▽ More
The MLE (Maximum Likelihood Estimate) for a multinomial model is proportional to the data. We call such estimate an eigenestimate and the relationship of it to the data as the eigenstructure. When the multinomial model is generalized to deal with data arise from incomplete or censored categorical counts, we would naturally look for this eigenstructure between MLE and data. The paper finds the algebraic representation of the eigenstructure (put as Eqn (2.1)), with which the intuition is visualized geometrically (Figures 2.2 and 4.3) and elaborated in a theory (Section 4). The eigenestimate constructed from the eigenstructure must be a stationary point of the likelihood, a result proved in Theorem 4.42. On the bridge between the algebraic definition of Eqn (2.1) and the Proof of Theorem 4.42, we have exploited an elementary inequality (Lemma 3.1) that governs the primitive cases, defined the thick objects of fragment and slice which can be assembled like mechanical parts (Definition 4.1), proved a few intermediary results that help build up the intuition (Section 4), conjectured the universal existence of an eigenestimate (Conjecture 4.32), established a criterion for boundary regularity (Criterion 4.37), and paved way (the Trivial Slicing Algorithm (TSA)) for the derivation of the Weaver algorithms (Section 5) that finds the eigenestimate by using it to reconstruct the observed counts through the eigenstructure, the reconstruction is iterative but derivative-free and matrix-inversion-free. As new addition to the current body of algorithmic methods, the Weaver algorithms craftily tighten threads that are weaved on a rectangular grid (Figure 2.3), and is one incarnation of the TSA. Finally, we put our method in the context of some existing methods (Section 6). Softwares are pseudocoded and put online. Visit http://hku.hk/jdong/eigenstruct2013a.html for demonstrations and download.
△ Less
Submitted 31 December, 2017; v1 submitted 15 January, 2013;
originally announced January 2013.
-
Ant Colony Algorithm for the Weighted Item Layout Optimization Problem
Authors:
Yi-Chun Xu,
Fang-Min Dong,
Yong Liu,
Ren-Bin Xiao,
Martyn Amos
Abstract:
This paper discusses the problem of placing weighted items in a circular container in two-dimensional space. This problem is of great practical significance in various mechanical engineering domains, such as the design of communication satellites. Two constructive heuristics are proposed, one for packing circular items and the other for packing rectangular items. These work by first optimizing o…
▽ More
This paper discusses the problem of placing weighted items in a circular container in two-dimensional space. This problem is of great practical significance in various mechanical engineering domains, such as the design of communication satellites. Two constructive heuristics are proposed, one for packing circular items and the other for packing rectangular items. These work by first optimizing object placement order, and then optimizing object positioning. Based on these heuristics, an ant colony optimization (ACO) algorithm is described to search first for optimal positioning order, and then for the optimal layout. We describe the results of numerical experiments, in which we test two versions of our ACO algorithm alongside local search methods previously described in the literature. Our results show that the constructive heuristic-based ACO performs better than existing methods on larger problem instances.
△ Less
Submitted 24 January, 2010;
originally announced January 2010.
-
Intent expression using eye robot for mascot robot system
Authors:
Yoichi Yamazaki,
Fangyan Dong,
Yuta Masuda,
Yukiko Uehara,
Petar Kormushev,
Hai An Vu,
Phuc Quang Le,
Kaoru Hirota
Abstract:
An intent expression system using eye robots is proposed for a mascot robot system from a viewpoint of humatronics. The eye robot aims at providing a basic interface method for an information terminal robot system. To achieve better understanding of the displayed information, the importance and the degree of certainty of the information should be communicated along with the main content. The pro…
▽ More
An intent expression system using eye robots is proposed for a mascot robot system from a viewpoint of humatronics. The eye robot aims at providing a basic interface method for an information terminal robot system. To achieve better understanding of the displayed information, the importance and the degree of certainty of the information should be communicated along with the main content. The proposed intent expression system aims at conveying this additional information using the eye robot system. Eye motions are represented as the states in a pleasure-arousal space model. Changes in the model state are calculated by fuzzy inference according to the importance and degree of certainty of the displayed information. These changes influence the arousal-sleep coordinates in the space that corresponds to levels of liveliness during communication. The eye robot provides a basic interface for the mascot robot system that is easy to be understood as an information terminal for home environments in a humatronics society.
△ Less
Submitted 9 April, 2009;
originally announced April 2009.
-
Fuzzy inference based mentality estimation for eye robot agent
Authors:
Yoichi Yamazaki,
Fangyan Dong,
Yuta Masuda,
Yukiko Uehara,
Petar Kormushev,
Hai An Vu,
Phuc Quang Le,
Kaoru Hirota
Abstract:
Household robots need to communicate with human beings in a friendly fashion. To achieve better understanding of displayed information, an importance and a certainty of the information should be communicated together with the main information. The proposed intent expression system aims to convey this additional information using an eye robot. The eye motions are represented as states in a pleasu…
▽ More
Household robots need to communicate with human beings in a friendly fashion. To achieve better understanding of displayed information, an importance and a certainty of the information should be communicated together with the main information. The proposed intent expression system aims to convey this additional information using an eye robot. The eye motions are represented as states in a pleasure-arousal space model. Change of the model state is calculated by fuzzy inference according to the importance and certainty of the displayed information. This change influences the arousal-sleep coordinate in the space which corresponds to activeness in communication. The eye robot provides a basic interface for the mascot robot system which is an easy to understand information terminal for home environments in a humatronics society.
△ Less
Submitted 9 April, 2009;
originally announced April 2009.
-
Eligibility Propagation to Speed up Time Hop** for Reinforcement Learning
Authors:
Petar Kormushev,
Kohei Nomoto,
Fangyan Dong,
Kaoru Hirota
Abstract:
A mechanism called Eligibility Propagation is proposed to speed up the Time Hop** technique used for faster Reinforcement Learning in simulations. Eligibility Propagation provides for Time Hop** similar abilities to what eligibility traces provide for conventional Reinforcement Learning. It propagates values from one state to all of its temporal predecessors using a state transitions graph.…
▽ More
A mechanism called Eligibility Propagation is proposed to speed up the Time Hop** technique used for faster Reinforcement Learning in simulations. Eligibility Propagation provides for Time Hop** similar abilities to what eligibility traces provide for conventional Reinforcement Learning. It propagates values from one state to all of its temporal predecessors using a state transitions graph. Experiments on a simulated biped crawling robot confirm that Eligibility Propagation accelerates the learning process more than 3 times.
△ Less
Submitted 3 April, 2009;
originally announced April 2009.
-
Time Hop** technique for faster reinforcement learning in simulations
Authors:
Petar Kormushev,
Kohei Nomoto,
Fangyan Dong,
Kaoru Hirota
Abstract:
This preprint has been withdrawn by the author for revision
This preprint has been withdrawn by the author for revision
△ Less
Submitted 6 September, 2011; v1 submitted 3 April, 2009;
originally announced April 2009.
-
Time manipulation technique for speeding up reinforcement learning in simulations
Authors:
Petar Kormushev,
Kohei Nomoto,
Fangyan Dong,
Kaoru Hirota
Abstract:
A technique for speeding up reinforcement learning algorithms by using time manipulation is proposed. It is applicable to failure-avoidance control problems running in a computer simulation. Turning the time of the simulation backwards on failure events is shown to speed up the learning by 260% and improve the state space exploration by 12% on the cart-pole balancing task, compared to the conven…
▽ More
A technique for speeding up reinforcement learning algorithms by using time manipulation is proposed. It is applicable to failure-avoidance control problems running in a computer simulation. Turning the time of the simulation backwards on failure events is shown to speed up the learning by 260% and improve the state space exploration by 12% on the cart-pole balancing task, compared to the conventional Q-learning and Actor-Critic algorithms.
△ Less
Submitted 27 March, 2009;
originally announced March 2009.