Search | arXiv e-print repository

A data-centric approach for assessing progress of Graph Neural Networks

Authors: Tianqi Zhao, Ngan Thi Dong, Alan Hanjalic, Megha Khosla

Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art results in node classification tasks. However, most improvements are in multi-class classification, with less focus on the cases where each node could have multiple labels. The first challenge in studying multi-label node classification is the scarcity of publicly available datasets. To address this, we collected and released three real-w… ▽ More Graph Neural Networks (GNNs) have achieved state-of-the-art results in node classification tasks. However, most improvements are in multi-class classification, with less focus on the cases where each node could have multiple labels. The first challenge in studying multi-label node classification is the scarcity of publicly available datasets. To address this, we collected and released three real-world biological datasets and developed a multi-label graph generator with tunable properties. We also argue that traditional notions of homophily and heterophily do not apply well to multi-label scenarios. Therefore, we define homophily and Cross-Class Neighborhood Similarity for multi-label classification and investigate $9$ collected multi-label datasets. Lastly, we conducted a large-scale comparative study with $8$ methods across nine datasets to evaluate current progress in multi-label node classification. We release our code at \url{https://github.com/Tianqi-py/MLGNC}. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Journal ref: Published in Data-centric Machine Learning Research Worshop @ ICML 2024

arXiv:2404.08501 [pdf, ps, other]

Analyzing and Overcoming Local Optima in Complex Multi-Objective Optimization by Decomposition-Based Evolutionary Algorithms

Authors: Ting Dong, Haoxin Wang, Hengxi Zhang, Wenbo Ding

Abstract: When addressing the challenge of complex multi-objective optimization problems, particularly those with non-convex and non-uniform Pareto fronts, Decomposition-based Multi-Objective Evolutionary Algorithms (MOEADs) often converge to local optima, thereby limiting solution diversity. Despite its significance, this issue has received limited theoretical exploration. Through a comprehensive geometric… ▽ More When addressing the challenge of complex multi-objective optimization problems, particularly those with non-convex and non-uniform Pareto fronts, Decomposition-based Multi-Objective Evolutionary Algorithms (MOEADs) often converge to local optima, thereby limiting solution diversity. Despite its significance, this issue has received limited theoretical exploration. Through a comprehensive geometric analysis, we identify that the traditional method of Reference Point (RP) selection fundamentally contributes to this challenge. In response, we introduce an innovative RP selection strategy, the Weight Vector-Guided and Gaussian-Hybrid method, designed to overcome the local optima issue. This approach employs a novel RP type that aligns with weight vector directions and integrates a Gaussian distribution to combine three distinct RP categories. Our research comprises two main experimental components: an ablation study involving 14 algorithms within the MOEADs framework, spanning from 2014 to 2022, to validate our theoretical framework, and a series of empirical tests to evaluate the effectiveness of our proposed method against both traditional and cutting-edge alternatives. Results demonstrate that our method achieves remarkable improvements in both population diversity and convergence. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2404.03233 [pdf, other]

Learn What You Want to Unlearn: Unlearning Inversion Attacks against Machine Unlearning

Authors: Hongsheng Hu, Shuo Wang, Tian Dong, Minhui Xue

Abstract: Machine unlearning has become a promising solution for fulfilling the "right to be forgotten", under which individuals can request the deletion of their data from machine learning models. However, existing studies of machine unlearning mainly focus on the efficacy and efficiency of unlearning methods, while neglecting the investigation of the privacy vulnerability during the unlearning process. Wi… ▽ More Machine unlearning has become a promising solution for fulfilling the "right to be forgotten", under which individuals can request the deletion of their data from machine learning models. However, existing studies of machine unlearning mainly focus on the efficacy and efficiency of unlearning methods, while neglecting the investigation of the privacy vulnerability during the unlearning process. With two versions of a model available to an adversary, that is, the original model and the unlearned model, machine unlearning opens up a new attack surface. In this paper, we conduct the first investigation to understand the extent to which machine unlearning can leak the confidential content of the unlearned data. Specifically, under the Machine Learning as a Service setting, we propose unlearning inversion attacks that can reveal the feature and label information of an unlearned sample by only accessing the original and unlearned model. The effectiveness of the proposed unlearning inversion attacks is evaluated through extensive experiments on benchmark datasets across various model architectures and on both exact and approximate representative unlearning approaches. The experimental results indicate that the proposed attack can reveal the sensitive information of the unlearned data. As such, we identify three possible defenses that help to mitigate the proposed attacks, while at the cost of reducing the utility of the unlearned model. The study in this paper uncovers an underexplored gap between machine unlearning and the privacy of unlearned data, highlighting the need for the careful design of mechanisms for implementing unlearning without leaking the information of the unlearned data. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: To Appear in the 45th IEEE Symposium on Security and Privacy, May 20-23, 2024

arXiv:2403.15297 [pdf, other]

Sphere Neural-Networks for Rational Reasoning

Authors: Tiansi Dong, Mateja Jamnik, Pietro Liò

Abstract: The success of Large Language Models (LLMs), e.g., ChatGPT, is witnessed by their planetary popularity, their capability of human-like communication, and also by their steadily improved reasoning performance. However, it remains unclear whether LLMs reason. It is an open problem how traditional neural networks can be qualitatively extended to go beyond the statistic paradigm and achieve high-level… ▽ More The success of Large Language Models (LLMs), e.g., ChatGPT, is witnessed by their planetary popularity, their capability of human-like communication, and also by their steadily improved reasoning performance. However, it remains unclear whether LLMs reason. It is an open problem how traditional neural networks can be qualitatively extended to go beyond the statistic paradigm and achieve high-level cognition. Here, we present a novel qualitative extension by generalising computational building blocks from vectors to spheres. We propose Sphere Neural Networks (SphNNs) for human-like reasoning through model construction and inspection, and develop SphNN for syllogistic reasoning, a microcosm of human rationality. SphNN is a hierarchical neuro-symbolic Kolmogorov-Arnold geometric GNN, and uses a neuro-symbolic transition map of neighbourhood spatial relations to transform the current sphere configuration towards the target. SphNN is the first neural model that can determine the validity of long-chained syllogistic reasoning in one epoch without training data, with the worst computational complexity of O(N). SphNN can evolve into various types of reasoning, such as spatio-temporal reasoning, logical reasoning with negation and disjunction, event reasoning, neuro-symbolic unification, and humour understanding (the highest level of cognition). All these suggest a new kind of Herbert A. Simon's scissors with two neural blades. SphNNs will tremendously enhance interdisciplinary collaborations to develop the two neural blades and realise deterministic neural reasoning and human-bounded rationality and elevate LLMs to reliable psychological AI. This work suggests that the non-zero radii of spheres are the missing components that prevent traditional deep-learning systems from reaching the realm of rational reasoning and cause LLMs to be trapped in the swamp of hallucination. △ Less

Submitted 24 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

arXiv:2312.00374 [pdf, other]

The Philosopher's Stone: Trojaning Plugins of Large Language Models

Authors: Tian Dong, Minhui Xue, Guoxing Chen, Rayne Holland, Shaofeng Li, Yan Meng, Zhen Liu, Hao** Zhu

Abstract: Open-source Large Language Models (LLMs) have recently gained popularity because of their comparable performance to proprietary LLMs. To efficiently fulfill domain-specialized tasks, open-source LLMs can be refined, without expensive accelerators, using low-rank adapters. However, it is still unknown whether low-rank adapters can be exploited to control LLMs. To address this gap, we demonstrate th… ▽ More Open-source Large Language Models (LLMs) have recently gained popularity because of their comparable performance to proprietary LLMs. To efficiently fulfill domain-specialized tasks, open-source LLMs can be refined, without expensive accelerators, using low-rank adapters. However, it is still unknown whether low-rank adapters can be exploited to control LLMs. To address this gap, we demonstrate that an infected adapter can induce, on specific triggers, an LLM to output content defined by an adversary and to even maliciously use tools. To train a Trojan adapter, we propose two novel attacks, POLISHED and FUSION, that improve over prior approaches. POLISHED uses LLM-enhanced paraphrasing to polish benchmark poisoned datasets. In contrast, in the absence of a dataset, FUSION leverages an over-poisoning procedure to transform a benign adaptor. In our experiments, we first conduct two case studies to demonstrate that a compromised LLM agent can execute malware to control system (e.g., LLM-driven robot) or launch a spear-phishing attack. Then, in terms of targeted misinformation, we show that our attacks provide higher attack effectiveness than the baseline and, for the purpose of attracting downloads, preserve or improve the adapter's utility. Finally, we design and evaluate three potential defenses, yet none proved entirely effective in safeguarding against our attacks. △ Less

Submitted 13 March, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

arXiv:2311.07766 [pdf, other]

Vision-Language Integration in Multimodal Video Transformers (Partially) Aligns with the Brain

Authors: Dota Tianai Dong, Mariya Toneva

Abstract: Integrating information from multiple modalities is arguably one of the essential prerequisites for grounding artificial intelligence systems with an understanding of the real world. Recent advances in video transformers that jointly learn from vision, text, and sound over time have made some progress toward this goal, but the degree to which these models integrate information from modalities stil… ▽ More Integrating information from multiple modalities is arguably one of the essential prerequisites for grounding artificial intelligence systems with an understanding of the real world. Recent advances in video transformers that jointly learn from vision, text, and sound over time have made some progress toward this goal, but the degree to which these models integrate information from modalities still remains unclear. In this work, we present a promising approach for probing a pre-trained multimodal video transformer model by leveraging neuroscientific evidence of multimodal information processing in the brain. Using brain recordings of participants watching a popular TV show, we analyze the effects of multi-modal connections and interactions in a pre-trained multi-modal video transformer on the alignment with uni- and multi-modal brain regions. We find evidence that vision enhances masked prediction performance during language processing, providing support that cross-modal representations in models can benefit individual modalities. However, we don't find evidence of brain-relevant information captured by the joint multi-modal transformer representations beyond that captured by all of the individual modalities. We finally show that the brain alignment of the pre-trained joint representation can be improved by fine-tuning using a task that requires vision-language inferences. Overall, our results paint an optimistic picture of the ability of multi-modal transformers to integrate vision and language in partially brain-relevant ways but also show that improving the brain alignment of these models may require new approaches. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2307.16663 [pdf, other]

Word Sense Disambiguation as a Game of Neurosymbolic Darts

Authors: Tiansi Dong, Rafet Sifa

Abstract: Word Sense Disambiguation (WSD) is one of the hardest tasks in natural language understanding and knowledge engineering. The glass ceiling of 80% F1 score is recently achieved through supervised deep-learning, enriched by a variety of knowledge graphs. Here, we propose a novel neurosymbolic methodology that is able to push the F1 score above 90%. The core of our methodology is a neurosymbolic sens… ▽ More Word Sense Disambiguation (WSD) is one of the hardest tasks in natural language understanding and knowledge engineering. The glass ceiling of 80% F1 score is recently achieved through supervised deep-learning, enriched by a variety of knowledge graphs. Here, we propose a novel neurosymbolic methodology that is able to push the F1 score above 90%. The core of our methodology is a neurosymbolic sense embedding, in terms of a configuration of nested balls in n-dimensional space. The centre point of a ball well-preserves word embedding, which partially fix the locations of balls. Inclusion relations among balls precisely encode symbolic hypernym relations among senses, and enable simple logic deduction among sense embeddings, which cannot be realised before. We trained a Transformer to learn the map** from a contextualized word embedding to its sense ball embedding, just like playing the game of darts (a game of shooting darts into a dartboard). A series of experiments are conducted by utilizing pre-training n-ball embeddings, which have the coverage of around 70% training data and 75% testing data in the benchmark WSD corpus. The F1 scores in experiments range from 90.1% to 100.0% in all six groups of test data-sets (each group has 4 testing data with different sizes of n-ball embeddings). Our novel neurosymbolic methodology has the potential to break the ceiling of deep-learning approaches for WSD. Limitations and extensions of our current works are listed. △ Less

Submitted 25 July, 2023; originally announced July 2023.

arXiv:2305.10263 [pdf, other]

M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models

Authors: Chuang Liu, Renren **, Yuqi Ren, Linhao Yu, Tianyu Dong, Xiaohan Peng, Shuting Zhang, Jianxiang Peng, Peiyi Zhang, Qingqing Lyu, Xiaowen Su, Qun Liu, Deyi Xiong

Abstract: Large language models have recently made tremendous progress in a variety of aspects, e.g., cross-task generalization, instruction following. Comprehensively evaluating the capability of large language models in multiple tasks is of great importance. In this paper, we propose M3KE, a Massive Multi-Level Multi-Subject Knowledge Evaluation benchmark, which is developed to measure knowledge acquired… ▽ More Large language models have recently made tremendous progress in a variety of aspects, e.g., cross-task generalization, instruction following. Comprehensively evaluating the capability of large language models in multiple tasks is of great importance. In this paper, we propose M3KE, a Massive Multi-Level Multi-Subject Knowledge Evaluation benchmark, which is developed to measure knowledge acquired by Chinese large language models by testing their multitask accuracy in zero- and few-shot settings. We have collected 20,477 questions from 71 tasks. Our selection covers all major levels of Chinese education system, ranging from the primary school to college, as well as a wide variety of subjects, including humanities, history, politics, law, education, psychology, science, technology, art and religion. All questions are multiple-choice questions with four options, hence guaranteeing a standardized and unified assessment process. We've assessed a number of state-of-the-art open-source Chinese large language models on the proposed benchmark. The size of these models varies from 335M to 130B parameters. Experiment results demonstrate that they perform significantly worse than GPT-3.5 that reaches an accuracy of ~ 48% on M3KE. The dataset is available at https://github.com/tjunlp-lab/M3KE. △ Less

Submitted 20 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

arXiv:2304.10398 [pdf, other]

Multi-label Node Classification On Graph-Structured Data

Authors: Tianqi Zhao, Ngan Thi Dong, Alan Hanjalic, Megha Khosla

Abstract: Graph Neural Networks (GNNs) have shown state-of-the-art improvements in node classification tasks on graphs. While these improvements have been largely demonstrated in a multi-class classification scenario, a more general and realistic scenario in which each node could have multiple labels has so far received little attention. The first challenge in conducting focused studies on multi-label node… ▽ More Graph Neural Networks (GNNs) have shown state-of-the-art improvements in node classification tasks on graphs. While these improvements have been largely demonstrated in a multi-class classification scenario, a more general and realistic scenario in which each node could have multiple labels has so far received little attention. The first challenge in conducting focused studies on multi-label node classification is the limited number of publicly available multi-label graph datasets. Therefore, as our first contribution, we collect and release three real-world biological datasets and develop a multi-label graph generator to generate datasets with tunable properties. While high label similarity (high homophily) is usually attributed to the success of GNNs, we argue that a multi-label scenario does not follow the usual semantics of homophily and heterophily so far defined for a multi-class scenario. As our second contribution, we define homophily and Cross-Class Neighborhood Similarity for the multi-label scenario and provide a thorough analyses of the collected $9$ multi-label datasets. Finally, we perform a large-scale comparative study with $8$ methods and $9$ datasets and analyse the performances of the methods to assess the progress made by current state of the art in the multi-label node classification scenario. We release our benchmark at https://github.com/Tianqi-py/MLGNC. △ Less

Submitted 29 February, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

Comments: Published in TMLR 2023. Link: https://openreview.net/forum?id=EZhkV2BjDP

Journal ref: Transaction Of Machine Learning Research, 2835-8856, 2023

arXiv:2303.09158 [pdf, other]

Facial Affect Recognition based on Transformer Encoder and Audiovisual Fusion for the ABAW5 Challenge

Authors: Ziyang Zhang, Liuwei An, Zishun Cui, Ao xu, Tengteng Dong, Yueqi Jiang, **gyi Shi, Xin Liu, Xiao Sun, Meng Wang

Abstract: In this paper, we present our solutions for the 5th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW), which includes four sub-challenges of Valence-Arousal (VA) Estimation, Expression (Expr) Classification, Action Unit (AU) Detection and Emotional Reaction Intensity (ERI) Estimation. The 5th ABAW competition focuses on facial affect recognition utilizing different modalit… ▽ More In this paper, we present our solutions for the 5th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW), which includes four sub-challenges of Valence-Arousal (VA) Estimation, Expression (Expr) Classification, Action Unit (AU) Detection and Emotional Reaction Intensity (ERI) Estimation. The 5th ABAW competition focuses on facial affect recognition utilizing different modalities and datasets. In our work, we extract powerful audio and visual features using a large number of sota models. These features are fused by Transformer Encoder and TEMMA. Besides, to avoid the possible impact of large dimensional differences between various features, we design an Affine Module to align different features to the same dimension. Extensive experiments demonstrate that the superiority of the proposed method. For the VA Estimation sub-challenge, our method obtains the mean Concordance Correlation Coefficient (CCC) of 0.6066. For the Expression Classification sub-challenge, the average F1 Score is 0.4055. For the AU Detection sub-challenge, the average F1 Score is 0.5296. For the Emotional Reaction Intensity Estimation sub-challenge, the average pearson's correlations coefficient on the validation set is 0.3968. All of the results of four sub-challenges outperform the baseline with a large margin. △ Less

Submitted 20 March, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

arXiv:2212.11751 [pdf, other]

Mind Your Heart: Stealthy Backdoor Attack on Dynamic Deep Neural Network in Edge Computing

Authors: Tian Dong, Ziyuan Zhang, Han Qiu, Tianwei Zhang, Hewu Li, Terry Wang

Abstract: Transforming off-the-shelf deep neural network (DNN) models into dynamic multi-exit architectures can achieve inference and transmission efficiency by fragmenting and distributing a large DNN model in edge computing scenarios (e.g., edge devices and cloud servers). In this paper, we propose a novel backdoor attack specifically on the dynamic multi-exit DNN models. Particularly, we inject a backdoo… ▽ More Transforming off-the-shelf deep neural network (DNN) models into dynamic multi-exit architectures can achieve inference and transmission efficiency by fragmenting and distributing a large DNN model in edge computing scenarios (e.g., edge devices and cloud servers). In this paper, we propose a novel backdoor attack specifically on the dynamic multi-exit DNN models. Particularly, we inject a backdoor by poisoning one DNN model's shallow hidden layers targeting not this vanilla DNN model but only its dynamically deployed multi-exit architectures. Our backdoored vanilla model behaves normally on performance and cannot be activated even with the correct trigger. However, the backdoor will be activated when the victims acquire this model and transform it into a dynamic multi-exit architecture at their deployment. We conduct extensive experiments to prove the effectiveness of our attack on three structures (ResNet-56, VGG-16, and MobileNet) with four datasets (CIFAR-10, SVHN, GTSRB, and Tiny-ImageNet) and our backdoor is stealthy to evade multiple state-of-the-art backdoor detection or removal methods. △ Less

Submitted 22 December, 2022; originally announced December 2022.

Comments: Accepted to IEEE INFOCOM 2023

arXiv:2206.00240 [pdf, other]

Privacy for Free: How does Dataset Condensation Help Privacy?

Authors: Tian Dong, Bo Zhao, Lingjuan Lyu

Abstract: To prevent unintentional data leakage, research community has resorted to data generators that can produce differentially private data for model training. However, for the sake of the data privacy, existing solutions suffer from either expensive training cost or poor generalization performance. Therefore, we raise the question whether training efficiency and privacy can be achieved simultaneously.… ▽ More To prevent unintentional data leakage, research community has resorted to data generators that can produce differentially private data for model training. However, for the sake of the data privacy, existing solutions suffer from either expensive training cost or poor generalization performance. Therefore, we raise the question whether training efficiency and privacy can be achieved simultaneously. In this work, we for the first time identify that dataset condensation (DC) which is originally designed for improving training efficiency is also a better solution to replace the traditional data generators for private data generation, thus providing privacy for free. To demonstrate the privacy benefit of DC, we build a connection between DC and differential privacy, and theoretically prove on linear feature extractors (and then extended to non-linear feature extractors) that the existence of one sample has limited impact ($O(m/n)$) on the parameter distribution of networks trained on $m$ samples synthesized from $n (n \gg m)$ raw samples by DC. We also empirically validate the visual privacy and membership privacy of DC-synthesized data by launching both the loss-based and the state-of-the-art likelihood-based membership inference attacks. We envision this work as a milestone for data-efficient and privacy-preserving machine learning. △ Less

Submitted 1 June, 2022; originally announced June 2022.

Comments: Accepted by ICML 2022 as Oral

arXiv:2202.02139 [pdf, other]

Multi Objective Resource Optimization of Wireless Network Based on Cross Domain Virtual Network Embedding

Authors: Chao Wang, Tao Dong, Youxiang Duan, Qifeng Sun, Peiying Zhang

Abstract: The rapid development of virtual network architecture makes it possible for wireless network to be widely used. With the popularity of artificial intelligence (AI) industry in daily life, efficient resource allocation of wireless network has become a problem. Especially when network users request wireless network resources from different management domains, they still face many practical problems.… ▽ More The rapid development of virtual network architecture makes it possible for wireless network to be widely used. With the popularity of artificial intelligence (AI) industry in daily life, efficient resource allocation of wireless network has become a problem. Especially when network users request wireless network resources from different management domains, they still face many practical problems. From the perspective of virtual network embedding (VNE), this paper designs and implements a multi-objective optimization VNE algorithm for wireless network resource allocation. Resource allocation in virtual network is essentially a problem of allocating underlying resources for virtual network requests (VNRs). According to the proposed objective formula, we consider the optimization map** cost, network delay and VNR acceptance rate. VNE is completed by node map** and link map**. In the experiment and simulation stage, it is compared with other VNE algorithms, the cross domain VNE algorithm proposed in this paper is optimal in the above three indicators. This shows the effectiveness of the algorithm in wireless network resource allocation. △ Less

Submitted 3 February, 2022; originally announced February 2022.

arXiv:2201.03134 [pdf, other]

An Interpretable Federated Learning-based Network Intrusion Detection Framework

Authors: Tian Dong, Song Li, Han Qiu, Jialiang Lu

Abstract: Learning-based Network Intrusion Detection Systems (NIDSs) are widely deployed for defending various cyberattacks. Existing learning-based NIDS mainly uses Neural Network (NN) as a classifier that relies on the quality and quantity of cyberattack data. Such NN-based approaches are also hard to interpret for improving efficiency and scalability. In this paper, we design a new local-global computati… ▽ More Learning-based Network Intrusion Detection Systems (NIDSs) are widely deployed for defending various cyberattacks. Existing learning-based NIDS mainly uses Neural Network (NN) as a classifier that relies on the quality and quantity of cyberattack data. Such NN-based approaches are also hard to interpret for improving efficiency and scalability. In this paper, we design a new local-global computation paradigm, FEDFOREST, a novel learning-based NIDS by combining the interpretable Gradient Boosting Decision Tree (GBDT) and Federated Learning (FL) framework. Specifically, FEDFOREST is composed of multiple clients that extract local cyberattack data features for the server to train models and detect intrusions. A privacy-enhanced technology is also proposed in FEDFOREST to further defeat the privacy of the FL systems. Extensive experiments on 4 cyberattack datasets of different tasks demonstrate that FEDFOREST is effective, efficient, interpretable, and extendable. FEDFOREST ranks first in the collaborative learning and cybersecurity competition 2021 for Chinese college students. △ Less

Submitted 9 January, 2022; originally announced January 2022.

Comments: 12 pages, draft

arXiv:2111.13346 [pdf, other]

doi 10.1186/s12859?021?04484?y

A multitask transfer learning framework for the prediction of virus-human protein-protein interactions

Authors: Thi Ngan Dong, Graham Brogden, Gisa Gerold, Megha Khosla

Abstract: Viral infections are causing significant morbidity and mortality worldwide. Understanding the interaction patterns between a particular virus and human proteins plays a crucial role in unveiling the underlying mechanism of viral infection and pathogenesis. This could further help in the prevention and treatment of virus-related diseases. However, the task of predicting protein-protein interactions… ▽ More Viral infections are causing significant morbidity and mortality worldwide. Understanding the interaction patterns between a particular virus and human proteins plays a crucial role in unveiling the underlying mechanism of viral infection and pathogenesis. This could further help in the prevention and treatment of virus-related diseases. However, the task of predicting protein-protein interactions between a new virus and human cells is extremely challenging due to scarce data on virus-human interactions and fast mutation rates of most viruses. We developed a multitask transfer learning approach that exploits the information of around 24 million protein sequences and the interaction patterns from the human interactome to counter the problem of small training datasets. Instead of using hand-crafted protein features, we utilize statistically rich protein representations learned by a deep language modeling approach from a massive source of protein sequences. Additionally, we employ an additional objective which aims to maximize the probability of observing human protein-protein interactions. This additional task objective acts as a regularizer and also allows to incorporate domain knowledge to inform the virus-human protein-protein interaction prediction model. Our approach achieved competitive results on 13 benchmark datasets and the case study for the SAR-CoV-2 virus receptor. Experimental results show that our proposed model works effectively for both virus-human and bacteria-human protein-protein interaction prediction tasks. We share our code for reproducibility and future research at https://git.l3s.uni-hannover.de/dong/multitask-transfer. △ Less

Submitted 26 November, 2021; originally announced November 2021.

Journal ref: BMC Bioinformatics 2021

arXiv:2111.10085 [pdf, other]

Mate! Are You Really Aware? An Explainability-Guided Testing Framework for Robustness of Malware Detectors

Authors: Ruoxi Sun, Minhui Xue, Gareth Tyson, Tian Dong, Shaofeng Li, Shuo Wang, Hao** Zhu, Seyit Camtepe, Surya Nepal

Abstract: Numerous open-source and commercial malware detectors are available. However, their efficacy is threatened by new adversarial attacks, whereby malware attempts to evade detection, e.g., by performing feature-space manipulation. In this work, we propose an explainability-guided and model-agnostic testing framework for robustness of malware detectors when confronted with adversarial attacks. The fra… ▽ More Numerous open-source and commercial malware detectors are available. However, their efficacy is threatened by new adversarial attacks, whereby malware attempts to evade detection, e.g., by performing feature-space manipulation. In this work, we propose an explainability-guided and model-agnostic testing framework for robustness of malware detectors when confronted with adversarial attacks. The framework introduces the concept of Accrued Malicious Magnitude (AMM) to identify which malware features could be manipulated to maximize the likelihood of evading detection. We then use this framework to test several state-of-the-art malware detectors' abilities to detect manipulated malware. We find that (i) commercial antivirus engines are vulnerable to AMM-guided test cases; (ii) the ability of a manipulated malware generated using one detector to evade detection by another detector (i.e., transferability) depends on the overlap of features with large AMM values between the different detectors; and (iii) AMM values effectively measure the fragility of features (i.e., capability of feature-space manipulation to flip the prediction results) and explain the robustness of malware detectors facing evasion attacks. Our findings shed light on the limitations of current malware detectors, as well as how they can be improved. △ Less

Submitted 27 November, 2023; v1 submitted 19 November, 2021; originally announced November 2021.

Comments: Accepted at ESEC/FSE 2023. https://doi.org/10.1145/3611643.3616309

arXiv:2110.03175 [pdf, other]

Fingerprinting Multi-exit Deep Neural Network Models via Inference Time

Authors: Tian Dong, Han Qiu, Tianwei Zhang, Jiwei Li, Hewu Li, Jialiang Lu

Abstract: Transforming large deep neural network (DNN) models into the multi-exit architectures can overcome the overthinking issue and distribute a large DNN model on resource-constrained scenarios (e.g. IoT frontend devices and backend servers) for inference and transmission efficiency. Nevertheless, intellectual property (IP) protection for the multi-exit models in the wild is still an unsolved challenge… ▽ More Transforming large deep neural network (DNN) models into the multi-exit architectures can overcome the overthinking issue and distribute a large DNN model on resource-constrained scenarios (e.g. IoT frontend devices and backend servers) for inference and transmission efficiency. Nevertheless, intellectual property (IP) protection for the multi-exit models in the wild is still an unsolved challenge. Previous efforts to verify DNN model ownership mainly rely on querying the model with specific samples and checking the responses, e.g., DNN watermarking and fingerprinting. However, they are vulnerable to adversarial settings such as adversarial training and are not suitable for the IP verification for multi-exit DNN models. In this paper, we propose a novel approach to fingerprint multi-exit models via inference time rather than inference predictions. Specifically, we design an effective method to generate a set of fingerprint samples to craft the inference process with a unique and robust inference time cost as the evidence for model ownership. We conduct extensive experiments to prove the uniqueness and robustness of our method on three structures (ResNet-56, VGG-16, and MobileNet) and three datasets (CIFAR-10, CIFAR-100, and Tiny-ImageNet) under comprehensive adversarial settings. △ Less

Submitted 7 October, 2021; originally announced October 2021.

arXiv:2108.04820 [pdf, other]

MuCoMiD: A Multitask Convolutional Learning Framework for miRNA-Disease Association Prediction

Authors: Thi Ngan Dong, Megha Khosla

Abstract: Growing evidence from recent studies implies that microRNA or miRNA could serve as biomarkers in various complex human diseases. Since wet-lab experiments are expensive and time-consuming, computational techniques for miRNA-disease association prediction have attracted a lot of attention in recent years. Data scarcity is one of the major challenges in building reliable machine learning models. Dat… ▽ More Growing evidence from recent studies implies that microRNA or miRNA could serve as biomarkers in various complex human diseases. Since wet-lab experiments are expensive and time-consuming, computational techniques for miRNA-disease association prediction have attracted a lot of attention in recent years. Data scarcity is one of the major challenges in building reliable machine learning models. Data scarcity combined with the use of precalculated hand-crafted input features has led to problems of overfitting and data leakage. We overcome the limitations of existing works by proposing a novel multi-tasking graph convolution-based approach, which we refer to as MuCoMiD. MuCoMiD allows automatic feature extraction while incorporating knowledge from five heterogeneous biological information sources (interactions between miRNA/diseases and protein-coding genes (PCG), interactions between protein-coding genes, miRNA family information, and disease ontology) in a multi-task setting which is a novel perspective and has not been studied before. To effectively test the generalization capability of our model, we construct large-scale experiments on standard benchmark datasets as well as our proposed larger independent test sets and case studies. MuCoMiD shows an improvement of at least 3% in 5-fold CV evaluation on HMDDv2.0 and HMDDv3.0 datasets and at least 35% on larger independent test sets with unseen miRNA and diseases over state-of-the-art approaches. We share our code for reproducibility and future research at https://git.l3s.uni-hannover.de/dong/cmtt. △ Less

Submitted 29 November, 2021; v1 submitted 8 August, 2021; originally announced August 2021.

arXiv:2106.04174 [pdf, other]

Interpretable and Low-Resource Entity Matching via Decoupling Feature Learning from Decision Making

Authors: Zijun Yao, Chengjiang Li, Tiansi Dong, Xin Lv, Jifan Yu, Lei Hou, Juanzi Li, Yichi Zhang, Zelin Dai

Abstract: Entity Matching (EM) aims at recognizing entity records that denote the same real-world object. Neural EM models learn vector representation of entity descriptions and match entities end-to-end. Though robust, these methods require many resources for training, and lack of interpretability. In this paper, we propose a novel EM framework that consists of Heterogeneous Information Fusion (HIF) and Ke… ▽ More Entity Matching (EM) aims at recognizing entity records that denote the same real-world object. Neural EM models learn vector representation of entity descriptions and match entities end-to-end. Though robust, these methods require many resources for training, and lack of interpretability. In this paper, we propose a novel EM framework that consists of Heterogeneous Information Fusion (HIF) and Key Attribute Tree (KAT) Induction to decouple feature representation from matching decision. Using self-supervised learning and mask mechanism in pre-trained language modeling, HIF learns the embeddings of noisy attribute values by inter-attribute attention with unlabeled data. Using a set of comparison features and a limited amount of annotated data, KAT Induction learns an efficient decision tree that can be interpreted by generating entity matching rules whose structure is advocated by domain experts. Experiments on 6 public datasets and 3 industrial datasets show that our method is highly efficient and outperforms SOTA EM models in most cases. Our codes and datasets can be obtained from https://github.com/THU-KEG/HIF-KAT. △ Less

Submitted 8 June, 2021; originally announced June 2021.

arXiv:2105.00164 [pdf, other]

Hidden Backdoors in Human-Centric Language Models

Authors: Shaofeng Li, Hui Liu, Tian Dong, Benjamin Zi Hao Zhao, Minhui Xue, Hao** Zhu, Jialiang Lu

Abstract: Natural language processing (NLP) systems have been proven to be vulnerable to backdoor attacks, whereby hidden features (backdoors) are trained into a language model and may only be activated by specific inputs (called triggers), to trick the model into producing unexpected behaviors. In this paper, we create covert and natural triggers for textual backdoor attacks, \textit{hidden backdoors}, whe… ▽ More Natural language processing (NLP) systems have been proven to be vulnerable to backdoor attacks, whereby hidden features (backdoors) are trained into a language model and may only be activated by specific inputs (called triggers), to trick the model into producing unexpected behaviors. In this paper, we create covert and natural triggers for textual backdoor attacks, \textit{hidden backdoors}, where triggers can fool both modern language models and human inspection. We deploy our hidden backdoors through two state-of-the-art trigger embedding methods. The first approach via homograph replacement, embeds the trigger into deep neural networks through the visual spoofing of lookalike character replacement. The second approach uses subtle differences between text generated by language models and real natural text to produce trigger sentences with correct grammar and high fluency. We demonstrate that the proposed hidden backdoors can be effective across three downstream security-critical NLP tasks, representative of modern human-centric NLP systems, including toxic comment detection, neural machine translation (NMT), and question answering (QA). Our two hidden backdoor attacks can achieve an Attack Success Rate (ASR) of at least $97\%$ with an injection rate of only $3\%$ in toxic comment detection, $95.1\%$ ASR in NMT with less than $0.5\%$ injected data, and finally $91.12\%$ ASR against QA updated with only 27 poisoning data samples on a model previously trained with 92,024 samples (0.029\%). We are able to demonstrate the adversary's high success rate of attacks, while maintaining functionality for regular users, with triggers inconspicuous by the human administrators. △ Less

Submitted 28 September, 2021; v1 submitted 1 May, 2021; originally announced May 2021.

arXiv:2010.15015 [pdf, ps, other]

Towards Supporting Programming Education at Scale via Live Streaming

Authors: Yan Chen, Walter S. Lasecki, Tao Dong

Abstract: Live streaming, which allows streamers to broadcast their work to live viewers, is an emerging practice for teaching and learning computer programming. Participation in live streaming is growing rapidly, despite several apparent challenges, such as a general lack of training in pedagogy among streamers and scarce signals about a stream's characteristics (e.g., difficulty, style, and usefulness) to… ▽ More Live streaming, which allows streamers to broadcast their work to live viewers, is an emerging practice for teaching and learning computer programming. Participation in live streaming is growing rapidly, despite several apparent challenges, such as a general lack of training in pedagogy among streamers and scarce signals about a stream's characteristics (e.g., difficulty, style, and usefulness) to help viewers decide what to watch. To understand why people choose to participate in live streaming for teaching or learning programming, and how they cope with both apparent and non-obvious challenges, we interviewed 14 streamers and 12 viewers about their experience with live streaming programming. Among other results, we found that the casual and impromptu nature of live streaming makes it easier to prepare than pre-recorded videos, and viewers have the opportunity to shape the content and learning experience via real-time communication with both the streamer and each other. Nonetheless, we identified several challenges that limit the potential of live streaming as a learning medium. For example, streamers voiced privacy and harassment concerns, and existing streaming platforms do not adequately support viewer-streamer interactions, adaptive learning, and discovery and selection of streaming content. Based on these findings, we suggest specialized tools to facilitate knowledge sharing among people teaching and learning computer programming online, and we offer design recommendations that promote a healthy, safe, and engaging learning environment. △ Less

Submitted 28 October, 2020; originally announced October 2020.

Comments: Accepted to ACM CSCW 2020

ACM Class: H.5.1; J.4

arXiv:2007.07320 [pdf, other]

Learning Syllogism with Euler Neural-Networks

Authors: Tiansi Dong, Chengjiang Li, Christian Bauckhage, Juanzi Li, Stefan Wrobel, Armin B. Cremers

Abstract: Traditional neural networks represent everything as a vector, and are able to approximate a subset of logical reasoning to a certain degree. As basic logic relations are better represented by topological relations between regions, we propose a novel neural network that represents everything as a ball and is able to learn topological configuration as an Euler diagram. So comes the name Euler Neural… ▽ More Traditional neural networks represent everything as a vector, and are able to approximate a subset of logical reasoning to a certain degree. As basic logic relations are better represented by topological relations between regions, we propose a novel neural network that represents everything as a ball and is able to learn topological configuration as an Euler diagram. So comes the name Euler Neural-Network (ENN). The central vector of a ball is a vector that can inherit representation power of traditional neural network. ENN distinguishes four spatial statuses between balls, namely, being disconnected, being partially overlapped, being part of, being inverse part of. Within each status, ideal values are defined for efficient reasoning. A novel back-propagation algorithm with six Rectified Spatial Units (ReSU) can optimize an Euler diagram representing logical premises, from which logical conclusion can be deduced. In contrast to traditional neural network, ENN can precisely represent all 24 different structures of Syllogism. Two large datasets are created: one extracted from WordNet-3.0 covers all types of Syllogism reasoning, the other extracted all family relations from DBpedia. Experiment results approve the superior power of ENN in logical representation and reasoning. Datasets and source code are available upon request. △ Less

Submitted 20 July, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: 16 pages, 6 figures

arXiv:1910.06732 [pdf, other]

Cross-sectional Urban Scaling Fails in Predicting Temporal Growth of Cities

Authors: Gang Xu, Zhengzi Zhou, Limin Jiao, Ting Dong, Ruiqi Li

Abstract: Numerous urban indicators scale with population in a power law across cities, but whether the cross-sectional scaling law is applicable to the temporal growth of individual cities is unclear. Here we first find two paradoxical scaling relationships that urban built-up area sub-linearly scales with population across cities, but super-linearly scales with population over time in most individual citi… ▽ More Numerous urban indicators scale with population in a power law across cities, but whether the cross-sectional scaling law is applicable to the temporal growth of individual cities is unclear. Here we first find two paradoxical scaling relationships that urban built-up area sub-linearly scales with population across cities, but super-linearly scales with population over time in most individual cities because urban land expands faster than population grows. Different cities have diverse temporal scaling exponents and one city even has opposite temporal scaling regimes during two periods, strongly supporting the absence of single temporal scaling and further illustrating the failure of cross-sectional urban scaling in predicting temporal growth of cities. We propose a conceptual model that can clarify the essential difference and also connections between the cross-sectional scaling law and temporal trajectories of cities. Our model shows that cities have an extra growth of built-up area over time besides the supposed growth predicted by the cross-sectional scaling law. Disparities of extra growth among different-sized cities change the cross-sectional scaling exponent. Further analyses of GDP and other indicators confirm the contradiction between cross-sectional and temporal scaling relationships and the validity of the conceptual model. Our findings may open a new avenue towards the science of cities. △ Less

Submitted 17 October, 2019; v1 submitted 13 October, 2019; originally announced October 2019.

arXiv:1905.09483 [pdf, other]

Towards Generation and Evaluation of Comprehensive Map** Robot Datasets

Authors: Hongyu Chen, Xiting Zhao, Jianwen Luo, Zhijie Yang, Zehao Zhao, Haochuan Wan, Xiaoya Ye, Guangyuan Weng, Zhenpeng He, Tian Dong, Sören Schwertfeger

Abstract: This paper presents a fully hardware synchronized map** robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. We also employ a professional, static 3D scanner for ground truth map collection. Three datasets are generated to evaluate the performance of map** algorithms within a room and between rooms. Based on these datasets we gener… ▽ More This paper presents a fully hardware synchronized map** robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. We also employ a professional, static 3D scanner for ground truth map collection. Three datasets are generated to evaluate the performance of map** algorithms within a room and between rooms. Based on these datasets we generate maps and trajectory data, which is then fed into evaluation algorithms. The map** and evaluation procedures are made in a very easily reproducible manner for maximum comparability. In the end we can draw a couple of conclusions about the tested SLAM algorithms. △ Less

Submitted 24 August, 2019; v1 submitted 23 May, 2019; originally announced May 2019.

arXiv:1811.10776 [pdf]

Joint Representation Learning of Cross-lingual Words and Entities via Attentive Distant Supervision

Authors: Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Chengjiang Li, Xu Chen, Tiansi Dong

Abstract: Joint representation learning of words and entities benefits many NLP tasks, but has not been well explored in cross-lingual settings. In this paper, we propose a novel method for joint representation learning of cross-lingual words and entities. It captures mutually complementary knowledge, and enables cross-lingual inferences among knowledge bases and texts. Our method does not require parallel… ▽ More Joint representation learning of words and entities benefits many NLP tasks, but has not been well explored in cross-lingual settings. In this paper, we propose a novel method for joint representation learning of cross-lingual words and entities. It captures mutually complementary knowledge, and enables cross-lingual inferences among knowledge bases and texts. Our method does not require parallel corpora, and automatically generates comparable data via distant supervision using multi-lingual knowledge bases. We utilize two types of regularizers to align cross-lingual words and entities, and design knowledge attention and cross-lingual attention to further reduce noises. We conducted a series of experiments on three tasks: word translation, entity relatedness, and cross-lingual entity linking. The results, both qualitatively and quantitatively, demonstrate the significance of our method. △ Less

Submitted 26 November, 2018; originally announced November 2018.

Comments: 11 pages, EMNLP2018

arXiv:1807.04191 [pdf, other]

A Computational Method for Evaluating UI Patterns

Authors: Bardia Doosti, Tao Dong, Biplab Deka, Jeffrey Nichols

Abstract: UI design languages, such as Google's Material Design, make applications both easier to develop and easier to learn by providing a set of standard UI components. Nonetheless, it is hard to assess the impact of design languages in the wild. Moreover, designers often get stranded by strong-opinionated debates around the merit of certain UI components, such as the Floating Action Button and the Navig… ▽ More UI design languages, such as Google's Material Design, make applications both easier to develop and easier to learn by providing a set of standard UI components. Nonetheless, it is hard to assess the impact of design languages in the wild. Moreover, designers often get stranded by strong-opinionated debates around the merit of certain UI components, such as the Floating Action Button and the Navigation Drawer. To address these challenges, this short paper introduces a method for measuring the impact of design languages and informing design debates through analyzing a dataset consisting of view hierarchies, screenshots, and app metadata for more than 9,000 mobile apps. Our data analysis shows that use of Material Design is positively correlated to app ratings, and to some extent, also the number of installs. Furthermore, we show that use of UI components vary by app category, suggesting a more nuanced view needed in design debates. △ Less

Submitted 11 July, 2018; originally announced July 2018.

arXiv:1806.01743 [pdf, other]

A Machine Learning Framework for Stock Selection

Authors: XingYu Fu, **Hong Du, YiFeng Guo, MingWen Liu, Tao Dong, XiuWen Duan

Abstract: This paper demonstrates how to apply machine learning algorithms to distinguish good stocks from the bad stocks. To this end, we construct 244 technical and fundamental features to characterize each stock, and label stocks according to their ranking with respect to the return-to-volatility ratio. Algorithms ranging from traditional statistical learning methods to recently popular deep learning met… ▽ More This paper demonstrates how to apply machine learning algorithms to distinguish good stocks from the bad stocks. To this end, we construct 244 technical and fundamental features to characterize each stock, and label stocks according to their ranking with respect to the return-to-volatility ratio. Algorithms ranging from traditional statistical learning methods to recently popular deep learning method, e.g. Logistic Regression (LR), Random Forest (RF), Deep Neural Network (DNN), and the Stacking, are trained to solve the classification task. Genetic Algorithm (GA) is also used to implement feature selection. The effectiveness of the stock selection strategy is validated in Chinese stock market in both statistical and practical aspects, showing that: 1) Stacking outperforms other models reaching an AUC score of 0.972; 2) Genetic Algorithm picks a subset of 114 features and the prediction performances of all models remain almost unchanged after the selection procedure, which suggests some features are indeed redundant; 3) LR and DNN are radical models; RF is risk-neutral model; Stacking is somewhere between DNN and RF. 4) The portfolios constructed by our models outperform market average in back tests. △ Less

Submitted 8 August, 2018; v1 submitted 5 June, 2018; originally announced June 2018.

arXiv:1711.07835 [pdf]

Robust Object Tracking Based on Self-adaptive Search Area

Authors: Taihang Dong, Sheng Zhong

Abstract: Discriminative correlation filter (DCF) based trackers have recently achieved excellent performance with great computational efficiency. However, DCF based trackers suffer boundary effects, which result in unstable performance in challenging situations exhibiting fast motion. In this paper, we propose a novel method to mitigate this side-effect in DCF based trackers. We change the search area acco… ▽ More Discriminative correlation filter (DCF) based trackers have recently achieved excellent performance with great computational efficiency. However, DCF based trackers suffer boundary effects, which result in unstable performance in challenging situations exhibiting fast motion. In this paper, we propose a novel method to mitigate this side-effect in DCF based trackers. We change the search area according to the prediction of target motion. When the object moves fast, broad search area could alleviate boundary effects and reserve the probability of locating the object. When the object moves slowly, narrow search area could prevent effect of useless background information and improve computational efficiency to attain real-time performance. This strategy can impressively soothe boundary effects in situations exhibiting fast motion and motion blur, and it can be used in almost all DCF based trackers. The experiments on OTB benchmark show that the proposed framework improves the performance compared with the baseline trackers. △ Less

Submitted 15 November, 2020; v1 submitted 21 November, 2017; originally announced November 2017.

Comments: 10 pages, 4 figures, 3 tables, SPIE 10th International Symposium on Multispectral Image Processing and Pattern Recognition

arXiv:1711.07829 [pdf]

Discussion among Different Methods of Updating Model Filter in Object Tracking

Authors: Taihang Dong, Sheng Zhong

Abstract: Discriminative correlation filters (DCF) have recently shown excellent performance in visual object tracking area. In this paper, we summarize the methods of updating model filter from discriminative correlation filter (DCF) based tracking algorithms and analyzes similarities and differences among these methods. We deduce the relationship between updating coefficient in high dimension (kernel tric… ▽ More Discriminative correlation filters (DCF) have recently shown excellent performance in visual object tracking area. In this paper, we summarize the methods of updating model filter from discriminative correlation filter (DCF) based tracking algorithms and analyzes similarities and differences among these methods. We deduce the relationship between updating coefficient in high dimension (kernel trick), updating filter in frequency domain and updating filter in spatial domain, and analyze the difference among these different ways. We also analyze the difference between the updating filter directly and updating filter's numerator (object response power) with updating filter's denominator (filter's power). The experiments about comparing different updating methods and visualizing the template filters are used to prove our derivation. △ Less

Submitted 15 November, 2020; v1 submitted 21 November, 2017; originally announced November 2017.

Comments: 8 pages, 3 figures, SPIE 10th International Symposium on Multispectral Image Processing and Pattern Recognition

Showing 1–29 of 29 results for author: Dong, T