Search | arXiv e-print repository

An Automated Vulnerability Detection Framework for Smart Contracts

Authors: Feng Mi, Chen Zhao, Zhuoyi Wang, Sadaf MD Halim, Xiaodi Li, Zhouxiang Wu, Latifur Khan, Bhavani Thuraisingham

Abstract: With the increase of the adoption of blockchain technology in providing decentralized solutions to various problems, smart contracts have become more popular to the point that billions of US Dollars are currently exchanged every day through such technology. Meanwhile, various vulnerabilities in smart contracts have been exploited by attackers to steal cryptocurrencies worth millions of dollars. Th… ▽ More With the increase of the adoption of blockchain technology in providing decentralized solutions to various problems, smart contracts have become more popular to the point that billions of US Dollars are currently exchanged every day through such technology. Meanwhile, various vulnerabilities in smart contracts have been exploited by attackers to steal cryptocurrencies worth millions of dollars. The automatic detection of smart contract vulnerabilities therefore is an essential research problem. Existing solutions to this problem particularly rely on human experts to define features or different rules to detect vulnerabilities. However, this often causes many vulnerabilities to be ignored, and they are inefficient in detecting new vulnerabilities. In this study, to overcome such challenges, we propose a framework to automatically detect vulnerabilities in smart contracts on the blockchain. More specifically, first, we utilize novel feature vector generation techniques from bytecode of smart contract since the source code of smart contracts are rarely available in public. Next, the collected vectors are fed into our novel metric learning-based deep neural network(DNN) to get the detection result. We conduct comprehensive experiments on large-scale benchmarks, and the quantitative results demonstrate the effectiveness and efficiency of our approach. △ Less

Submitted 20 January, 2023; originally announced January 2023.

arXiv:2110.10287 [pdf, other]

Multi-concept adversarial attacks

Authors: Vibha Belavadi, Yan Zhou, Murat Kantarcioglu, Bhavani M. Thuraisingham

Abstract: As machine learning (ML) techniques are being increasingly used in many applications, their vulnerability to adversarial attacks becomes well-known. Test time attacks, usually launched by adding adversarial noise to test instances, have been shown effective against the deployed ML models. In practice, one test input may be leveraged by different ML models. Test time attacks targeting a single ML m… ▽ More As machine learning (ML) techniques are being increasingly used in many applications, their vulnerability to adversarial attacks becomes well-known. Test time attacks, usually launched by adding adversarial noise to test instances, have been shown effective against the deployed ML models. In practice, one test input may be leveraged by different ML models. Test time attacks targeting a single ML model often neglect their impact on other ML models. In this work, we empirically demonstrate that naively attacking the classifier learning one concept may negatively impact classifiers trained to learn other concepts. For example, for the online image classification scenario, when the Gender classifier is under attack, the (wearing) Glasses classifier is simultaneously attacked with the accuracy dropped from 98.69 to 88.42. This raises an interesting question: is it possible to attack one set of classifiers without impacting the other set that uses the same test instance? Answers to the above research question have interesting implications for protecting privacy against ML model misuse. Attacking ML models that pose unnecessary risks of privacy invasion can be an important tool for protecting individuals from harmful privacy exploitation. In this paper, we address the above research question by develo** novel attack techniques that can simultaneously attack one set of ML models while preserving the accuracy of the other. In the case of linear classifiers, we provide a theoretical framework for finding an optimal solution to generate such adversarial examples. Using this theoretical framework, we develop a multi-concept attack strategy in the context of deep learning. Our results demonstrate that our techniques can successfully attack the target classes while protecting the protected classes in many different settings, which is not possible with the existing test-time attack-single strategies. △ Less

Submitted 19 October, 2021; originally announced October 2021.

Comments: 20 pages, 28 figures, 9 tables

arXiv:2108.09435 [pdf, other]

Fairness-Aware Online Meta-learning

Authors: Chen Zhao, Feng Chen, Bhavani Thuraisingham

Abstract: In contrast to offline working fashions, two research paradigms are devised for online learning: (1) Online Meta Learning (OML) learns good priors over model parameters (or learning to learn) in a sequential setting where tasks are revealed one after another. Although it provides a sub-linear regret bound, such techniques completely ignore the importance of learning with fairness which is a signif… ▽ More In contrast to offline working fashions, two research paradigms are devised for online learning: (1) Online Meta Learning (OML) learns good priors over model parameters (or learning to learn) in a sequential setting where tasks are revealed one after another. Although it provides a sub-linear regret bound, such techniques completely ignore the importance of learning with fairness which is a significant hallmark of human intelligence. (2) Online Fairness-Aware Learning. This setting captures many classification problems for which fairness is a concern. But it aims to attain zero-shot generalization without any task-specific adaptation. This therefore limits the capability of a model to adapt onto newly arrived data. To overcome such issues and bridge the gap, in this paper for the first time we proposed a novel online meta-learning algorithm, namely FFML, which is under the setting of unfairness prevention. The key part of FFML is to learn good priors of an online fair classification model's primal and dual parameters that are associated with the model's accuracy and fairness, respectively. The problem is formulated in the form of a bi-level convex-concave optimization. Theoretic analysis provides sub-linear upper bounds for loss regret and for violation of cumulative fairness constraints. Our experiments demonstrate the versatility of FFML by applying it to classification on three real-world datasets and show substantial improvements over the best prior work on the tradeoff between fairness and classification accuracy △ Less

Submitted 21 August, 2021; originally announced August 2021.

Comments: KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

arXiv:2107.13639 [pdf, other]

Imbalanced Adversarial Training with Reweighting

Authors: Wentao Wang, Han Xu, Xiaorui Liu, Yaxin Li, Bhavani Thuraisingham, Jiliang Tang

Abstract: Adversarial training has been empirically proven to be one of the most effective and reliable defense methods against adversarial attacks. However, almost all existing studies about adversarial training are focused on balanced datasets, where each class has an equal amount of training examples. Research on adversarial training with imbalanced training datasets is rather limited. As the initial eff… ▽ More Adversarial training has been empirically proven to be one of the most effective and reliable defense methods against adversarial attacks. However, almost all existing studies about adversarial training are focused on balanced datasets, where each class has an equal amount of training examples. Research on adversarial training with imbalanced training datasets is rather limited. As the initial effort to investigate this problem, we reveal the facts that adversarially trained models present two distinguished behaviors from naturally trained models in imbalanced datasets: (1) Compared to natural training, adversarially trained models can suffer much worse performance on under-represented classes, when the training dataset is extremely imbalanced. (2) Traditional reweighting strategies may lose efficacy to deal with the imbalance issue for adversarial training. For example, upweighting the under-represented classes will drastically hurt the model's performance on well-represented classes, and as a result, finding an optimal reweighting value can be tremendously challenging. In this paper, to further understand our observations, we theoretically show that the poor data separability is one key reason causing this strong tension between under-represented and well-represented classes. Motivated by this finding, we propose Separable Reweighted Adversarial Training (SRAT) to facilitate adversarial training under imbalanced scenarios, by learning more separable features for different classes. Extensive experiments on various datasets verify the effectiveness of the proposed framework. △ Less

Submitted 28 July, 2021; originally announced July 2021.

arXiv:2104.01495 [pdf, other]

Towards Self-Adaptive Metric Learning On the Fly

Authors: Yang Gao, Yi-Fan Li, Swarup Chandra, Latifur Khan, Bhavani Thuraisingham

Abstract: Good quality similarity metrics can significantly facilitate the performance of many large-scale, real-world applications. Existing studies have proposed various solutions to learn a Mahalanobis or bilinear metric in an online fashion by either restricting distances between similar (dissimilar) pairs to be smaller (larger) than a given lower (upper) bound or requiring similar instances to be separ… ▽ More Good quality similarity metrics can significantly facilitate the performance of many large-scale, real-world applications. Existing studies have proposed various solutions to learn a Mahalanobis or bilinear metric in an online fashion by either restricting distances between similar (dissimilar) pairs to be smaller (larger) than a given lower (upper) bound or requiring similar instances to be separated from dissimilar instances with a given margin. However, these linear metrics learned by leveraging fixed bounds or margins may not perform well in real-world applications, especially when data distributions are complex. We aim to address the open challenge of "Online Adaptive Metric Learning" (OAML) for learning adaptive metric functions on the fly. Unlike traditional online metric learning methods, OAML is significantly more challenging since the learned metric could be non-linear and the model has to be self-adaptive as more instances are observed. In this paper, we present a new online metric learning framework that attempts to tackle the challenge by learning an ANN-based metric with adaptive model complexity from a stream of constraints. In particular, we propose a novel Adaptive-Bound Triplet Loss (ABTL) to effectively utilize the input constraints and present a novel Adaptive Hedge Update (AHU) method for online updating the model parameters. We empirically validate the effectiveness and efficacy of our framework on various applications such as real-world image classification, facial verification, and image retrieval. △ Less

Submitted 3 April, 2021; originally announced April 2021.

Comments: Accepted by WWW 2019 (Long Paper, Oral)

arXiv:2012.11810 [pdf, other]

Progressive One-shot Human Parsing

Authors: Haoyu He, **g Zhang, Bhavani Thuraisingham, Dacheng Tao

Abstract: Prior human parsing models are limited to parsing humans into classes pre-defined in the training data, which is not flexible to generalize to unseen classes, e.g., new clothing in fashion analysis. In this paper, we propose a new problem named one-shot human parsing (OSHP) that requires to parse human into an open set of reference classes defined by any single reference example. During training,… ▽ More Prior human parsing models are limited to parsing humans into classes pre-defined in the training data, which is not flexible to generalize to unseen classes, e.g., new clothing in fashion analysis. In this paper, we propose a new problem named one-shot human parsing (OSHP) that requires to parse human into an open set of reference classes defined by any single reference example. During training, only base classes defined in the training set are exposed, which can overlap with part of reference classes. In this paper, we devise a novel Progressive One-shot Parsing network (POPNet) to address two critical challenges , i.e., testing bias and small sizes. POPNet consists of two collaborative metric learning modules named Attention Guidance Module and Nearest Centroid Module, which can learn representative prototypes for base classes and quickly transfer the ability to unseen classes during testing, thereby reducing testing bias. Moreover, POPNet adopts a progressive human parsing framework that can incorporate the learned knowledge of parent classes at the coarse granularity to help recognize the descendant classes at the fine granularity, thereby handling the small sizes issue. Experiments on the ATR-OS benchmark tailored for OSHP demonstrate POPNet outperforms other representative one-shot segmentation models by large margins and establishes a strong baseline. Source code can be found at https://github.com/Charleshhy/One-shot-Human-Parsing. △ Less

Submitted 7 May, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

Comments: Accepted in AAAI 2021. 9 pages, 4 figures

arXiv:2012.07006 [pdf, other]

DeepSweep: An Evaluation Framework for Mitigating DNN Backdoor Attacks using Data Augmentation

Authors: Han Qiu, Yi Zeng, Shangwei Guo, Tianwei Zhang, Meikang Qiu, Bhavani Thuraisingham

Abstract: Public resources and services (e.g., datasets, training platforms, pre-trained models) have been widely adopted to ease the development of Deep Learning-based applications. However, if the third-party providers are untrusted, they can inject poisoned samples into the datasets or embed backdoors in those models. Such an integrity breach can cause severe consequences, especially in safety- and secur… ▽ More Public resources and services (e.g., datasets, training platforms, pre-trained models) have been widely adopted to ease the development of Deep Learning-based applications. However, if the third-party providers are untrusted, they can inject poisoned samples into the datasets or embed backdoors in those models. Such an integrity breach can cause severe consequences, especially in safety- and security-critical applications. Various backdoor attack techniques have been proposed for higher effectiveness and stealthiness. Unfortunately, existing defense solutions are not practical to thwart those attacks in a comprehensive way. In this paper, we investigate the effectiveness of data augmentation techniques in mitigating backdoor attacks and enhancing DL models' robustness. An evaluation framework is introduced to achieve this goal. Specifically, we consider a unified defense solution, which (1) adopts a data augmentation policy to fine-tune the infected model and eliminate the effects of the embedded backdoor; (2) uses another augmentation policy to preprocess input samples and invalidate the triggers during inference. We propose a systematic approach to discover the optimal policies for defending against different backdoor attacks by comprehensively evaluating 71 state-of-the-art data augmentation functions. Extensive experiments show that our identified policy can effectively mitigate eight different kinds of backdoor attacks and outperform five existing defense methods. We envision this framework can be a good benchmark tool to advance future DNN backdoor studies. △ Less

Submitted 11 April, 2021; v1 submitted 13 December, 2020; originally announced December 2020.

arXiv:2009.01267 [pdf]

COVID-19: The Information Warfare Paradigm Shift

Authors: Jan Kallberg, Rosemary A. Burk, Bhavani Thuraisingham

Abstract: In Kuhn's The Structure of Scientific Revolutions, the critical term is paradigm-shift when it suddenly becomes evident that earlier assumptions no longer are correct and the plurality of the scientific community that studies this domain accepts the change. These types of events can be scientific findings or as in social science system shock that creates a punctured equilibrium that sets the stage… ▽ More In Kuhn's The Structure of Scientific Revolutions, the critical term is paradigm-shift when it suddenly becomes evident that earlier assumptions no longer are correct and the plurality of the scientific community that studies this domain accepts the change. These types of events can be scientific findings or as in social science system shock that creates a punctured equilibrium that sets the stage in the developments. In information warfare, recent years studies and government lines of efforts have been to engage fake news, electoral interference, and fight extremist social media as the primary combat theater in the information space, and the tools to influence a targeted audience. The COVID-19 pandemic generates a rebuttal of these assumptions. Even if fake news and extremist social media content may exploit fault lines in our society and create a civil disturbance, tensions between federal and local government, and massive protests, it is still effects that impact a part of the population. What we have seen with COVID-19, as an indicator, is that what is related to public health is far more powerful to swing public sentiment and create reactions within the citizenry that are trigger impact at a larger magnitude that has rippled through society in multiple directions. △ Less

Submitted 2 September, 2020; originally announced September 2020.

arXiv:1908.06971 [pdf, other]

ChainNet: Learning on Blockchain Graphs with Topological Features

Authors: Nazmiye Ceren Abay, Cuneyt Gurcan Akcora, Yulia R. Gel, Umar D. Islambekov, Murat Kantarcioglu, Yahui Tian, Bhavani Thuraisingham

Abstract: With emergence of blockchain technologies and the associated cryptocurrencies, such as Bitcoin, understanding network dynamics behind Blockchain graphs has become a rapidly evolving research direction. Unlike other financial networks, such as stock and currency trading, blockchain based cryptocurrencies have the entire transaction graph accessible to the public (i.e., all transactions can be downl… ▽ More With emergence of blockchain technologies and the associated cryptocurrencies, such as Bitcoin, understanding network dynamics behind Blockchain graphs has become a rapidly evolving research direction. Unlike other financial networks, such as stock and currency trading, blockchain based cryptocurrencies have the entire transaction graph accessible to the public (i.e., all transactions can be downloaded and analyzed). A natural question is then to ask whether the dynamics of the transaction graph impacts the price of the underlying cryptocurrency. We show that standard graph features such as degree distribution of the transaction graph may not be sufficient to capture network dynamics and its potential impact on fluctuations of Bitcoin price. In contrast, the new graph associated topological features computed using the tools of persistent homology, are found to exhibit a high utility for predicting Bitcoin price dynamics. %explain higher order interactions among the nodes in Blockchain graphs and can be used to build much more accurate price prediction models. Using the proposed persistent homology-based techniques, we offer a new elegant, easily extendable and computationally light approach for graph representation learning on Blockchain. △ Less

Submitted 18 August, 2019; originally announced August 2019.

Comments: To Appear in the 2019 IEEE International Conference on Data Mining (ICDM)

arXiv:1703.08859 [pdf, ps, other]

The INSuRE Project: CAE-Rs Collaborate to Engage Students in Cybersecurity Research

Authors: Alan Sherman, M. Dark, A. Chan, R. Chong, T. Morris, L. Oliva, J. Springer, B. Thuraisingham, C. Vatcher, R. Verma, S. Wetzel

Abstract: Since fall 2012, several National Centers of Academic Excellence in Cyber Defense Research (CAE-Rs) fielded a collaborative course to engage students in solving applied cybersecurity research problems. We describe our experiences with this Information Security Research and Education (INSuRE) research collaborative. We explain how we conducted our project-based research course, give examples of stu… ▽ More Since fall 2012, several National Centers of Academic Excellence in Cyber Defense Research (CAE-Rs) fielded a collaborative course to engage students in solving applied cybersecurity research problems. We describe our experiences with this Information Security Research and Education (INSuRE) research collaborative. We explain how we conducted our project-based research course, give examples of student projects, and discuss the outcomes and lessons learned. △ Less

Submitted 26 March, 2017; originally announced March 2017.

Comments: A shorter version of this paper has been submitted to IEEE Security and Privacy

arXiv:1105.1982 [pdf, ps, other]

Secure Data Processing in a Hybrid Cloud

Authors: Vaibhav Khadilkar, Murat Kantarcioglu, Bhavani Thuraisingham, Sharad Mehrotra

Abstract: Cloud computing has made it possible for a user to be able to select a computing service precisely when needed. However, certain factors such as security of data and regulatory issues will impact a user's choice of using such a service. A solution to these problems is the use of a hybrid cloud that combines a user's local computing capabilities (for mission- or organization-critical tasks) with a… ▽ More Cloud computing has made it possible for a user to be able to select a computing service precisely when needed. However, certain factors such as security of data and regulatory issues will impact a user's choice of using such a service. A solution to these problems is the use of a hybrid cloud that combines a user's local computing capabilities (for mission- or organization-critical tasks) with a public cloud (for less influential tasks). We foresee three challenges that must be overcome before the adoption of a hybrid cloud approach: 1) data design: How to partition relations in a hybrid cloud? The solution to this problem must account for the sensitivity of attributes in a relation as well as the workload of a user; 2) data security: How to protect a user's data in a public cloud with encryption while enabling query processing over this encrypted data? and 3) query processing: How to execute queries efficiently over both, encrypted and unencrypted data? This paper addresses these challenges and incorporates their solutions into an add-on tool for a Hadoop and Hive based cloud computing infrastructure. △ Less

Submitted 10 May, 2011; originally announced May 2011.

Comments: 16 pages (13 pages + 3 page appendix), 5 figures

ACM Class: D.4.6; H.3.3; H.3.4

arXiv:0710.3979 [pdf]

Toward Trusted Sharing of Network Packet Traces Using Anonymization: Single-Field Privacy/Analysis Tradeoffs

Authors: William Yurcik, Clay Woolam, Greg Hellings, Latifur Khan, Bhavani Thuraisingham

Abstract: Network data needs to be shared for distributed security analysis. Anonymization of network data for sharing sets up a fundamental tradeoff between privacy protection versus security analysis capability. This privacy/analysis tradeoff has been acknowledged by many researchers but this is the first paper to provide empirical measurements to characterize the privacy/analysis tradeoff for an enterp… ▽ More Network data needs to be shared for distributed security analysis. Anonymization of network data for sharing sets up a fundamental tradeoff between privacy protection versus security analysis capability. This privacy/analysis tradeoff has been acknowledged by many researchers but this is the first paper to provide empirical measurements to characterize the privacy/analysis tradeoff for an enterprise dataset. Specifically we perform anonymization options on single-fields within network packet traces and then make measurements using intrusion detection system alarms as a proxy for security analysis capability. Our results show: (1) two fields have a zero sum tradeoff (more privacy lessens security analysis and vice versa) and (2) eight fields have a more complex tradeoff (that is not zero sum) in which both privacy and analysis can both be simultaneously accomplished. △ Less

Submitted 26 October, 2007; v1 submitted 22 October, 2007; originally announced October 2007.

Comments: 8 pages,1 figure, 4 tables

ACM Class: C.2.0; C.2.3; C.2.m; D.3.4; K.6.5

Showing 1–12 of 12 results for author: Thuraisingham, B