-
Inverse-Free Fast Natural Gradient Descent Method for Deep Learning
Authors:
Xinwei Ou,
Ce Zhu,
Xiaolin Huang,
Yipeng Liu
Abstract:
Second-order optimization techniques have the potential to achieve faster convergence rates compared to first-order methods through the incorporation of second-order derivatives or statistics. However, their utilization in deep learning is limited due to their computational inefficiency. Various approaches have been proposed to address this issue, primarily centered on minimizing the size of the m…
▽ More
Second-order optimization techniques have the potential to achieve faster convergence rates compared to first-order methods through the incorporation of second-order derivatives or statistics. However, their utilization in deep learning is limited due to their computational inefficiency. Various approaches have been proposed to address this issue, primarily centered on minimizing the size of the matrix to be inverted. Nevertheless, the necessity of performing the inverse operation iteratively persists. In this work, we present a fast natural gradient descent (FNGD) method that only requires inversion during the first epoch. Specifically, it is revealed that natural gradient descent (NGD) is essentially a weighted sum of per-sample gradients. Our novel approach further proposes to share these weighted coefficients across epochs without affecting empirical performance. Consequently, FNGD exhibits similarities to the average sum in first-order methods, leading to the computational complexity of FNGD being comparable to that of first-order methods. Extensive experiments on image classification and machine translation tasks demonstrate the efficiency of the proposed FNGD. For training ResNet-18 on CIFAR-100, FNGD can achieve a speedup of 2.07$\times$ compared with KFAC. For training Transformer on Multi30K, FNGD outperforms AdamW by 24 BLEU score while requiring almost the same training time.
△ Less
Submitted 28 April, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
A Preliminary Study on Using Large Language Models in Software Pentesting
Authors:
Kumar Shashwat,
Francis Hahn,
Xinming Ou,
Dmitry Goldgof,
Lawrence Hall,
Jay Ligatti,
S. Raj Rajgopalan,
Armin Ziaie Tabari
Abstract:
Large language models (LLM) are perceived to offer promising potentials for automating security tasks, such as those found in security operation centers (SOCs). As a first step towards evaluating this perceived potential, we investigate the use of LLMs in software pentesting, where the main task is to automatically identify software security vulnerabilities in source code. We hypothesize that an L…
▽ More
Large language models (LLM) are perceived to offer promising potentials for automating security tasks, such as those found in security operation centers (SOCs). As a first step towards evaluating this perceived potential, we investigate the use of LLMs in software pentesting, where the main task is to automatically identify software security vulnerabilities in source code. We hypothesize that an LLM-based AI agent can be improved over time for a specific security task as human operators interact with it. Such improvement can be made, as a first step, by engineering prompts fed to the LLM based on the responses produced, to include relevant contexts and structures so that the model provides more accurate results. Such engineering efforts become sustainable if the prompts that are engineered to produce better results on current tasks, also produce better results on future unknown tasks. To examine this hypothesis, we utilize the OWASP Benchmark Project 1.2 which contains 2,740 hand-crafted source code test cases containing various types of vulnerabilities. We divide the test cases into training and testing data, where we engineer the prompts based on the training data (only), and evaluate the final system on the testing data. We compare the AI agent's performance on the testing data against the performance of the agent without the prompt engineering. We also compare the AI agent's results against those from SonarQube, a widely used static code analyzer for security testing. We built and tested multiple versions of the AI agent using different off-the-shelf LLMs -- Google's Gemini-pro, as well as OpenAI's GPT-3.5-Turbo and GPT-4-Turbo (with both chat completion and assistant APIs). The results show that using LLMs is a viable approach to build an AI agent for software pentesting that can improve through repeated use and prompt engineering.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
NB-IoT Uplink Synchronization by Change Point Detection of Phase Series in NTNs
Authors:
Jiaqi Jiang,
Yihang Huang,
Yin Xu,
Runnan Liu,
XiaoWu Ou,
Dazhi He
Abstract:
Non-Terrestrial Networks (NTNs) are widely recognized as a potential solution to achieve ubiquitous connections of Narrow Bandwidth Internet of Things (NB-IoT). In order to adopt NTNs in NB-IoT, one of the main challenges is the uplink synchronization of Narrowband Physical Random Access procedure which refers to the estimation of time of arrival (ToA) and carrier frequency offset (CFO). Due to th…
▽ More
Non-Terrestrial Networks (NTNs) are widely recognized as a potential solution to achieve ubiquitous connections of Narrow Bandwidth Internet of Things (NB-IoT). In order to adopt NTNs in NB-IoT, one of the main challenges is the uplink synchronization of Narrowband Physical Random Access procedure which refers to the estimation of time of arrival (ToA) and carrier frequency offset (CFO). Due to the large propagation delay and Doppler shift in NTNs, traditional estimation methods for Terrestrial Networks (TNs) can not be applied in NTNs directly. In this context, we design a two stage ToA and CFO estimation scheme including coarse estimation and fine estimation based on abrupt change point detection (CPD) of phase series with machine learning. Our method achieves high estimation accuracy of ToA and CFO under the low signal-noise ratio (SNR) and large Doppler shift conditions and extends the estimation range without enhancing Random Access preambles.
△ Less
Submitted 4 June, 2023;
originally announced June 2023.
-
Low Rank Optimization for Efficient Deep Learning: Making A Balance between Compact Architecture and Fast Training
Authors:
Xinwei Ou,
Zhangxin Chen,
Ce Zhu,
Yipeng Liu
Abstract:
Deep neural networks have achieved great success in many data processing applications. However, the high computational complexity and storage cost makes deep learning hard to be used on resource-constrained devices, and it is not environmental-friendly with much power cost. In this paper, we focus on low-rank optimization for efficient deep learning techniques. In the space domain, deep neural net…
▽ More
Deep neural networks have achieved great success in many data processing applications. However, the high computational complexity and storage cost makes deep learning hard to be used on resource-constrained devices, and it is not environmental-friendly with much power cost. In this paper, we focus on low-rank optimization for efficient deep learning techniques. In the space domain, deep neural networks are compressed by low rank approximation of the network parameters, which directly reduces the storage requirement with a smaller number of network parameters. In the time domain, the network parameters can be trained in a few subspaces, which enables efficient training for fast convergence. The model compression in the spatial domain is summarized into three categories as pre-train, pre-set, and compression-aware methods, respectively. With a series of integrable techniques discussed, such as sparse pruning, quantization, and entropy coding, we can ensemble them in an integration framework with lower computational complexity and storage. Besides of summary of recent technical advances, we have two findings for motivating future works: one is that the effective rank outperforms other sparse measures for network compression. The other is a spatial and temporal balance for tensorized neural networks.
△ Less
Submitted 21 March, 2023;
originally announced March 2023.
-
What are Attackers after on IoT Devices? An approach based on a multi-phased multi-faceted IoT honeypot ecosystem and data clustering
Authors:
Armin Ziaie Tabari,
Xinming Ou,
Anoop Singhal
Abstract:
The growing number of Internet of Things (IoT) devices makes it imperative to be aware of the real-world threats they face in terms of cybersecurity. While honeypots have been historically used as decoy devices to help researchers/organizations gain a better understanding of the dynamic of threats on a network and their impact, IoT devices pose a unique challenge for this purpose due to the variet…
▽ More
The growing number of Internet of Things (IoT) devices makes it imperative to be aware of the real-world threats they face in terms of cybersecurity. While honeypots have been historically used as decoy devices to help researchers/organizations gain a better understanding of the dynamic of threats on a network and their impact, IoT devices pose a unique challenge for this purpose due to the variety of devices and their physical connections. In this work, by observing real-world attackers' behavior in a low-interaction honeypot ecosystem, we (1) presented a new approach to creating a multi-phased, multi-faceted honeypot ecosystem, which gradually increases the sophistication of honeypots' interactions with adversaries, (2) designed and developed a low-interaction honeypot for cameras that allowed researchers to gain a deeper understanding of what attackers are targeting, and (3) devised an innovative data analytics method to identify the goals of adversaries. Our honeypots have been active for over three years. We were able to collect increasingly sophisticated attack data in each phase. Furthermore, our data analytics points to the fact that the vast majority of attack activities captured in the honeypots share significant similarity, and can be clustered and grouped to better understand the goals, patterns, and trends of IoT attacks in the wild.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Topology-Aware Hashing for Effective Control Flow Graph Similarity Analysis
Authors:
Yu** Li,
Jiong Jang,
Xinming Ou
Abstract:
Control Flow Graph (CFG) similarity analysis is an essential technique for a variety of security analysis tasks, including malware detection and malware clustering. Even though various algorithms have been developed, existing CFG similarity analysis methods still suffer from limited efficiency, accuracy, and usability. In this paper, we propose a novel fuzzy hashing scheme called topology-aware ha…
▽ More
Control Flow Graph (CFG) similarity analysis is an essential technique for a variety of security analysis tasks, including malware detection and malware clustering. Even though various algorithms have been developed, existing CFG similarity analysis methods still suffer from limited efficiency, accuracy, and usability. In this paper, we propose a novel fuzzy hashing scheme called topology-aware hashing (TAH) for effective and efficient CFG similarity analysis. Given the CFGs constructed from program binaries, we extract blended n-gram graphical features of the CFGs, encode the graphical features into numeric vectors (called graph signatures), and then measure the graph similarity by comparing the graph signatures. We further employ a fuzzy hashing technique to convert the numeric graph signatures into smaller fixed-size fuzzy hash signatures for efficient similarity calculation. Our comprehensive evaluation demonstrates that TAH is more effective and efficient compared to existing CFG comparison techniques. To demonstrate the applicability of TAH to real-world security analysis tasks, we develop a binary similarity analysis tool based on TAH, and show that it outperforms existing similarity analysis tools while conducting malware clustering.
△ Less
Submitted 14 April, 2020;
originally announced April 2020.
-
A First Step Towards Understanding Real-world Attacks on IoT Devices
Authors:
Armin Ziaie Tabari,
Xinming Ou
Abstract:
With the rapid growth of Internet of Things (IoT) devices, it is imperative to proactively understand the real-world cybersecurity threats posed to them. This paper describes our initial efforts towards building a honeypot ecosystem as a means to gathering and analyzing real attack data against IoT devices. A primary condition for a honeypot to yield useful insights is to let attackers believe the…
▽ More
With the rapid growth of Internet of Things (IoT) devices, it is imperative to proactively understand the real-world cybersecurity threats posed to them. This paper describes our initial efforts towards building a honeypot ecosystem as a means to gathering and analyzing real attack data against IoT devices. A primary condition for a honeypot to yield useful insights is to let attackers believe they are real systems used by humans and organizations. IoT devices pose unique challenges in this respect, due to the large variety of device types and the physical-connectedness nature. We thus create a multiphased approach in building a honeypot ecosystem, where researchers can gradually increase a low-interaction honeypot's sophistication in emulating an IoT device by observing real-world attackers' behaviors. We deployed honeypots both on-premise and in the cloud, with associated analysis and vetting infrastructures to ensure these honeypots cannot be easily identified as such and appear to be real systems. In doing so we were able to attract increasingly sophisticated attack data. We present the design of this honeypot ecosystem and our observation on the attack data so far. Our data shows that real-world attackers are explicitly going after IoT devices, and some captured activities seem to involve direct human interaction (as opposed to scripted automatic activities). We also build a low interaction honeypot for IoT cameras, called Honeycamera, that present to attackers seemingly real videos. This is our first step towards building a more comprehensive honeypot ecosystem that will allow researchers to gain concrete understanding of what attackers are going after on IoT devices, so as to more proactively protect them.
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
Network Reconnaissance and Vulnerability Excavation of Secure DDS Systems
Authors:
Ruffin White,
Gianluca Caiazza,
Chenxu Jiang,
Xinyue Ou,
Zhiyue Yang,
Agostino Cortesi,
Henrik Christensen
Abstract:
Distribution Service (DDS) is a realtime peer-to-peer protocol that serves as a scalable middleware between distributed networked systems found in many Industrial IoT domains such as automotive, medical, energy, and defense. Since the initial ratification of the standard, specifications have introduced a Security Model and Service Plugin Interface (SPI) architecture, facilitating authenticated enc…
▽ More
Distribution Service (DDS) is a realtime peer-to-peer protocol that serves as a scalable middleware between distributed networked systems found in many Industrial IoT domains such as automotive, medical, energy, and defense. Since the initial ratification of the standard, specifications have introduced a Security Model and Service Plugin Interface (SPI) architecture, facilitating authenticated encryption and data centric access control while preserving interoperable data exchange. However, as Secure DDS v1.1, the default plugin specifications presently exchanges digitally signed capability lists of both participants in the clear during the crypto handshake for permission attestation; thus breaching confidentiality of the context of the connection. In this work, we present an attacker model that makes use of network reconnaissance afforded by this leaked context in conjunction with formal verification and model checking to arbitrarily reason about the underlying topology and reachability of information flow, enabling targeted attacks such as selective denial of service, adversarial partitioning of the data bus, or vulnerability excavation of vendor implementations.
△ Less
Submitted 14 August, 2019;
originally announced August 2019.
-
VulDeePecker: A Deep Learning-Based System for Vulnerability Detection
Authors:
Zhen Li,
Deqing Zou,
Shouhuai Xu,
Xinyu Ou,
Hai **,
Sujuan Wang,
Zhijun Deng,
Yuyi Zhong
Abstract:
The automatic detection of software vulnerabilities is an important research problem. However, existing solutions to this problem rely on human experts to define features and often miss many vulnerabilities (i.e., incurring high false negative rate). In this paper, we initiate the study of using deep learning-based vulnerability detection to relieve human experts from the tedious and subjective ta…
▽ More
The automatic detection of software vulnerabilities is an important research problem. However, existing solutions to this problem rely on human experts to define features and often miss many vulnerabilities (i.e., incurring high false negative rate). In this paper, we initiate the study of using deep learning-based vulnerability detection to relieve human experts from the tedious and subjective task of manually defining features. Since deep learning is motivated to deal with problems that are very different from the problem of vulnerability detection, we need some guiding principles for applying deep learning to vulnerability detection. In particular, we need to find representations of software programs that are suitable for deep learning. For this purpose, we propose using code gadgets to represent programs and then transform them into vectors, where a code gadget is a number of (not necessarily consecutive) lines of code that are semantically related to each other. This leads to the design and implementation of a deep learning-based vulnerability detection system, called Vulnerability Deep Pecker (VulDeePecker). In order to evaluate VulDeePecker, we present the first vulnerability dataset for deep learning approaches. Experimental results show that VulDeePecker can achieve much fewer false negatives (with reasonable false positives) than other approaches. We further apply VulDeePecker to 3 software products (namely Xen, Seamonkey, and Libav) and detect 4 vulnerabilities, which are not reported in the National Vulnerability Database but were "silently" patched by the vendors when releasing later versions of these products; in contrast, these vulnerabilities are almost entirely missed by the other vulnerability detection systems we experimented with.
△ Less
Submitted 5 January, 2018;
originally announced January 2018.
-
Android Malware Clustering through Malicious Payload Mining
Authors:
Yu** Li,
Jiyong Jang,
Xin Hu,
Xinming Ou
Abstract:
Clustering has been well studied for desktop malware analysis as an effective triage method. Conventional similarity-based clustering techniques, however, cannot be immediately applied to Android malware analysis due to the excessive use of third-party libraries in Android application development and the widespread use of repackaging in malware development. We design and implement an Android malwa…
▽ More
Clustering has been well studied for desktop malware analysis as an effective triage method. Conventional similarity-based clustering techniques, however, cannot be immediately applied to Android malware analysis due to the excessive use of third-party libraries in Android application development and the widespread use of repackaging in malware development. We design and implement an Android malware clustering system through iterative mining of malicious payload and checking whether malware samples share the same version of malicious payload. Our system utilizes a hierarchical clustering technique and an efficient bit-vector format to represent Android apps. Experimental results demonstrate that our clustering approach achieves precision of 0.90 and recall of 0.75 for Android Genome malware dataset, and average precision of 0.98 and recall of 0.96 with respect to manually verified ground-truth.
△ Less
Submitted 15 July, 2017;
originally announced July 2017.
-
Investigating the Impacts of Recommendation Agents on Impulsive Purchase Behaviour
Authors:
Hui Zhu,
Zherui Yang,
Carol Xj Ou,
Hongwei Liu,
Robert M Davison
Abstract:
The usage of recommendation agents (RAs) in the online marketplace can help consumers to locate their desired products. RAs can help consumers effectively obtain comprehensive product information and compare their candidate target products. As a result, RAs have affected consumers shop** behaviour. In this study, we investigate the usage and the influence of RAs in the online marketplace. Based…
▽ More
The usage of recommendation agents (RAs) in the online marketplace can help consumers to locate their desired products. RAs can help consumers effectively obtain comprehensive product information and compare their candidate target products. As a result, RAs have affected consumers shop** behaviour. In this study, we investigate the usage and the influence of RAs in the online marketplace. Based on the Stimulus-Organism-Response (SOR) model, we propose that the stimulus of using RAs (informativeness, product search effectiveness and the lack of sociality stress) can affect consumers attitude (perceived control and satisfaction), which further affects their behavioural outcomes like impulsive purchase. We validate this research model with survey data from 157 users of RAs. The data largely support the proposed model and indicate that the RAs can significantly contribute to impulsive purchase behaviour in online marketplaces. Theoretical and practical contributions are discussed.
△ Less
Submitted 4 June, 2016;
originally announced June 2016.
-
Makeup like a superstar: Deep Localized Makeup Transfer Network
Authors:
Si Liu,
Xinyu Ou,
Ruihe Qian,
Wei Wang,
Xiaochun Cao
Abstract:
In this paper, we propose a novel Deep Localized Makeup Transfer Network to automatically recommend the most suitable makeup for a female and synthesis the makeup on her face. Given a before-makeup face, her most suitable makeup is determined automatically. Then, both the beforemakeup and the reference faces are fed into the proposed Deep Transfer Network to generate the after-makeup face. Our end…
▽ More
In this paper, we propose a novel Deep Localized Makeup Transfer Network to automatically recommend the most suitable makeup for a female and synthesis the makeup on her face. Given a before-makeup face, her most suitable makeup is determined automatically. Then, both the beforemakeup and the reference faces are fed into the proposed Deep Transfer Network to generate the after-makeup face. Our end-to-end makeup transfer network have several nice properties including: (1) with complete functions: including foundation, lip gloss, and eye shadow transfer; (2) cosmetic specific: different cosmetics are transferred in different manners; (3) localized: different cosmetics are applied on different facial regions; (4) producing naturally looking results without obvious artifacts; (5) controllable makeup lightness: various results from light makeup to heavy makeup can be generated. Qualitative and quantitative experiments show that our network performs much better than the methods of [Guo and Sim, 2009] and two variants of NerualStyle [Gatys et al., 2015a].
△ Less
Submitted 24 April, 2016;
originally announced April 2016.
-
Fourier ptychographic reconstruction using Poisson maximum likelihood and truncated Wirtinger gradient
Authors:
Liheng Bian,
**li Suo,
Jaebum Chung,
Xiaoze Ou,
Changhuei Yang,
Feng Chen,
Qionghai Dai
Abstract:
Fourier ptychographic microscopy (FPM) is a novel computational coherent imaging technique for high space-bandwidth product imaging. Mathematically, Fourier ptychographic (FP) reconstruction can be implemented as a phase retrieval optimization process, in which we only obtain low resolution intensity images corresponding to the sub-bands of the sample's high resolution (HR) spatial spectrum, and a…
▽ More
Fourier ptychographic microscopy (FPM) is a novel computational coherent imaging technique for high space-bandwidth product imaging. Mathematically, Fourier ptychographic (FP) reconstruction can be implemented as a phase retrieval optimization process, in which we only obtain low resolution intensity images corresponding to the sub-bands of the sample's high resolution (HR) spatial spectrum, and aim to retrieve the complex HR spectrum. In real setups, the measurements always suffer from various degenerations such as Gaussian noise, Poisson noise, speckle noise and pupil location error, which would largely degrade the reconstruction. To efficiently address these degenerations, we propose a novel FP reconstruction method under a gradient descent optimization framework in this paper. The technique utilizes Poisson maximum likelihood for better signal modeling, and truncated Wirtinger gradient for error removal. Results on both simulated data and real data captured using our laser FPM setup show that the proposed method outperforms other state-of-the-art algorithms. Also, we have released our source code for non-commercial use.
△ Less
Submitted 1 March, 2016;
originally announced March 2016.