Search | arXiv e-print repository

INGREX: An Interactive Explanation Framework for Graph Neural Networks

Authors: Tien-Cuong Bui, Van-Duc Le, Wen-Syan Li, Sang Kyun Cha

Abstract: Graph Neural Networks (GNNs) are widely used in many modern applications, necessitating explanations for their decisions. However, the complexity of GNNs makes it difficult to explain predictions. Even though several methods have been proposed lately, they can only provide simple and static explanations, which are difficult for users to understand in many scenarios. Therefore, we introduce INGREX,… ▽ More Graph Neural Networks (GNNs) are widely used in many modern applications, necessitating explanations for their decisions. However, the complexity of GNNs makes it difficult to explain predictions. Even though several methods have been proposed lately, they can only provide simple and static explanations, which are difficult for users to understand in many scenarios. Therefore, we introduce INGREX, an interactive explanation framework for GNNs designed to aid users in comprehending model predictions. Our framework is implemented based on multiple explanation algorithms and advanced libraries. We demonstrate our framework in three scenarios covering common demands for GNN explanations to present its effectiveness and helpfulness. △ Less

Submitted 2 November, 2022; originally announced November 2022.

Comments: 4 pages, 5 figures, This paper is under review for IEEE ICDE 2023

arXiv:2210.11094 [pdf, other]

Toward Multiple Specialty Learners for Explaining GNNs via Online Knowledge Distillation

Authors: Tien-Cuong Bui, Van-Duc Le, Wen-syan Li, Sang Kyun Cha

Abstract: Graph Neural Networks (GNNs) have become increasingly ubiquitous in numerous applications and systems, necessitating explanations of their predictions, especially when making critical decisions. However, explaining GNNs is challenging due to the complexity of graph data and model execution. Despite additional computational costs, post-hoc explanation approaches have been widely adopted due to the… ▽ More Graph Neural Networks (GNNs) have become increasingly ubiquitous in numerous applications and systems, necessitating explanations of their predictions, especially when making critical decisions. However, explaining GNNs is challenging due to the complexity of graph data and model execution. Despite additional computational costs, post-hoc explanation approaches have been widely adopted due to the generality of their architectures. Intrinsically interpretable models provide instant explanations but are usually model-specific, which can only explain particular GNNs. Therefore, we propose a novel GNN explanation framework named SCALE, which is general and fast for explaining predictions. SCALE trains multiple specialty learners to explain GNNs since constructing one powerful explainer to examine attributions of interactions in input graphs is complicated. In training, a black-box GNN model guides learners based on an online knowledge distillation paradigm. In the explanation phase, explanations of predictions are provided by multiple explainers corresponding to trained learners. Specifically, edge masking and random walk with restart procedures are executed to provide structural explanations for graph-level and node-level predictions, respectively. A feature attribution module provides overall summaries and instance-level feature contributions. We compare SCALE with state-of-the-art baselines via quantitative and qualitative experiments to prove its explanation correctness and execution performance. We also conduct a series of ablation studies to understand the strengths and weaknesses of the proposed framework. △ Less

Submitted 20 October, 2022; originally announced October 2022.

Comments: 13 pages, 11 figures, A preliminary paper under review of IEEE ICDE 2023

arXiv:2011.14344 [pdf, other]

Generative Pre-training for Paraphrase Generation by Representing and Predicting Spans in Exemplars

Authors: Tien-Cuong Bui, Van-Duc Le, Hai-Thien To, Sang Kyun Cha

Abstract: Paraphrase generation is a long-standing problem and serves an essential role in many natural language processing problems. Despite some encouraging results, recent methods either confront the problem of favoring generic utterance or need to retrain the model from scratch for each new dataset. This paper presents a novel approach to paraphrasing sentences, extended from the GPT-2 model. We develop… ▽ More Paraphrase generation is a long-standing problem and serves an essential role in many natural language processing problems. Despite some encouraging results, recent methods either confront the problem of favoring generic utterance or need to retrain the model from scratch for each new dataset. This paper presents a novel approach to paraphrasing sentences, extended from the GPT-2 model. We develop a template masking technique, named first-order masking, to masked out irrelevant words in exemplars utilizing POS taggers. So that, the paraphrasing task is changed to predicting spans in masked templates. Our proposed approach outperforms competitive baselines, especially in the semantic preservation aspect. To prevent the model from being biased towards a given template, we introduce a technique, referred to as second-order masking, which utilizes Bernoulli distribution to control the visibility of the first-order-masked template's tokens. Moreover, this technique allows the model to provide various paraphrased sentences in testing by adjusting the second-order-masking level. For scale-up objectives, we compare the performance of two alternatives template-selection methods, which shows that they were equivalent in preserving semantic information. △ Less

Submitted 29 November, 2020; originally announced November 2020.

Comments: 8 pages, 4 figures, Accepted to IEEE International Conference on Big Data and Smart Computing 2021

arXiv:2011.10749 [pdf, other]

doi 10.1109/TSE.2022.3187689

Revisiting Binary Code Similarity Analysis using Interpretable Feature Engineering and Lessons Learned

Authors: Dongkwan Kim, Eunsoo Kim, Sang Kil Cha, Sooel Son, Yongdae Kim

Abstract: Binary code similarity analysis (BCSA) is widely used for diverse security applications, including plagiarism detection, software license violation detection, and vulnerability discovery. Despite the surging research interest in BCSA, it is significantly challenging to perform new research in this field for several reasons. First, most existing approaches focus only on the end results, namely, inc… ▽ More Binary code similarity analysis (BCSA) is widely used for diverse security applications, including plagiarism detection, software license violation detection, and vulnerability discovery. Despite the surging research interest in BCSA, it is significantly challenging to perform new research in this field for several reasons. First, most existing approaches focus only on the end results, namely, increasing the success rate of BCSA, by adopting uninterpretable machine learning. Moreover, they utilize their own benchmark, sharing neither the source code nor the entire dataset. Finally, researchers often use different terminologies or even use the same technique without citing the previous literature properly, which makes it difficult to reproduce or extend previous work. To address these problems, we take a step back from the mainstream and contemplate fundamental research questions for BCSA. Why does a certain technique or a certain feature show better results than the others? Specifically, we conduct the first systematic study on the basic features used in BCSA by leveraging interpretable feature engineering on a large-scale benchmark. Our study reveals various useful insights on BCSA. For example, we show that a simple interpretable model with a few basic features can achieve a comparable result to that of recent deep learning-based approaches. Furthermore, we show that the way we compile binaries or the correctness of underlying binary analysis tools can significantly affect the performance of BCSA. Lastly, we make all our source code and benchmark public and suggest future directions in this field to help further research. △ Less

Submitted 6 July, 2022; v1 submitted 21 November, 2020; originally announced November 2020.

Comments: 23 pages, accepted to IEEE Transactions on Software Engineering (June 2022)

arXiv:2009.11543 [pdf, other]

Compressed Key Sort and Fast Index Reconstruction

Authors: Yongsik Kwon, Cheol Ryu, Sang Kyun Cha, Arthur H. Lee, Kunsoo Park, Bongki Moon

Abstract: In this paper we propose an index key compression scheme based on the notion of distinction bits by proving that the distinction bits of index keys are sufficient information to determine the sorted order of the index keys correctly. While the actual compression ratio may vary depending on the characteristics of datasets (an average of 2.76 to one compression ratio was observed in our experiments)… ▽ More In this paper we propose an index key compression scheme based on the notion of distinction bits by proving that the distinction bits of index keys are sufficient information to determine the sorted order of the index keys correctly. While the actual compression ratio may vary depending on the characteristics of datasets (an average of 2.76 to one compression ratio was observed in our experiments), the index key compression scheme leads to significant performance improvements during the reconstruction of large-scale indexes. Our index key compression can be effectively used in database replication and index recovery of modern main-memory database systems. △ Less

Submitted 24 September, 2020; originally announced September 2020.

Comments: 26 pages and 13 figures

arXiv:2007.03169 [pdf, other]

Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning

Authors: Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim

Abstract: We propose spatial semantic embedding network (SSEN), a simple, yet efficient algorithm for 3D instance segmentation using deep metric learning. The raw 3D reconstruction of an indoor environment suffers from occlusions, noise, and is produced without any meaningful distinction between individual entities. For high-level intelligent tasks from a large scale scene, 3D instance segmentation recogniz… ▽ More We propose spatial semantic embedding network (SSEN), a simple, yet efficient algorithm for 3D instance segmentation using deep metric learning. The raw 3D reconstruction of an indoor environment suffers from occlusions, noise, and is produced without any meaningful distinction between individual entities. For high-level intelligent tasks from a large scale scene, 3D instance segmentation recognizes individual instances of objects. We approach the instance segmentation by simply learning the correct embedding space that maps individual instances of objects into distinct clusters that reflect both spatial and semantic information. Unlike previous approaches that require complex pre-processing or post-processing, our implementation is compact and fast with competitive performance, maintaining scalability on large scenes with high resolution voxels. We demonstrate the state-of-the-art performance of our algorithm in the ScanNet 3D instance segmentation benchmark on AP score. △ Less

Submitted 6 July, 2020; originally announced July 2020.

arXiv:2001.04107 [pdf, other]

Montage: A Neural Network Language Model-Guided JavaScript Engine Fuzzer

Authors: Suyoung Lee, HyungSeok Han, Sang Kil Cha, Sooel Son

Abstract: JavaScript (JS) engine vulnerabilities pose significant security threats affecting billions of web browsers. While fuzzing is a prevalent technique for finding such vulnerabilities, there have been few studies that leverage the recent advances in neural network language models (NNLMs). In this paper, we present Montage, the first NNLM-guided fuzzer for finding JS engine vulnerabilities. The key as… ▽ More JavaScript (JS) engine vulnerabilities pose significant security threats affecting billions of web browsers. While fuzzing is a prevalent technique for finding such vulnerabilities, there have been few studies that leverage the recent advances in neural network language models (NNLMs). In this paper, we present Montage, the first NNLM-guided fuzzer for finding JS engine vulnerabilities. The key aspect of our technique is to transform a JS abstract syntax tree (AST) into a sequence of AST subtrees that can directly train prevailing NNLMs. We demonstrate that Montage is capable of generating valid JS tests, and show that it outperforms previous studies in terms of finding vulnerabilities. Montage found 37 real-world bugs, including three CVEs, in the latest JS engines, demonstrating its efficacy in finding JS engine bugs. △ Less

Submitted 14 January, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

Comments: 18 pages, accepted at USENIX Security '20

arXiv:1912.00649 [pdf, other]

An Attention-Based Speaker Naming Method for Online Adaptation in Non-Fixed Scenarios

Authors: Jungwoo Pyo, Joohyun Lee, Youngjune Park, Tien-Cuong Bui, Sang Kyun Cha

Abstract: A speaker naming task, which finds and identifies the active speaker in a certain movie or drama scene, is crucial for dealing with high-level video analysis applications such as automatic subtitle labeling and video summarization. Modern approaches have usually exploited biometric features with a gradient-based method instead of rule-based algorithms. In a certain situation, however, a naive grad… ▽ More A speaker naming task, which finds and identifies the active speaker in a certain movie or drama scene, is crucial for dealing with high-level video analysis applications such as automatic subtitle labeling and video summarization. Modern approaches have usually exploited biometric features with a gradient-based method instead of rule-based algorithms. In a certain situation, however, a naive gradient-based method does not work efficiently. For example, when new characters are added to the target identification list, the neural network needs to be frequently retrained to identify new people and it causes delays in model preparation. In this paper, we present an attention-based method which reduces the model setup time by updating the newly added data via online adaptation without a gradient update process. We comparatively analyzed with three evaluation metrics(accuracy, memory usage, setup time) of the attention-based method and existing gradient-based methods under various controlled settings of speaker naming. Also, we applied existing speaker naming models and the attention-based model to real video to prove that our approach shows comparable accuracy to the existing state-of-the-art models and even higher accuracy in some cases. △ Less

Submitted 2 December, 2019; originally announced December 2019.

Comments: AAAI 2020 Workshop on Interactive and Conversational Recommendation Systems(WICRS)

arXiv:1911.12919 [pdf, other]

Spatiotemporal deep learning model for citywide air pollution interpolation and prediction

Authors: Van-Duc Le, Tien-Cuong Bui, Sang Kyun Cha

Abstract: Recently, air pollution is one of the most concerns for big cities. Predicting air quality for any regions and at any time is a critical requirement of urban citizens. However, air pollution prediction for the whole city is a challenging problem. The reason is, there are many spatiotemporal factors affecting air pollution throughout the city. Collecting as many of them could help us to forecast ai… ▽ More Recently, air pollution is one of the most concerns for big cities. Predicting air quality for any regions and at any time is a critical requirement of urban citizens. However, air pollution prediction for the whole city is a challenging problem. The reason is, there are many spatiotemporal factors affecting air pollution throughout the city. Collecting as many of them could help us to forecast air pollution better. In this research, we present many spatiotemporal datasets collected over Seoul city in Korea, which is currently much suffered by air pollution problem as well. These datasets include air pollution data, meteorological data, traffic volume, average driving speed, and air pollution indexes of external areas which are known to impact Seoul's air pollution. To the best of our knowledge, traffic volume and average driving speed data are two new datasets in air pollution research. In addition, recent research in air pollution has tried to build models to interpolate and predict air pollution in the city. Nevertheless, they mostly focused on predicting air quality in discrete locations or used hand-crafted spatial and temporal features. In this paper, we propose the usage of Convolutional Long Short-Term Memory (ConvLSTM) model \cite{b16}, a combination of Convolutional Neural Networks and Long Short-Term Memory, which automatically manipulates both the spatial and temporal features of the data. Specially, we introduce how to transform the air pollution data into sequences of images which leverages the using of ConvLSTM model to interpolate and predict air quality for the entire city at the same time. We prove that our approach is suitable for spatiotemporal air pollution problems and also outperforms other related research. △ Less

Submitted 28 November, 2019; originally announced November 2019.

Comments: Accepted at BigComp2020

arXiv:1907.00957 [pdf]

doi 10.1038/s41928-020-0385-0

Magnetic skyrmion artificial synapse for neuromorphic computing

Authors: Kyung Mee Song, Jae-Seung Jeong, Biao Pan, Xichao Zhang, **g Xia, Sun Kyung Cha, Tae-Eon Park, Kwangsu Kim, Simone Finizio, Joerg Raabe, Joonyeon Chang, Yan Zhou, Weisheng Zhao, Wang Kang, Hyunsu Ju, Seonghoon Woo

Abstract: Since the experimental discovery of magnetic skyrmions achieved one decade ago, there have been significant efforts to bring the virtual particles into all-electrical fully functional devices, inspired by their fascinating physical and topological properties suitable for future low-power electronics. Here, we experimentally demonstrate such a device: electrically-operating skyrmion-based artificia… ▽ More Since the experimental discovery of magnetic skyrmions achieved one decade ago, there have been significant efforts to bring the virtual particles into all-electrical fully functional devices, inspired by their fascinating physical and topological properties suitable for future low-power electronics. Here, we experimentally demonstrate such a device: electrically-operating skyrmion-based artificial synaptic device designed for neuromorphic computing. We present that controlled current-induced creation, motion, detection and deletion of skyrmions in ferrimagnetic multilayers can be harnessed in a single device at room temperature to imitate the behaviors of biological synapses. Using simulations, we demonstrate that such skyrmion-based synapses could be used to perform neuromorphic pattern-recognition computing using handwritten recognition data set, reaching to the accuracy of ~89 percents, comparable to the software-based training accuracy of ~94 percents. Chip-level simulation then highlights the potential of skyrmion synapse compared to existing technologies. Our findings experimentally illustrate the basic concepts of skyrmion-based fully functional electronic devices while providing a new building block in the emerging field of spintronics-based bio-inspired computing. △ Less

Submitted 30 September, 2019; v1 submitted 1 July, 2019; originally announced July 2019.

Comments: 11 pages, 4 figures

Journal ref: Nature Electronics 3, 148 (2020)

arXiv:1812.00140 [pdf, ps, other]

The Art, Science, and Engineering of Fuzzing: A Survey

Authors: Valentin J. M. Manes, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J. Schwartz, Maverick Woo

Abstract: Among the many software vulnerability discovery techniques available today, fuzzing has remained highly popular due to its conceptual simplicity, its low barrier to deployment, and its vast amount of empirical evidence in discovering real-world software vulnerabilities. At a high level, fuzzing refers to a process of repeatedly running a program with generated inputs that may be syntactically or s… ▽ More Among the many software vulnerability discovery techniques available today, fuzzing has remained highly popular due to its conceptual simplicity, its low barrier to deployment, and its vast amount of empirical evidence in discovering real-world software vulnerabilities. At a high level, fuzzing refers to a process of repeatedly running a program with generated inputs that may be syntactically or semantically malformed. While researchers and practitioners alike have invested a large and diverse effort towards improving fuzzing in recent years, this surge of work has also made it difficult to gain a comprehensive and coherent view of fuzzing. To help preserve and bring coherence to the vast literature of fuzzing, this paper presents a unified, general-purpose model of fuzzing together with a taxonomy of the current fuzzing literature. We methodically explore the design decisions at every stage of our model fuzzer by surveying the related literature and innovations in the art, science, and engineering that make modern-day fuzzers effective. △ Less

Submitted 7 April, 2019; v1 submitted 30 November, 2018; originally announced December 2018.

Comments: 29 pages, under submission to ACM Computing Surveys (July 2018) - 2018.12.10 update: correct minor mistakes in overview table - 2019.02.16 update: source clean - 2019.04.08: submission to TSE, 21 pages

arXiv:1805.00432 [pdf]

Real-time Air Pollution prediction model based on Spatiotemporal Big data

Authors: V. Duc Le, Sang Kyun Cha

Abstract: Air pollution is one of the most concerns for urban areas. Many countries have constructed monitoring stations to hourly collect pollution values. Recently, there is a research in Daegu city, Korea for real-time air quality monitoring via sensors installed on taxis running across the whole city. The collected data is huge (1-second interval) and in both Spatial and Temporal format. In this paper,… ▽ More Air pollution is one of the most concerns for urban areas. Many countries have constructed monitoring stations to hourly collect pollution values. Recently, there is a research in Daegu city, Korea for real-time air quality monitoring via sensors installed on taxis running across the whole city. The collected data is huge (1-second interval) and in both Spatial and Temporal format. In this paper, based on this spatiotemporal Big data, we propose a real-time air pollution prediction model based on Convolutional Neural Network (CNN) algorithm for image-like Spatial distribution of air pollution. Regarding to Temporal information in the data, we introduce a combination of a Long Short-Term Memory (LSTM) unit for time series data and a Neural Network model for other air pollution impact factors such as weather conditions to build a hybrid prediction model. This model is simple in architecture but still brings good prediction ability. △ Less

Submitted 9 August, 2018; v1 submitted 5 April, 2018; originally announced May 2018.

Comments: 6 pages

Journal ref: The International Conference on Big data, IoT, and Cloud Computing (BIC 2018)

Showing 1–12 of 12 results for author: Cha, S K