-
Prime+Retouch: When Cache is Locked and Leaked
Authors:
Jaehyuk Lee,
Fan Sang,
Taesoo Kim
Abstract:
Caches on the modern commodity CPUs have become one of the major sources of side-channel leakages and been abused as a new attack vector. To thwart the cache-based side-channel attacks, two types of countermeasures have been proposed: detection-based ones that limit the amount of microarchitectural traces an attacker can leave, and cache prefetching-and-locking techniques that claim to prevent suc…
▽ More
Caches on the modern commodity CPUs have become one of the major sources of side-channel leakages and been abused as a new attack vector. To thwart the cache-based side-channel attacks, two types of countermeasures have been proposed: detection-based ones that limit the amount of microarchitectural traces an attacker can leave, and cache prefetching-and-locking techniques that claim to prevent such leakage by disallowing evictions on sensitive data. In this paper, we present the Prime+Retouch attack that completely bypasses these defense schemes by accurately inferring the cache activities with the metadata of the cache replacement policy. Prime+Retouch has three noticeable properties: 1) it incurs no eviction on the victim's data, allowing us to bypass the two known mitigation schemes, 2) it requires minimal synchronization of only one memory access to the attacker's pre-primed cache lines, and 3) it leaks data via non-shared memory, yet because underlying eviction metadata is shared.
We demonstrate Prime+Retouch in two architectures: predominant Intel x86 and emerging Apple M1. We elucidate how Prime+Retouch can break the T-table implementation of AES with robust cache side-channel mitigations such as Cloak, under both normal and SGX-protected environments. We also manifest feasibility of the Prime+Retouch attack on the M1 platform imposing more restrictions where the precise measurement tools such as core clock cycle timer and performance counters are inaccessible to the attacker. Furthermore, we first demystify undisclosed cache architecture and its eviction policy of L1 data cache on Apple M1 architecture. We also devise a user-space noise-free cache monitoring tool by repurposing Intel TSX.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Graphene: Infrastructure Security Posture Analysis with AI-generated Attack Graphs
Authors:
Xin **,
Charalampos Katsis,
Fan Sang,
Jiahao Sun,
Elisa Bertino,
Ramana Rao Kompella,
Ashish Kundu
Abstract:
The rampant occurrence of cybersecurity breaches imposes substantial limitations on the progress of network infrastructures, leading to compromised data, financial losses, potential harm to individuals, and disruptions in essential services. The current security landscape demands the urgent development of a holistic security assessment solution that encompasses vulnerability analysis and investiga…
▽ More
The rampant occurrence of cybersecurity breaches imposes substantial limitations on the progress of network infrastructures, leading to compromised data, financial losses, potential harm to individuals, and disruptions in essential services. The current security landscape demands the urgent development of a holistic security assessment solution that encompasses vulnerability analysis and investigates the potential exploitation of these vulnerabilities as attack paths. In this paper, we propose Graphene, an advanced system designed to provide a detailed analysis of the security posture of computing infrastructures. Using user-provided information, such as device details and software versions, Graphene performs a comprehensive security assessment. This assessment includes identifying associated vulnerabilities and constructing potential attack graphs that adversaries can exploit. Furthermore, Graphene evaluates the exploitability of these attack paths and quantifies the overall security posture through a scoring mechanism. The system takes a holistic approach by analyzing security layers encompassing hardware, system, network, and cryptography. Furthermore, Graphene delves into the interconnections between these layers, exploring how vulnerabilities in one layer can be leveraged to exploit vulnerabilities in others. In this paper, we present the end-to-end pipeline implemented in Graphene, showcasing the systematic approach adopted for conducting this thorough security analysis.
△ Less
Submitted 30 April, 2024; v1 submitted 20 December, 2023;
originally announced December 2023.
-
Edge Security: Challenges and Issues
Authors:
Xin **,
Charalampos Katsis,
Fan Sang,
Jiahao Sun,
Ashish Kundu,
Ramana Kompella
Abstract:
Edge computing is a paradigm that shifts data processing services to the network edge, where data are generated. While such an architecture provides faster processing and response, among other benefits, it also raises critical security issues and challenges that must be addressed. This paper discusses the security threats and vulnerabilities emerging from the edge network architecture spanning fro…
▽ More
Edge computing is a paradigm that shifts data processing services to the network edge, where data are generated. While such an architecture provides faster processing and response, among other benefits, it also raises critical security issues and challenges that must be addressed. This paper discusses the security threats and vulnerabilities emerging from the edge network architecture spanning from the hardware layer to the system layer. We further discuss privacy and regulatory compliance challenges in such networks. Finally, we argue the need for a holistic approach to analyze edge network security posture, which must consider knowledge from each layer.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
P2FAAS: Toward Privacy-Preserving Fuzzing as a Service
Authors:
Fan Sang,
Daehee Jang,
Ming-Wei Shih,
Taesoo Kim
Abstract:
Global corporations (e.g., Google and Microsoft) have recently introduced a new model of cloud services, fuzzing-as-a-service (FaaS). Despite effectively alleviating the cost of fuzzing, the model comes with privacy concerns. For example, the end user has to trust both cloud and service providers who have access to the application to be fuzzed. Such concerns are due to the platform is under the co…
▽ More
Global corporations (e.g., Google and Microsoft) have recently introduced a new model of cloud services, fuzzing-as-a-service (FaaS). Despite effectively alleviating the cost of fuzzing, the model comes with privacy concerns. For example, the end user has to trust both cloud and service providers who have access to the application to be fuzzed. Such concerns are due to the platform is under the control of its provider and the application and the fuzzer are highly coupled. In this paper, we propose P2FaaS, a new ecosystem that preserves end user's privacy while providing FaaS in the cloud. The key idea of P2FaaS is to utilize Intel SGX for preventing cloud and service providers from learning information about the application. Our preliminary evaluation shows that P2FaaS imposes 45% runtime overhead to the fuzzing compared to the baseline. In addition, P2FaaS demonstrates that, with recently introduced hardware, Intel SGX Card, the fuzzing service can be scaled up to multiple servers without native SGX support.
△ Less
Submitted 24 September, 2019;
originally announced September 2019.
-
Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition
Authors:
Erik F. Tjong Kim Sang,
Fien De Meulder
Abstract:
We describe the CoNLL-2003 shared task: language-independent named entity recognition. We give background information on the data sets (English and German) and the evaluation method, present a general overview of the systems that have taken part in the task and discuss their performance.
We describe the CoNLL-2003 shared task: language-independent named entity recognition. We give background information on the data sets (English and German) and the evaluation method, present a general overview of the systems that have taken part in the task and discuss their performance.
△ Less
Submitted 12 June, 2003;
originally announced June 2003.
-
Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition
Authors:
Erik F. Tjong Kim Sang
Abstract:
We describe the CoNLL-2002 shared task: language-independent named entity recognition. We give background information on the data sets and the evaluation method, present a general overview of the systems that have taken part in the task and discuss their performance.
We describe the CoNLL-2002 shared task: language-independent named entity recognition. We give background information on the data sets and the evaluation method, present a general overview of the systems that have taken part in the task and discuss their performance.
△ Less
Submitted 5 September, 2002;
originally announced September 2002.
-
Memory-Based Shallow Parsing
Authors:
Erik F. Tjong Kim Sang
Abstract:
We present memory-based learning approaches to shallow parsing and apply these to five tasks: base noun phrase identification, arbitrary base phrase recognition, clause detection, noun phrase parsing and full parsing. We use feature selection techniques and system combination methods for improving the performance of the memory-based learner. Our approach is evaluated on standard data sets and th…
▽ More
We present memory-based learning approaches to shallow parsing and apply these to five tasks: base noun phrase identification, arbitrary base phrase recognition, clause detection, noun phrase parsing and full parsing. We use feature selection techniques and system combination methods for improving the performance of the memory-based learner. Our approach is evaluated on standard data sets and the results are compared with that of other systems. This reveals that our approach works well for base phrase identification while its application towards recognizing embedded structures leaves some room for improvement.
△ Less
Submitted 24 April, 2002;
originally announced April 2002.
-
Combining a self-organising map with memory-based learning
Authors:
James Hammerton,
Erik F. Tjong Kim Sang
Abstract:
Memory-based learning (MBL) has enjoyed considerable success in corpus-based natural language processing (NLP) tasks and is thus a reliable method of getting a high-level of performance when building corpus-based NLP systems. However there is a bottleneck in MBL whereby any novel testing item has to be compared against all the training items in memory base. For this reason there has been some in…
▽ More
Memory-based learning (MBL) has enjoyed considerable success in corpus-based natural language processing (NLP) tasks and is thus a reliable method of getting a high-level of performance when building corpus-based NLP systems. However there is a bottleneck in MBL whereby any novel testing item has to be compared against all the training items in memory base. For this reason there has been some interest in various forms of memory editing whereby some method of selecting a subset of the memory base is employed to reduce the number of comparisons. This paper investigates the use of a modified self-organising map (SOM) to select a subset of the memory items for comparison. This method involves reducing the number of comparisons to a value proportional to the square root of the number of training items. The method is tested on the identification of base noun-phrases in the Wall Street Journal corpus, using sections 15 to 18 for training and section 20 for testing.
△ Less
Submitted 15 July, 2001;
originally announced July 2001.
-
Learning Computational Grammars
Authors:
John Nerbonne,
Anja Belz,
Nicola Cancedda,
Herve Dejean,
James Hammerton,
Rob Koeling,
Stasinos Konstantopoulos,
Miles Osborne,
Franck Thollard,
Erik F. Tjong Kim Sang
Abstract:
This paper reports on the "Learning Computational Grammars" (LCG) project, a postdoc network devoted to studying the application of machine learning techniques to grammars suitable for computational use. We were interested in a more systematic survey to understand the relevance of many factors to the success of learning, esp. the availability of annotated data, the kind of dependencies in the da…
▽ More
This paper reports on the "Learning Computational Grammars" (LCG) project, a postdoc network devoted to studying the application of machine learning techniques to grammars suitable for computational use. We were interested in a more systematic survey to understand the relevance of many factors to the success of learning, esp. the availability of annotated data, the kind of dependencies in the data, and the availability of knowledge bases (grammars). We focused on syntax, esp. noun phrase (NP) syntax.
△ Less
Submitted 15 July, 2001;
originally announced July 2001.
-
Introduction to the CoNLL-2001 Shared Task: Clause Identification
Authors:
Erik F. Tjong Kim Sang,
Herve Dejean
Abstract:
We describe the CoNLL-2001 shared task: dividing text into clauses. We give background information on the data sets, present a general overview of the systems that have taken part in the shared task and briefly discuss their performance.
We describe the CoNLL-2001 shared task: dividing text into clauses. We give background information on the data sets, present a general overview of the systems that have taken part in the shared task and briefly discuss their performance.
△ Less
Submitted 15 July, 2001;
originally announced July 2001.
-
Introduction to the CoNLL-2000 Shared Task: Chunking
Authors:
Erik F. Tjong Kim Sang,
Sabine Buchholz
Abstract:
We describe the CoNLL-2000 shared task: dividing text into syntactically related non-overlap** groups of words, so-called text chunking. We give background information on the data sets, present a general overview of the systems that have taken part in the shared task and briefly discuss their performance.
We describe the CoNLL-2000 shared task: dividing text into syntactically related non-overlap** groups of words, so-called text chunking. We give background information on the data sets, present a general overview of the systems that have taken part in the shared task and briefly discuss their performance.
△ Less
Submitted 18 September, 2000;
originally announced September 2000.
-
Applying System Combination to Base Noun Phrase Identification
Authors:
Erik F. Tjong Kim Sang,
Walter Daelemans,
Herve Dejean,
Rob Koeling,
Yuval Krymolowski,
Vasin Punyakanok,
Dan Roth
Abstract:
We use seven machine learning algorithms for one task: identifying base noun phrases. The results have been processed by different system combination methods and all of these outperformed the best individual result. We have applied the seven learners with the best combinator, a majority vote of the top five systems, to a standard data set and managed to improve the best published result for this…
▽ More
We use seven machine learning algorithms for one task: identifying base noun phrases. The results have been processed by different system combination methods and all of these outperformed the best individual result. We have applied the seven learners with the best combinator, a majority vote of the top five systems, to a standard data set and managed to improve the best published result for this data set.
△ Less
Submitted 17 August, 2000;
originally announced August 2000.
-
Noun Phrase Recognition by System Combination
Authors:
Erik F. Tjong Kim Sang
Abstract:
The performance of machine learning algorithms can be improved by combining the output of different systems. In this paper we apply this idea to the recognition of noun phrases.We generate different classifiers by using different representations of the data. By combining the results with voting techniques described in (Van Halteren et.al. 1998) we manage to improve the best reported performances…
▽ More
The performance of machine learning algorithms can be improved by combining the output of different systems. In this paper we apply this idea to the recognition of noun phrases.We generate different classifiers by using different representations of the data. By combining the results with voting techniques described in (Van Halteren et.al. 1998) we manage to improve the best reported performances on standard data sets for base noun phrases and arbitrary noun phrases.
△ Less
Submitted 10 May, 2000;
originally announced May 2000.
-
Representing Text Chunks
Authors:
Erik F. Tjong Kim Sang,
Jorn Veenstra
Abstract:
Dividing sentences in chunks of words is a useful preprocessing step for parsing, information extraction and information retrieval. (Ramshaw and Marcus, 1995) have introduced a "convenient" data representation for chunking by converting it to a tagging task. In this paper we will examine seven different data representations for the problem of recognizing noun phrase chunks. We will show that the…
▽ More
Dividing sentences in chunks of words is a useful preprocessing step for parsing, information extraction and information retrieval. (Ramshaw and Marcus, 1995) have introduced a "convenient" data representation for chunking by converting it to a tagging task. In this paper we will examine seven different data representations for the problem of recognizing noun phrase chunks. We will show that the the data representation choice has a minor influence on chunking performance. However, equipped with the most suitable data representation, our memory-based learning chunker was able to improve the best published chunking results for a standard data set.
△ Less
Submitted 6 July, 1999;
originally announced July 1999.