Search | arXiv e-print repository

Exploiting the Layered Intrinsic Dimensionality of Deep Models for Practical Adversarial Training

Authors: Enes Altinisik, Safa Messaoud, Husrev Taha Sencar, Hassan Sajjad, Sanjay Chawla

Abstract: Despite being a heavily researched topic, Adversarial Training (AT) is rarely, if ever, deployed in practical AI systems for two primary reasons: (i) the gained robustness is frequently accompanied by a drop in generalization and (ii) generating adversarial examples (AEs) is computationally prohibitively expensive. To address these limitations, we propose SMAAT, a new AT algorithm that leverages t… ▽ More Despite being a heavily researched topic, Adversarial Training (AT) is rarely, if ever, deployed in practical AI systems for two primary reasons: (i) the gained robustness is frequently accompanied by a drop in generalization and (ii) generating adversarial examples (AEs) is computationally prohibitively expensive. To address these limitations, we propose SMAAT, a new AT algorithm that leverages the manifold conjecture, stating that off-manifold AEs lead to better robustness while on-manifold AEs result in better generalization. Specifically, SMAAT aims at generating a higher proportion of off-manifold AEs by perturbing the intermediate deepnet layer with the lowest intrinsic dimension. This systematically results in better scalability compared to classical AT as it reduces the PGD chains length required for generating the AEs. Additionally, our study provides, to the best of our knowledge, the first explanation for the difference in the generalization and robustness trends between vision and language models, ie., AT results in a drop in generalization in vision models whereas, in encoder-based language models, generalization either improves or remains unchanged. We show that vision transformers and decoder-based models tend to have low intrinsic dimensionality in the earlier layers of the network (more off-manifold AEs), while encoder-based models have low intrinsic dimensionality in the later layers. We demonstrate the efficacy of SMAAT; on several tasks, including robustifying (i) sentiment classifiers, (ii) safety filters in decoder-based models, and (iii) retrievers in RAG setups. SMAAT requires only 25-33% of the GPU time compared to standard AT, while significantly improving robustness across all applications and maintaining comparable generalization. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2403.17068 [pdf, other]

Semantic Ranking for Automated Adversarial Technique Annotation in Security Text

Authors: Udesh Kumarasinghe, Ahmed Lekssays, Husrev Taha Sencar, Sabri Boughorbel, Charitha Elvitigala, Preslav Nakov

Abstract: We introduce a new method for extracting structured threat behaviors from threat intelligence text. Our method is based on a multi-stage ranking architecture that allows jointly optimizing for efficiency and effectiveness. Therefore, we believe this problem formulation better aligns with the real-world nature of the task considering the large number of adversary techniques and the extensive body o… ▽ More We introduce a new method for extracting structured threat behaviors from threat intelligence text. Our method is based on a multi-stage ranking architecture that allows jointly optimizing for efficiency and effectiveness. Therefore, we believe this problem formulation better aligns with the real-world nature of the task considering the large number of adversary techniques and the extensive body of threat intelligence created by security analysts. Our findings show that the proposed system yields state-of-the-art performance results for this task. Results show that our method has a top-3 recall performance of 81\% in identifying the relevant technique among 193 top-level techniques. Our tests also demonstrate that our system performs significantly better (+40\%) than the widely used large language models when tested under a zero-shot setting. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2309.03647 [pdf, other]

ProvG-Searcher: A Graph Representation Learning Approach for Efficient Provenance Graph Search

Authors: Enes Altinisik, Fatih Deniz, Husrev Taha Sencar

Abstract: We present ProvG-Searcher, a novel approach for detecting known APT behaviors within system security logs. Our approach leverages provenance graphs, a comprehensive graph representation of event logs, to capture and depict data provenance relations by map** system entities as nodes and their interactions as edges. We formulate the task of searching provenance graphs as a subgraph matching proble… ▽ More We present ProvG-Searcher, a novel approach for detecting known APT behaviors within system security logs. Our approach leverages provenance graphs, a comprehensive graph representation of event logs, to capture and depict data provenance relations by map** system entities as nodes and their interactions as edges. We formulate the task of searching provenance graphs as a subgraph matching problem and employ a graph representation learning method. The central component of our search methodology involves embedding of subgraphs in a vector space where subgraph relationships can be directly evaluated. We achieve this through the use of order embeddings that simplify subgraph matching to straightforward comparisons between a query and precomputed subgraph representations. To address challenges posed by the size and complexity of provenance graphs, we propose a graph partitioning scheme and a behavior-preserving graph reduction method. Overall, our technique offers significant computational efficiency, allowing most of the search computation to be performed offline while incorporating a lightweight comparison step during query execution. Experimental results on standard datasets demonstrate that ProvG-Searcher achieves superior performance, with an accuracy exceeding 99% in detecting query behaviors and a false positive rate of approximately 0.02%, outperforming other approaches. △ Less

Submitted 19 December, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

arXiv:2211.16316 [pdf, other]

A3T: Accuracy Aware Adversarial Training

Authors: Enes Altinisik, Safa Messaoud, Husrev Taha Sencar, Sanjay Chawla

Abstract: Adversarial training has been empirically shown to be more prone to overfitting than standard training. The exact underlying reasons still need to be fully understood. In this paper, we identify one cause of overfitting related to current practices of generating adversarial samples from misclassified samples. To address this, we propose an alternative approach that leverages the misclassified samp… ▽ More Adversarial training has been empirically shown to be more prone to overfitting than standard training. The exact underlying reasons still need to be fully understood. In this paper, we identify one cause of overfitting related to current practices of generating adversarial samples from misclassified samples. To address this, we propose an alternative approach that leverages the misclassified samples to mitigate the overfitting problem. We show that our approach achieves better generalization while having comparable robustness to state-of-the-art adversarial training methods on a wide range of computer vision, natural language processing, and tabular tasks. △ Less

Submitted 29 November, 2022; originally announced November 2022.

arXiv:2211.05533 [pdf, other]

GREENER: Graph Neural Networks for News Media Profiling

Authors: Panayot Panayotov, Utsav Shukla, Husrev Taha Sencar, Mohamed Nabeel, Preslav Nakov

Abstract: We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. This is an important but under-studied problem related to disinformation and "fake news" detection, but it addresses the issue at a coarser granularity compared to looking at an individual article or an individual claim. This is useful as it allows to profile entire media outlets in adva… ▽ More We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias. This is an important but under-studied problem related to disinformation and "fake news" detection, but it addresses the issue at a coarser granularity compared to looking at an individual article or an individual claim. This is useful as it allows to profile entire media outlets in advance. Unlike previous work, which has focused primarily on text (e.g.,~on the text of the articles published by the target website, or on the textual description in their social media profiles or in Wikipedia), here our main focus is on modeling the similarity between media outlets based on the overlap of their audience. This is motivated by homophily considerations, i.e.,~the tendency of people to have connections to people with similar interests, which we extend to media, hypothesizing that similar types of media would be read by similar kinds of users. In particular, we propose GREENER (GRaph nEural nEtwork for News mEdia pRofiling), a model that builds a graph of inter-media connections based on their audience overlap, and then uses graph neural networks to represent each medium. We find that such representations are quite useful for predicting the factuality and the bias of news media outlets, yielding improvements over state-of-the-art results reported on two datasets. When augmented with conventionally used representations obtained from news articles, Twitter, YouTube, Facebook, and Wikipedia, prediction accuracy is found to improve by 2.5-27 macro-F1 points for the two tasks. △ Less

Submitted 10 November, 2022; originally announced November 2022.

arXiv:2211.05523 [pdf, other]

Impact of Adversarial Training on Robustness and Generalizability of Language Models

Authors: Enes Altinisik, Hassan Sajjad, Husrev Taha Sencar, Safa Messaoud, Sanjay Chawla

Abstract: Adversarial training is widely acknowledged as the most effective defense against adversarial attacks. However, it is also well established that achieving both robustness and generalization in adversarially trained models involves a trade-off. The goal of this work is to provide an in depth comparison of different approaches for adversarial training in language models. Specifically, we study the e… ▽ More Adversarial training is widely acknowledged as the most effective defense against adversarial attacks. However, it is also well established that achieving both robustness and generalization in adversarially trained models involves a trade-off. The goal of this work is to provide an in depth comparison of different approaches for adversarial training in language models. Specifically, we study the effect of pre-training data augmentation as well as training time input perturbations vs. embedding space perturbations on the robustness and generalization of transformer-based language models. Our findings suggest that better robustness can be achieved by pre-training data augmentation or by training with input space perturbation. However, training with embedding space perturbation significantly improves generalization. A linguistic correlation analysis of neurons of the learned models reveals that the improved generalization is due to 'more specialized' neurons. To the best of our knowledge, this is the first work to carry out a deep qualitative analysis of different methods of generating adversarial examples in adversarial training of language models. △ Less

Submitted 10 December, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

arXiv:2210.01797 [pdf, other]

Ten Years after ImageNet: A 360° Perspective on AI

Authors: Sanjay Chawla, Preslav Nakov, Ahmed Ali, Wendy Hall, Issa Khalil, Xiaosong Ma, Husrev Taha Sencar, Ingmar Weber, Michael Wooldridge, Ting Yu

Abstract: It is ten years since neural networks made their spectacular comeback. Prompted by this anniversary, we take a holistic perspective on Artificial Intelligence (AI). Supervised Learning for cognitive tasks is effectively solved - provided we have enough high-quality labeled data. However, deep neural network models are not easily interpretable, and thus the debate between blackbox and whitebox mode… ▽ More It is ten years since neural networks made their spectacular comeback. Prompted by this anniversary, we take a holistic perspective on Artificial Intelligence (AI). Supervised Learning for cognitive tasks is effectively solved - provided we have enough high-quality labeled data. However, deep neural network models are not easily interpretable, and thus the debate between blackbox and whitebox modeling has come to the fore. The rise of attention networks, self-supervised learning, generative modeling, and graph neural networks has widened the application space of AI. Deep Learning has also propelled the return of reinforcement learning as a core building block of autonomous decision making systems. The possible harms made possible by new AI technologies have raised socio-technical issues such as transparency, fairness, and accountability. The dominance of AI by Big-Tech who control talent, computing resources, and most importantly, data may lead to an extreme AI divide. Failure to meet high expectations in high profile, and much heralded flagship projects like self-driving vehicles could trigger another AI winter. △ Less

Submitted 30 September, 2022; originally announced October 2022.

arXiv:2206.05679 [pdf, other]

Exploration of Enterprise Server Data to Assess Ease of Modeling System Behavior

Authors: Enes Altinisik, Husrev Taha Sencar, Mohamed Nabeel, Issa Khalil, Ting Yu

Abstract: Enterprise networks are one of the major targets for cyber attacks due to the vast amount of sensitive and valuable data they contain. A common approach to detecting attacks in the enterprise environment relies on modeling the behavior of users and systems to identify unexpected deviations. The feasibility of this approach crucially depends on how well attack-related events can be isolated from be… ▽ More Enterprise networks are one of the major targets for cyber attacks due to the vast amount of sensitive and valuable data they contain. A common approach to detecting attacks in the enterprise environment relies on modeling the behavior of users and systems to identify unexpected deviations. The feasibility of this approach crucially depends on how well attack-related events can be isolated from benign and mundane system activities. Despite the significant focus on end-user systems, the background behavior of servers running critical services for the enterprise is less studied. To guide the design of detection methods tailored for servers, in this work, we examine system event records from 46 servers in a large enterprise obtained over a duration of ten weeks. We analyze the rareness characteristics and the similarity of the provenance relations in the event log data. Our findings show that server activity, in general, is highly variant over time and dissimilar across different types of servers. However, careful consideration of profiling window of historical events and service level grou** of servers improve rareness measurements by 24.5%. Further, utilizing better contextual representations, the similarity in provenance relationships could be improved. An important implication of our findings is that detection techniques developed considering experimental setups with non-representative characteristics may perform poorly in practice. △ Less

Submitted 12 June, 2022; originally announced June 2022.

arXiv:2201.02949 [pdf, other]

doi 10.1109/TIFS.2022.3204210

Video Source Characterization Using Encoding and Encapsulation Characteristics

Authors: Enes Altinisik, Husrev Taha Sencar, Diram Tabaa

Abstract: We introduce a new method for camera-model identification. Our approach combines two independent aspects of video file generation corresponding to video coding and media data encapsulation. To this end, a joint representation of the overall file metadata is developed and used in conjunction with a two-level hierarchical classification method. At the first level, our method groups videos into metac… ▽ More We introduce a new method for camera-model identification. Our approach combines two independent aspects of video file generation corresponding to video coding and media data encapsulation. To this end, a joint representation of the overall file metadata is developed and used in conjunction with a two-level hierarchical classification method. At the first level, our method groups videos into metaclasses considering several abstractions that represent high-level structural properties of file metadata. This is followed by a more nuanced classification of classes that comprise each metaclass. The method is evaluated on more than 20K videos obtained by combining four public video datasets. Tests show that a balanced accuracy of 91% is achieved in correctly identifying the class of a video among 119 video classes. This corresponds to an improvement of 6.5% over the conventional approach based on video file encapsulation characteristics. Furthermore, we investigate a setting relevant to forensic file recovery operations where file metadata cannot be located or are missing but video data is partially available. By estimating a partial list of encoding parameters from coded video data, we demonstrate that an identification accuracy of 57% can be achieved in camera-model identification in the absence of any other file metadata. △ Less

Submitted 28 August, 2022; v1 submitted 9 January, 2022; originally announced January 2022.

arXiv:2104.14522 [pdf, other]

doi 10.1109/TIFS.2021.3118876

Automatic Generation of H.264 Parameter Sets to Recover Video File Fragments

Authors: Enes Altinisik, Hüsrev Taha Sencar

Abstract: We address the problem of decoding video file fragments when the necessary encoding parameters are missing. With this objective, we propose a method that automatically generates H.264 video headers containing these parameters and extracts coded pictures in the partially available compressed video data. To accomplish this, we examined a very large corpus of videos to learn patterns of encoding sett… ▽ More We address the problem of decoding video file fragments when the necessary encoding parameters are missing. With this objective, we propose a method that automatically generates H.264 video headers containing these parameters and extracts coded pictures in the partially available compressed video data. To accomplish this, we examined a very large corpus of videos to learn patterns of encoding settings commonly used by encoders and created a parameter dictionary. Further, to facilitate a more efficient search our method identifies characteristics of a coded bitstream to discriminate the entropy coding mode. It also utilizes the application logs created by the decoder to identify correct parameter values. Evaluation of the effectiveness of the proposed method on more than 55K videos with diverse provenance shows that it can generate valid headers on average in 11.3 decoding trials per video. This result represents an improvement by more than a factor of 10 over the conventional approach of video header stitching to recover video file fragments. △ Less

Submitted 13 September, 2021; v1 submitted 29 April, 2021; originally announced April 2021.

arXiv:2103.16235 [pdf, other]

BLEKeeper: Response Time Behavior Based Man-In-The-Middle Attack Detection

Authors: Muhammed Ali Yurdagul, Husrev Taha Sencar

Abstract: Bluetooth Low Energy (BLE) has become one of the most popular wireless communication protocols and is used in billions of smart devices. Despite several security features, the hardware and software limitations of these devices makes them vulnerable to man-in-the-middle (MITM) attacks. Due to the use of these devices in increasingly diverse and safety-critical applications, the capability to detect… ▽ More Bluetooth Low Energy (BLE) has become one of the most popular wireless communication protocols and is used in billions of smart devices. Despite several security features, the hardware and software limitations of these devices makes them vulnerable to man-in-the-middle (MITM) attacks. Due to the use of these devices in increasingly diverse and safety-critical applications, the capability to detect MITM attacks has become more critical. To address this challenge, we propose the use of the response time behavior of a BLE device observed in relation to select read and write operations and introduce an activeMITM attack detection system that identifies changes in response time. Our measurements on several BLE devices show that theirresponse time behavior exhibits very high regularity, making it a very reliable attack indicator that cannot be concealed by an attacker. Test results show that our system can very accurately and quickly detect MITM attacks while requiring a simple learning approach. △ Less

Submitted 30 March, 2021; originally announced March 2021.

arXiv:2103.12506 [pdf, ps, other]

A Survey on Predicting the Factuality and the Bias of News Media

Authors: Preslav Nakov, Husrev Taha Sencar, Jisun An, Haewoon Kwak

Abstract: The present level of proliferation of fake, biased, and propagandistic content online has made it impossible to fact-check every single suspicious claim or article, either manually or automatically. Thus, many researchers are shifting their attention to higher granularity, aiming to profile entire news outlets, which makes it possible to detect likely "fake news" the moment it is published, by sim… ▽ More The present level of proliferation of fake, biased, and propagandistic content online has made it impossible to fact-check every single suspicious claim or article, either manually or automatically. Thus, many researchers are shifting their attention to higher granularity, aiming to profile entire news outlets, which makes it possible to detect likely "fake news" the moment it is published, by simply checking the reliability of its source. Source factuality is also an important element of systems for automatic fact-checking and "fake news" detection, as they need to assess the reliability of the evidence they retrieve online. Political bias detection, which in the Western political landscape is about predicting left-center-right bias, is an equally important topic, which has experienced a similar shift towards profiling entire news outlets. Moreover, there is a clear connection between the two, as highly biased media are less likely to be factual; yet, the two problems have been addressed separately. In this survey, we review the state of the art on media profiling for factuality and bias, arguing for the need to model them jointly. We further discuss interesting recent advances in using different information sources and modalities, which go beyond the text of the articles the target news outlet has published. Finally, we discuss current challenges and outline future research directions. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Comments: factuality of reporting, fact-checking, political ideology, media bias, disinformation, propaganda, social media, news media

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2008.11985 [pdf, ps, other]

Estimating Uniqueness of I-Vector Representation of Human Voice

Authors: Erkam Sinan Tandogan, Husrev Taha Sencar

Abstract: We study the individuality of the human voice with respect to a widely used feature representation of speech utterances, namely, the i-vector model. As a first step toward this goal, we compare and contrast uniqueness measures proposed for different biometric modalities. Then, we introduce a new uniqueness measure that evaluates the entropy of i-vectors while taking into account speaker level vari… ▽ More We study the individuality of the human voice with respect to a widely used feature representation of speech utterances, namely, the i-vector model. As a first step toward this goal, we compare and contrast uniqueness measures proposed for different biometric modalities. Then, we introduce a new uniqueness measure that evaluates the entropy of i-vectors while taking into account speaker level variations. Our measure operates in the discrete feature space and relies on accurate estimation of the distribution of i-vectors. Therefore, i-vectors are quantized while ensuring that both the quantized and original representations yield similar speaker verification performance. Uniqueness estimates are obtained from two newly generated datasets and the public VoxCeleb dataset. The first custom dataset contains more than one and a half million speech samples of 20,741 speakers obtained from TEDx Talks videos. The second one includes over twenty one thousand speech samples from 1,595 actors that are extracted from movie dialogues. Using this data, we analyzed how several factors, such as the number of speakers, number of samples per speaker, sample durations, and diversity of utterances affect uniqueness estimates. Most notably, we determine that the discretization of i-vectors does not cause a reduction in speaker recognition performance. Our results show that the degree of distinctiveness offered by i-vector-based representation may reach 43-70 bits considering 5-second long speech samples; however, under less constrained variations in speech, uniqueness estimates are found to reduce by around 30 bits. We also find that doubling the sample duration increases the distinctiveness of the i-vector representation by around 20 bits. △ Less

Submitted 3 March, 2021; v1 submitted 27 August, 2020; originally announced August 2020.

Comments: 13 pages

arXiv:2008.08138 [pdf, other]

doi 10.2352/ISSN.2470-1173.2021.4.MWSF-338

PRNU Estimation from Encoded Videos Using Block-Based Weighting

Authors: Enes Altinisik, Kasim Tasdemir, Husrev Taha Sencar

Abstract: Estimating the photo-response non-uniformity (PRNU) of an imaging sensor from videos is a challenging task due to complications created by several processing steps in the camera imaging pipeline. Among these steps, video coding is one of the most disruptive to PRNU estimation because of its lossy nature. Since videos are always stored in a compressed format, the ability to cope with the disruptive… ▽ More Estimating the photo-response non-uniformity (PRNU) of an imaging sensor from videos is a challenging task due to complications created by several processing steps in the camera imaging pipeline. Among these steps, video coding is one of the most disruptive to PRNU estimation because of its lossy nature. Since videos are always stored in a compressed format, the ability to cope with the disruptive effects of encoding is central to reliable attribution. In this work, by focusing on the block-based operation of widely used video coding standards, we present an improved approach to PRNU estimation that exploits this behavior. To this purpose, several PRNU weighting schemes that utilize block-level parameters, such as encoding block type, quantization strength, and rate-distortion value, are proposed and compared. Our results show that the use of the coding rate of a block serves as a better estimator for the strength of PRNU with almost three times improvement in the matching statistic at low to medium coding bitrates as compared to the basic estimation method developed for photos. △ Less

Submitted 28 January, 2021; v1 submitted 18 August, 2020; originally announced August 2020.

arXiv:1912.05018 [pdf, other]

doi 10.1109/TIFS.2020.3016830

Source Camera Verification from Strongly Stabilized Videos

Authors: Enes Altinisik, Husrev Taha Sencar

Abstract: Image stabilization performed during imaging and/or post-processing poses one of the most significant challenges to photo-response non-uniformity based source camera attribution from videos. When performed digitally, stabilization involves crop**, war**, and inpainting of video frames to eliminate unwanted camera motion. Hence, successful attribution requires the inversion of these transformat… ▽ More Image stabilization performed during imaging and/or post-processing poses one of the most significant challenges to photo-response non-uniformity based source camera attribution from videos. When performed digitally, stabilization involves crop**, war**, and inpainting of video frames to eliminate unwanted camera motion. Hence, successful attribution requires the inversion of these transformations in a blind manner. To address this challenge, we introduce a source camera verification method for videos that takes into account the spatially variant nature of stabilization transformations and assumes a larger degree of freedom in their search. Our method identifies transformations at a sub-frame level, incorporates a number of constraints to validate their correctness, and offers computational flexibility in the search for the correct transformation. The method also adopts a holistic approach in countering disruptive effects of other video generation steps, such as video coding and downsizing, for more reliable attribution. Tests performed on one public and two custom datasets show that the proposed method is able to verify the source of 23-30% of all videos that underwent stronger stabilization, depending on computation load, without a significant impact on false attribution. △ Less

Submitted 22 July, 2020; v1 submitted 26 November, 2019; originally announced December 2019.

arXiv:1905.09611 [pdf, other]

doi 10.1109/TIFS.2019.2945190

Mitigation of H.264 and H.265 Video Compression for Reliable PRNU Estimation

Authors: Enes Altınışık, Kasım Taşdemir, Hüsrev Taha Sencar

Abstract: The photo-response non-uniformity (PRNU) is a distinctive image sensor characteristic, and an imaging device inadvertently introduces its sensor's PRNU into all media it captures. Therefore, the PRNU can be regarded as a camera fingerprint and used for source attribution. The imaging pipeline in a camera, however, involves various processing steps that are detrimental to PRNU estimation. In the co… ▽ More The photo-response non-uniformity (PRNU) is a distinctive image sensor characteristic, and an imaging device inadvertently introduces its sensor's PRNU into all media it captures. Therefore, the PRNU can be regarded as a camera fingerprint and used for source attribution. The imaging pipeline in a camera, however, involves various processing steps that are detrimental to PRNU estimation. In the context of photographic images, these challenges are successfully addressed and the method for estimating a sensor's PRNU pattern is well established. However, various additional challenges related to generation of videos remain largely untackled. With this perspective, this work introduces methods to mitigate disruptive effects of widely deployed H.264 and H.265 video compression standards on PRNU estimation. Our approach involves an intervention in the decoding process to eliminate a filtering procedure applied at the decoder to reduce blockiness. It also utilizes decoding parameters to develop a weighting scheme and adjust the contribution of video frames at the macroblock level to PRNU estimation process. Results obtained on videos captured by 28 cameras show that our approach increases the PRNU matching metric up to more than five times over the conventional estimation method tailored for photos. △ Less

Submitted 23 May, 2019; originally announced May 2019.

Showing 1–16 of 16 results for author: Sencar, H T