Search | arXiv e-print repository

Enhancing Energy Sector Resilience: Integrating Security by Design Principles

Authors: Dov Shirtz, Inna Koberman, Aviad Elyashar, Rami Puzis, Yuval Elovici

Abstract: Security by design, Sbd is a concept for develo** and maintaining systems that are, to the greatest extent possible, free from security vulnerabilities and impervious to security attacks. In addition to technical aspects, such as how to develop a robust industrial control systems hardware, software, communication product, etc., SbD includes also soft aspects, such as organizational managerial at… ▽ More Security by design, Sbd is a concept for develo** and maintaining systems that are, to the greatest extent possible, free from security vulnerabilities and impervious to security attacks. In addition to technical aspects, such as how to develop a robust industrial control systems hardware, software, communication product, etc., SbD includes also soft aspects, such as organizational managerial attitude and behavior, and employee awareness. Under the Sbd concept, systems, ICS in our context, will be considered more trustworthy by users. User's trust in the systems will be derived from the meticulous adherence to the SbD processes and policies. In accordance with the SbD concept, security is considered. Security measures are implemented, at every stage of the product and systems development life cycle, rather than afterwards. This document presents the security requirements for the implementation of the SbD in industrial control systems. The information presented does not negate any existing security and cyber security standards, etc. Instead, we strongly recommend that organizations should implement and comply with those standards and best practices. Security by design is not a one-time process. It starts at the very beginning of the products of the system design and continues through all its lifecycle. Due to the benefits of the SbD, higher level of security, and robustness to cyber attacks, all organizations associated with the energy sector should strive to establish an ecosystem. The requirements presented in this document may be perceived as burdensome by organizations. However, strict compliance with the requirements and existing security standards and best practices, including continuous monitoring, as specified in this document, is essential to realize an ecosystem driven and protected by the SbD △ Less

Submitted 18 February, 2024; originally announced February 2024.

Comments: 66 pages, 2 figures

ACM Class: K.6.5

arXiv:2305.06786 [pdf, other]

ReMark: Receptive Field based Spatial WaterMark Embedding Optimization using Deep Network

Authors: Natan Semyonov, Rami Puzis, Asaf Shabtai, Gilad Katz

Abstract: Watermarking is one of the most important copyright protection tools for digital media. The most challenging type of watermarking is the imperceptible one, which embeds identifying information in the data while retaining the latter's original quality. To fulfill its purpose, watermarks need to withstand various distortions whose goal is to damage their integrity. In this study, we investigate a no… ▽ More Watermarking is one of the most important copyright protection tools for digital media. The most challenging type of watermarking is the imperceptible one, which embeds identifying information in the data while retaining the latter's original quality. To fulfill its purpose, watermarks need to withstand various distortions whose goal is to damage their integrity. In this study, we investigate a novel deep learning-based architecture for embedding imperceptible watermarks. The key insight guiding our architecture design is the need to correlate the dimensions of our watermarks with the sizes of receptive fields (RF) of modules of our architecture. This adaptation makes our watermarks more robust, while also enabling us to generate them in a way that better maintains image quality. Extensive evaluations on a wide variety of distortions show that the proposed method is robust against most common distortions on watermarks including collusive distortion. △ Less

Submitted 11 May, 2023; originally announced May 2023.

arXiv:2212.14404 [pdf, other]

Cross Version Defect Prediction with Class Dependency Embeddings

Authors: Moti Cohen, Lior Rokach, Rami Puzis

Abstract: Software Defect Prediction aims at predicting which software modules are the most probable to contain defects. The idea behind this approach is to save time during the development process by hel** find bugs early. Defect Prediction models are based on historical data. Specifically, one can use data collected from past software distributions, or Versions, of the same target application under anal… ▽ More Software Defect Prediction aims at predicting which software modules are the most probable to contain defects. The idea behind this approach is to save time during the development process by hel** find bugs early. Defect Prediction models are based on historical data. Specifically, one can use data collected from past software distributions, or Versions, of the same target application under analysis. Defect Prediction based on past versions is called Cross Version Defect Prediction (CVDP). Traditionally, Static Code Metrics are used to predict defects. In this work, we use the Class Dependency Network (CDN) as another predictor for defects, combined with static code metrics. CDN data contains structural information about the target application being analyzed. Usually, CDN data is analyzed using different handcrafted network measures, like Social Network metrics. Our approach uses network embedding techniques to leverage CDN information without having to build the metrics manually. In order to use the embeddings between versions, we incorporate different embedding alignment techniques. To evaluate our approach, we performed experiments on 24 software release pairs and compared it against several benchmark methods. In these experiments, we analyzed the performance of two different graph embedding techniques, three anchor selection approaches, and two alignment techniques. We also built a meta-model based on two different embeddings and achieved a statistically significant improvement in AUC of 4.7% (p < 0.002) over the baseline method. △ Less

Submitted 29 December, 2022; originally announced December 2022.

arXiv:2211.06325 [pdf, other]

Can one hear the position of nodes?

Authors: Rami Puzis

Abstract: Wave propagation through nodes and links of a network forms the basis of spectral graph theory. Nevertheless, the sound emitted by nodes within the resonating chamber formed by a network are not well studied. The sound emitted by vibrations of individual nodes reflects the structure of the overall network topology but also the location of the node within the network. In this article, a sound recog… ▽ More Wave propagation through nodes and links of a network forms the basis of spectral graph theory. Nevertheless, the sound emitted by nodes within the resonating chamber formed by a network are not well studied. The sound emitted by vibrations of individual nodes reflects the structure of the overall network topology but also the location of the node within the network. In this article, a sound recognition neural network is trained to infer centrality measures from the nodes' wave-forms. In addition to advancing network representation learning, sounds emitted by nodes are plausible in most cases. Auralization of the network topology may open new directions in arts, competing with network visualization. △ Less

Submitted 10 November, 2022; originally announced November 2022.

Comments: Presented at Complex Networks 2022, Palermo, Italy

arXiv:2208.05750 [pdf, other]

A Survey of MulVAL Extensions and Their Attack Scenarios Coverage

Authors: David Tayouri, Nick Baum, Asaf Shabtai, Rami Puzis

Abstract: Organizations employ various adversary models in order to assess the risk and potential impact of attacks on their networks. Attack graphs represent vulnerabilities and actions an attacker can take to identify and compromise an organization's assets. Attack graphs facilitate both visual presentation and algorithmic analysis of attack scenarios in the form of attack paths. MulVAL is a generic open-… ▽ More Organizations employ various adversary models in order to assess the risk and potential impact of attacks on their networks. Attack graphs represent vulnerabilities and actions an attacker can take to identify and compromise an organization's assets. Attack graphs facilitate both visual presentation and algorithmic analysis of attack scenarios in the form of attack paths. MulVAL is a generic open-source framework for constructing logical attack graphs, which has been widely used by researchers and practitioners and extended by them with additional attack scenarios. This paper surveys all of the existing MulVAL extensions, and maps all MulVAL interaction rules to MITRE ATT&CK Techniques to estimate their attack scenarios coverage. This survey aligns current MulVAL extensions along unified ontological concepts and highlights the existing gaps. It paves the way for methodical improvement of MulVAL and the comprehensive modeling of the entire landscape of adversarial behaviors captured in MITRE ATT&CK. △ Less

Submitted 11 August, 2022; originally announced August 2022.

arXiv:2204.02057 [pdf, other]

Large-Scale Shill Bidder Detection in E-commerce

Authors: Michael Fire, Rami Puzis, Dima Kagan, Yuval Elovici

Abstract: User feedback is one of the most effective methods to build and maintain trust in electronic commerce platforms. Unfortunately, dishonest sellers often bend over backward to manipulate users' feedback or place phony bids in order to increase their own sales and harm competitors. The black market of user feedback, supported by a plethora of shill bidders, prospers on top of legitimate electronic co… ▽ More User feedback is one of the most effective methods to build and maintain trust in electronic commerce platforms. Unfortunately, dishonest sellers often bend over backward to manipulate users' feedback or place phony bids in order to increase their own sales and harm competitors. The black market of user feedback, supported by a plethora of shill bidders, prospers on top of legitimate electronic commerce. In this paper, we investigate the ecosystem of shill bidders based on large-scale data by analyzing hundreds of millions of users who performed billions of transactions, and we propose a machine-learning-based method for identifying communities of users that methodically provide dishonest feedback. Our results show that (1) shill bidders can be identified with high precision based on their transaction and feedback statistics; and (2) in contrast to legitimate buyers and sellers, shill bidders form cliques to support each other. △ Less

Submitted 21 April, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

arXiv:2012.12498 [pdf, other]

Fake News Data Collection and Classification: Iterative Query Selection for Opaque Search Engines with Pseudo Relevance Feedback

Authors: Aviad Elyashar, Maor Reuben, Rami Puzis

Abstract: Retrieving information from an online search engine, is the first and most important step in many data mining tasks. Most of the search engines currently available on the web, including all social media platforms, are black-boxes (a.k.a opaque) supporting short keyword queries. In these settings, retrieving all posts and comments discussing a particular news item automatically and at large scales… ▽ More Retrieving information from an online search engine, is the first and most important step in many data mining tasks. Most of the search engines currently available on the web, including all social media platforms, are black-boxes (a.k.a opaque) supporting short keyword queries. In these settings, retrieving all posts and comments discussing a particular news item automatically and at large scales is a challenging task. In this paper, we propose a method for generating short keyword queries given a prototype document. The proposed iterative query selection algorithm (IQS) interacts with the opaque search engine to iteratively improve the query. It is evaluated on the Twitter TREC Microblog 2012 and TREC-COVID 2019 datasets showing superior performance compared to state-of-the-art. IQS is applied to automatically collect a large-scale fake news dataset of about 70K true and fake news items. The dataset, publicly available for research, includes more than 22M accounts and 61M tweets in Twitter approved format. We demonstrate the usefulness of the dataset for fake news detection task achieving state-of-the-art performance. △ Less

Submitted 21 February, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

arXiv:2011.14224 [pdf, other]

Cyberbiosecurity: DNA Injection Attack in Synthetic Biology

Authors: Dor Farbiash, Rami Puzis

Abstract: Today arbitrary synthetic DNA can be ordered online and delivered within several days. In order to regulate both intentional and unintentional generation of dangerous substances, most synthetic gene providers screen DNA orders. A weakness in the Screening Framework Guidance for Providers of Synthetic Double-Stranded DNA allows screening protocols based on this guidance to be circumvented using a g… ▽ More Today arbitrary synthetic DNA can be ordered online and delivered within several days. In order to regulate both intentional and unintentional generation of dangerous substances, most synthetic gene providers screen DNA orders. A weakness in the Screening Framework Guidance for Providers of Synthetic Double-Stranded DNA allows screening protocols based on this guidance to be circumvented using a generic obfuscation procedure inspired by early malware obfuscation techniques. Furthermore, accessibility and automation of the synthetic gene engineering workflow, combined with insufficient cybersecurity controls, allow malware to interfere with biological processes within the victim's lab, closing the loop with the possibility of an exploit written into a DNA molecule presented by Ney et al. in USENIX Security'17. Here we present an end-to-end cyberbiological attack, in which unwitting biologists may be tricked into generating dangerous substances within their labs. Consequently, despite common biosecurity assumptions, the attacker does not need to have physical contact with the generated substance. The most challenging part of the attack, decoding of the obfuscated DNA, is executed within living cells while using primitive biological operations commonly employed by biologists during in-vivo gene editing. This attack scenario underlines the need to harden the synthetic DNA supply chain with protections against cyberbiological threats. To address these threats we propose an improved screening protocol that takes into account in-vivo gene editing. △ Less

Submitted 28 November, 2020; originally announced November 2020.

arXiv:2010.01380 [pdf, other]

Predicting traffic overflows on private peering

Authors: Elad Rapaport, Ingmar Poese, Polina Zilberman, Oliver Holschke, Rami Puzis

Abstract: Large content providers and content distribution network operators usually connect with large Internet service providers (eyeball networks) through dedicated private peering. The capacity of these private network interconnects is provisioned to match the volume of the real content demand by the users. Unfortunately, in case of a surge in traffic demand, for example due to a content trending in a c… ▽ More Large content providers and content distribution network operators usually connect with large Internet service providers (eyeball networks) through dedicated private peering. The capacity of these private network interconnects is provisioned to match the volume of the real content demand by the users. Unfortunately, in case of a surge in traffic demand, for example due to a content trending in a certain country, the capacity of the private interconnect may deplete and the content provider/distributor would have to reroute the excess traffic through transit providers. Although, such overflow events are rare, they have significant negative impacts on content providers, Internet service providers, and end-users. These include unexpected delays and disruptions reducing the user experience quality, as well as direct costs paid by the Internet service provider to the transit providers. If the traffic overflow events could be predicted, the Internet service providers would be able to influence the routes chosen for the excess traffic to reduce the costs and increase user experience quality. In this article we propose a method based on an ensemble of deep learning models to predict overflow events over a short term horizon of 2-6 hours and predict the specific interconnections that will ingress the overflow traffic. The method was evaluated with 2.5 years' traffic measurement data from a large European Internet service provider resulting in a true-positive rate of 0.8 while maintaining a 0.05 false-positive rate. The lockdown imposed by the COVID-19 pandemic reduced the overflow prediction accuracy. Nevertheless, starting from the end of April 2020 with the gradual lockdown release, the old models trained before the pandemic perform equally well. △ Less

Submitted 3 October, 2020; originally announced October 2020.

arXiv:2005.11838 [pdf, other]

How Does That Sound? Multi-Language SpokenName2Vec Algorithm Using Speech Generation and Deep Learning

Authors: Aviad Elyashar, Rami Puzis, Michael Fire

Abstract: Searching for information about a specific person is an online activity frequently performed by many users. In most cases, users are aided by queries containing a name and sending back to the web search engines for finding their will. Typically, Web search engines provide just a few accurate results associated with a name-containing query. Currently, most solutions for suggesting synonyms in onlin… ▽ More Searching for information about a specific person is an online activity frequently performed by many users. In most cases, users are aided by queries containing a name and sending back to the web search engines for finding their will. Typically, Web search engines provide just a few accurate results associated with a name-containing query. Currently, most solutions for suggesting synonyms in online search are based on pattern matching and phonetic encoding, however very often, the performance of such solutions is less than optimal. In this paper, we propose SpokenName2Vec, a novel and generic approach which addresses the similar name suggestion problem by utilizing automated speech generation, and deep learning to produce spoken name embeddings. This sophisticated and innovative embeddings captures the way people pronounce names in any language and accent. Utilizing the name pronunciation can be helpful for both differentiating and detecting names that sound alike, but are written differently. The proposed approach was demonstrated on a large-scale dataset consisting of 250,000 forenames and evaluated using a machine learning classifier and 7,399 names with their verified synonyms. The performance of the proposed approach was found to be superior to 10 other algorithms evaluated in this study, including well used phonetic and string similarity algorithms, and two recently proposed algorithms. The results obtained suggest that the proposed approach could serve as a useful and valuable tool for solving the similar name suggestion problem. △ Less

Submitted 21 July, 2020; v1 submitted 24 May, 2020; originally announced May 2020.

Comments: arXiv admin note: text overlap with arXiv:1912.04003

arXiv:2003.03663 [pdf, other]

ATHAFI: Agile Threat Hunting And Forensic Investigation

Authors: Rami Puzis, Polina Zilberman, Yuval Elovici

Abstract: Attackers rapidly change their attacks to evade detection. Even the most sophisticated Intrusion Detection Systems that are based on artificial intelligence and advanced data analytic cannot keep pace with the rapid development of new attacks. When standard detection mechanisms fail or do not provide sufficient forensic information to investigate and mitigate attacks, targeted threat hunting perfo… ▽ More Attackers rapidly change their attacks to evade detection. Even the most sophisticated Intrusion Detection Systems that are based on artificial intelligence and advanced data analytic cannot keep pace with the rapid development of new attacks. When standard detection mechanisms fail or do not provide sufficient forensic information to investigate and mitigate attacks, targeted threat hunting performed by competent personnel is used. Unfortunately, many organization do not have enough security analysts to perform threat hunting tasks and today the level of automation of threat hunting is low. In this paper we describe a framework for agile threat hunting and forensic investigation (ATHAFI), which automates the threat hunting process at multiple levels. Adaptive targeted data collection, attack hypotheses generation, hypotheses testing, and continuous threat intelligence feeds allow to perform simple investigations in a fully automated manner. The increased level of automation will significantly boost the analyst's productivity during investigation of the harshest cases. Special Workflow Generation module adapts the threat hunting procedures either to the latest Threat Intelligence obtained from external sources (e.g. National CERT) or to the likeliest attack hypotheses generated by the Attack Hypotheses Generation module. The combination of Attack Hypotheses Generation and Workflows Generation enables intelligent adjustment of workflows, which react to emerging threats effectively. △ Less

Submitted 7 March, 2020; originally announced March 2020.

arXiv:2003.02575 [pdf, other]

DANTE: A framework for mining and monitoring darknet traffic

Authors: Dvir Cohen, Yisroel Mirsky, Yuval Elovici, Rami Puzis, Manuel Kamp, Tobias Martin, Asaf Shabtai

Abstract: Trillions of network packets are sent over the Internet to destinations which do not exist. This 'darknet' traffic captures the activity of botnets and other malicious campaigns aiming to discover and compromise devices around the world. In order to mine threat intelligence from this data, one must be able to handle large streams of logs and represent the traffic patterns in a meaningful way. Howe… ▽ More Trillions of network packets are sent over the Internet to destinations which do not exist. This 'darknet' traffic captures the activity of botnets and other malicious campaigns aiming to discover and compromise devices around the world. In order to mine threat intelligence from this data, one must be able to handle large streams of logs and represent the traffic patterns in a meaningful way. However, by observing how network ports (services) are used, it is possible to capture the intent of each transmission. In this paper, we present DANTE: a framework and algorithm for mining darknet traffic. DANTE learns the meaning of targeted network ports by applying Word2Vec to observed port sequences. Then, when a host sends a new sequence, DANTE represents the transmission as the average embedding of the ports found that sequence. Finally, DANTE uses a novel and incremental time-series cluster tracking algorithm on observed sequences to detect recurring behaviors and new emerging threats. To evaluate the system, we ran DANTE on a full year of darknet traffic (over three Tera-Bytes) collected by the largest telecommunications provider in Europe, Deutsche Telekom and analyzed the results. DANTE discovered 1,177 new emerging threats and was able to track malicious campaigns over time. We also compared DANTE to the current best approach and found DANTE to be more practical and effective at detecting darknet traffic patterns. △ Less

Submitted 5 March, 2020; originally announced March 2020.

arXiv:2003.01518 [pdf, other]

SoK: A Survey of Open-Source Threat Emulators

Authors: Polina Zilberman, Rami Puzis, Sunders Bruskin, Shai Shwarz, Yuval Elovici

Abstract: Threat emulators are tools or sets of scripts that emulate cyber attacks or malicious behavior. They can be used to create and launch single procedure attacks and multi-step attacks; the resulting attacks may be known or unknown cyber attacks. The motivation for using threat emulators varies and includes the need to perform automated security audits in organizations or reduce the size of red teams… ▽ More Threat emulators are tools or sets of scripts that emulate cyber attacks or malicious behavior. They can be used to create and launch single procedure attacks and multi-step attacks; the resulting attacks may be known or unknown cyber attacks. The motivation for using threat emulators varies and includes the need to perform automated security audits in organizations or reduce the size of red teams in order to lower pen testing costs; or the desire to create baseline tests for security tools under development or supply pen testers with another tool in their arsenal. In this paper, we review and compare various open-source threat emulators. We focus on tactics and techniques from the MITRE ATT&CK Enterprise matrix and determine whether they can be performed and tested with the emulators. We develop a comprehensive methodology for our qualitative and quantitative comparison of threat emulators with respect to general features, such as prerequisites, attack definition, cleanup, and more. Finally, we discuss the circumstances in which one threat emulator is preferred over another. This survey can help security teams, security developers, and product deployment teams examine their network environment or products with the most suitable threat emulator. Using the guidelines provided, a team can select the threat emulator that best meets their needs without evaluating all of them. △ Less

Submitted 2 October, 2020; v1 submitted 3 March, 2020; originally announced March 2020.

arXiv:2002.09832 [pdf, other]

Sequence Preserving Network Traffic Generation

Authors: Sigal Shaked, Amos Zamir, Roman Vainshtein, Moshe Unger, Lior Rokach, Rami Puzis, Bracha Shapira

Abstract: We present the Network Traffic Generator (NTG), a framework for perturbing recorded network traffic with the purpose of generating diverse but realistic background traffic for network simulation and what-if analysis in enterprise environments. The framework preserves many characteristics of the original traffic recorded in an enterprise, as well as sequences of network activities. Using the propos… ▽ More We present the Network Traffic Generator (NTG), a framework for perturbing recorded network traffic with the purpose of generating diverse but realistic background traffic for network simulation and what-if analysis in enterprise environments. The framework preserves many characteristics of the original traffic recorded in an enterprise, as well as sequences of network activities. Using the proposed framework, the original traffic flows are profiled using 200 cross-protocol features. The traffic is aggregated into flows of packets between IP pairs and clustered into groups of similar network activities. Sequences of network activities are then extracted. We examined two methods for extracting sequences of activities: a Markov model and a neural language model. Finally, new traffic is generated using the extracted model. We developed a prototype of the framework and conducted extensive experiments based on two real network traffic collections. Hypothesis testing was used to examine the difference between the distribution of original and generated features, showing that 30-100\% of the extracted features were preserved. Small differences between n-gram perplexities in sequences of network activities in the original and generated traffic, indicate that sequences of network activities were well preserved. △ Less

Submitted 23 February, 2020; originally announced February 2020.

arXiv:2001.05668 [pdf, other]

The Chameleon Attack: Manipulating Content Display in Online Social Media

Authors: Aviad Elyashar, Sagi Uziel, Abigail Paradise, Rami Puzis

Abstract: Online social networks (OSNs) are ubiquitous attracting millions of users all over the world. Being a popular communication media OSNs are exploited in a variety of cyber attacks. In this article, we discuss the Chameleon attack technique, a new type of OSN-based trickery where malicious posts and profiles change the way they are displayed to OSN users to conceal themselves before the attack or av… ▽ More Online social networks (OSNs) are ubiquitous attracting millions of users all over the world. Being a popular communication media OSNs are exploited in a variety of cyber attacks. In this article, we discuss the Chameleon attack technique, a new type of OSN-based trickery where malicious posts and profiles change the way they are displayed to OSN users to conceal themselves before the attack or avoid detection. Using this technique, adversaries can, for example, avoid censorship by concealing true content when it is about to be inspected; acquire social capital to promote new content while piggybacking a trending one; cause embarrassment and serious reputation damage by tricking a victim to like, retweet, or comment a message that he wouldn't normally do without any indication for the trickery within the OSN. An experiment performed with closed Facebook groups of sports fans shows that (1) Chameleon pages can pass by the moderation filters by changing the way their posts are displayed and (2) moderators do not distinguish between regular and Chameleon pages. We list the OSN weaknesses that facilitate the Chameleon attack and propose a set of mitigation guidelines. △ Less

Submitted 24 January, 2020; v1 submitted 16 January, 2020; originally announced January 2020.

arXiv:1912.04003 [pdf, other]

It Runs in the Family: Searching for Synonyms Using Digitized Family Trees

Authors: Aviad Elyashar, Rami Puzis, Michael Fire

Abstract: Searching for a person's name is a common online activity. However, Web search engines provide few accurate results to queries containing names. In contrast to a general word which has only one correct spelling, there are several legitimate spellings of a given name. Today, most techniques used to suggest synonyms in online search are based on pattern matching and phonetic encoding, however they o… ▽ More Searching for a person's name is a common online activity. However, Web search engines provide few accurate results to queries containing names. In contrast to a general word which has only one correct spelling, there are several legitimate spellings of a given name. Today, most techniques used to suggest synonyms in online search are based on pattern matching and phonetic encoding, however they often perform poorly. As a result, there is a need for an effective tool for improved synonym suggestion. In this paper, we propose a revolutionary approach for tackling the problem of synonym suggestion. Our novel algorithm, GRAFT, utilizes historical data collected from genealogy websites, along with network algorithms. GRAFT is a general algorithm that suggests synonyms using a graph based on names derived from digitized ancestral family trees. Synonyms are extracted from this graph, which is constructed using generic ordering functions that outperform other algorithms that suggest synonyms based on a single dimension, a factor that limits their performance. We evaluated GRAFT's performance on three ground truth datasets of forenames and surnames, including a large-scale online genealogy dataset with over 16 million profiles and more than 700,000 unique forenames and 500,000 surnames. We compared GRAFT's performance at suggesting synonyms to 10 other algorithms, including phonetic encoding, string similarity algorithms, and machine and deep learning algorithms. The results show GRAFT's superiority with respect to both forenames and surnames and demonstrate its use as a tool to improve synonym suggestion. △ Less

Submitted 29 January, 2021; v1 submitted 9 December, 2019; originally announced December 2019.

Comments: 20 pages

arXiv:1906.10922 [pdf]

Challenges for Security Assessment of Enterprises in the IoT Era

Authors: Yael Mathov, Noga Agmon, Asaf Shabtai, Rami Puzis, Nils Ole Tippenhauer, Yuval Elovici

Abstract: For years, attack graphs have been an important tool for security assessment of enterprise networks, but IoT devices, a new player in the IT world, might threat the reliability of this tool. In this paper, we review the challenges that must be addressed when using attack graphs to model and analyze enterprise networks that include IoT devices. In addition, we propose novel ideas and countermeasure… ▽ More For years, attack graphs have been an important tool for security assessment of enterprise networks, but IoT devices, a new player in the IT world, might threat the reliability of this tool. In this paper, we review the challenges that must be addressed when using attack graphs to model and analyze enterprise networks that include IoT devices. In addition, we propose novel ideas and countermeasures aimed at addressing these challenges. △ Less

Submitted 26 June, 2019; originally announced June 2019.

Comments: 11 pages, 4 figures, 1 table

arXiv:1906.10229 [pdf, other]

Evaluating the Information Security Awareness of Smartphone Users

Authors: Ron Bitton, Kobi Boymgold, Rami Puzis, Asaf Shabtai

Abstract: Information security awareness (ISA) is a practice focused on the set of skills, which help a user successfully mitigate a social engineering attack. Previous studies have presented various methods for evaluating the ISA of both PC and mobile users. These methods rely primarily on subjective data sources such as interviews, surveys, and questionnaires that are influenced by human interpretation an… ▽ More Information security awareness (ISA) is a practice focused on the set of skills, which help a user successfully mitigate a social engineering attack. Previous studies have presented various methods for evaluating the ISA of both PC and mobile users. These methods rely primarily on subjective data sources such as interviews, surveys, and questionnaires that are influenced by human interpretation and sincerity. Furthermore, previous methods for evaluating ISA did not address the differences between classes of social engineering attacks. In this paper, we present a novel framework designed for evaluating the ISA of smartphone users to specific social engineering attack classes. In addition to questionnaires, the proposed framework utilizes objective data sources: a mobile agent and a network traffic monitor; both of which are used to analyze the actual behavior of users. We empirically evaluated the ISA scores assessed from the three data sources (namely, the questionnaires, mobile agent, and network traffic monitor) by conducting a long-term user study involving 162 smartphone users. All participants were exposed to four different security challenges that resemble real-life social engineering attacks. These challenges were used to assess the ability of the proposed framework to derive a relevant ISA score. The results of our experiment show that: (1) the self-reported behavior of the users differs significantly from their actual behavior; and (2) ISA scores derived from data collected by the mobile agent or the network traffic monitor are highly correlated with the users' success in mitigating social engineering attacks. △ Less

Submitted 24 June, 2019; originally announced June 2019.

Comments: Under review in NDSS 2020

arXiv:1904.05853 [pdf, other]

doi 10.1145/3317549.3323411

Deployment Optimization of IoT Devices through Attack Graph Analysis

Authors: Noga Agmon, Asaf Shabtai, Rami Puzis

Abstract: The Internet of things (IoT) has become an integral part of our life at both work and home. However, these IoT devices are prone to vulnerability exploits due to their low cost, low resources, the diversity of vendors, and proprietary firmware. Moreover, short range communication protocols (e.g., Bluetooth or ZigBee) open additional opportunities for the lateral movement of an attacker within an o… ▽ More The Internet of things (IoT) has become an integral part of our life at both work and home. However, these IoT devices are prone to vulnerability exploits due to their low cost, low resources, the diversity of vendors, and proprietary firmware. Moreover, short range communication protocols (e.g., Bluetooth or ZigBee) open additional opportunities for the lateral movement of an attacker within an organization. Thus, the type and location of IoT devices may significantly change the level of network security of the organizational network. In this paper, we quantify the level of network security based on an augmented attack graph analysis that accounts for the physical location of IoT devices and their communication capabilities. We use the depth-first branch and bound (DFBnB) heuristic search algorithm to solve two optimization problems: Full Deployment with Minimal Risk (FDMR) and Maximal Utility without Risk Deterioration (MURD). An admissible heuristic is proposed to accelerate the search. The proposed method is evaluated using a real network with simulated deployment of IoT devices. The results demonstrate (1) the contribution of the augmented attack graphs to quantifying the impact of IoT devices deployed within the organization on security, and (2) the effectiveness of the optimized IoT deployment. △ Less

Submitted 11 April, 2019; originally announced April 2019.

arXiv:1903.02601 [pdf, other]

Attack Graph Obfuscation

Authors: Rami Puzis, Hadar Polad, Bracha Shapira

Abstract: Before executing an attack, adversaries usually explore the victim's network in an attempt to infer the network topology and identify vulnerabilities in the victim's servers and personal computers. Falsifying the information collected by the adversary post penetration may significantly slower lateral movement and increase the amount of noise generated within the victim's network. We investigate th… ▽ More Before executing an attack, adversaries usually explore the victim's network in an attempt to infer the network topology and identify vulnerabilities in the victim's servers and personal computers. Falsifying the information collected by the adversary post penetration may significantly slower lateral movement and increase the amount of noise generated within the victim's network. We investigate the effect of fake vulnerabilities within a real enterprise network on the attacker performance. We use the attack graphs to model the path of an attacker making its way towards a target in a given network. We use combinatorial optimization in order to find the optimal assignments of fake vulnerabilities. We demonstrate the feasibility of our deception-based defense by presenting results of experiments with a large scale real network. We show that adding fake vulnerabilities forces the adversary to invest a significant amount of effort, in terms of time and exploitability cost. △ Less

Submitted 6 March, 2019; originally announced March 2019.

arXiv:1807.00125 [pdf]

Generation of Automatic and Realistic Artificial Profiles

Authors: Abigail Paradise, Dvir Cohen, Asaf Shabtai, Rami Puzis

Abstract: Online social networks (OSNs) are abused by cyber criminals for various malicious activities. One of the most effective approaches for detecting malicious activity in OSNs involves the use of social network honeypots - artificial profiles that are deliberately planted within OSNs in order to attract abusers. Honeypot profiles have been used in detecting spammers, potential cyber attackers, and adv… ▽ More Online social networks (OSNs) are abused by cyber criminals for various malicious activities. One of the most effective approaches for detecting malicious activity in OSNs involves the use of social network honeypots - artificial profiles that are deliberately planted within OSNs in order to attract abusers. Honeypot profiles have been used in detecting spammers, potential cyber attackers, and advanced attackers. Therefore, there is a growing need for the ability to reliably generate realistic artificial honeypot profiles in OSNs. In this research we present 'ProfileGen' - a method for the automated generation of profiles for professional social networks, giving particular attention to producing realistic education and employment records. 'ProfileGen' creates honeypot profiles that are similar to actual data by extrapolating the characteristics and properties of real data items. Evaluation by 70 domain experts confirms the method's ability to generate realistic artificial profiles that are indistinguishable from real profiles, demonstrating that our method can be applied to generate realistic artificial profiles for a wide range of applications. △ Less

Submitted 30 June, 2018; originally announced July 2018.

arXiv:1801.03734 [pdf, other]

PALE: Partially Asynchronous Agile Leader Election

Authors: Bronislav Sidik, Rami Puzis, Polina Zilberman, Yuval Elovici

Abstract: Many tasks executed in dynamic distributed systems, such as sensor networks or enterprise environments with bring-your-own-device policy, require central coordination by a leader node. In the past it has been proven that distributed leader election in dynamic environments with constant changes and asynchronous communication is not possible. Thus, state-of-the-art leader election algorithms are not… ▽ More Many tasks executed in dynamic distributed systems, such as sensor networks or enterprise environments with bring-your-own-device policy, require central coordination by a leader node. In the past it has been proven that distributed leader election in dynamic environments with constant changes and asynchronous communication is not possible. Thus, state-of-the-art leader election algorithms are not applicable in asynchronous environments with constant network changes. Some algorithms converge only after the network stabilizes (an unrealistic requirement in many dynamic environments). Other algorithms reach consensus in the presence of network changes but require a global clock or some level of communication synchronization. Determining the weakest assumptions, under which leader election is possible, remains an unresolved problem. In this study we present a leader election algorithm that operates in the presence of changes and under weak (realistic) assumptions regarding message delays and regarding the clock drifts of the distributed nodes. The proposed algorithm is self-sufficient, easy to implement and can be extended to support multiple regions, self-stabilization, and wireless ad-hoc networks. We prove the algorithm's correctness and provide a complexity analysis of the time, space, and number of messages required to elect a leader. △ Less

Submitted 11 January, 2018; originally announced January 2018.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:1710.06699 [pdf, other]

Detecting Clickbait in Online Social Media: You Won't Believe How We Did It

Authors: Aviad Elyashar, Jorge Bendahan, Rami Puzis

Abstract: In this paper, we propose an approach for the detection of clickbait posts in online social media (OSM). Clickbait posts are short catchy phrases that attract a user's attention to click to an article. The approach is based on a machine learning (ML) classifier capable of distinguishing between clickbait and legitimate posts published in OSM. The suggested classifier is based on a variety of featu… ▽ More In this paper, we propose an approach for the detection of clickbait posts in online social media (OSM). Clickbait posts are short catchy phrases that attract a user's attention to click to an article. The approach is based on a machine learning (ML) classifier capable of distinguishing between clickbait and legitimate posts published in OSM. The suggested classifier is based on a variety of features, including image related features, linguistic analysis, and methods for abuser detection. In order to evaluate our method, we used two datasets provided by Clickbait Challenge 2017. The best performance obtained by the ML classifier was an AUC of 0.8, an accuracy of 0.812, precision of 0.819, and recall of 0.966. In addition, as opposed to previous studies, we found that clickbait post titles are statistically significant shorter than legitimate post titles. Finally, we found that counting the number of formal English words in the given content is useful for clickbait detection. △ Less

Submitted 18 October, 2017; originally announced October 2017.

arXiv:1708.02763 [pdf, other]

Has the Online Discussion Been Manipulated? Quantifying Online Discussion Authenticity within Online Social Media

Authors: Aviad Elyashar, Jorge Bendahan, Rami Puzis

Abstract: Online social media (OSM) has a enormous influence in today's world. Some individuals view OSM as fertile ground for abuse and use it to disseminate misinformation and political propaganda, slander competitors, and spread spam. The crowdturfing industry employs large numbers of bots and human workers to manipulate OSM and misrepresent public opinion. The detection of online discussion topics manip… ▽ More Online social media (OSM) has a enormous influence in today's world. Some individuals view OSM as fertile ground for abuse and use it to disseminate misinformation and political propaganda, slander competitors, and spread spam. The crowdturfing industry employs large numbers of bots and human workers to manipulate OSM and misrepresent public opinion. The detection of online discussion topics manipulated by OSM \emph{abusers} is an emerging issue attracting significant attention. In this paper, we propose an approach for quantifying the authenticity of online discussions based on the similarity of OSM accounts participating in the discussion to known abusers and legitimate accounts. Our method uses several similarity functions for the analysis and classification of OSM accounts. The proposed methods are demonstrated using Twitter data collected for this study and previously published \emph{Arabic honeypot dataset}. The former includes manually labeled accounts and abusers who participated in crowdturfing platforms. Evaluation of the topic's authenticity, derived from account similarity functions, shows that the suggested approach is effective for discriminating between topics that were strongly promoted by abusers and topics that attracted authentic public interest. △ Less

Submitted 4 January, 2018; v1 submitted 9 August, 2017; originally announced August 2017.

arXiv:1705.07490 [pdf]

MindDesktop: a general purpose brain computer interface

Authors: Ori Ossmy, Ofir Tam, Rami Puzis, Lior Rokach, Ohad Inbar, Yuval Elovici

Abstract: Recent advances in electroencephalography (EEG) and electromyography (EMG) enable communication for people with severe disabilities. In this paper we present a system that enables the use of regular computers using an off-the-shelf EEG/EMG headset, providing a pointing device and virtual keyboard that can be used to operate any Windows based system, minimizing the user effort required for interact… ▽ More Recent advances in electroencephalography (EEG) and electromyography (EMG) enable communication for people with severe disabilities. In this paper we present a system that enables the use of regular computers using an off-the-shelf EEG/EMG headset, providing a pointing device and virtual keyboard that can be used to operate any Windows based system, minimizing the user effort required for interacting with a personal computer. Effectiveness of the proposed system is evaluated by a usability study, indicating decreasing learning curve for completing various tasks. The proposed system is available in the link provided. △ Less

Submitted 21 May, 2017; originally announced May 2017.

arXiv:1701.00220 [pdf]

Classification of Smartphone Users Using Internet Traffic

Authors: Andrey Finkelstein, Ron Biton, Rami Puzis, Asaf Shabtai

Abstract: Today, smartphone devices are owned by a large portion of the population and have become a very popular platform for accessing the Internet. Smartphones provide the user with immediate access to information and services. However, they can easily expose the user to many privacy risks. Applications that are installed on the device and entities with access to the device's Internet traffic can reveal… ▽ More Today, smartphone devices are owned by a large portion of the population and have become a very popular platform for accessing the Internet. Smartphones provide the user with immediate access to information and services. However, they can easily expose the user to many privacy risks. Applications that are installed on the device and entities with access to the device's Internet traffic can reveal private information about the smartphone user and steal sensitive content stored on the device or transmitted by the device over the Internet. In this paper, we present a method to reveal various demographics and technical computer skills of smartphone users by their Internet traffic records, using machine learning classification models. We implement and evaluate the method on real life data of smartphone users and show that smartphone users can be classified by their gender, smoking habits, software programming experience, and other characteristics. △ Less

Submitted 1 January, 2017; originally announced January 2017.

arXiv:1609.02945 [pdf]

Pinpoint Influential Posts and Authors

Authors: Luiza Nacshon, Rami Puzis, Amparo Sanmateho

Abstract: This research presents an analytical model that aims to pin-point influential posts across a social web comprised of a corpus of posts. The model employs the Latent Dirichlet Al-location algorithm to associate posts with topics, and the TF-IDF metric to identify the key posts associated with each top-ic. The model was demonstrated in the domain of customer relationship by enabling careful monitori… ▽ More This research presents an analytical model that aims to pin-point influential posts across a social web comprised of a corpus of posts. The model employs the Latent Dirichlet Al-location algorithm to associate posts with topics, and the TF-IDF metric to identify the key posts associated with each top-ic. The model was demonstrated in the domain of customer relationship by enabling careful monitoring of evolving "storms" created by individuals which tend to impact large audiences (either positively or negatively). Future research should be engaged in order to extend the scope of the corpus by including additional relevant publicly available sources. △ Less

Submitted 9 September, 2016; originally announced September 2016.

arXiv:1608.03307 [pdf]

Floware: Balanced Flow Monitoring in Software Defined Networks

Authors: Luiza Nacshon, Rami Puzis, Polina Zilberman

Abstract: OpenFlow is a protocol implementing Software Defined Networking, a new networking paradigm, which segregates packet forwarding and accounting (performed on switches) from the routing decisions and advanced protocols (executed on a central controller). This segregation increases agility and flexibility of a networking infrastructure and reduces its operational expenses. OpenFlow controllers expose… ▽ More OpenFlow is a protocol implementing Software Defined Networking, a new networking paradigm, which segregates packet forwarding and accounting (performed on switches) from the routing decisions and advanced protocols (executed on a central controller). This segregation increases agility and flexibility of a networking infrastructure and reduces its operational expenses. OpenFlow controllers expose standard interfaces to facilitate variety of networking applications. In particular, a monitoring application can use these interfaces to push into the OpenFlow switches rules that collect traffic flow statistics at different aggregation levels. The aggregation level determines the monitoring accuracy and the induced network overhead. In this paper, we propose Floware an OpenFlow application that allows discovery and monitoring of active flows at any required aggregation level. Floware balances the monitoring overhead among many switches in order to reduce its negative effect on network performance. In addition, Floware integrates with monitoring systems based on legacy protocols such as NetFlow. We demonstrate the application with soft switches emulated in Mininet, the Floodlight controller, and the NetFlow Analyzer as a legacy network analysis and intrusion detection system. Evaluation results demonstrate the positive impact of balanced monitoring. △ Less

Submitted 4 December, 2016; v1 submitted 10 August, 2016; originally announced August 2016.

arXiv:1601.00184 [pdf]

The Security of WebRTC

Authors: Ben Feher, Lior Sidi, Asaf Shabtai, Rami Puzis

Abstract: WebRTC is an API that allows users to share streaming information, whether it is text, sound, video or files. It is supported by all major browsers and has a flexible underlying infrastructure. In this study we review current WebRTC structure and security in the contexts of communication disruption, modification and eavesdrop**. In addition, we examine WebRTC security in a few representative sce… ▽ More WebRTC is an API that allows users to share streaming information, whether it is text, sound, video or files. It is supported by all major browsers and has a flexible underlying infrastructure. In this study we review current WebRTC structure and security in the contexts of communication disruption, modification and eavesdrop**. In addition, we examine WebRTC security in a few representative scenarios, setting up and simulating real WebRTC environments and attacks. △ Less

Submitted 2 January, 2016; originally announced January 2016.

arXiv:1410.2480

Efficient On-line Detection of Temporal Patterns

Authors: Shlomi Dolev, Jonathan Goldfeld, Rami Puzis

Abstract: Identifying a temporal pattern of events is a fundamental task of on-line (real-time) verification. We present efficient schemes for on-line monitoring of events for identifying desired/undesired patterns of events. The schemes use preprocessing to ensure that the number of comparisons during run-time is minimized. In particular, the first comparison following the time point when an execution sub-… ▽ More Identifying a temporal pattern of events is a fundamental task of on-line (real-time) verification. We present efficient schemes for on-line monitoring of events for identifying desired/undesired patterns of events. The schemes use preprocessing to ensure that the number of comparisons during run-time is minimized. In particular, the first comparison following the time point when an execution sub-sequence cannot be further extended to satisfy the temporal requirements, halts the process that monitors the sub-sequence. △ Less

Submitted 27 May, 2015; v1 submitted 9 October, 2014; originally announced October 2014.

Comments: withdrawn due to submission policy

arXiv:1303.3741 [pdf, other]

Organization Mining Using Online Social Networks

Authors: Michael Fire, Rami Puzis, Yuval Elovici

Abstract: Mature social networking services are one of the greatest assets of today's organizations. This valuable asset, however, can also be a threat to an organization's confidentiality. Members of social networking websites expose not only their personal information, but also details about the organizations for which they work. In this paper we analyze several commercial organizations by mining data whi… ▽ More Mature social networking services are one of the greatest assets of today's organizations. This valuable asset, however, can also be a threat to an organization's confidentiality. Members of social networking websites expose not only their personal information, but also details about the organizations for which they work. In this paper we analyze several commercial organizations by mining data which their employees have exposed on Facebook, LinkedIn, and other publicly available sources. Using a web crawler designed for this purpose, we extract a network of informal social relationships among employees of a given target organization. Our results, obtained using centrality analysis and Machine Learning techniques applied to the structure of the informal relationships network, show that it is possible to identify leadership roles within the organization solely by this means. It is also possible to gain valuable non-trivial insights on an organization's structure by clustering its social network and gathering publicly available information on the employees within each cluster. Organizations wanting to conceal their internal structure, identity of leaders, location and specialization of branches offices, etc., must enforce strict policies to control the use of social media by their employees. △ Less

Submitted 2 September, 2013; v1 submitted 15 March, 2013; originally announced March 2013.

Comments: Draft Version

arXiv:1205.1357 [pdf, other]

Detecting Spammers via Aggregated Historical Data Set

Authors: Eitan Menahem, Rami Puzis

Abstract: The battle between email service providers and senders of mass unsolicited emails (Spam) continues to gain traction. Vast numbers of Spam emails are sent mainly from automatic botnets distributed over the world. One method for mitigating Spam in a computationally efficient manner is fast and accurate blacklisting of the senders. In this work we propose a new sender reputation mechanism that is bas… ▽ More The battle between email service providers and senders of mass unsolicited emails (Spam) continues to gain traction. Vast numbers of Spam emails are sent mainly from automatic botnets distributed over the world. One method for mitigating Spam in a computationally efficient manner is fast and accurate blacklisting of the senders. In this work we propose a new sender reputation mechanism that is based on an aggregated historical data-set which encodes the behavior of mail transfer agents over time. A historical data-set is created from labeled logs of received emails. We use machine learning algorithms to build a model that predicts the \emph{spammingness} of mail transfer agents in the near future. The proposed mechanism is targeted mainly at large enterprises and email service providers and can be used for updating both the black and the white lists. We evaluate the proposed mechanism using 9.5M anonymized log entries obtained from the biggest Internet service provider in Europe. Experiments show that proposed method detects more than 94% of the Spam emails that escaped the blacklist (i.e., TPR), while having less than 0.5% false-alarms. Therefore, the effectiveness of the proposed method is much higher than of previously reported reputation mechanisms, which rely on emails logs. In addition, the proposed method, when used for updating both the black and white lists, eliminated the need in automatic content inspection of 4 out of 5 incoming emails, which resulted in dramatic reduction in the filtering computational load. △ Less

Submitted 7 May, 2012; originally announced May 2012.

Comments: This is a conference version of the HDS research. 13 pages 10 figures

ACM Class: C.2.0; H.4.3

arXiv:0904.0352 [pdf]

doi 10.1016/j.ipl.2009.07.019

Incremental Deployment of Network Monitors Based on Group Betweenness Centrality

Authors: Shlomi Dolev, Yuval Elovici, Rami Puzis, Polina Zilberman

Abstract: In many applications we are required to increase the deployment of a distributed monitoring system on an evolving network. In this paper we present a new method for finding candidate locations for additional deployment in the network. This method is based on the Group Betweenness Centrality (GBC) measure that is used to estimate the influence of a group of nodes over the information flow in the ne… ▽ More In many applications we are required to increase the deployment of a distributed monitoring system on an evolving network. In this paper we present a new method for finding candidate locations for additional deployment in the network. This method is based on the Group Betweenness Centrality (GBC) measure that is used to estimate the influence of a group of nodes over the information flow in the network. The new method assists in finding the location of k additional monitors in the evolving network, such that the portion of additional traffic covered is at least (1-1/e) of the optimal. △ Less

Submitted 2 October, 2020; v1 submitted 2 April, 2009; originally announced April 2009.

Journal ref: Information Processing Letters, 109(20), 1172-1176 (2009)

Showing 1–33 of 33 results for author: Puzis, R