-
CRATOR: a Dark Web Crawler
Authors:
Daniel De Pascale,
Giuseppe Cascavilla,
Damian A. Tamburri,
Willem-Jan Van Den Heuvel
Abstract:
Dark web crawling is a complex process that involves specific methodologies and techniques to navigate the Tor network and extract data from hidden services. This study proposes a general dark web crawler designed to extract pages handling security protocols, such as captchas, efficiently. Our approach uses a combination of seed URL lists, link analysis, and scanning to discover new content. We al…
▽ More
Dark web crawling is a complex process that involves specific methodologies and techniques to navigate the Tor network and extract data from hidden services. This study proposes a general dark web crawler designed to extract pages handling security protocols, such as captchas, efficiently. Our approach uses a combination of seed URL lists, link analysis, and scanning to discover new content. We also incorporate methods for user-agent rotation and proxy usage to maintain anonymity and avoid detection. We evaluate the effectiveness of our crawler using metrics such as coverage, performance and robustness. Our results demonstrate that our crawler effectively extracts pages handling security protocols while maintaining anonymity and avoiding detection. Our proposed dark web crawler can be used for various applications, including threat intelligence, cybersecurity, and online investigations.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
Real-world K-Anonymity Applications: the \textsc{KGen} approach and its evaluation in Fraudulent Transactions
Authors:
Daniel De Pascale,
Giuseppe Cascavilla,
Damian A. Tamburri,
Willem-Jan Van Den Heuvel
Abstract:
K-Anonymity is a property for the measurement, management, and governance of the data anonymization. Many implementations of k-anonymity have been described in state of the art, but most of them are not able to work with a large number of attributes in a "Big" dataset, i.e., a dataset drawn from Big Data. To address this significant shortcoming, we introduce and evaluate \textsc{KGen} an approach…
▽ More
K-Anonymity is a property for the measurement, management, and governance of the data anonymization. Many implementations of k-anonymity have been described in state of the art, but most of them are not able to work with a large number of attributes in a "Big" dataset, i.e., a dataset drawn from Big Data. To address this significant shortcoming, we introduce and evaluate \textsc{KGen} an approach to K-anonymity featuring Genetic Algorithms. \textsc{KGen} promotes such a meta-heuristic approach since it can solve the problem by finding a pseudo-optimal solution in a reasonable time over a considerable load of input. \textsc{KGen} allows the data manager to guarantee a high anonymity level while preserving the usability and preventing loss of information entropy over the data. Differently from other approaches that provide optimal global solutions catered for small datasets, \textsc{KGen} works properly also over Big datasets while still providing a good-enough solution. Evaluation results show how our approach can still work efficiently on a real world dataset, provided by Dutch Tax Authority, with 47 attributes (i.e., the columns of the dataset to be anonymized) and over 1.5K+ observations (i.e., the rows of that dataset), as well as on a dataset with 97 attributes and over 3942 observations.
△ Less
Submitted 31 March, 2022;
originally announced April 2022.
-
Internet-of-Things Architectures for Secure Cyber-Physical Spaces: the VISOR Experience Report
Authors:
Daniel De Pascale,
Giuseppe Cascavilla,
Mirella Sangiovanni,
Damian A. Tamburri,
Willem-Jan van den Heuvel
Abstract:
Internet of things (IoT) technologies are becoming a more and more widespread part of civilian life in common urban spaces, which are rapidly turning into cyber-physical spaces. Simultaneously, the fear of terrorism and crime in such public spaces is ever-increasing. Due to the resulting increased demand for security, video-based IoT surveillance systems have become an important area for research.…
▽ More
Internet of things (IoT) technologies are becoming a more and more widespread part of civilian life in common urban spaces, which are rapidly turning into cyber-physical spaces. Simultaneously, the fear of terrorism and crime in such public spaces is ever-increasing. Due to the resulting increased demand for security, video-based IoT surveillance systems have become an important area for research. Considering the large number of devices involved in the illicit recognition task, we conducted a field study in a Dutch Easter music festival in a national interest project called VISOR to select the most appropriate device configuration in terms of performance and results. We iteratively architected solutions for the security of cyber-physical spaces using IoT devices. We tested the performance of multiple federated devices encompassing drones, closed-circuit television, smart phone cameras, and smart glasses to detect real-case scenarios of potentially malicious activities such as mosh-pits and pick-pocketing. Our results pave the way to select optimal IoT architecture configurations -- i.e., a mix of CCTV, drones, smart glasses, and camera phones in our case -- to make safer cyber-physical spaces' a reality.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.