-
Unravelling the Use of Digital Twins to Assist Decision- and Policy-Making in Smart Cities
Authors:
Lucy Temple,
Gabriela Viale Pereira,
Lukas Daniel Klausner
Abstract:
This short paper represents a systematic literature review that sets the basis for the future development of a framework for digital twin-based decision support in the public sector, specifically for the smart city domain. The final aim of the research is to model context-specific digital twins for aiding the decision-making processes in smart cities and devise methods for defining the policy agen…
▽ More
This short paper represents a systematic literature review that sets the basis for the future development of a framework for digital twin-based decision support in the public sector, specifically for the smart city domain. The final aim of the research is to model context-specific digital twins for aiding the decision-making processes in smart cities and devise methods for defining the policy agenda. Overall, this short paper provides a foundation, based on the main concepts from existing literature, for further research in the role and applications of urban digital twins to assist decision- and policy-making in smart cities. The existing literature analyses common applications of digital twins in smart city development with a focus on supporting decision- and policy-making. Future work will centre on develo** a digital-twin-based sustainable smart city and defining different scenarios concerning challenges of good governance, especially so-called wicked problems, in smaller-scale urban and non-urban contexts.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Warum wir es für eine gute Idee gehalten haben, eine DACH-Spieledatenbank aufzubauen
Authors:
Eugen Pfister,
Aurelia Brandenburg,
Adrian Demleitner,
Lukas Daniel Klausner
Abstract:
We are in the process of creating a database of digital games from the DACH region. This article provides an insight into the context in which it was created and the underlying methodological considerations behind the games database. The database was compiled collaboratively and lists digital games developed in Germany, Austria and Switzerland up to the year 2000. In this report, we outline our in…
▽ More
We are in the process of creating a database of digital games from the DACH region. This article provides an insight into the context in which it was created and the underlying methodological considerations behind the games database. The database was compiled collaboratively and lists digital games developed in Germany, Austria and Switzerland up to the year 2000. In this report, we outline our initial considerations and the various stages of realisation as well as the input data on which the database was built, the aims of the data model and the difficulties we faced during the creation process. We then pin down the current status of the games database and give an outlook on the project's future plans.
--
Unser Werkstattbericht gibt Einblick in den Entstehungskontext sowie die zugrundeliegenden methodischen Überlegungen hinter der von den Autor*innen publizierten Spieledatenbank. Diese wurde kollaborativ erarbeitet und führt digitale Spiele, die in Deutschland, Österreich und der Schweiz bis zum Jahr 2000 entwickelt wurden. In diesem Bericht skizzieren wir neben unseren Ausgangsüberlegungen und den verschiedenen Arbeitsschritten bei der Realisierung außerdem auch, auf welcher Datenbasis die Datenbank aufgebaut und geprüft wurde, was die Ziele des Datenmodells sind und mit welchen Schwierigkeiten wir im Prozess der Erstellung konfrontiert waren. Hiernach ordnen wir den aktuellen Stand der Spieledatenbank ein und geben einen Ausblick auf die weiteren Pläne des Projekts.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Delete My Account: Impact of Data Deletion on Machine Learning Classifiers
Authors:
Tobias Dam,
Maximilian Henzl,
Lukas Daniel Klausner
Abstract:
Users are more aware than ever of the importance of their own data, thanks to reports about security breaches and leaks of private, often sensitive data in recent years. Additionally, the GDPR has been in effect in the European Union for over three years and many people have encountered its effects in one way or another. Consequently, more and more users are actively protecting their personal data…
▽ More
Users are more aware than ever of the importance of their own data, thanks to reports about security breaches and leaks of private, often sensitive data in recent years. Additionally, the GDPR has been in effect in the European Union for over three years and many people have encountered its effects in one way or another. Consequently, more and more users are actively protecting their personal data. One way to do this is to make of the right to erasure guaranteed in the GDPR, which has potential implications for a number of different fields, such as big data and machine learning.
Our paper presents an in-depth analysis about the impact of the use of the right to erasure on the performance of machine learning models on classification tasks. We conduct various experiments utilising different datasets as well as different machine learning algorithms to analyse a variety of deletion behaviour scenarios. Due to the lack of credible data on actual user behaviour, we make reasonable assumptions for various deletion modes and biases and provide insight into the effects of different plausible scenarios for right to erasure usage on data quality of machine learning. Our results show that the impact depends strongly on the amount of data deleted, the particular characteristics of the dataset and the bias chosen for deletion and assumptions on user behaviour.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
A Survey of Dataspace Connector Implementations
Authors:
Tobias Dam,
Lukas Daniel Klausner,
Sebastian Neumaier,
Torsten Priebe
Abstract:
The concept of dataspaces aims to facilitate secure and sovereign data exchange among multiple stakeholders. Technical implementations known as "connectors" support the definition of usage control policies and the verifiable enforcement of such policies. This paper provides an overview of existing literature and reviews current open-source dataspace connector implementations that are compliant wit…
▽ More
The concept of dataspaces aims to facilitate secure and sovereign data exchange among multiple stakeholders. Technical implementations known as "connectors" support the definition of usage control policies and the verifiable enforcement of such policies. This paper provides an overview of existing literature and reviews current open-source dataspace connector implementations that are compliant with the International Data Spaces (IDS) standard. To assess maturity and readiness, we review four implementations with regard to their architecture, underlying data model and usage control language.
△ Less
Submitted 9 January, 2024; v1 submitted 20 September, 2023;
originally announced September 2023.
-
"This (Smart) Town Ain't Big Enough": Smart Small Towns and Digital Twins for Sustainable Urban and Regional Development
Authors:
Gabriela Viale Pereira,
Lukas Daniel Klausner,
Lucy Temple,
Thomas Delissen,
Thomas Lampoltshammer,
Torsten Priebe
Abstract:
One of the major challenges today lies in the creation of governance concepts for regional development that not only promote growth but, at the same time, ensure promotion of inclusiveness, fairness, and resilience. Digital twins can support policymakers in develo** smart, sustainable solutions for cities and regions and, therefore, urban and non-urban environments. The project SCiNDTiLA (Smart…
▽ More
One of the major challenges today lies in the creation of governance concepts for regional development that not only promote growth but, at the same time, ensure promotion of inclusiveness, fairness, and resilience. Digital twins can support policymakers in develo** smart, sustainable solutions for cities and regions and, therefore, urban and non-urban environments. The project SCiNDTiLA (Smart Cities aNd Digital Twins in Lower Austria) aims to define the state-of-the-art in the field of smart cities, identify interdependencies, critical components and stakeholders, and provide a roadmap for smart cities with application to both smaller-scale urban and non-urban environments. SCiNDTiLA uses the foundations of complexity theory and computational social science methods to model Austrian towns and regions as smart cities/regions and thus as systems of socio-technical interaction to guide policy decision-making toward sustainable development.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Smart Cities and Digital Twins in Lower Austria
Authors:
Gabriela Viale Pereira,
Lukas Daniel Klausner,
Lucy Temple,
Thomas Delissen,
Thomas Lampoltshammer,
Torsten Priebe
Abstract:
Smart city solutions require innovative governance approaches together with the smart use of technology, such as digital twins, by city managers and policymakers to manage the big societal challenges. The project Smart Cities aNd Digital Twins in Lower Austria (SCiNDTiLA) extends the state of the art of research in several contributing disciplines and uses the foundations of complexity theory and…
▽ More
Smart city solutions require innovative governance approaches together with the smart use of technology, such as digital twins, by city managers and policymakers to manage the big societal challenges. The project Smart Cities aNd Digital Twins in Lower Austria (SCiNDTiLA) extends the state of the art of research in several contributing disciplines and uses the foundations of complexity theory and computational social science methods to develop a digital-twin-based smart city model. The project will also apply a novel transdisciplinary process to conceptualise sustainable smart cities and validate the smart city generic model. The outcomes will be translated into a roadmap highlighting methodologies, guidelines and policy recommendations for tackling societal challenges in smart cities with a focus on rescaling the entire framework to be transferred to regions, smaller towns and non-urban environments, such as rural areas and smart villages, in ways that fit the respective local governance, ethical and operational capacity context.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
Participatory Research as a Path to Community-Informed, Gender-Fair Machine Translation
Authors:
Dagmar Gromann,
Manuel Lardelli,
Katta Spiel,
Sabrina Burtscher,
Lukas Daniel Klausner,
Arthur Mettinger,
Igor Miladinovic,
Sigrid Schefer-Wenzl,
Daniela Duh,
Katharina Bühn
Abstract:
Recent years have seen a strongly increased visibility of non-binary people in public discourse. Accordingly, considerations of gender-fair language go beyond a binary conception of male/female. However, language technology, especially machine translation (MT), still suffers from binary gender bias. Proposing a solution for gender-fair MT beyond the binary from a purely technological perspective m…
▽ More
Recent years have seen a strongly increased visibility of non-binary people in public discourse. Accordingly, considerations of gender-fair language go beyond a binary conception of male/female. However, language technology, especially machine translation (MT), still suffers from binary gender bias. Proposing a solution for gender-fair MT beyond the binary from a purely technological perspective might fall short to accommodate different target user groups and in the worst case might lead to misgendering. To address this challenge, we propose a method and case study building on participatory action research to include experiential experts, i.e., queer and non-binary people, translators, and MT experts, in the MT design process. The case study focuses on German, where central findings are the importance of context dependency to avoid identity invalidation and a desire for customizable MT solutions.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
"Schöne neue Lieferkettenwelt": Workers' Voice und Arbeitsstandards in Zeiten algorithmischer Vorhersage
Authors:
Lukas Daniel Klausner,
Maximilian Heimstädt,
Leonhard Dobusch
Abstract:
The complexity and increasingly tight coupling of supply chains poses a major logistical challenge for leading companies. Another challenge is that leading companies -- under pressure from consumers, a critical public and legislative measures such as supply chain laws -- have to take more responsibility than before for their suppliers' labour standards. In this paper, we discuss a new approach tha…
▽ More
The complexity and increasingly tight coupling of supply chains poses a major logistical challenge for leading companies. Another challenge is that leading companies -- under pressure from consumers, a critical public and legislative measures such as supply chain laws -- have to take more responsibility than before for their suppliers' labour standards. In this paper, we discuss a new approach that leading companies are using to try to address these challenges: algorithmic prediction of business risks, but also environmental and social risks. We describe the technical and cultural conditions for algorithmic prediction and explain how -- from the perspective of leading companies -- it helps to address both challenges. We then develop scenarios on how and with what kind of social consequences algorithmic prediction can be used by leading companies. From the scenarios, we derive policy options for different stakeholder groups to help develop algorithmic prediction towards improving labour standards and worker voice.
--
Die Komplexität und zunehmend enge Kopplung vieler Lieferketten stellt eine große logistische Herausforderung für Leitunternehmen dar. Eine weitere Herausforderung besteht darin, dass Leitunternehmen -- gedrängt durch Konsument:innen, eine kritische Öffentlichkeit und gesetzgeberische Maßnahmen wie die Lieferkettengesetze -- stärker als bisher Verantwortung für Arbeitsstandards in ihren Zulieferbetrieben übernehmen müssen. In diesem Beitrag diskutieren wir einen neuen Ansatz, mit dem Leitunternehmen versuchen, diese Herausforderungen zu bearbeiten: die algorithmische Vorhersage von betriebswirtschaftlichen, aber auch ökologischen und sozialen Risiken. Wir beschreiben die technischen und kulturellen Bedingungen für algorithmische Vorhersage und erklären, wie diese -- aus Perspektive von Leitunternehmen -- bei der Bearbeitung beider Herausforderungen hilft. Anschließend entwickeln wir Szenarien, wie und mit welchen sozialen Konsequenzen algorithmische Vorhersage durch Leitunternehmen eingesetzt werden kann. Aus den Szenarien leiten wir Handlungsoptionen für verschiedene Stakeholder-Gruppen ab, die dabei helfen sollen, algorithmische Vorhersage im Sinne einer Verbesserung von Arbeitsstandards und Workers' Voice weiterzuentwickeln.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Towards a Critical Open-Source Software Database
Authors:
Tobias Dam,
Lukas Daniel Klausner,
Sebastian Neumaier
Abstract:
Open-source software (OSS) plays a vital role in the modern software ecosystem. However, the maintenance and sustainability of OSS projects can be challenging. In this paper, we present the CrOSSD project, which aims to build a database of OSS projects and measure their current project "health" status. In the project, we will use both quantitative and qualitative metrics to evaluate the health of…
▽ More
Open-source software (OSS) plays a vital role in the modern software ecosystem. However, the maintenance and sustainability of OSS projects can be challenging. In this paper, we present the CrOSSD project, which aims to build a database of OSS projects and measure their current project "health" status. In the project, we will use both quantitative and qualitative metrics to evaluate the health of OSS projects. The quantitative metrics will be gathered through automated crawling of meta information such as the number of contributors, commits and lines of code. Qualitative metrics will be gathered for selected "critical" projects through manual analysis and automated tools, including aspects such as sustainability, funding, community engagement and adherence to security policies. The results of the analysis will be presented on a user-friendly web platform, which will allow users to view the health of individual OSS projects as well as the overall health of the OSS ecosystem. With this approach, the CrOSSD project provides a comprehensive and up-to-date view of the health of OSS projects, making it easier for developers, maintainers and other stakeholders to understand the health of OSS projects and make informed decisions about their use and maintenance.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
"Es geht um Respekt, nicht um Technologie": Erkenntnisse aus einem Interessensgruppen-übergreifenden Workshop zu genderfairer Sprache und Sprachtechnologie
Authors:
Sabrina Burtscher,
Katta Spiel,
Lukas Daniel Klausner,
Manuel Lardelli,
Dagmar Gromann
Abstract:
With the increasing attention non-binary people receive in Western societies, strategies of gender-fair language have started to move away from binary (only female/male) concepts of gender. Nevertheless, hardly any approaches to take these identities into account into machine translation models exist so far. A lack of understanding of the socio-technical implications of such technologies risks fur…
▽ More
With the increasing attention non-binary people receive in Western societies, strategies of gender-fair language have started to move away from binary (only female/male) concepts of gender. Nevertheless, hardly any approaches to take these identities into account into machine translation models exist so far. A lack of understanding of the socio-technical implications of such technologies risks further reproducing linguistic mechanisms of oppression and mislabelling. In this paper, we describe the methods and results of a workshop on gender-fair language and language technologies, which was led and organised by ten researchers from TU Wien, St. Pölten UAS, FH Campus Wien and the University of Vienna and took place in Vienna in autumn 2021. A wide range of interest groups and their representatives were invited to ensure that the topic could be dealt with holistically. Accordingly, we aimed to include translators, machine translation experts and non-binary individuals (as "community experts") on an equal footing. Our analysis shows that gender in machine translation requires a high degree of context sensitivity, that developers of such technologies need to position themselves cautiously in a process still under social negotiation, and that flexible approaches seem most adequate at present. We then illustrate steps that follow from our results for the field of gender-fair language technologies so that technological developments can adequately line up with social advancements.
----
Mit zunehmender gesamtgesellschaftlicher Wahrnehmung nicht-binärer Personen haben sich in den letzten Jahren auch Konzepte von genderfairer Sprache von der bisher verwendeten Binarität (weiblich/männlich) entfernt. Trotzdem gibt es bislang nur wenige Ansätze dazu, diese Identitäten in maschineller Übersetzung abzubilden. Ein fehlendes Verständnis unterschiedlicher sozio-technischer Implikationen derartiger Technologien birgt in sich die Gefahr, fehlerhafte Ansprachen und Bezeichnungen sowie sprachliche Unterdrückungsmechanismen zu reproduzieren. In diesem Beitrag beschreiben wir die Methoden und Ergebnisse eines Workshops zu genderfairer Sprache in technologischen Zusammenhängen, der im Herbst 2021 in Wien stattgefunden hat. Zehn Forscher*innen der TU Wien, FH St. Pölten, FH Campus Wien und Universität Wien organisierten und leiteten den Workshop. Dabei wurden unterschiedlichste Interessensgruppen und deren Vertreter*innen breit gestreut eingeladen, um sicherzustellen, dass das Thema holistisch behandelt werden kann. Dementsprechend setzten wir uns zum Ziel, Machine-Translation-Entwickler*innen, Übersetzer*innen, und nicht-binäre Privatpersonen (als "Lebenswelt-Expert*innen") gleichberechtigt einzubinden. Unsere Analyse zeigt, dass Geschlecht in maschineller Übersetzung eine maßgeblich kontextsensible Herangehensweise erfordert, die Entwicklung von Sprachtechnologien sich vorsichtig in einem sich noch in Aushandlung befindlichen gesellschaftlichen Prozess positionieren muss, und flexible Ansätze derzeit am adäquatesten erscheinen. Wir zeigen auf, welche nächsten Schritte im Bereich genderfairer Technologien notwendig sind, damit technische mit sozialen Entwicklungen mithalten können.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
Wer ist schuld, wenn Algorithmen irren? Entscheidungsautomatisierung, Organisationen und Verantwortung
Authors:
Angelika Adensamer,
Rita Gsenger,
Lukas Daniel Klausner
Abstract:
Algorithmic decision support (ADS) is increasingly used in a whole array of different contexts and structures in various areas of society, influencing many people's lives. Its use raises questions, among others, about accountability, transparency and responsibility. Our article aims to give a brief overview of the central issues connected to ADS, responsibility and decision-making in organisationa…
▽ More
Algorithmic decision support (ADS) is increasingly used in a whole array of different contexts and structures in various areas of society, influencing many people's lives. Its use raises questions, among others, about accountability, transparency and responsibility. Our article aims to give a brief overview of the central issues connected to ADS, responsibility and decision-making in organisational contexts and identify open questions and research gaps. Furthermore, we describe a set of guidelines and a complementary digital tool to assist practitioners in map** responsibility when introducing ADS within their organisational context.
--
Algorithmenunterstützte Entscheidungsfindung (algorithmic decision support, ADS) kommt in verschiedenen Kontexten und Strukturen vermehrt zum Einsatz und beeinflusst in diversen gesellschaftlichen Bereichen das Leben vieler Menschen. Ihr Einsatz wirft einige Fragen auf, unter anderem zu den Themen Rechenschaft, Transparenz und Verantwortung. Im Folgenden möchten wir einen Überblick über die wichtigsten Fragestellungen rund um ADS, Verantwortung und Entscheidungsfindung in organisationalen Kontexten geben und einige offene Fragen und Forschungslücken aufzeigen. Weiters beschreiben wir als konkrete Hilfestellung für die Praxis einen von uns entwickelten Leitfaden samt ergänzendem digitalem Tool, welches Anwender:innen insbesondere bei der Verortung und Zuordnung von Verantwortung bei der Nutzung von ADS in organisationalen Kontexten helfen soll.
△ Less
Submitted 21 July, 2022;
originally announced July 2022.
-
"Computer Says No": Algorithmic Decision Support and Organisational Responsibility
Authors:
Angelika Adensamer,
Rita Gsenger,
Lukas Daniel Klausner
Abstract:
Algorithmic decision support is increasingly used in a whole array of different contexts and structures in various areas of society, influencing many people's lives. Its use raises questions, among others, about accountability, transparency and responsibility. While there is substantial research on the issue of algorithmic systems and responsibility in general, there is little to no prior research…
▽ More
Algorithmic decision support is increasingly used in a whole array of different contexts and structures in various areas of society, influencing many people's lives. Its use raises questions, among others, about accountability, transparency and responsibility. While there is substantial research on the issue of algorithmic systems and responsibility in general, there is little to no prior research on organisational responsibility and its attribution. Our article aims to fill that gap; we give a brief overview of the central issues connected to ADS, responsibility and decision-making in organisational contexts and identify open questions and research gaps. Furthermore, we describe a set of guidelines and a complementary digital tool to assist practitioners in map** responsibility when introducing ADS within their organisational context.
△ Less
Submitted 23 June, 2022; v1 submitted 21 October, 2021;
originally announced October 2021.
-
Towards Resilient Artificial Intelligence: Survey and Research Issues
Authors:
Oliver Eigner,
Sebastian Eresheim,
Peter Kieseberg,
Lukas Daniel Klausner,
Martin Pirker,
Torsten Priebe,
Simon Tjoa,
Fiammetta Marulli,
Francesco Mercaldo
Abstract:
Artificial intelligence (AI) systems are becoming critical components of today's IT landscapes. Their resilience against attacks and other environmental influences needs to be ensured just like for other IT assets. Considering the particular nature of AI, and machine learning (ML) in particular, this paper provides an overview of the emerging field of resilient AI and presents research issues the…
▽ More
Artificial intelligence (AI) systems are becoming critical components of today's IT landscapes. Their resilience against attacks and other environmental influences needs to be ensured just like for other IT assets. Considering the particular nature of AI, and machine learning (ML) in particular, this paper provides an overview of the emerging field of resilient AI and presents research issues the authors identify as potential future work.
△ Less
Submitted 18 September, 2021;
originally announced September 2021.
-
"Part Man, Part Machine, All Cop": Automation in Policing
Authors:
Angelika Adensamer,
Lukas Daniel Klausner
Abstract:
Digitisation, automation and datafication permeate policing and justice more and more each year -- from predictive policing methods through recidivism prediction to automated biometric identification at the border. The sociotechnical issues surrounding the use of such systems raise questions and reveal problems, both old and new. Our article reviews contemporary issues surrounding automation in po…
▽ More
Digitisation, automation and datafication permeate policing and justice more and more each year -- from predictive policing methods through recidivism prediction to automated biometric identification at the border. The sociotechnical issues surrounding the use of such systems raise questions and reveal problems, both old and new. Our article reviews contemporary issues surrounding automation in policing and the legal system, finds common issues and themes in various different examples, introduces the distinction between human "retail bias" and algorithmic "wholesale bias", and argues for shifting the viewpoint on the debate to focus on both workers' rights and organisational responsibility as well as fundamental rights and the right to an effective remedy.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Trust Me If You Can: Trusted Transformation Between (JSON) Schemas to Support Global Authentication of Education Credentials
Authors:
Stefan More,
Peter Grassberger,
Felix Hörandner,
Andreas Abraham,
Lukas Daniel Klausner
Abstract:
Recruiters and institutions around the world struggle with the verification of diplomas issued in a diverse and global education setting. Firstly, it is a nontrivial problem to identify bogus institutions selling education credentials. While institutions are often accredited by qualified authorities on a regional level, there is no global authority fulfilling this task. Secondly, many different da…
▽ More
Recruiters and institutions around the world struggle with the verification of diplomas issued in a diverse and global education setting. Firstly, it is a nontrivial problem to identify bogus institutions selling education credentials. While institutions are often accredited by qualified authorities on a regional level, there is no global authority fulfilling this task. Secondly, many different data schemas are used to encode education credentials, which represents a considerable challenge to automated processing. Consequently, significant manual effort is required to verify credentials.
In this paper, we tackle these challenges by introducing a decentralized and open system to automatically verify the legitimacy of issuers and interpret credentials in unknown schemas. We do so by enabling participants to publish transformation information, which enables verifiers to transform credentials into their preferred schema. Due to the lack of a global root of trust, we utilize a distributed ledger to build a decentralized web of trust, which verifiers can query to gather information on the trustworthiness of issuing institutions and to establish trust in transformation information. Going beyond diploma fraud, our system can be generalized to tackle the generalized problem for other domains lacking a root of trust and agreements on data schemas.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
$k$-Anonymity in Practice: How Generalisation and Suppression Affect Machine Learning Classifiers
Authors:
Djordje Slijepčević,
Maximilian Henzl,
Lukas Daniel Klausner,
Tobias Dam,
Peter Kieseberg,
Matthias Zeppelzauer
Abstract:
The protection of private information is a crucial issue in data-driven research and business contexts. Typically, techniques like anonymisation or (selective) deletion are introduced in order to allow data sharing, e. g. in the case of collaborative research endeavours. For use with anonymisation techniques, the $k$-anonymity criterion is one of the most popular, with numerous scientific publicat…
▽ More
The protection of private information is a crucial issue in data-driven research and business contexts. Typically, techniques like anonymisation or (selective) deletion are introduced in order to allow data sharing, e. g. in the case of collaborative research endeavours. For use with anonymisation techniques, the $k$-anonymity criterion is one of the most popular, with numerous scientific publications on different algorithms and metrics. Anonymisation techniques often require changing the data and thus necessarily affect the results of machine learning models trained on the underlying data. In this work, we conduct a systematic comparison and detailed investigation into the effects of different $k$-anonymisation algorithms on the results of machine learning models. We investigate a set of popular $k$-anonymisation algorithms with different classifiers and evaluate them on different real-world datasets. Our systematic evaluation shows that with an increasingly strong $k$-anonymity constraint, the classification performance generally degrades, but to varying degrees and strongly depending on the dataset and anonymisation method. Furthermore, Mondrian can be considered as the method with the most appealing properties for subsequent classification.
△ Less
Submitted 22 June, 2022; v1 submitted 9 February, 2021;
originally announced February 2021.
-
Anomaly Detection Support Using Process Classification
Authors:
Sebastian Eresheim,
Lukas Daniel Klausner,
Patrick Kochberger
Abstract:
Anomaly detection systems need to consider a lot of information when scanning for anomalies. One example is the context of the process in which an anomaly might occur, because anomalies for one process might not be anomalies for a different one. Therefore data -- such as system events -- need to be assigned to the program they originate from. This paper investigates whether it is possible to infer…
▽ More
Anomaly detection systems need to consider a lot of information when scanning for anomalies. One example is the context of the process in which an anomaly might occur, because anomalies for one process might not be anomalies for a different one. Therefore data -- such as system events -- need to be assigned to the program they originate from. This paper investigates whether it is possible to infer from a list of system events the program whose behavior caused the occurrence of these system events. To that end, we model transition probabilities between non-equivalent events and apply the $k$-nearest neighbors algorithm. This system is evaluated on non-malicious, real-world data using four different evaluation scores. Our results suggest that the approach proposed in this paper is capable of correctly inferring program names from system events.
△ Less
Submitted 13 January, 2021;
originally announced January 2021.
-
Typosquatting for Fun and Profit: Cross-Country Analysis of Pop-Up Scam
Authors:
Tobias Dam,
Lukas Daniel Klausner,
Sebastian Schrittwieser
Abstract:
Today, many different types of scams can be found on the internet. Online criminals are always finding new creative ways to trick internet users, be it in the form of lottery scams, downloading scam apps for smartphones or fake gambling websites. This paper presents a large-scale study on one particular delivery method of online scam: pop-up scam on typosquatting domains. Typosquatting describes t…
▽ More
Today, many different types of scams can be found on the internet. Online criminals are always finding new creative ways to trick internet users, be it in the form of lottery scams, downloading scam apps for smartphones or fake gambling websites. This paper presents a large-scale study on one particular delivery method of online scam: pop-up scam on typosquatting domains. Typosquatting describes the concept of registering domains which are very similar to existing ones while deliberately containing common ty** errors; these domains are then used to trick online users while under the belief of browsing the intended website. Pop-up scam uses JavaScript alert boxes to present a message which attracts the user's attention very effectively, as they are a blocking user interface element.
Our study among typosquatting domains derived from the Majestic Million list utilising an Austrian IP address revealed on 1219 distinct typosquatting URLs a total of 2577 pop-up messages, out of which 1538 were malicious. Approximately a third of those distinct URLs (403) were targeted and displayed pop-up messages to one specific HTTP user agent only. Based on our scans, we present an in-depth analysis as well as a detailed classification of different targeting parameters (user agent and language) which triggered varying kinds of pop-up scams. Furthermore, we expound the differences of current pop-up scam characteristics in comparison with a previous scan performed in late 2018 and examine the use of IDN homograph attacks as well as the application of message localisation using additional scans with IP addresses from the United States and Japan.
△ Less
Submitted 2 April, 2020;
originally announced April 2020.
-
Ich weiß, was du nächsten Sommer getan haben wirst: Predictive Policing in Österreich
Authors:
Angelika Adensamer,
Lukas Daniel Klausner
Abstract:
Predictive policing is a data-based, predictive analytical technique used in law enforcement. In this paper, we give an overview of the current situation in Austria and discuss technical, sociopolitical and legal questions raised by the use of PP, such as the lack of awareness of discriminatory structures in society, the biases in data underlying PP and the lack of reflection on the basic premises…
▽ More
Predictive policing is a data-based, predictive analytical technique used in law enforcement. In this paper, we give an overview of the current situation in Austria and discuss technical, sociopolitical and legal questions raised by the use of PP, such as the lack of awareness of discriminatory structures in society, the biases in data underlying PP and the lack of reflection on the basic premises and feedback mechanisms of PP. Violations of fundamental rights without cause are not allowed by the Austrian Code of Criminal Procedure (Strafprozeßordnung, StPO), the Security Police Act (Sicherheitspolizeigesetz, SPG) or the Act concerning Police Protection of the State (Polizeiliches Staatsschutzgesetz, PStSG); the principle of allowing police intervention only on the basis of concrete threats or suspicion must remain absolute. Considering the numerous problems (not least from the point of view of legal policy), we conclude that the use of PP should be eschewed and that resources and planning should instead be focussed on solving the social problems which actually cause crime.
-----
Predictive Policing ist ein datenbasiertes und prognosegetriebenes Modell für Polizeiarbeit. Wir geben in diesem Artikel einen Überblick über den aktuellen Stand in Österreich und diskutieren technische, politisch-gesellschaftliche und rechtliche Probleme, die sich daraus ergeben -- etwa das mangelhafte Bewusstsein für Prozesse gesellschaftlicher Diskriminierung, die verzerrte Datenbasis, die PP zugrundeliegt, und fehlende Reflexion über zugrundeliegende Annahmen und Rückkopplungseffekte. Anlasslose Grundrechtseingriffe sind weder durch die StPO noch das SPG oder das PStSG gedeckt; dem Grundgedanken, dass Polizei erst bei konkreter Gefahrenlage oder Tatverdacht tätig werden darf, muss weiterhin Rechnung getragen werden. Aus unserer Sicht sollte angesichts der zahlreichen Probleme (und auch aus rechtspolitischen Erwägungen) auf PP verzichtet werden und stattdessen Ressourcen und Überlegung in die Lösung jener gesellschaftlicher Probleme investiert werden, die zu Kriminalität führen.
△ Less
Submitted 23 October, 2019; v1 submitted 1 July, 2019;
originally announced July 2019.
-
Large-Scale Analysis of Pop-Up Scam on Typosquatting URLs
Authors:
Tobias Dam,
Lukas Daniel Klausner,
Damjan Buhov,
Sebastian Schrittwieser
Abstract:
Today, many different types of scams can be found on the internet. Online criminals are always finding new creative ways to trick internet users, be it in the form of lottery scams, downloading scam apps for smartphones or fake gambling websites. This paper presents a large-scale study on one particular delivery method of online scam: pop-up scam on typosquatting domains. Typosquatting describes t…
▽ More
Today, many different types of scams can be found on the internet. Online criminals are always finding new creative ways to trick internet users, be it in the form of lottery scams, downloading scam apps for smartphones or fake gambling websites. This paper presents a large-scale study on one particular delivery method of online scam: pop-up scam on typosquatting domains. Typosquatting describes the concept of registering domains which are very similar to existing ones while deliberately containing common ty** errors; these domains are then used to trick online users while under the belief of browsing the intended website. Pop-up scam uses JavaScript alert boxes to present a message which attracts the user's attention very effectively, as they are a blocking user interface element.
Our study among typosquatting domains derived from the Alexa Top 1 Million list revealed on 8255 distinct typosquatting URLs a total of 9857 pop-up messages, out of which 8828 were malicious. The vast majority of those distinct URLs (7176) were targeted and displayed pop-up messages to one specific HTTP user agent only. Based on our scans, we present an in-depth analysis as well as a detailed classification of different targeting parameters (user agent and language) which triggered varying kinds of pop-up scams.
△ Less
Submitted 25 June, 2019;
originally announced June 2019.