Search | arXiv e-print repository

Using Self-supervised Learning Can Improve Model Fairness

Authors: Sofia Yfantidou, Dimitris Spathis, Marios Constantinides, Athena Vakali, Daniele Quercia, Fahim Kawsar

Abstract: Self-supervised learning (SSL) has become the de facto training paradigm of large models, where pre-training is followed by supervised fine-tuning using domain-specific data and labels. Despite demonstrating comparable performance with supervised methods, comprehensive efforts to assess SSL's impact on machine learning fairness (i.e., performing equally on different demographic breakdowns) are lac… ▽ More Self-supervised learning (SSL) has become the de facto training paradigm of large models, where pre-training is followed by supervised fine-tuning using domain-specific data and labels. Despite demonstrating comparable performance with supervised methods, comprehensive efforts to assess SSL's impact on machine learning fairness (i.e., performing equally on different demographic breakdowns) are lacking. Hypothesizing that SSL models would learn more generic, hence less biased representations, this study explores the impact of pre-training and fine-tuning strategies on fairness. We introduce a fairness assessment framework for SSL, comprising five stages: defining dataset requirements, pre-training, fine-tuning with gradual unfreezing, assessing representation similarity conditioned on demographics, and establishing domain-specific evaluation processes. We evaluate our method's generalizability on three real-world human-centric datasets (i.e., MIMIC, MESA, and GLOBEM) by systematically comparing hundreds of SSL and fine-tuned models on various dimensions spanning from the intermediate representations to appropriate evaluation metrics. Our findings demonstrate that SSL can significantly improve model fairness, while maintaining performance on par with supervised methods-exhibiting up to a 30% increase in fairness with minimal loss in performance through self-supervision. We posit that such differences can be attributed to representation dissimilarities found between the best- and the worst-performing demographics across models-up to x13 greater for protected attributes with larger performance discrepancies between segments. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: arXiv admin note: text overlap with arXiv:2401.01640

arXiv:2401.01640 [pdf, other]

Evaluating Fairness in Self-supervised and Supervised Models for Sequential Data

Authors: Sofia Yfantidou, Dimitris Spathis, Marios Constantinides, Athena Vakali, Daniele Quercia, Fahim Kawsar

Abstract: Self-supervised learning (SSL) has become the de facto training paradigm of large models where pre-training is followed by supervised fine-tuning using domain-specific data and labels. Hypothesizing that SSL models would learn more generic, hence less biased, representations, this study explores the impact of pre-training and fine-tuning strategies on fairness (i.e., performing equally on differen… ▽ More Self-supervised learning (SSL) has become the de facto training paradigm of large models where pre-training is followed by supervised fine-tuning using domain-specific data and labels. Hypothesizing that SSL models would learn more generic, hence less biased, representations, this study explores the impact of pre-training and fine-tuning strategies on fairness (i.e., performing equally on different demographic breakdowns). Motivated by human-centric applications on real-world timeseries data, we interpret inductive biases on the model, layer, and metric levels by systematically comparing SSL models to their supervised counterparts. Our findings demonstrate that SSL has the capacity to achieve performance on par with supervised methods while significantly enhancing fairness--exhibiting up to a 27% increase in fairness with a mere 1% loss in performance through self-supervision. Ultimately, this work underscores SSL's potential in human-centric computing, particularly high-stakes, data-scarce application domains like healthcare. △ Less

Submitted 3 January, 2024; originally announced January 2024.

Comments: Paper accepted in Human-Centric Representation Learning workshop at AAAI 2024 (https://hcrl-workshop.github.io/2024/)

arXiv:2311.04705 [pdf, other]

Negotiation Strategies in Ubiquitous Human-Computer Interaction: A Novel Storyboards Scale & Field Study

Authors: Sofia Yfantidou, Georgia Yfantidou, Panagiota Balaska, Athena Vakali

Abstract: In today's connected society, self-tracking technologies (STTs), such as wearables and mobile fitness apps, empower humans to improve their health and well-being through ubiquitous physical activity monitoring, with several personal and societal benefits. Despite the advances in such technologies' hardware, low user engagement and decreased effectiveness limitations demand more informed and theore… ▽ More In today's connected society, self-tracking technologies (STTs), such as wearables and mobile fitness apps, empower humans to improve their health and well-being through ubiquitous physical activity monitoring, with several personal and societal benefits. Despite the advances in such technologies' hardware, low user engagement and decreased effectiveness limitations demand more informed and theoretically-founded Human-Computer Interaction designs. To address these challenges, we build upon the previously unexplored Leisure Constraints Negotiation Model and the Transtheoretical Model to systematically define and assess the effectiveness of STTs' features that acknowledge users' contextual constraints and establish human-negotiated STTs narratives. Specifically, we introduce and validate a human-centric scale, StoryWear, which exploits and explores eleven dimensions of negotiation strategies that humans utilize to overcome constraints regarding exercise participation, captured through an inclusive storyboards format. Based on our preliminary studies, StoryWear shows high reliability, rendering it suitable for future work in ubiquitous computing. Our results indicate that negotiation strategies vary in perceived effectiveness and have higher appeal for existing STTs' users, with self-motivation, commitment, and understanding of the negative impact of non-exercise placed at the top. Finally, we give actionable guidelines for real-world implementation and a commentary on the future of personalized training. △ Less

Submitted 8 November, 2023; originally announced November 2023.

arXiv:2307.12075 [pdf, other]

doi 10.1145/3565066.3608685

The State of Algorithmic Fairness in Mobile Human-Computer Interaction

Authors: Sofia Yfantidou, Marios Constantinides, Dimitris Spathis, Athena Vakali, Daniele Quercia, Fahim Kawsar

Abstract: This paper explores the intersection of Artificial Intelligence and Machine Learning (AI/ML) fairness and mobile human-computer interaction (MobileHCI). Through a comprehensive analysis of MobileHCI proceedings published between 2017 and 2022, we first aim to understand the current state of algorithmic fairness in the community. By manually analyzing 90 papers, we found that only a small portion (… ▽ More This paper explores the intersection of Artificial Intelligence and Machine Learning (AI/ML) fairness and mobile human-computer interaction (MobileHCI). Through a comprehensive analysis of MobileHCI proceedings published between 2017 and 2022, we first aim to understand the current state of algorithmic fairness in the community. By manually analyzing 90 papers, we found that only a small portion (5%) thereof adheres to modern fairness reporting, such as analyses conditioned on demographic breakdowns. At the same time, the overwhelming majority draws its findings from highly-educated, employed, and Western populations. We situate these findings within recent efforts to capture the current state of algorithmic fairness in mobile and wearable computing, and envision that our results will serve as an open invitation to the design and development of fairer ubiquitous technologies. △ Less

Submitted 22 July, 2023; originally announced July 2023.

Comments: arXiv admin note: text overlap with arXiv:2303.15585

Journal ref: 25th International Conference on Mobile Human-Computer Interaction (MobileHCI '23 Companion), September 26--29, 2023, Athens, Greece

arXiv:2307.09958 [pdf, other]

Bias in Internet Measurement Platforms

Authors: Pavlos Sermpezis, Lars Prehn, Sofia Kostoglou, Marcel Flores, Athena Vakali, Emile Aben

Abstract: Network operators and researchers frequently use Internet measurement platforms (IMPs), such as RIPE Atlas, RIPE RIS, or RouteViews for, e.g., monitoring network performance, detecting routing events, topology discovery, or route optimization. To interpret the results of their measurements and avoid pitfalls or wrong generalizations, users must understand a platform's limitations. To this end, thi… ▽ More Network operators and researchers frequently use Internet measurement platforms (IMPs), such as RIPE Atlas, RIPE RIS, or RouteViews for, e.g., monitoring network performance, detecting routing events, topology discovery, or route optimization. To interpret the results of their measurements and avoid pitfalls or wrong generalizations, users must understand a platform's limitations. To this end, this paper studies an important limitation of IMPs, the \textit{bias}, which exists due to the non-uniform deployment of the vantage points. Specifically, we introduce a generic framework to systematically and comprehensively quantify the multi-dimensional (e.g., across location, topology, network types, etc.) biases of IMPs. Using the framework and open datasets, we perform a detailed analysis of biases in IMPs that confirms well-known (to the domain experts) biases and sheds light on less-known or unexplored biases. To facilitate IMP users to obtain awareness of and explore bias in their measurements, as well as further research and analyses (e.g., methods for mitigating bias), we publicly share our code and data, and provide online tools (API, Web app, etc.) that calculate and visualize the bias in measurement setups. △ Less

Submitted 24 July, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

arXiv:2307.09819 [pdf]

Analyzing large scale political discussions on Twitter: the use case of the Greek wiretap** scandal (#ypoklopes)

Authors: Ilias Dimitriadis, Dimitrios P. Giakatos, Stelios Karamanidis, Pavlos Sermpezis, Kelly Kiki, Athena Vakali

Abstract: In this paper, we study the Greek wiretap**s scandal, which has been revealed in 2022 and attracted a lot of attention by press and citizens. Specifically, we propose a methodology for collecting data and analyzing patterns of online public discussions on Twitter. We apply our methodology to the Greek wiretap**s use case, and present findings related to the evolution of the discussion over tim… ▽ More In this paper, we study the Greek wiretap**s scandal, which has been revealed in 2022 and attracted a lot of attention by press and citizens. Specifically, we propose a methodology for collecting data and analyzing patterns of online public discussions on Twitter. We apply our methodology to the Greek wiretap**s use case, and present findings related to the evolution of the discussion over time, its polarization, and the role of the media. The methodology can be of wider use and replicated to other topics. Finally, we provide publicly an open dataset, and online resources with the results. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2305.05303 [pdf, other]

ENCOVIZ: An open-source, secure and multi-role energy consumption visualisation platform

Authors: Efstratios Voulgaris, Ilias Dimitriadis, Dimitrios P. Giakatos, Athena Vakali, Athanasios Papakonstantinou, Dimitris Chatzigiannis

Abstract: The need for a more energy efficient future is now more evident than ever and has led to the continuous growth of sectors with greater potential for energy savings, such as smart buildings, energy consumption meters, etc. The large volume of energy related data produced is a huge advantage but, at the same time, it creates a new problem; The need to structure, organize and efficiently present this… ▽ More The need for a more energy efficient future is now more evident than ever and has led to the continuous growth of sectors with greater potential for energy savings, such as smart buildings, energy consumption meters, etc. The large volume of energy related data produced is a huge advantage but, at the same time, it creates a new problem; The need to structure, organize and efficiently present this meaningful information. In this context, we present the ENCOVIZ platform, a multi-role, extensible, secure, energy consumption visualization platform with built-in analytics. ENCOVIZ has been built in accordance with the best visualisation practices, on top of open source technologies and includes (i) multi-role functionalities, (ii) the automated ingestion of energy consumption data and (iii) proper visualisations and information to support effective decision making both for energy providers and consumers. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 5 pages, 4 figures

arXiv:2303.15592 [pdf, other]

doi 10.1145/3610914

Uncovering Bias in Personal Informatics

Authors: Sofia Yfantidou, Pavlos Sermpezis, Athena Vakali, Ricardo Baeza-Yates

Abstract: Personal informatics (PI) systems, powered by smartphones and wearables, enable people to lead healthier lifestyles by providing meaningful and actionable insights that break down barriers between users and their health information. Today, such systems are used by billions of users for monitoring not only physical activity and sleep but also vital signs and women's and heart health, among others.… ▽ More Personal informatics (PI) systems, powered by smartphones and wearables, enable people to lead healthier lifestyles by providing meaningful and actionable insights that break down barriers between users and their health information. Today, such systems are used by billions of users for monitoring not only physical activity and sleep but also vital signs and women's and heart health, among others. Despite their widespread usage, the processing of sensitive PI data may suffer from biases, which may entail practical and ethical implications. In this work, we present the first comprehensive empirical and analytical study of bias in PI systems, including biases in raw data and in the entire machine learning life cycle. We use the most detailed framework to date for exploring the different sources of bias and find that biases exist both in the data generation and the model learning and implementation streams. According to our results, the most affected minority groups are users with health issues, such as diabetes, joint issues, and hypertension, and female users, whose data biases are propagated or even amplified by learning models, while intersectional biases can also be observed. △ Less

Submitted 19 July, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

Report number: Volume: 7 Number: 3, Article: 139

Journal ref: IMWUT 2023

arXiv:2303.15585 [pdf, other]

Beyond Accuracy: A Critical Review of Fairness in Machine Learning for Mobile and Wearable Computing

Authors: Sofia Yfantidou, Marios Constantinides, Dimitris Spathis, Athena Vakali, Daniele Quercia, Fahim Kawsar

Abstract: The field of mobile and wearable computing is undergoing a revolutionary integration of machine learning. Devices can now diagnose diseases, predict heart irregularities, and unlock the full potential of human cognition. However, the underlying algorithms powering these predictions are not immune to biases with respect to sensitive attributes (e.g., gender, race), leading to discriminatory outcome… ▽ More The field of mobile and wearable computing is undergoing a revolutionary integration of machine learning. Devices can now diagnose diseases, predict heart irregularities, and unlock the full potential of human cognition. However, the underlying algorithms powering these predictions are not immune to biases with respect to sensitive attributes (e.g., gender, race), leading to discriminatory outcomes. The goal of this work is to explore the extent to which the mobile and wearable computing community has adopted ways of reporting information about datasets and models to surface and, eventually, counter biases. Our systematic review of papers published in the Proceedings of the ACM Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) journal from 2018-2022 indicates that, while there has been progress made on algorithmic fairness, there is still ample room for growth. Our findings show that only a small portion (5%) of published papers adheres to modern fairness reporting, while the overwhelming majority thereof focuses on accuracy or error metrics. To generalize these results across venues of similar scope, we analyzed recent proceedings of ACM MobiCom, MobiSys, and SenSys, IEEE Pervasive, and IEEE Transactions on Mobile Computing Computing, and found no deviation from our primary result. In light of these findings, our work provides practical guidelines for the design and development of mobile and wearable technologies that not only strive for accuracy but also fairness. △ Less

Submitted 22 September, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

arXiv:2303.06478 [pdf, other]

PyPoll: A python library automating mining of networks, discussions and polarization on Twitter

Authors: Dimitrios Panteleimon Giakatos, Pavlos Sermpezis, Athena Vakali

Abstract: Today online social networks have a high impact in our society as more and more people use them for communicating with each other, express their opinions, participating in public discussions, etc. In particular, Twitter is one of the most popular social network platforms people mainly use for political discussions. This attracted the interest of many research studies that analyzed social phenomena… ▽ More Today online social networks have a high impact in our society as more and more people use them for communicating with each other, express their opinions, participating in public discussions, etc. In particular, Twitter is one of the most popular social network platforms people mainly use for political discussions. This attracted the interest of many research studies that analyzed social phenomena on Twitter, by collecting data, analysing communication patterns, and exploring the structure of user networks. While previous works share many common methodologies for data collection and analysis, these are mainly re-implemented every time by researchers in a custom way. In this paper, we introduce PyPoll an open-source Python library that operationalizes common analysis tasks for Twitter discussions. With PyPoll users can perform Twitter graph mining, calculate the polarization index and generate interactive visualizations without needing third-party tools. We believe that PyPoll can help researchers automate their tasks by giving them methods that are easy to use. Also, we demonstrate the use of the library by presenting two use cases; the PyPoll visualization app, an online application for graph visualizing and sharing, and the Political Lighthouse, a Web portal for displaying the polarization in various political topics on Twitter. △ Less

Submitted 11 March, 2023; originally announced March 2023.

arXiv:2212.14731 [pdf, other]

doi 10.1109/HealthCom54947.2022.9982730

UBIWEAR: An end-to-end, data-driven framework for intelligent physical activity prediction to empower mHealth interventions

Authors: Asterios Bampakis, Sofia Yfantidou, Athena Vakali

Abstract: It is indisputable that physical activity is vital for an individual's health and wellness. However, a global prevalence of physical inactivity has induced significant personal and socioeconomic implications. In recent years, a significant amount of work has showcased the capabilities of self-tracking technology to create positive health behavior change. This work is motivated by the potential of… ▽ More It is indisputable that physical activity is vital for an individual's health and wellness. However, a global prevalence of physical inactivity has induced significant personal and socioeconomic implications. In recent years, a significant amount of work has showcased the capabilities of self-tracking technology to create positive health behavior change. This work is motivated by the potential of personalized and adaptive goal-setting techniques in encouraging physical activity via self-tracking. To this end, we propose UBIWEAR, an end-to-end framework for intelligent physical activity prediction, with the ultimate goal to empower data-driven goal-setting interventions. To achieve this, we experiment with numerous machine learning and deep learning paradigms as a robust benchmark for physical activity prediction tasks. To train our models, we utilize, "MyHeart Counts", an open, large-scale dataset collected in-the-wild from thousands of users. We also propose a prescriptive framework for self-tracking aggregated data preprocessing, to facilitate data wrangling of real-world, noisy data. Our best model achieves a MAE of 1087 steps, 65% lower than the state of the art in terms of absolute error, proving the feasibility of the physical activity prediction task, and paving the way for future research. △ Less

Submitted 3 January, 2023; v1 submitted 30 December, 2022; originally announced December 2022.

Comments: 2022 IEEE International Conference on E-health Networking, Application & Services (HealthCom), Pages 56-62

arXiv:2211.02415 [pdf, other]

doi 10.1016/j.simpat.2022.102620

Multilingual Name Entity Recognition and Intent Classification Employing Deep Learning Architectures

Authors: Sofia Rizou, Antonia Paflioti, Angelos Theofilatos, Athena Vakali, George Sarigiannidis, Konstantinos Ch. Chatzisavvas

Abstract: Named Entity Recognition and Intent Classification are among the most important subfields of the field of Natural Language Processing. Recent research has lead to the development of faster, more sophisticated and efficient models to tackle the problems posed by those two tasks. In this work we explore the effectiveness of two separate families of Deep Learning networks for those tasks: Bidirection… ▽ More Named Entity Recognition and Intent Classification are among the most important subfields of the field of Natural Language Processing. Recent research has lead to the development of faster, more sophisticated and efficient models to tackle the problems posed by those two tasks. In this work we explore the effectiveness of two separate families of Deep Learning networks for those tasks: Bidirectional Long Short-Term networks and Transformer-based networks. The models were trained and tested on the ATIS benchmark dataset for both English and Greek languages. The purpose of this paper is to present a comparative study of the two groups of networks for both languages and showcase the results of our experiments. The models, being the current state-of-the-art, yielded impressive results and achieved high performance. △ Less

Submitted 4 November, 2022; originally announced November 2022.

Comments: 24 pages, 5 figures, 11 tables, dataset available

ACM Class: I.2.7; I.2.1; I.2.11

Journal ref: Simulation Modelling Practice and Theory, Vol. 120, 102620 (2022)

arXiv:2210.14189 [pdf, other]

Benchmarking Graph Neural Networks for Internet Routing Data

Authors: Dimitrios Panteleimon Giakatos, Sofia Kostoglou, Pavlos Sermpezis, Athena Vakali

Abstract: The Internet is composed of networks, called Autonomous Systems (or, ASes), interconnected to each other, thus forming a large graph. While both the AS-graph is known and there is a multitude of data available for the ASes (i.e., node attributes), the research on applying graph machine learning (ML) methods on Internet data has not attracted a lot of attention. In this work, we provide a benchmark… ▽ More The Internet is composed of networks, called Autonomous Systems (or, ASes), interconnected to each other, thus forming a large graph. While both the AS-graph is known and there is a multitude of data available for the ASes (i.e., node attributes), the research on applying graph machine learning (ML) methods on Internet data has not attracted a lot of attention. In this work, we provide a benchmarking framework aiming to facilitate research on Internet data using graph-ML and graph neural network (GNN) methods. Specifically, we compile a dataset with heterogeneous node/AS attributes by collecting data from multiple online sources, and preprocessing them so that they can be easily used as input in GNN architectures. Then, we create a framework/pipeline for applying GNNs on the compiled data. For a set of tasks, we perform a benchmarking of different GNN models (as well as, non-GNN ML models) to test their efficiency; our results can serve as a common baseline for future research and provide initial insights for the application of GNNs on Internet data. △ Less

Submitted 25 October, 2022; originally announced October 2022.

arXiv:2209.10361 [pdf, other]

MulBot: Unsupervised Bot Detection Based on Multivariate Time Series

Authors: Lorenzo Mannocci, Stefano Cresci, Anna Monreale, Athina Vakali, Maurizio Tesconi

Abstract: Online social networks are actively involved in the removal of malicious social bots due to their role in the spread of low quality information. However, most of the existing bot detectors are supervised classifiers incapable of capturing the evolving behavior of sophisticated bots. Here we propose MulBot, an unsupervised bot detector based on multivariate time series (MTS). For the first time, we… ▽ More Online social networks are actively involved in the removal of malicious social bots due to their role in the spread of low quality information. However, most of the existing bot detectors are supervised classifiers incapable of capturing the evolving behavior of sophisticated bots. Here we propose MulBot, an unsupervised bot detector based on multivariate time series (MTS). For the first time, we exploit multidimensional temporal features extracted from user timelines. We manage the multidimensionality with an LSTM autoencoder, which projects the MTS in a suitable latent space. Then, we perform a clustering step on this encoded representation to identify dense groups of very similar users -- a known sign of automation. Finally, we perform a binary classification task achieving f1-score $= 0.99$, outperforming state-of-the-art methods (f1-score $\le 0.97$). Not only does MulBot achieve excellent results in the binary classification task, but we also demonstrate its strengths in a novel and practically-relevant task: detecting and separating different botnets. In this multi-class classification task we achieve f1-score $= 0.96$. We conclude by estimating the importance of the different features used in our model and by evaluating MulBot's capability to generalize to new unseen bots, thus proposing a solution to the generalization deficiencies of supervised bot detectors. △ Less

Submitted 21 September, 2022; originally announced September 2022.

arXiv:2206.01421 [pdf, other]

doi 10.1145/3511047.3538029

12 Years of Self-tracking for Promoting Physical Activity from a User Diversity Perspective: Taking Stock and Thinking Ahead

Authors: Sofia Yfantidou, Pavlos Sermpezis, Athena Vakali

Abstract: Despite the indisputable personal and societal benefits of regular physical activity, a large portion of the population does not follow the recommended guidelines, harming their health and wellness. The World Health Organization has called upon governments, practitioners, and researchers to accelerate action to address the global prevalence of physical inactivity. To this end, an emerging wave of… ▽ More Despite the indisputable personal and societal benefits of regular physical activity, a large portion of the population does not follow the recommended guidelines, harming their health and wellness. The World Health Organization has called upon governments, practitioners, and researchers to accelerate action to address the global prevalence of physical inactivity. To this end, an emerging wave of research in ubiquitous computing has been exploring the potential of interactive self-tracking technology in encouraging positive health behavior change. Numerous findings indicate the benefits of personalization and inclusive design regarding increasing the motivational appeal and overall effectiveness of behavior change systems, with the ultimate goal of empowering and facilitating people to achieve their goals. However, most interventions still adopt a "one-size-fits-all" approach to their design, assuming equal effectiveness for all system features in spite of individual and collective user differences. To this end, we analyze a corpus of 12 years of research in self-tracking technology for health behavior change, focusing on physical activity, to identify those design elements that have proven most effective in inciting desirable behavior across diverse population segments. We then provide actionable recommendations for designing and evaluating behavior change self-tracking technology based on age, gender, occupation, fitness, and health condition. Finally, we engage in a critical commentary on the diversity of the domain and discuss ethical concerns surrounding tailored interventions and directions for moving forward. △ Less

Submitted 3 June, 2022; originally announced June 2022.

arXiv:2205.15707 [pdf, other]

CALEB: A Conditional Adversarial Learning Framework to Enhance Bot Detection

Authors: George Dialektakis, Ilias Dimitriadis, Athena Vakali

Abstract: The high growth of Online Social Networks (OSNs) over the last few years has allowed automated accounts, known as social bots, to gain ground. As highlighted by other researchers, most of these bots have malicious purposes and tend to mimic human behavior, posing high-level security threats on OSN platforms. Moreover, recent studies have shown that social bots evolve over time by reforming and rei… ▽ More The high growth of Online Social Networks (OSNs) over the last few years has allowed automated accounts, known as social bots, to gain ground. As highlighted by other researchers, most of these bots have malicious purposes and tend to mimic human behavior, posing high-level security threats on OSN platforms. Moreover, recent studies have shown that social bots evolve over time by reforming and reinventing unforeseen and sophisticated characteristics, making them capable of evading the current machine learning state-of-the-art bot detection systems. This work is motivated by the critical need to establish adaptive bot detection methods in order to proactively capture unseen evolved bots towards healthier OSNs interactions. In contrast with most earlier supervised ML approaches which are limited by the inability to effectively detect new types of bots, this paper proposes CALEB, a robust end-to-end proactive framework based on the Conditional Generative Adversarial Network (CGAN) and its extension, Auxiliary Classifier GAN (AC-GAN), to simulate bot evolution by creating realistic synthetic instances of different bot types. These simulated evolved bots augment existing bot datasets and therefore enhance the detection of emerging generations of bots before they even appear! Furthermore, we show that our augmentation approach overpasses other earlier augmentation techniques which fail at simulating evolving bots. Extensive experimentation on well established public bot datasets, show that our approach offers a performance boost of up to 10% regarding the detection of new unseen bots. Finally, the use of the AC-GAN Discriminator as a bot detector, has outperformed former ML approaches, showcasing the efficiency of our end to end framework. △ Less

Submitted 31 May, 2022; originally announced May 2022.

arXiv:2109.02358 [pdf, other]

Pointspectrum: Equivariance Meets Laplacian Filtering for Graph Representation Learning

Authors: Marinos Poiitis, Pavlos Sermpezis, Athena Vakali

Abstract: Graph Representation Learning (GRL) has become essential for modern graph data mining and learning tasks. GRL aims to capture the graph's structural information and exploit it in combination with node and edge attributes to compute low-dimensional representations. While Graph Neural Networks (GNNs) have been used in state-of-the-art GRL architectures, they have been shown to suffer from over smoot… ▽ More Graph Representation Learning (GRL) has become essential for modern graph data mining and learning tasks. GRL aims to capture the graph's structural information and exploit it in combination with node and edge attributes to compute low-dimensional representations. While Graph Neural Networks (GNNs) have been used in state-of-the-art GRL architectures, they have been shown to suffer from over smoothing when many GNN layers need to be stacked. In a different GRL approach, spectral methods based on graph filtering have emerged addressing over smoothing; however, up to now, they employ traditional neural networks that cannot efficiently exploit the structure of graph data. Motivated by this, we propose PointSpectrum, a spectral method that incorporates a set equivariant network to account for a graph's structure. PointSpectrum enhances the efficiency and expressiveness of spectral methods, while it outperforms or competes with state-of-the-art GRL methods. Overall, PointSpectrum addresses over smoothing by employing a graph filter and captures a graph's structure through set equivariance, lying on the intersection of GNNs and spectral methods. Our findings are promising for the benefits and applicability of this architectural shift for spectral methods and GRL. △ Less

Submitted 7 September, 2021; v1 submitted 6 September, 2021; originally announced September 2021.

Comments: 13 pages, 8 figures, 6 tables

arXiv:2105.02346 [pdf, ps, other]

Estimating the Impact of BGP Prefix Hijacking

Authors: Pavlos Sermpezis, Vasileios Kotronis, Konstantinos Arakadakis, Athena Vakali

Abstract: BGP prefix hijacking is a critical threat to the resilience and security of communications in the Internet. While several mechanisms have been proposed to prevent, detect or mitigate hijacking events, it has not been studied how to accurately quantify the impact of an ongoing hijack. When detecting a hijack, existing methods do not estimate how many networks in the Internet are affected (before an… ▽ More BGP prefix hijacking is a critical threat to the resilience and security of communications in the Internet. While several mechanisms have been proposed to prevent, detect or mitigate hijacking events, it has not been studied how to accurately quantify the impact of an ongoing hijack. When detecting a hijack, existing methods do not estimate how many networks in the Internet are affected (before and/or after its mitigation). In this paper, we study fundamental and practical aspects of the problem of estimating the impact of an ongoing hijack through network measurements. We derive analytical results for the involved trade-offs and limits, and investigate the performance of different measurement approaches (control/data-plane measurements) and use of public measurement infrastructure. Our findings provide useful insights for the design of accurate hijack impact estimation methodologies. Based on these insights, we design (i) a lightweight and practical estimation methodology that employs ** measurements, and (ii) an estimator that employs public infrastructure measurements and eliminates correlations between them to improve the accuracy. We validate the proposed methodologies and findings against results from hijacking experiments we conduct in the real Internet. △ Less

Submitted 5 May, 2021; originally announced May 2021.

Comments: IFIP Networking conference 2021

arXiv:2104.11483 [pdf, ps, other]

doi 10.1145/3592621

14 Years of Self-Tracking Technology for mHealth -- Literature Review: Lessons Learnt and the PAST SELF Framework

Authors: Sofia Yfantidou, Pavlos Sermpezis, Athena Vakali

Abstract: In today's connected society, many people rely on mHealth and self-tracking (ST) technology to help them adopt healthier habits with a focus on breaking their sedentary lifestyle and staying fit. However, there is scarce evidence of such technological interventions' effectiveness, and there are no standardized methods to evaluate their impact on people's physical activity (PA) and health. This wor… ▽ More In today's connected society, many people rely on mHealth and self-tracking (ST) technology to help them adopt healthier habits with a focus on breaking their sedentary lifestyle and staying fit. However, there is scarce evidence of such technological interventions' effectiveness, and there are no standardized methods to evaluate their impact on people's physical activity (PA) and health. This work aims to help ST practitioners and researchers by empowering them with systematic guidelines and a framework for designing and evaluating technological interventions to facilitate health behavior change (HBC) and user engagement (UE), focusing on increasing PA and decreasing sedentariness. To this end, we conduct a literature review of 129 papers between 2008 and 2022, which identifies the core ST HCI design methods and their efficacy, as well as the most comprehensive list to date of UE evaluation metrics for ST. Based on the review's findings, we propose PAST SELF, a framework to guide the design and evaluation of ST technology that has potential applications in industrial and scientific settings. Finally, to facilitate researchers and practitioners, we complement this paper with an open corpus and an online, adaptive exploration tool for the PAST SELF data. △ Less

Submitted 29 April, 2022; v1 submitted 23 April, 2021; originally announced April 2021.

Comments: 40 pages, 10 figures

arXiv:2009.10802 [pdf]

My tweets bring all the traits to the yard: Predicting personality and relational traits in Online Social Networks

Authors: Dimitra Karanatsiou, Pavlos Sermpezis, Jon Gruda, Konstantinos Kafetsios, Ilias Dimitriadis, Athena Vakali

Abstract: Users in Online Social Networks (OSN) leaves traces that reflect their personality characteristics. The study of these traces is important for a number of fields, such as a social science, psychology, OSN, marketing, and others. Despite a marked increase on research in personality prediction on based on online behavior the focus has been heavily on individual personality traits largely neglecting… ▽ More Users in Online Social Networks (OSN) leaves traces that reflect their personality characteristics. The study of these traces is important for a number of fields, such as a social science, psychology, OSN, marketing, and others. Despite a marked increase on research in personality prediction on based on online behavior the focus has been heavily on individual personality traits largely neglecting relational facets of personality. This study aims to address this gap by providing a prediction model for a holistic personality profiling in OSNs that included socio-relational traits (attachment orientations) in combination with standard personality traits. Specifically, we first designed a feature engineering methodology that extracts a wide range of features (accounting for behavior, language, and emotions) from OSN accounts of users. Then, we designed a machine learning model that predicts scores for the psychological traits of the users based on the extracted features. The proposed model architecture is inspired by characteristics embedded in psychological theory, i.e, utilizing interrelations among personality facets, and leads to increased accuracy in comparison with the state of the art approaches. To demonstrate the usefulness of this approach, we applied our model to two datasets, one of random OSN users and one of organizational leaders, and compared their psychological profiles. Our findings demonstrate that the two groups can be clearly separated by only using their psychological profiles, which opens a promising direction for future research on OSN user characterization and classification. △ Less

Submitted 22 September, 2020; originally announced September 2020.

arXiv:2005.10646 [pdf, other]

On the Aggression Diffusion Modeling and Minimization in Online Social Networks

Authors: Marinos Poiitis, Athena Vakali, Nicolas Kourtellis

Abstract: Aggression in online social networks has been studied mostly from the perspective of machine learning which detects such behavior in a static context. However, the way aggression diffuses in the network has received little attention as it embeds modeling challenges. In fact, modeling how aggression propagates from one user to another, is an important research topic since it can enable effective ag… ▽ More Aggression in online social networks has been studied mostly from the perspective of machine learning which detects such behavior in a static context. However, the way aggression diffuses in the network has received little attention as it embeds modeling challenges. In fact, modeling how aggression propagates from one user to another, is an important research topic since it can enable effective aggression monitoring, especially in media platforms which up to now apply simplistic user blocking techniques. In this paper, we address aggression propagation modeling and minimization on Twitter, since it is a popular microblogging platform at which aggression had several onsets. We propose various methods building on two well-known diffusion models, Independent Cascade (IC) and Linear Threshold (LT), to study the aggression evolution in the social network. We experimentally investigate how well each method can model aggression propagation using real Twitter data, while varying parameters, such as seed users selection, graph edge weighting, users' activation timing, etc. It is found that the best performing strategies are the ones to select seed users with a degree-based approach, weigh user edges based on their social circles' overlaps, and activate users according to their aggression levels. We further employ the best performing models to predict which ordinary real users could become aggressive (and vice versa) in the future, and achieve up to AUC=0.89 in this prediction task. Finally, we investigate aggression minimization by launching competitive cascades to "inform" and "heal" aggressors. We show that IC and LT models can be used in aggression minimization, providing less intrusive alternatives to the blocking techniques currently employed by popular online social network platforms. △ Less

Submitted 30 August, 2021; v1 submitted 21 May, 2020; originally announced May 2020.

Comments: 23 pages, 10 figures, 3 tables, submitted to TWEB

arXiv:1907.08873 [pdf, other]

Detecting Cyberbullying and Cyberaggression in Social Media

Authors: Despoina Chatzakou, Ilias Leontiadis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Athena Vakali, Nicolas Kourtellis

Abstract: Cyberbullying and cyberaggression are increasingly worrisome phenomena affecting people across all demographics. More than half of young social media users worldwide have been exposed to such prolonged and/or coordinated digital harassment. Victims can experience a wide range of emotions, with negative consequences such as embarrassment, depression, isolation from other community members, which em… ▽ More Cyberbullying and cyberaggression are increasingly worrisome phenomena affecting people across all demographics. More than half of young social media users worldwide have been exposed to such prolonged and/or coordinated digital harassment. Victims can experience a wide range of emotions, with negative consequences such as embarrassment, depression, isolation from other community members, which embed the risk to lead to even more critical consequences, such as suicide attempts. In this work, we take the first concrete steps to understand the characteristics of abusive behavior in Twitter, one of today's largest social media platforms. We analyze 1.2 million users and 2.1 million tweets, comparing users participating in discussions around seemingly normal topics like the NBA, to those more likely to be hate-related, such as the Gamergate controversy, or the gender pay inequality at the BBC station. We also explore specific manifestations of abusive behavior, i.e., cyberbullying and cyberaggression, in one of the hate-related communities (Gamergate). We present a robust methodology to distinguish bullies and aggressors from normal Twitter users by considering text, user, and network-based attributes. Using various state-of-the-art machine learning algorithms, we classify these accounts with over 90% accuracy and AUC. Finally, we discuss the current status of Twitter user accounts marked as abusive by our methodology, and study the performance of potential mechanisms that can be used by Twitter to suspend users in the future. △ Less

Submitted 20 July, 2019; originally announced July 2019.

Comments: To appear in ACM Transactions on the Web (TWEB)

arXiv:1802.00393 [pdf, other]

Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior

Authors: Antigoni-Maria Founta, Constantinos Djouvas, Despoina Chatzakou, Ilias Leontiadis, Jeremy Blackburn, Gianluca Stringhini, Athena Vakali, Michael Sirivianos, Nicolas Kourtellis

Abstract: In recent years, offensive, abusive and hateful language, sexism, racism and other types of aggressive and cyberbullying behavior have been manifesting with increased frequency, and in many online social media platforms. In fact, past scientific work focused on studying these forms in popular media, such as Facebook and Twitter. Building on such work, we present an 8-month study of the various for… ▽ More In recent years, offensive, abusive and hateful language, sexism, racism and other types of aggressive and cyberbullying behavior have been manifesting with increased frequency, and in many online social media platforms. In fact, past scientific work focused on studying these forms in popular media, such as Facebook and Twitter. Building on such work, we present an 8-month study of the various forms of abusive behavior on Twitter, in a holistic fashion. Departing from past work, we examine a wide variety of labeling schemes, which cover different forms of abusive behavior, at the same time. We propose an incremental and iterative methodology, that utilizes the power of crowdsourcing to annotate a large scale collection of tweets with a set of abuse-related labels. In fact, by applying our methodology including statistical analysis for label merging or elimination, we identify a reduced but robust set of labels. Finally, we offer a first overview and findings of our collected and annotated dataset of 100 thousand tweets, which we make publicly available for further scientific exploration. △ Less

Submitted 15 April, 2018; v1 submitted 1 February, 2018; originally announced February 2018.

Comments: crowdsourcing, abusive behavior, hate speech, Twitter, aggression, bullying

MSC Class: 68T06 ACM Class: K.4.2

arXiv:1802.00385 [pdf, other]

A Unified Deep Learning Architecture for Abuse Detection

Authors: Antigoni-Maria Founta, Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Athena Vakali, Ilias Leontiadis

Abstract: Hate speech, offensive language, sexism, racism and other types of abusive behavior have become a common phenomenon in many online social media platforms. In recent years, such diverse abusive behaviors have been manifesting with increased frequency and levels of intensity. This is due to the openness and willingness of popular media platforms, such as Twitter and Facebook, to host content of sens… ▽ More Hate speech, offensive language, sexism, racism and other types of abusive behavior have become a common phenomenon in many online social media platforms. In recent years, such diverse abusive behaviors have been manifesting with increased frequency and levels of intensity. This is due to the openness and willingness of popular media platforms, such as Twitter and Facebook, to host content of sensitive or controversial topics. However, these platforms have not adequately addressed the problem of online abusive behavior, and their responsiveness to the effective detection and blocking of such inappropriate behavior remains limited. In the present paper, we study this complex problem by following a more holistic approach, which considers the various aspects of abusive behavior. To make the approach tangible, we focus on Twitter data and analyze user and textual properties from different angles of abusive posting behavior. We propose a deep learning architecture, which utilizes a wide variety of available metadata, and combines it with automatically-extracted hidden patterns within the text of the tweets, to detect multiple abusive behavioral norms which are highly inter-related. We apply this unified architecture in a seamless, transparent fashion to detect different types of abusive behavior (hate speech, sexism vs. racism, bullying, sarcasm, etc.) without the need for any tuning of the model architecture for each task. We test the proposed approach with multiple datasets addressing different and multiple abusive behaviors on Twitter. Our results demonstrate that it largely outperforms the state-of-art methods (between 21 and 45\% improvement in AUC, depending on the dataset). △ Less

Submitted 21 February, 2018; v1 submitted 1 February, 2018; originally announced February 2018.

Comments: abusive behavior, Twitter, aggression, bullying, deep learning, machine learning

MSC Class: 68T06 ACM Class: K.4.2

arXiv:1705.03345 [pdf, other]

Hate is not Binary: Studying Abusive Behavior of #GamerGate on Twitter

Authors: Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Athena Vakali

Abstract: Over the past few years, online bullying and aggression have become increasingly prominent, and manifested in many different forms on social media. However, there is little work analyzing the characteristics of abusive users and what distinguishes them from typical social media users. In this paper, we start addressing this gap by analyzing tweets containing a great large amount of abusiveness. We… ▽ More Over the past few years, online bullying and aggression have become increasingly prominent, and manifested in many different forms on social media. However, there is little work analyzing the characteristics of abusive users and what distinguishes them from typical social media users. In this paper, we start addressing this gap by analyzing tweets containing a great large amount of abusiveness. We focus on a Twitter dataset revolving around the Gamergate controversy, which led to many incidents of cyberbullying and cyberaggression on various gaming and social media platforms. We study the properties of the users tweeting about Gamergate, the content they post, and the differences in their behavior compared to typical Twitter users. We find that while their tweets are often seemingly about aggressive and hateful subjects, "Gamergaters" do not exhibit common expressions of online anger, and in fact primarily differ from typical users in that their tweets are less joyful. They are also more engaged than typical Twitter users, which is an indication as to how and why this controversy is still ongoing. Surprisingly, we find that Gamergaters are less likely to be suspended by Twitter, thus we analyze their properties to identify differences from typical users and what may have led to their suspension. We perform an unsupervised machine learning analysis to detect clusters of users who, though currently active, could be considered for suspension since they exhibit similar behaviors with suspended users. Finally, we confirm the usefulness of our analyzed features by emulating the Twitter suspension mechanism with a supervised learning method, achieving very good precision and recall. △ Less

Submitted 9 May, 2017; originally announced May 2017.

Comments: In 28th ACM Conference on Hypertext and Social Media (ACM HyperText 2017)

arXiv:1702.07784 [pdf, other]

Measuring #GamerGate: A Tale of Hate, Sexism, and Bullying

Authors: Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Athena Vakali

Abstract: Over the past few years, online aggression and abusive behaviors have occurred in many different forms and on a variety of platforms. In extreme cases, these incidents have evolved into hate, discrimination, and bullying, and even materialized into real-world threats and attacks against individuals or groups. In this paper, we study the Gamergate controversy. Started in August 2014 in the online g… ▽ More Over the past few years, online aggression and abusive behaviors have occurred in many different forms and on a variety of platforms. In extreme cases, these incidents have evolved into hate, discrimination, and bullying, and even materialized into real-world threats and attacks against individuals or groups. In this paper, we study the Gamergate controversy. Started in August 2014 in the online gaming world, it quickly spread across various social networking platforms, ultimately leading to many incidents of cyberbullying and cyberaggression. We focus on Twitter, presenting a measurement study of a dataset of 340k unique users and 1.6M tweets to study the properties of these users, the content they post, and how they differ from random Twitter users. We find that users involved in this "Twitter war" tend to have more friends and followers, are generally more engaged and post tweets with negative sentiment, less joy, and more hate than random users. We also perform preliminary measurements on how the Twitter suspension mechanism deals with such abusive behaviors. While we focus on Gamergate, our methodology to collect and analyze tweets related to aggressive and bullying activities is of independent interest. △ Less

Submitted 24 February, 2017; originally announced February 2017.

Comments: WWW Cybersafety Workshop 2017

arXiv:1702.06877 [pdf, other]

Mean Birds: Detecting Aggression and Bullying on Twitter

Authors: Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, Athena Vakali

Abstract: In recent years, bullying and aggression against users on social media have grown significantly, causing serious consequences to victims of all demographics. In particular, cyberbullying affects more than half of young social media users worldwide, and has also led to teenage suicides, prompted by prolonged and/or coordinated digital harassment. Nonetheless, tools and technologies for understandin… ▽ More In recent years, bullying and aggression against users on social media have grown significantly, causing serious consequences to victims of all demographics. In particular, cyberbullying affects more than half of young social media users worldwide, and has also led to teenage suicides, prompted by prolonged and/or coordinated digital harassment. Nonetheless, tools and technologies for understanding and mitigating it are scarce and mostly ineffective. In this paper, we present a principled and scalable approach to detect bullying and aggressive behavior on Twitter. We propose a robust methodology for extracting text, user, and network-based attributes, studying the properties of cyberbullies and aggressors, and what features distinguish them from regular users. We find that bully users post less, participate in fewer online communities, and are less popular than normal users, while aggressors are quite popular and tend to include more negativity in their posts. We evaluate our methodology using a corpus of 1.6M tweets posted over 3 months, and show that machine learning classification algorithms can accurately detect users exhibiting bullying and aggressive behavior, achieving over 90% AUC. △ Less

Submitted 12 May, 2017; v1 submitted 22 February, 2017; originally announced February 2017.

Showing 1–27 of 27 results for author: Vakali, A