Search | arXiv e-print repository

Chronoblox: Chronophotographic Sequential Graph Visualization

Authors: Quentin Lobbé, Camille Roth, Lena Mangold

Abstract: We introduce Chronoblox, a system for visualizing dynamic graphs. Chronoblox consists of a chronophotography of a sequence of graph snapshots based on a single embedding space common to all time periods. The goal of Chronoblox is to project all snapshots onto a common visualization space so as to represent both local and global dynamics at a glance. In this short paper, we review both the embeddin… ▽ More We introduce Chronoblox, a system for visualizing dynamic graphs. Chronoblox consists of a chronophotography of a sequence of graph snapshots based on a single embedding space common to all time periods. The goal of Chronoblox is to project all snapshots onto a common visualization space so as to represent both local and global dynamics at a glance. In this short paper, we review both the embedding and spatialization strategies. We then explain the way in which Chronoblox translates micro to meso structural evolution visually. We finally evaluate our approach using a synthetic network before illustrating it on a real world retweet network. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2401.08236

Interpreting Node Embedding Distances Through $n$-order Proximity Neighbourhoods

Authors: Dougal Shakespeare, Camille Roth

Abstract: In the field of node representation learning the task of interpreting latent dimensions has become a prominent, well-studied research topic. The contribution of this work focuses on appraising the interpretability of another rarely-exploited feature of node embeddings increasingly utilised in recommendation and consumption diversity studies: inter-node embedded distances. Introducing a new method… ▽ More In the field of node representation learning the task of interpreting latent dimensions has become a prominent, well-studied research topic. The contribution of this work focuses on appraising the interpretability of another rarely-exploited feature of node embeddings increasingly utilised in recommendation and consumption diversity studies: inter-node embedded distances. Introducing a new method to measure how understandable the distances between nodes are, our work assesses how well the proximity weights derived from a network before embedding relate to the node closeness measurements after embedding. Testing several classical node embedding models, our findings reach a conclusion familiar to practitioners albeit rarely cited in literature - the matrix factorisation model SVD is the most interpretable through 1, 2 and even higher-order proximities. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission

MSC Class: 68R10

arXiv:2311.18705 [pdf, other]

Quantifying metadata-block structure relationships in networks using description length

Authors: Lena Mangold, Camille Roth

Abstract: Network analysis is often enriched by including an examination of node metadata. In the context of understanding the mesoscale of networks it is often assumed that node groups based on metadata and node groups based on connectivity patterns are intrinsically linked. Recently, this assumption has been challenged and it has been demonstrated that metadata might be entirely unrelated to structure or,… ▽ More Network analysis is often enriched by including an examination of node metadata. In the context of understanding the mesoscale of networks it is often assumed that node groups based on metadata and node groups based on connectivity patterns are intrinsically linked. Recently, this assumption has been challenged and it has been demonstrated that metadata might be entirely unrelated to structure or, similarly, multiple sets of metadata might be relevant to the structure of a network in different ways. We propose the metablox tool to quantify the relationship between a network's node metadata and its mesoscale structure, measuring the strength of the relationship and the type of structural arrangement exhibited by the metadata. Our tool incorporates a way to distinguish significantly relevant relationships and can be used as part of systematic meta analyses comparing large numbers of networks, which we demonstrate on a number of synthetic and empirical networks. △ Less

Submitted 9 April, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

arXiv:2310.01586 [pdf, other]

Experiences Readying Applications for Exascale

Authors: Paul T. Bauman, Reuben D. Budiardja, Dmytro Bykov, Noel Chalmers, Jacqueline Chen, Nicholas Curtis, Marc Day, Markus Eisenbach, Lucas Esclapez, Alessandro Fanfarillo, William Freitag, Nicholas Frontiere, Antigoni Georgiadou, Joseph Glenski, Kalyana Gottiparthi, Marc T. Henry de Frahan, Gustav R. Jansen, Wayne Joubert, Justin G. Lietz, Jakub Kurzak, Nicholas Malaya, Bronson Messer, Damon McDougall, Paul Mullowney, Stephen Nichols , et al. (7 additional authors not shown)

Abstract: The advent of exascale computing invites an assessment of existing best practices for develo** application readiness on the world's largest supercomputers. This work details observations from the last four years in preparing scientific applications to run on the Oak Ridge Leadership Computing Facility's (OLCF) Frontier system. This paper addresses a range of topics in software including programm… ▽ More The advent of exascale computing invites an assessment of existing best practices for develo** application readiness on the world's largest supercomputers. This work details observations from the last four years in preparing scientific applications to run on the Oak Ridge Leadership Computing Facility's (OLCF) Frontier system. This paper addresses a range of topics in software including programmability, tuning, and portability considerations that are key to moving applications from existing systems to future installations. A set of representative workloads provides case studies for general system and software testing. We evaluate the use of early access systems for development across several generations of hardware. Finally, we discuss how best practices were identified and disseminated to the community through a wide range of activities including user-guides and trainings. We conclude with recommendations for ensuring application readiness on future leadership computing systems. △ Less

Submitted 2 October, 2023; originally announced October 2023.

Comments: Accepted at SC23

arXiv:2306.09817 [pdf]

INDCOR white paper 4: Evaluation of Interactive Narrative Design For Complexity Representations

Authors: Christian Roth, Breanne Pitt, Lāsma Šķestere, Jonathan Barbara, Agnes Karolina Bakk, Kirsty Dunlop, Maria del Mar Grandio, Miguel Barreda, Despoina Sampatakou, Michael Schlauch

Abstract: While a strength of Interactive Digital Narratives (IDN) is its support for multiperspectivity, this also poses a substantial challenge to its evaluation. Moreover, evaluation has to assess the system's ability to represent a complex reality as well as the user's understanding of that complex reality as a result of the experience of interacting with the system. This is needed to measure an IDN's e… ▽ More While a strength of Interactive Digital Narratives (IDN) is its support for multiperspectivity, this also poses a substantial challenge to its evaluation. Moreover, evaluation has to assess the system's ability to represent a complex reality as well as the user's understanding of that complex reality as a result of the experience of interacting with the system. This is needed to measure an IDN's efficiency and effectiveness in representing the chosen complex phenomenon. We here present some empirical methods employed by INDCOR members in their research including UX toolkits and scales. Particularly, we consider the impact of IDN on transformative learning and its evaluation through self-reporting and other alternatives. △ Less

Submitted 11 June, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

Comments: arXiv admin note: text overlap with arXiv:2010.10135

arXiv:2302.02787 [pdf, other]

Generative models for two-ground-truth partitions in networks

Authors: Lena Mangold, Camille Roth

Abstract: A myriad of approaches have been proposed to characterise the mesoscale structure of networks - most often as a partition based on patterns variously called communities, blocks, or clusters. Clearly, distinct methods designed to detect different types of patterns may provide a variety of answers to the network's mesoscale structure. Yet, even multiple runs of a given method can sometimes yield div… ▽ More A myriad of approaches have been proposed to characterise the mesoscale structure of networks - most often as a partition based on patterns variously called communities, blocks, or clusters. Clearly, distinct methods designed to detect different types of patterns may provide a variety of answers to the network's mesoscale structure. Yet, even multiple runs of a given method can sometimes yield diverse and conflicting results, producing entire landscapes of partitions which potentially include multiple (locally optimal) mesoscale explanations of the network. Such ambiguity motivates a closer look at the ability of these methods to find multiple qualitatively different 'ground truth' partitions in a network. Here, we propose the stochastic cross-block model (SCBM), a generative model which allows for two distinct partitions to be built into the mesoscale structure of a single benchmark network. We demonstrate a use case of the benchmark model by appraising the power of stochastic block models (SBMs) to detect implicitly planted coexisting bi-community and core-periphery structures of different strengths. Given our model design and experimental set-up, we find that the ability to detect the two partitions individually varies by SBM variant and that coexistence of both partitions is recovered only in a very limited number of cases. Our findings suggest that in most instances only one - in some way dominating - structure can be detected, even in the presence of other partitions. They underline the need for considering entire landscapes of partitions when different competing explanations exist and motivate future research to advance partition coexistence detection methods. Our model also contributes to the field of benchmark networks more generally by enabling further exploration of the ability of new and existing methods to detect ambiguity in the mesoscale structure of networks. △ Less

Submitted 5 October, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

arXiv:2212.06681 [pdf, other]

doi 10.4000/oeconomia.15729

The two sides of the Environmental Kuznets Curve: a socio-semantic analysis

Authors: Telmo Menezes, Antonin Pottier, Camille Roth

Abstract: Since the 1990s, the Environmental Kuznets Curve (EKC) hypothesis posits an inverted U-shaped relationship between pollutants and economic development. The hypothesis has attracted a lot of research. We provide here a review of more than 2000 articles that have been published on the EKC. We aim at map** the development of this specialized research, both in term of actors and of content, and to t… ▽ More Since the 1990s, the Environmental Kuznets Curve (EKC) hypothesis posits an inverted U-shaped relationship between pollutants and economic development. The hypothesis has attracted a lot of research. We provide here a review of more than 2000 articles that have been published on the EKC. We aim at map** the development of this specialized research, both in term of actors and of content, and to trace the transformation it has undergone from its beginning to the present. To that end, we combine traditional bibliometric analysis and semantic analysis with a novel method, that enables us to recover the type of pollutants that are studied and the empirical claims made on EKC (whether the hypothesis is invalidated or not). We principally exhibit the existence of a few epistemic communities that are related to distinct time periods, topics and, to some extent, proportion of positive results on EKC. △ Less

Submitted 30 October, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

MSC Class: 91C99 ACM Class: J.4; I.2.7

Journal ref: Œconomia, 13-2 (2023) 279-321

arXiv:2112.10526 [pdf, other]

doi 10.21468/SciPostPhysCodeb.7

NetKet 3: Machine Learning Toolbox for Many-Body Quantum Systems

Authors: Filippo Vicentini, Damian Hofmann, Attila Szabó, Dian Wu, Christopher Roth, Clemens Giuliani, Gabriel Pescia, Jannes Nys, Vladimir Vargas-Calderon, Nikita Astrakhantsev, Giuseppe Carleo

Abstract: We introduce version 3 of NetKet, the machine learning toolbox for many-body quantum physics. NetKet is built around neural-network quantum states and provides efficient algorithms for their evaluation and optimization. This new version is built on top of JAX, a differentiable programming and accelerated linear algebra framework for the Python programming language. The most significant new feature… ▽ More We introduce version 3 of NetKet, the machine learning toolbox for many-body quantum physics. NetKet is built around neural-network quantum states and provides efficient algorithms for their evaluation and optimization. This new version is built on top of JAX, a differentiable programming and accelerated linear algebra framework for the Python programming language. The most significant new feature is the possibility to define arbitrary neural network ansätze in pure Python code using the concise notation of machine-learning frameworks, which allows for just-in-time compilation as well as the implicit generation of gradients thanks to automatic differentiation. NetKet 3 also comes with support for GPU and TPU accelerators, advanced support for discrete symmetry groups, chunking to scale up to thousands of degrees of freedom, drivers for quantum dynamics applications, and improved modularity, allowing users to use only parts of the toolbox as a foundation for their own code. △ Less

Submitted 18 August, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

Comments: 55 pages, 5 figures. Accompanying code at https://github.com/netket/netket

Journal ref: SciPost Phys. Codebases 7 (2022)

arXiv:2112.00554 [pdf, other]

Quoting is not Citing: Disentangling Affiliation and Interaction on Twitter

Authors: Camille Roth, Jonathan St-Onge, Katrin Herms

Abstract: Interaction networks are generally much less homophilic than affiliation networks, accommodating for many more cross-cutting links. By statistically assigning a political valence to users from their network-level affiliation patterns, and by further contrasting interaction and affiliation (quotes and retweets) within specific discursive events, namely quote trees, we describe a variety of cross-cu… ▽ More Interaction networks are generally much less homophilic than affiliation networks, accommodating for many more cross-cutting links. By statistically assigning a political valence to users from their network-level affiliation patterns, and by further contrasting interaction and affiliation (quotes and retweets) within specific discursive events, namely quote trees, we describe a variety of cross-cutting patterns which significantly nuance the traditional "echo chamber" narrative. △ Less

Submitted 1 December, 2021; originally announced December 2021.

Comments: Proc. Complex Networks'21, 10th International Conference on Complex Networks and their Applications

arXiv:2109.03915 [pdf, other]

doi 10.1145/3460231.3474269

Follow the guides: disentangling human and algorithmic curation in online music consumption

Authors: Quentin Villermet, Jérémie Poiroux, Manuel Moussallam, Thomas Louail, Camille Roth

Abstract: The role of recommendation systems in the diversity of content consumption on platforms is a much-debated issue. The quantitative state of the art often overlooks the existence of individual attitudes toward guidance, and eventually of different categories of users in this regard. Focusing on the case of music streaming, we analyze the complete listening history of about 9k users over one year and… ▽ More The role of recommendation systems in the diversity of content consumption on platforms is a much-debated issue. The quantitative state of the art often overlooks the existence of individual attitudes toward guidance, and eventually of different categories of users in this regard. Focusing on the case of music streaming, we analyze the complete listening history of about 9k users over one year and demonstrate that there is no blanket answer to the intertwinement of recommendation use and consumption diversity: it depends on users. First we compute for each user the relative importance of different access modes within their listening history, introducing a trichotomy distinguishing so-called `organic' use from algorithmic and editorial guidance. We thereby identify four categories of users. We then focus on two scales related to content diversity, both in terms of dispersion -- how much users consume the same content repeatedly -- and popularity -- how popular is the content they consume. We show that the two types of recommendation offered by music platforms -- algorithmic and editorial -- may drive the consumption of more or less diverse content in opposite directions, depending also strongly on the type of users. Finally, we compare users' streaming histories with the music programming of a selection of popular French radio stations during the same period. While radio programs are usually more tilted toward repetition than users' listening histories, they often program more songs from less popular artists. On the whole, our results highlight the nontrivial effects of platform-mediated recommendation on consumption, and lead us to speak of `filter niches' rather than `filter bubbles'. They hint at further ramifications for the study and design of recommendation systems. △ Less

Submitted 8 September, 2021; originally announced September 2021.

Comments: (c) The Authors 2021. To be published in Proc. 15th ACM Conference on Recommender Systems (RecSys '21), Sep 27-Oct 1, 2021, Amsterdam, NL. 16 pages, 6 figures, 3 tables. This is the authors' version of the work. It is posted here for your personal use. Not for redistribution

arXiv:2109.03538 [pdf, other]

Tracing Affordance and Item Adoption on Music Streaming Platforms

Authors: Dougal Shakespeare, Camille Roth

Abstract: Popular music streaming platforms offer users a diverse network of content exploration through a triad of affordances: organic, algorithmic and editorial access modes. Whilst offering great potential for discovery, such platform developments also pose the modern user with daily adoption decisions on two fronts: platform affordance adoption and the adoption of recommendations therein. Following a c… ▽ More Popular music streaming platforms offer users a diverse network of content exploration through a triad of affordances: organic, algorithmic and editorial access modes. Whilst offering great potential for discovery, such platform developments also pose the modern user with daily adoption decisions on two fronts: platform affordance adoption and the adoption of recommendations therein. Following a carefully constrained set of Deezer users over a 2-year observation period, our work explores factors driving user behaviour in the broad sense, by differentiating users on the basis of their temporal daily usage, adoption of the main platform affordances, and the ways in which they react to them, especially in terms of recommendation adoption. Diverging from a perspective common in studies on the effects of recommendation, we assume and confirm that users exhibit very diverse behaviours in using and adopting the platform affordances. The resulting complex and quite heterogeneous picture demonstrates that there is no blanket answer for adoption practices of both recommendation features and recommendations. △ Less

Submitted 8 September, 2021; originally announced September 2021.

Comments: ISMIR 2021 pre-print

arXiv:2011.01040 [pdf, other]

doi 10.1145/3328905.3332513

Poster: A Real-World Distributed Infrastructure for Processing Financial Data at Scale

Authors: Sebastian Frischbier, Mario Paic, Alexander Echler, Christian Roth

Abstract: Financial markets are event- and data-driven to an extremely high degree. For making decisions and triggering actions stakeholders require notifications about significant events and reliable background information that meet their individual requirements in terms of timeliness, accuracy, and completeness. As one of Europe's leading providers of financial data and regulatory solutions vwd processes… ▽ More Financial markets are event- and data-driven to an extremely high degree. For making decisions and triggering actions stakeholders require notifications about significant events and reliable background information that meet their individual requirements in terms of timeliness, accuracy, and completeness. As one of Europe's leading providers of financial data and regulatory solutions vwd processes an average of 18 billion event notifications from 500+ data sources for 30 million symbols per day. Our large-scale distributed event-based systems handle daily peak rates of 1+ million event notifications per second and additional load generated by singular pivotal events with global impact. In this poster we give practical insights into our IT systems. We outline the infrastructure we operate and the event-driven architecture we apply at vwd. In particular we showcase the (geo)distributed publish/subscribe broker network we operate across locations and countries to provide market data to our customers with varying quality of information (QoI) properties. △ Less

Submitted 29 October, 2020; originally announced November 2020.

Comments: Authors' version of the accepted submission; final version published by ACM as part of the proceedings of DEBS '19: The 13th ACM International Conference on Distributed and Event-based Systems (DEBS '19); 2 pages, 1 figure; vwd Vereinigte Wirtschaftsdienste GmbH is by now known as Infront Financial Technology GmbH (part of the Infront group)

arXiv:2009.09067 [pdf, other]

Computational appraisal of gender representativeness in popular movies

Authors: Antoine Mazieres, Telmo Menezes, Camille Roth

Abstract: Gender representation in mass media has long been mainly studied by qualitatively analyzing content. This article illustrates how automated computational methods may be used in this context to scale up such empirical observations and increase their resolution and significance. We specifically apply a face and gender detection algorithm on a broad set of popular movies spanning more than three deca… ▽ More Gender representation in mass media has long been mainly studied by qualitatively analyzing content. This article illustrates how automated computational methods may be used in this context to scale up such empirical observations and increase their resolution and significance. We specifically apply a face and gender detection algorithm on a broad set of popular movies spanning more than three decades to carry out a large-scale appraisal of the on-screen presence of women and men. Beyond the confirmation of a strong under-representation of women, we exhibit a clear temporal trend towards a fairer representativeness. We further contrast our findings with respect to movie genre, budget, and various audience-related features such as movie gross and user ratings. We lastly propose a fine description of significant asymmetries in the mise-en-scène and mise-en-cadre of characters in relation to their gender and the spatial composition of a given frame. △ Less

Submitted 12 May, 2021; v1 submitted 16 September, 2020; originally announced September 2020.

Comments: 13 pages, 7 figures, 1 table

arXiv:2001.05324 [pdf, other]

doi 10.1371/journal.pone.0231703

Tubes & Bubbles -- Topological confinement of YouTube recommendations

Authors: Camille Roth, Antoine Mazières, Telmo Menezes

Abstract: The role of recommendation algorithms in online user confinement is at the heart of a fast-growing literature. Recent empirical studies generally suggest that filter bubbles may principally be observed in the case of explicit recommendation (based on user-declared preferences) rather than implicit recommendation (based on user activity). We focus on YouTube which has become a major online content… ▽ More The role of recommendation algorithms in online user confinement is at the heart of a fast-growing literature. Recent empirical studies generally suggest that filter bubbles may principally be observed in the case of explicit recommendation (based on user-declared preferences) rather than implicit recommendation (based on user activity). We focus on YouTube which has become a major online content provider but where confinement has until now been little-studied in a systematic manner. Starting from a diverse number of seed videos, we first describe the properties of the sets of suggested videos in order to design a sound exploration protocol able to capture latent recommendation graphs recursively induced by these suggestions. These graphs form the background of potential user navigations along non-personalized recommendations. From there, be it in topological, topical or temporal terms, we show that the landscape of what we call mean-field YouTube recommendations is often prone to confinement dynamics. Moreover, the most confined recommendation graphs i.e., potential bubbles, seem to be organized around sets of videos that garner the highest audience and thus plausibly viewing time. △ Less

Submitted 15 January, 2020; originally announced January 2020.

Comments: 10 pages, 7 figures, 1 table

Journal ref: PLOS ONE 15(4): e0231703 (2020)

arXiv:1908.10784 [pdf, other]

Semantic Hypergraphs

Authors: Telmo Menezes, Camille Roth

Abstract: Approaches to Natural language processing (NLP) may be classified along a double dichotomy open/opaque - strict/adaptive. The former axis relates to the possibility of inspecting the underlying processing rules, the latter to the use of fixed or adaptive rules. We argue that many techniques fall into either the open-strict or opaque-adaptive categories. Our contribution takes steps in the open-ada… ▽ More Approaches to Natural language processing (NLP) may be classified along a double dichotomy open/opaque - strict/adaptive. The former axis relates to the possibility of inspecting the underlying processing rules, the latter to the use of fixed or adaptive rules. We argue that many techniques fall into either the open-strict or opaque-adaptive categories. Our contribution takes steps in the open-adaptive direction, which we suggest is likely to provide key instruments for interdisciplinary research. The central idea of our approach is the Semantic Hypergraph (SH), a novel knowledge representation model that is intrinsically recursive and accommodates the natural hierarchical richness of natural language. The SH model is hybrid in two senses. First, it attempts to combine the strengths of ML and symbolic approaches. Second, it is a formal language representation that reduces but tolerates ambiguity and structural variability. We will see that SH enables simple yet powerful methods of pattern detection, and features a good compromise for intelligibility both for humans and machines. It also provides a semantically deep starting point (in terms of explicit meaning) for further algorithms to operate and collaborate on. We show how modern NLP ML-based building blocks can be used in combination with a random forest classifier and a simple search tree to parse NL to SH, and that this parser can achieve high precision in a diversity of text categories. We define a pattern language representable in SH itself, and a process to discover knowledge inference rules. We then illustrate the efficiency of the SH framework in a variety of tasks, including conjunction decomposition, open information extraction, concept taxonomy inference and co-reference resolution, and an applied example of claim and conflict analysis in a news corpus. △ Less

Submitted 18 February, 2021; v1 submitted 28 August, 2019; originally announced August 2019.

arXiv:1908.03206 [pdf, other]

doi 10.1007/978-3-030-34843-4_2

Managing the Complexity of Processing Financial Data at Scale -- an Experience Report

Authors: Sebastian Frischbier, Mario Paic, Alexander Echler, Christian Roth

Abstract: Financial markets are extremely data-driven and regulated. Participants rely on notifications about significant events and background information that meet their requirements regarding timeliness, accuracy, and completeness. As one of Europe's leading providers of financial data and regulatory solutions vwd processes a daily average of 18 billion notifications from 500+ data sources for 30 million… ▽ More Financial markets are extremely data-driven and regulated. Participants rely on notifications about significant events and background information that meet their requirements regarding timeliness, accuracy, and completeness. As one of Europe's leading providers of financial data and regulatory solutions vwd processes a daily average of 18 billion notifications from 500+ data sources for 30 million symbols. Our large-scale geo-distributed systems handle daily peak rates of 1+ million notifications/sec. In this paper we give practical insights about the different types of complexity we face regarding the data we process, the systems we operate, and the regulatory constraints we must comply with. We describe the volume, variety, velocity, and veracity of the data we process, the infrastructure we operate, and the architecture we apply. We illustrate the load patterns created by trading and how the markets' attention to the Brexit vote and similar events stressed our systems. △ Less

Submitted 8 August, 2019; originally announced August 2019.

Comments: 12 pages, 2 figures, to be published in the proceedings of the 10th Complex Systems Design & Management conference (CSD&M'19) by Springer

arXiv:1907.10401 [pdf, ps, other]

Algorithmic Distortion of Informational Landscapes

Authors: Camille Roth

Abstract: The possible impact of algorithmic recommendation on the autonomy and free choice of Internet users is being increasingly discussed, especially in terms of the rendering of information and the structuring of interactions. This paper aims at reviewing and framing this issue along a double dichotomy. The first one addresses the discrepancy between users' intentions and actions (1) under some algorit… ▽ More The possible impact of algorithmic recommendation on the autonomy and free choice of Internet users is being increasingly discussed, especially in terms of the rendering of information and the structuring of interactions. This paper aims at reviewing and framing this issue along a double dichotomy. The first one addresses the discrepancy between users' intentions and actions (1) under some algorithmic influence and (2) without it. The second one distinguishes algorithmic biases on (1) prior information rearrangement and (2) posterior information arrangement. In all cases, we focus on and differentiate situations where algorithms empirically appear to expand the cognitive and social horizon of users, from those where they seem to limit that horizon. We additionally suggest that these biases may not be properly appraised without taking into account the underlying social processes which algorithms are building upon. △ Less

Submitted 19 July, 2019; originally announced July 2019.

Journal ref: Intellectica, In press

arXiv:1907.07962 [pdf, other]

doi 10.3390/info10080250

Interactional and Informational Attention on Twitter

Authors: Agathe Baltzer, Márton Karsai, Camille Roth

Abstract: Twitter may be considered as a decentralized social information processing platform whose users constantly receive their followees' information feeds, which they may in turn dispatch to their followers. This decentralization is not devoid of hierarchy and heterogeneity, both in terms of activity and attention. In particular, we appraise the distribution of attention at the collective and individua… ▽ More Twitter may be considered as a decentralized social information processing platform whose users constantly receive their followees' information feeds, which they may in turn dispatch to their followers. This decentralization is not devoid of hierarchy and heterogeneity, both in terms of activity and attention. In particular, we appraise the distribution of attention at the collective and individual level, which exhibits the existence of attentional constraints and focus effects. We observe that most users usually concentrate their attention on a limited core of peers and topics, and discuss the relationship between interactional and informational attention processes -- all of which, we suggest, may be useful to refine influence models by enabling the consideration of differential attention likelihood depending on users, their activity levels and peers' positions. △ Less

Submitted 18 July, 2019; originally announced July 2019.

Comments: 16 pages, 6 figures

Journal ref: Information 2019, 10(8), 250

arXiv:1906.12332 [pdf, other]

doi 10.1007/978-3-030-14683-2_4

Automatic Discovery of Families of Network Generative Processes

Authors: Telmo Menezes, Camille Roth

Abstract: Designing plausible network models typically requires scholars to form a priori intuitions on the key drivers of network formation. Oftentimes, these intuitions are supported by the statistical estimation of a selection of network evolution processes which will form the basis of the model to be developed. Machine learning techniques have lately been introduced to assist the automatic discovery o… ▽ More Designing plausible network models typically requires scholars to form a priori intuitions on the key drivers of network formation. Oftentimes, these intuitions are supported by the statistical estimation of a selection of network evolution processes which will form the basis of the model to be developed. Machine learning techniques have lately been introduced to assist the automatic discovery of generative models. These approaches may more broadly be described as "symbolic regression", where fundamental network dynamic functions, rather than just parameters, are evolved through genetic programming. This chapter first aims at reviewing the principles, efforts and the emerging literature in this direction, which is very much aligned with the idea of creating artificial scientists. Our contribution then aims more specifically at building upon an approach recently developed by us [Menezes \& Roth, 2014] in order to demonstrate the existence of families of networks that may be described by similar generative processes. In other words, symbolic regression may be used to group networks according to their inferred genotype (in terms of generative processes) rather than their observed phenotype (in terms of statistical/topological features). Our empirical case is based on an original data set of 238 anonymized ego-centered networks of Facebook friends, further yielding insights on the formation of sociability networks. △ Less

Submitted 26 June, 2019; originally announced June 2019.

Journal ref: DOOCN 2017: Dynamics On and Of Complex Networks III, pp.83-111, 2019, 978-3-030-14682-5

arXiv:1711.07220 [pdf, ps, other]

Integrating Privacy-Enhancing Technologies into the Internet Infrastructure

Authors: David Harborth, Dominik Herrmann, Stefan Köpsell, Sebastian Pape, Christian Roth, Hannes Federrath, Dogan Kesdogan, Kai Rannenberg

Abstract: The AN.ON-Next project aims to integrate privacy-enhancing technologies into the internet's infrastructure and establish them in the consumer mass market. The technologies in focus include a basis protection at internet service provider level, an improved overlay network-based protection and a concept for privacy protection in the emerging 5G mobile network. A crucial success factor will be the… ▽ More The AN.ON-Next project aims to integrate privacy-enhancing technologies into the internet's infrastructure and establish them in the consumer mass market. The technologies in focus include a basis protection at internet service provider level, an improved overlay network-based protection and a concept for privacy protection in the emerging 5G mobile network. A crucial success factor will be the viable adjustment and development of standards, business models and pricing strategies for those new technologies. △ Less

Submitted 20 November, 2017; originally announced November 2017.

Comments: 8 pages

arXiv:1704.01036 [pdf, other]

doi 10.1080/01621459.1971.10482356

Natural Scales in Geographical Patterns

Authors: Telmo Menezes, Camille Roth

Abstract: Human mobility is known to be distributed across several orders of magnitude of physical distances , which makes it generally difficult to endogenously find or define typical and meaningful scales. Relevant analyses, from movements to geographical partitions, seem to be relative to some ad-hoc scale, or no scale at all. Relying on geotagged data collected from photo-sharing social media, we apply… ▽ More Human mobility is known to be distributed across several orders of magnitude of physical distances , which makes it generally difficult to endogenously find or define typical and meaningful scales. Relevant analyses, from movements to geographical partitions, seem to be relative to some ad-hoc scale, or no scale at all. Relying on geotagged data collected from photo-sharing social media, we apply community detection to movement networks constrained by increasing percentiles of the distance distribution. Using a simple parameter-free discontinuity detection algorithm, we discover clear phase transitions in the community partition space. The detection of these phases constitutes the first objective method of characterising endogenous, natural scales of human movement. Our study covers nine regions, ranging from cities to countries of various sizes and a transnational area. For all regions, the number of natural scales is remarkably low (2 or 3). Further, our results hint at scale-related behaviours rather than scale-related users. The partitions of the natural scales allow us to draw discrete multi-scale geographical boundaries, potentially capable of providing key insights in fields such as epidemiology or cultural contagion where the introduction of spatial boundaries is pivotal. △ Less

Submitted 4 April, 2017; originally announced April 2017.

Journal ref: Scientific Reports, Nature Publishing Group, 2017, 7 (45823)

arXiv:1409.2390 [pdf, other]

doi 10.1038/srep06284

Symbolic regression of generative network models

Authors: Telmo Menezes, Camille Roth

Abstract: Networks are a powerful abstraction with applicability to a variety of scientific fields. Models explaining their morphology and growth processes permit a wide range of phenomena to be more systematically analysed and understood. At the same time, creating such models is often challenging and requires insights that may be counter-intuitive. Yet there currently exists no general method to arrive at… ▽ More Networks are a powerful abstraction with applicability to a variety of scientific fields. Models explaining their morphology and growth processes permit a wide range of phenomena to be more systematically analysed and understood. At the same time, creating such models is often challenging and requires insights that may be counter-intuitive. Yet there currently exists no general method to arrive at better models. We have developed an approach to automatically detect realistic decentralised network growth models from empirical data, employing a machine learning technique inspired by natural selection and defining a unified formalism to describe such models as computer programs. As the proposed method is completely general and does not assume any pre-existing models, it can be applied "out of the box" to any given network. To validate our approach empirically, we systematically rediscover pre-defined growth laws underlying several canonical network generation models and credible laws for diverse real-world networks. We were able to find programs that are simple enough to lead to an actual understanding of the mechanisms proposed, namely for a simple brain and a social network. △ Less

Submitted 8 September, 2014; originally announced September 2014.

Journal ref: Scientific Reports volume 4, Article number: 6284 (2015)

arXiv:1402.5878 [pdf, other]

Friend Inspector: A Serious Game to Enhance Privacy Awareness in Social Networks

Authors: Alexandra Cetto, Michael Netter, Günther Pernul, Christian Richthammer, Moritz Riesner, Christian Roth, Johannes Sänger

Abstract: Currently, many users of Social Network Sites are insufficiently aware of who can see their shared personal items. Nonetheless, most approaches focus on enhancing privacy in Social Networks through improved privacy settings, neglecting the fact that privacy awareness is a prerequisite for privacy control. Social Network users first need to know about privacy issues before being able to make adjust… ▽ More Currently, many users of Social Network Sites are insufficiently aware of who can see their shared personal items. Nonetheless, most approaches focus on enhancing privacy in Social Networks through improved privacy settings, neglecting the fact that privacy awareness is a prerequisite for privacy control. Social Network users first need to know about privacy issues before being able to make adjustments. In this paper, we introduce Friend Inspector, a serious game that allows its users to playfully increase their privacy awareness on Facebook. Since its launch, Friend Inspector has attracted a significant number of visitors, emphasising the need for better tools to understand privacy settings on Social Networks. △ Less

Submitted 20 February, 2014; originally announced February 2014.

Report number: IDGEI/2014/01

arXiv:1212.4950 [pdf, other]

doi 10.1109/Allerton.2012.6483283

Data Map** for Unreliable Memories

Authors: Christoph Roth, Christian Benkeser, Christoph Studer, Georgios Karakonstantis, Andreas Burg

Abstract: Future digital signal processing (DSP) systems must provide robustness on algorithm and application level to the presence of reliability issues that come along with corresponding implementations in modern semiconductor process technologies. In this paper, we address this issue by investigating the impact of unreliable memories on general DSP systems. In particular, we propose a novel framework to… ▽ More Future digital signal processing (DSP) systems must provide robustness on algorithm and application level to the presence of reliability issues that come along with corresponding implementations in modern semiconductor process technologies. In this paper, we address this issue by investigating the impact of unreliable memories on general DSP systems. In particular, we propose a novel framework to characterize the effects of unreliable memories, which enables us to devise novel methods to mitigate the associated performance loss. We propose to deploy specifically designed data representations, which have the capability of substantially improving the system reliability compared to that realized by conventional data representations used in digital integrated circuits, such as 2s complement or sign-magnitude number formats. To demonstrate the efficacy of the proposed framework, we analyze the impact of unreliable memories on coded communication systems, and we show that the deployment of optimized data representations substantially improves the error-rate performance of such systems. △ Less

Submitted 20 December, 2012; originally announced December 2012.

Comments: Proc. of the IEEE Allerton Conference, 2012

arXiv:1111.2018 [pdf, ps, other]

Intrinsically Dynamic Network Communities

Authors: Bivas Mitra, Lionel Tabourier, Camille Roth

Abstract: Community finding algorithms for networks have recently been extended to dynamic data. Most of these recent methods aim at exhibiting community partitions from successive graph snapshots and thereafter connecting or smoothing these partitions using clever time-dependent features and sampling techniques. These approaches are nonetheless achieving longitudinal rather than dynamic community detection… ▽ More Community finding algorithms for networks have recently been extended to dynamic data. Most of these recent methods aim at exhibiting community partitions from successive graph snapshots and thereafter connecting or smoothing these partitions using clever time-dependent features and sampling techniques. These approaches are nonetheless achieving longitudinal rather than dynamic community detection. We assume that communities are fundamentally defined by the repetition of interactions among a set of nodes over time. According to this definition, analyzing the data by considering successive snapshots induces a significant loss of information: we suggest that it blurs essentially dynamic phenomena - such as communities based on repeated inter-temporal interactions, nodes switching from a community to another across time, or the possibility that a community survives while its members are being integrally replaced over a longer time period. We propose a formalism which aims at tackling this issue in the context of time-directed datasets (such as citation networks), and present several illustrations on both empirical and synthetic dynamic networks. We eventually introduce intrinsically dynamic metrics to qualify temporal community structure and emphasize their possible role as an estimator of the quality of the community detection - taking into account the fact that various empirical contexts may call for distinct `community' definitions and detection criteria. △ Less

Submitted 8 November, 2011; originally announced November 2011.

Comments: 27 pages, 11 figures

arXiv:1105.5294 [pdf, other]

A long-time limit of world subway networks

Authors: Camille Roth, Soong Moon Kang, Michael Batty, Marc Barthelemy

Abstract: We study the temporal evolution of the structure of the world's largest subway networks in an exploratory manner. We show that, remarkably, all these networks converge to {a shape which shares similar generic features} despite their geographic and economic differences. This limiting shape is made of a core with branches radiating from it. For most of these networks, the average degree of a node (s… ▽ More We study the temporal evolution of the structure of the world's largest subway networks in an exploratory manner. We show that, remarkably, all these networks converge to {a shape which shares similar generic features} despite their geographic and economic differences. This limiting shape is made of a core with branches radiating from it. For most of these networks, the average degree of a node (station) within the core has a value of order 2.5 and the proportion of k=2 nodes in the core is larger than 60%. The number of branches scales roughly as the square root of the number of stations, the current proportion of branches represents about half of the total number of stations, and the average diameter of branches is about twice the average radial extension of the core. Spatial measures such as the number of stations at a given distance to the barycenter display a first regime which grows as r^2 followed by another regime with different exponents, and eventually saturates. These results -- difficult to interpret in the framework of fractal geometry -- confirm and yield a natural explanation in the geometric picture of this core and their branches: the first regime corresponds to a uniform core, while the second regime is controlled by the interstation spacing on branches. The apparent convergence towards a unique network shape in the temporal limit suggests the existence of dominant, universal mechanisms governing the evolution of these structures. △ Less

Submitted 16 May, 2012; v1 submitted 26 May, 2011; originally announced May 2011.

Comments: 11 pages, 13 figures, revised version, accepted for publication in Royal Society Interface

Journal ref: Journal of the Royal Society Interface, 9:2540-2550 (2012)

arXiv:1012.3023 [pdf, ps, other]

Generating constrained random graphs using multiple edge switches

Authors: Lionel Tabourier, Camille Roth, Jean-Philippe Cointet

Abstract: The generation of random graphs using edge swaps provides a reliable method to draw uniformly random samples of sets of graphs respecting some simple constraints, e.g. degree distributions. However, in general, it is not necessarily possible to access all graphs obeying some given con- straints through a classical switching procedure calling on pairs of edges. We therefore propose to get round thi… ▽ More The generation of random graphs using edge swaps provides a reliable method to draw uniformly random samples of sets of graphs respecting some simple constraints, e.g. degree distributions. However, in general, it is not necessarily possible to access all graphs obeying some given con- straints through a classical switching procedure calling on pairs of edges. We therefore propose to get round this issue by generalizing this classical approach through the use of higher-order edge switches. This method, which we denote by "k-edge switching", makes it possible to progres- sively improve the covered portion of a set of constrained graphs, thereby providing an increasing, asymptotically certain confidence on the statistical representativeness of the obtained sample. △ Less

Submitted 3 February, 2012; v1 submitted 14 December, 2010; originally announced December 2010.

Comments: 15 pages

arXiv:1009.0119 [pdf, other]

doi 10.1109/SocialCom.2010.26

Precursors and Laggards: An Analysis of Semantic Temporal Relationships on a Blog Network

Authors: Telmo Menezes, Camille Roth, Jean-Philippe Cointet

Abstract: We explore the hypothesis that it is possible to obtain information about the dynamics of a blog network by analysing the temporal relationships between blogs at a semantic level, and that this type of analysis adds to the knowledge that can be extracted by studying the network only at the structural level of URL links. We present an algorithm to automatically detect fine-grained discussion topics… ▽ More We explore the hypothesis that it is possible to obtain information about the dynamics of a blog network by analysing the temporal relationships between blogs at a semantic level, and that this type of analysis adds to the knowledge that can be extracted by studying the network only at the structural level of URL links. We present an algorithm to automatically detect fine-grained discussion topics, characterized by n-grams and time intervals. We then propose a probabilistic model to estimate the temporal relationships that blogs have with one another. We define the precursor score of blog A in relation to blog B as the probability that A enters a new topic before B, discounting the effect created by asymmetric posting rates. Network-level metrics of precursor and laggard behavior are derived from these dyadic precursor score estimations. This model is used to analyze a network of French political blogs. The scores are compared to traditional link degree metrics. We obtain insights into the dynamics of topic participation on this network, as well as the relationship between precursor/laggard and linking behaviors. We validate and analyze results with the help of an expert on the French blogosphere. Finally, we propose possible applications to the improvement of search engine ranking algorithms. △ Less

Submitted 1 September, 2010; originally announced September 2010.

Journal ref: IEEE SocialCom Intl Conf on Social Computing, Minneapolis, Minnesota, Aug 2010

arXiv:0909.3080 [pdf, ps, other]

Socio-semantic dynamics in a blog network

Authors: Jean-Philippe Cointet, Camille Roth

Abstract: The blogosphere can be construed as a knowledge network made of bloggers who are interacting through a social network to share, exchange or produce information. We claim that the social and semantic dimensions are essentially co-determined and propose to investigate the co-evolutionary dynamics of the blogosphere by examining two intertwined issues: First, how does knowledge distribution drive n… ▽ More The blogosphere can be construed as a knowledge network made of bloggers who are interacting through a social network to share, exchange or produce information. We claim that the social and semantic dimensions are essentially co-determined and propose to investigate the co-evolutionary dynamics of the blogosphere by examining two intertwined issues: First, how does knowledge distribution drive new interactions and thus influence the social network topology? Second, which role structural network properties play in the information circulation in the system? We adopt an empirical standpoint by analyzing the semantic and social activity of a portion of the US political blogosphere, monitored on a period of four months. △ Less

Submitted 16 September, 2009; originally announced September 2009.

Journal ref: IEEE International Conference on Social Computing (SocialCom-09), Vancouver : Canada (2009)

arXiv:nlin/0509007 [pdf, ps, other]

Lattices for Dynamic, Hierarchic & Overlap** Categorization: the Case of Epistemic Communities

Authors: Camille Roth, Paul Bourgine

Abstract: We present a method for hierarchic categorization and taxonomy evolution description. We focus on the structure of epistemic communities (ECs), or groups of agents sharing common knowledge concerns. Introducing a formal framework based on Galois lattices, we categorize ECs in an automated and hierarchically structured way and propose criteria for selecting the most relevant epistemic communities… ▽ More We present a method for hierarchic categorization and taxonomy evolution description. We focus on the structure of epistemic communities (ECs), or groups of agents sharing common knowledge concerns. Introducing a formal framework based on Galois lattices, we categorize ECs in an automated and hierarchically structured way and propose criteria for selecting the most relevant epistemic communities - for instance, ECs gathering a certain proportion of agents and thus prototypical of major fields. This process produces a manageable, insightful taxonomy of the community. Then, the longitudinal study of these static pictures makes possible an historical description. In particular, we capture stylized facts such as field progress, decline, specialization, interaction (merging or splitting), and paradigm emergence. The detection of such patterns in social networks could fruitfully be applied to other contexts. △ Less

Submitted 4 September, 2005; originally announced September 2005.

Comments: 14 pages, 8 figures

arXiv:nlin/0507021 [pdf, ps, other]

Measuring Generalized Preferential Attachment in Dynamic Social Networks

Authors: Camille Roth

Abstract: The mechanism of preferential attachment underpins most recent social network formation models. Yet few authors attempt to check or quantify assumptions on this mechanism. We call generalized preferential attachment any kind of preference to interact with other agents with respect to any node property. We then introduce tools for measuring empirically and characterizing comprehensively such phen… ▽ More The mechanism of preferential attachment underpins most recent social network formation models. Yet few authors attempt to check or quantify assumptions on this mechanism. We call generalized preferential attachment any kind of preference to interact with other agents with respect to any node property. We then introduce tools for measuring empirically and characterizing comprehensively such phenomena, and apply these tools to a socio-semantic network of scientific collaborations, investigating in particular homophilic behavior. This opens the way to a whole class of realistic and credible social network morphogenesis models. △ Less

Submitted 19 July, 2005; v1 submitted 12 July, 2005; originally announced July 2005.

Comments: 9 pages, 6 figures (v2: added property correlation measures, and various remarks)

arXiv:nlin/0409013 [pdf, ps, other]

doi 10.1080/08898480590931404

Epistemic communities: description and hierarchic categorization

Authors: Camille Roth, Paul Bourgine

Abstract: Social scientists have shown an increasing interest in understanding the structure of knowledge communities, and particularly the organization of "epistemic communities", that is groups of agents sharing common knowledge concerns. However, most existing approaches are based only on either social relationships or semantic similarity, while there has been roughly no attempt to link social and sema… ▽ More Social scientists have shown an increasing interest in understanding the structure of knowledge communities, and particularly the organization of "epistemic communities", that is groups of agents sharing common knowledge concerns. However, most existing approaches are based only on either social relationships or semantic similarity, while there has been roughly no attempt to link social and semantic aspects. In this paper, we introduce a formal framework addressing this issue and propose a method based on Galois lattices (or concept lattices) for categorizing epistemic communities in an automated and hierarchically structured fashion. Suggesting that our process allows us to rebuild a whole community structure and taxonomy, and notably fields and subfields gathering a certain proportion of agents, we eventually apply it to empirical data to exhibit these alleged structural properties, and successfully compare our results with categories spontaneously given by domain experts. △ Less

Submitted 7 September, 2004; v1 submitted 6 September, 2004; originally announced September 2004.

Comments: (v2: some typos corrected in sec. 3.2)

Journal ref: Mathematical Population Studies 12(2) (2005) 107-130

Showing 1–32 of 32 results for author: Roth, C