Search | arXiv e-print repository

Link Me Baby One More Time: Social Music Discovery on Spotify

Authors: Shazia'Ayn Babul, Desislava Hristova, Antonio Lima, Renaud Lambiotte, Mariano Beguerisse-Díaz

Abstract: We explore the social and contextual factors that influence the outcome of person-to-person music recommendations and discovery. Specifically, we use data from Spotify to investigate how a link sent from one user to another results in the receiver engaging with the music of the shared artist. We consider several factors that may influence this process, such as the strength of the sender-receiver r… ▽ More We explore the social and contextual factors that influence the outcome of person-to-person music recommendations and discovery. Specifically, we use data from Spotify to investigate how a link sent from one user to another results in the receiver engaging with the music of the shared artist. We consider several factors that may influence this process, such as the strength of the sender-receiver relationship, the user's role in the Spotify social network, their music social cohesion, and how similar the new artist is to the receiver's taste. We find that the receiver of a link is more likely to engage with a new artist when (1) they have similar music taste to the sender and the shared track is a good fit for their taste, (2) they have a stronger and more intimate tie with the sender, and (3) the shared artist is popular amongst the receiver's connections. Finally, we use these findings to build a Random Forest classifier to predict whether a shared music track will result in the receiver's engagement with the shared artist. This model elucidates which type of social and contextual features are most predictive, although peak performance is achieved when a diverse set of features are included. These findings provide new insights into the multifaceted mechanisms underpinning the interplay between music discovery and social processes. △ Less

Submitted 7 May, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

arXiv:2309.03516 [pdf, other]

Topological fingerprints for audio identification

Authors: Wojciech Reise, Ximena Fernández, Maria Dominguez, Heather A. Harrington, Mariano Beguerisse-Díaz

Abstract: We present a topological audio fingerprinting approach for robustly identifying duplicate audio tracks. Our method applies persistent homology on local spectral decompositions of audio signals, using filtered cubical complexes computed from mel-spectrograms. By encoding the audio content in terms of local Betti curves, our topological audio fingerprints enable accurate detection of time-aligned au… ▽ More We present a topological audio fingerprinting approach for robustly identifying duplicate audio tracks. Our method applies persistent homology on local spectral decompositions of audio signals, using filtered cubical complexes computed from mel-spectrograms. By encoding the audio content in terms of local Betti curves, our topological audio fingerprints enable accurate detection of time-aligned audio matchings. Experimental results demonstrate the accuracy of our algorithm in the detection of tracks with the same audio content, even when subjected to various obfuscations. Our approach outperforms existing methods in scenarios involving topological distortions, such as time stretching and pitch shifting. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: 26 pages

MSC Class: 55N31; 68U10; 62R40

arXiv:2105.05733 [pdf, other]

Thematic recommendations on knowledge graphs using multilayer networks

Authors: Mariano Beguerisse-Díaz, Dimitrios Korkinof, Till Hoffmann

Abstract: We present a framework to generate and evaluate thematic recommendations based on multilayer network representations of knowledge graphs (KGs). In this representation, each layer encodes a different type of relationship in the KG, and directed interlayer couplings connect the same entity in different roles. The relative importance of different types of connections is captured by an intuitive salie… ▽ More We present a framework to generate and evaluate thematic recommendations based on multilayer network representations of knowledge graphs (KGs). In this representation, each layer encodes a different type of relationship in the KG, and directed interlayer couplings connect the same entity in different roles. The relative importance of different types of connections is captured by an intuitive salience matrix that can be estimated from data, tuned to incorporate domain knowledge, address different use cases, or respect business logic. We apply an adaptation of the personalised PageRank algorithm to multilayer models of KGs to generate item-item recommendations. These recommendations reflect the knowledge we hold about the content and are suitable for thematic and/or cold-start recommendation settings. Evaluating thematic recommendations from user data presents unique challenges that we address by develo** a method to evaluate recommendations relying on user-item ratings, yet respecting their thematic nature. We also show that the salience matrix can be estimated from user data. We demonstrate the utility of our methods by significantly improving consumption metrics in an AB test where collaborative filtering delivered subpar performance. We also apply our approach to movie recommendation using publicly-available data to ensure the reproducibility of our results. We demonstrate that our approach outperforms existing thematic recommendation methods and is even competitive with collaborative filtering approaches. △ Less

Submitted 12 May, 2021; originally announced May 2021.

Comments: 20 pages, 5 figures

arXiv:2010.00143 [pdf, other]

doi 10.1103/PhysRevResearch.3.023249

Opinion dynamics on tie-decay networks

Authors: Kashin Sugishita, Mason A. Porter, Mariano Beguerisse-Díaz, Naoki Masuda

Abstract: In social networks, interaction patterns typically change over time. We study opinion dynamics on tie-decay networks in which tie strength increases instantaneously when there is an interaction and decays exponentially between interactions. Specifically, we formulate continuous-time Laplacian dynamics and a discrete-time DeGroot model of opinion dynamics on these tie-decay networks, and we carry o… ▽ More In social networks, interaction patterns typically change over time. We study opinion dynamics on tie-decay networks in which tie strength increases instantaneously when there is an interaction and decays exponentially between interactions. Specifically, we formulate continuous-time Laplacian dynamics and a discrete-time DeGroot model of opinion dynamics on these tie-decay networks, and we carry out numerical computations for the continuous-time Laplacian dynamics. We examine the speed of convergence by studying the spectral gaps of combinatorial Laplacian matrices of tie-decay networks. First, we compare the spectral gaps of the Laplacian matrices of tie-decay networks that we construct from empirical data with the spectral gaps for corresponding randomized and aggregate networks. We find that the spectral gaps for the empirical networks tend to be smaller than those for the randomized and aggregate networks. Second, we study the spectral gap as a function of the tie-decay rate and time. Intuitively, we expect small tie-decay rates to lead to fast convergence because the influence of each interaction between two nodes lasts longer for smaller decay rates. Moreover, as time progresses and more interactions occur, we expect eventual convergence. However, we demonstrate that the spectral gap need not decrease monotonically with respect to the decay rate or increase monotonically with respect to time. Our results highlight the importance of the interplay between the times that edges strengthen and decay in temporal networks. △ Less

Submitted 3 July, 2021; v1 submitted 30 September, 2020; originally announced October 2020.

Comments: 15 pages, 8 figures, 2 tables

Journal ref: Phys. Rev. Research 3, 023249 (2021)

arXiv:1905.13098 [pdf, other]

doi 10.1103/PhysRevE.100.062304

Customer mobility and congestion in supermarkets

Authors: Fabian Ying, Alisdair O. G. Wallis, Mariano Beguerisse-Díaz, Mason A. Porter, Sam D. Howison

Abstract: The analysis and characterization of human mobility using population-level mobility models is important for numerous applications, ranging from the estimation of commuter flows in cities to modeling trade flows between countries. However, almost all of these applications have focused on large spatial scales, which typically range between intra-city scales to inter-country scales. In this paper, we… ▽ More The analysis and characterization of human mobility using population-level mobility models is important for numerous applications, ranging from the estimation of commuter flows in cities to modeling trade flows between countries. However, almost all of these applications have focused on large spatial scales, which typically range between intra-city scales to inter-country scales. In this paper, we investigate population-level human mobility models on a much smaller spatial scale by using them to estimate customer mobility flow between supermarket zones. We use anonymized, ordered customer-basket data to infer empirical mobility flow in supermarkets, and we apply variants of the gravity and intervening-opportunities models to fit this mobility flow and estimate the flow on unseen data. We find that a doubly-constrained gravity model and an extended radiation model (which is a type of intervening-opportunities model) can successfully estimate 65--70\% of the flow inside supermarkets. Using a gravity model as a case study, we then investigate how to reduce congestion in supermarkets using mobility models. We model each supermarket zone as a queue, and we use a gravity model to identify store layouts with low congestion, which we measure either by the maximum number of visits to a zone or by the total mean queue size. We then use a simulated-annealing algorithm to find store layouts with lower congestion than a supermarket's original layout. In these optimized store layouts, we find that popular zones are often in the perimeter of a store. Our research gives insight both into how customers move in supermarkets and into how retailers can arrange stores to reduce congestion. It also provides a case study of human mobility on small spatial scales. △ Less

Submitted 26 September, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

Journal ref: Phys. Rev. E 100, 062304 (2019)

arXiv:1805.00193 [pdf, other]

Tie-decay networks in continuous time and eigenvector-based centralities

Authors: Walid Ahmad, Mason A. Porter, Mariano Beguerisse-Díaz

Abstract: Network theory is a useful framework for studying interconnected systems of interacting entities. Many networked systems evolve continuously in time, but most existing methods for the analysis of time-dependent networks rely on discrete or discretized time. In this paper, we propose an approach for studying networks that evolve in continuous time by distinguishing between \emph{interactions}, whic… ▽ More Network theory is a useful framework for studying interconnected systems of interacting entities. Many networked systems evolve continuously in time, but most existing methods for the analysis of time-dependent networks rely on discrete or discretized time. In this paper, we propose an approach for studying networks that evolve in continuous time by distinguishing between \emph{interactions}, which we model as discrete contacts, and \emph{ties}, which encode the strengths of relationships as functions of time. To illustrate our tie-decay network formalism, we adapt the well-known PageRank centrality score to our tie-decay framework in a mathematically tractable and computationally efficient way. We apply this framework to a synthetic example and then use it to study a network of retweets during the 2012 National Health Service controversy in the United Kingdom. Our work also provides guidance for similar generalizations of other tools from network theory to continuous-time networks with tie decay, including for applications to streaming data. △ Less

Submitted 31 December, 2020; v1 submitted 1 May, 2018; originally announced May 2018.

arXiv:1701.00289 [pdf, other]

doi 10.1098/rsos.170154

Integrating sentiment and social structure to determine preference alignments: The Irish Marriage Referendum

Authors: David J. P. O'Sullivan, Guillermo Garduño-Hernández, James P. Gleeson, Mariano Beguerisse-Díaz

Abstract: We examine the relationship between social structure and sentiment through the analysis of a large collection of tweets about the Irish Marriage Referendum of 2015. We obtain the sentiment of every tweet with the hashtags #marref and #marriageref that was posted in the days leading to the referendum, and construct networks to aggregate sentiment and use it to study the interactions among users. Ou… ▽ More We examine the relationship between social structure and sentiment through the analysis of a large collection of tweets about the Irish Marriage Referendum of 2015. We obtain the sentiment of every tweet with the hashtags #marref and #marriageref that was posted in the days leading to the referendum, and construct networks to aggregate sentiment and use it to study the interactions among users. Our results show that the sentiment of mention tweets posted by users is correlated with the sentiment of received mentions, and there are significantly more connections between users with similar sentiment scores than among users with opposite scores in the mention and follower networks. We combine the community structure of the two networks with the activity level of the users and sentiment scores to find groups of users who support voting `yes' or `no' in the referendum. There were numerous conversations between users on opposing sides of the debate in the absence of follower connections, which suggests that there were efforts by some users to establish dialogue and debate across ideological divisions. Our analysis shows that social structure can be integrated successfully with sentiment to analyse and understand the disposition of social media users. These results have potential applications in the integration of data and meta-data to study opinion dynamics, public opinion modelling, and polling. △ Less

Submitted 18 February, 2017; v1 submitted 1 January, 2017; originally announced January 2017.

Comments: 16 pages, 12 figures

Journal ref: R. Soc. open sci., 4, 170154 (2017)

arXiv:1605.01639 [pdf, other]

Flux-dependent graphs for metabolic networks

Authors: Mariano Beguerisse-Díaz, Gabriel Bosque, Diego Oyarzún, Jesús Picó, Mauricio Barahona

Abstract: Cells adapt their metabolic fluxes in response to changes in the environment. We present a framework for the systematic construction of flux-based graphs derived from organism-wide metabolic networks. Our graphs encode the directionality of metabolic fluxes via edges that represent the flow of metabolites from source to target reactions. The methodology can be applied in the absence of a specific… ▽ More Cells adapt their metabolic fluxes in response to changes in the environment. We present a framework for the systematic construction of flux-based graphs derived from organism-wide metabolic networks. Our graphs encode the directionality of metabolic fluxes via edges that represent the flow of metabolites from source to target reactions. The methodology can be applied in the absence of a specific biological context by modelling fluxes probabilistically, or can be tailored to different environmental conditions by incorporating flux distributions computed through constraint-based approaches such as Flux Balance Analysis. We illustrate our approach on the central carbon metabolism of Escherichia coli and on a metabolic model of human hepatocytes. The flux-dependent graphs under various environmental conditions and genetic perturbations exhibit systemic changes in their topological and community structure, which capture the re-routing of metabolic fluxes and the varying importance of specific reactions and pathways. By integrating constraint-based models and tools from network science, our framework allows the study of context-specific metabolic responses at a system level beyond standard pathway descriptions. △ Less

Submitted 28 March, 2018; v1 submitted 5 May, 2016; originally announced May 2016.

Comments: 26 Pages, 9 Figures

arXiv:1508.05764 [pdf, other]

doi 10.1177/2055207616688841

The 'who' and 'what' of #diabetes on Twitter

Authors: Mariano Beguerisse-Díaz, Amy K. McLennan, Guillermo Garduño-Hernández, Mauricio Barahona, Stanley J. Ulijaszek

Abstract: Social media are being increasingly used for health promotion, yet the landscape of users, messages and interactions in such fora is poorly understood. Studies of social media and diabetes have focused mostly on patients, or public agencies addressing it, but have not looked broadly at all the participants or the diversity of content they contribute. We study Twitter conversations about diabetes t… ▽ More Social media are being increasingly used for health promotion, yet the landscape of users, messages and interactions in such fora is poorly understood. Studies of social media and diabetes have focused mostly on patients, or public agencies addressing it, but have not looked broadly at all the participants or the diversity of content they contribute. We study Twitter conversations about diabetes through the systematic analysis of 2.5 million tweets collected over 8 months and the interactions between their authors. We address three questions: (1) what themes arise in these tweets?, (2) who are the most influential users?, (3) which type of users contribute to which themes? We answer these questions using a mixed-methods approach, integrating techniques from anthropology, network science and information retrieval such as thematic coding, temporal network analysis, and community and topic detection. Diabetes-related tweets fall within broad thematic groups: health information, news, social interaction, and commercial. At the same time, humorous messages and references to popular culture appear consistently, more than any other type of tweet. We classify authors according to their temporal 'hub' and 'authority' scores. Whereas the hub landscape is diffuse and fluid over time, top authorities are highly persistent across time and comprise bloggers, advocacy groups and NGOs related to diabetes, as well as for-profit entities without specific diabetes expertise. Top authorities fall into seven interest communities as derived from their Twitter follower network. Our findings have implications for public health professionals and policy makers who seek to use social media as an engagement tool and to inform policy design. △ Less

Submitted 30 January, 2017; v1 submitted 24 August, 2015; originally announced August 2015.

Comments: 25 pages, 11 figures, 7 tables. Supplemental spreadsheet available from http://journals.sagepub.com/doi/suppl/10.1177/2055207616688841, Digital Health, Vol 3, 2017

arXiv:1311.6785 [pdf, other]

doi 10.1098/rsif.2014.0940

Interest communities and flow roles in directed networks: the Twitter network of the UK riots

Authors: Mariano Beguerisse-Díaz, Guillermo Garduño-Hernández, Borislav Vangelov, Sophia N. Yaliraki, Mauricio Barahona

Abstract: Directionality is a crucial ingredient in many complex networks in which information, energy or influence are transmitted. In such directed networks, analysing flows (and not only the strength of connections) is crucial to reveal important features of the network that might go undetected if the orientation of connections is ignored. We showcase here a flow-based approach for community detection in… ▽ More Directionality is a crucial ingredient in many complex networks in which information, energy or influence are transmitted. In such directed networks, analysing flows (and not only the strength of connections) is crucial to reveal important features of the network that might go undetected if the orientation of connections is ignored. We showcase here a flow-based approach for community detection in networks through the study of the network of the most influential Twitter users during the 2011 riots in England. Firstly, we use directed Markov Stability to extract descriptions of the network at different levels of coarseness in terms of interest communities, i.e., groups of nodes within which flows of information are contained and reinforced. Such interest communities reveal user grou**s according to location, profession, employer, and topic. The study of flows also allows us to generate an interest distance, which affords a personalised view of the attention in the network as viewed from the vantage point of any given user. Secondly, we analyse the profiles of incoming and outgoing long-range flows with a combined approach of role-based similarity and the novel relaxed minimum spanning tree algorithm to reveal that the users in the network can be classified into five roles. These flow roles go beyond the standard leader/follower dichotomy and differ from classifications based on regular/structural equivalence. We then show that the interest communities fall into distinct informational organigrams characterised by a different mix of user roles reflecting the quality of dialogue within them. Our generic framework can be used to provide insight into how flows are generated, distributed, preserved and consumed in directed networks. △ Less

Submitted 8 October, 2014; v1 submitted 26 November, 2013; originally announced November 2013.

Comments: 32 pages, 14 figures. Supplementary Spreadsheet available from: http://www2.imperial.ac.uk/~mbegueri/Docs/riotsCommunities.zip or http://rsif.royalsocietypublishing.org/content/11/101/20140940/suppl/DC1

Journal ref: J. R. Soc. Interface 6 December 2014 vol. 11 no. 101 20140940

arXiv:1309.1795 [pdf, ps, other]

doi 10.1109/GlobalSIP.2013.6737046

Finding role communities in directed networks using Role-Based Similarity, Markov Stability and the Relaxed Minimum Spanning Tree

Authors: Mariano Beguerisse-Díaz, Borislav Vangelov, Mauricio Barahona

Abstract: We present a framework to cluster nodes in directed networks according to their roles by combining Role-Based Similarity (RBS) and Markov Stability, two techniques based on flows. First we compute the RBS matrix, which contains the pairwise similarities between nodes according to the scaled number of in- and out-directed paths of different lengths. The weighted RBS similarity matrix is then transf… ▽ More We present a framework to cluster nodes in directed networks according to their roles by combining Role-Based Similarity (RBS) and Markov Stability, two techniques based on flows. First we compute the RBS matrix, which contains the pairwise similarities between nodes according to the scaled number of in- and out-directed paths of different lengths. The weighted RBS similarity matrix is then transformed into an undirected similarity network using the Relaxed Minimum-Spanning Tree (RMST) algorithm, which uses the geometric structure of the RBS matrix to unblur the network, such that edges between nodes with high, direct RBS are preserved. Finally, we partition the RMST similarity network into role-communities of nodes at all scales using Markov Stability to find a robust set of roles in the network. We showcase our framework through a biological and a man-made network. △ Less

Submitted 6 September, 2013; originally announced September 2013.

Comments: 4 pages, 2 figures

Journal ref: Global Conference on Signal and Information Processing (GlobalSIP), 2013 IEEE, pp.937,940, 3-5 Dec. 2013

arXiv:0906.4675 [pdf, ps, other]

doi 10.1063/1.3475411

Competition for Popularity in Bipartite Networks

Authors: Mariano Beguerisse-Diaz, Mason A. Porter, Jukka-Pekka Onnela

Abstract: We present a dynamical model for rewiring and attachment in bipartite networks in which edges are added between nodes that belong to catalogs that can either be fixed in size or growing in size. The model is motivated by an empirical study of data from the video rental service Netflix, which invites its users to give ratings to the videos available in its catalog. We find that the distribution of… ▽ More We present a dynamical model for rewiring and attachment in bipartite networks in which edges are added between nodes that belong to catalogs that can either be fixed in size or growing in size. The model is motivated by an empirical study of data from the video rental service Netflix, which invites its users to give ratings to the videos available in its catalog. We find that the distribution of the number of ratings given by users and that of the number of ratings received by videos both follow a power law with an exponential cutoff. We also examine the activity patterns of Netflix users and find bursts of intense video-rating activity followed by long periods of inactivity. We derive ordinary differential equations to model the acquisition of edges by the nodes over time and obtain the corresponding time-dependent degree distributions. We then compare our results with the Netflix data and find good agreement. We conclude with a discussion of how catalog models can be used to study systems in which agents are forced to choose, rate, or prioritize their interactions from a very large set of options. △ Less

Submitted 27 May, 2010; v1 submitted 25 June, 2009; originally announced June 2009.

Comments: 13 Pages, 19 Figures

Showing 1–12 of 12 results for author: Beguerisse-Díaz, M