Search | arXiv e-print repository

The Joy of Neural Painting

Authors: Ernesto Diaz-Aviles, Claudia Orellana-Rodriguez, Beth Jochim

Abstract: Neural Painters is a class of models that follows a GAN framework to generate brushstrokes, which are then composed to create paintings. GANs are great generative models for AI Art but they are known to be notoriously difficult to train. To overcome GAN's limitations and to speed up the Neural Painter training, we applied Transfer Learning to the process reducing it from days to only hours, while… ▽ More Neural Painters is a class of models that follows a GAN framework to generate brushstrokes, which are then composed to create paintings. GANs are great generative models for AI Art but they are known to be notoriously difficult to train. To overcome GAN's limitations and to speed up the Neural Painter training, we applied Transfer Learning to the process reducing it from days to only hours, while achieving the same level of visual aesthetics in the final paintings generated. We report our approach and results in this work. △ Less

Submitted 22 November, 2021; v1 submitted 19 November, 2021; originally announced November 2021.

Report number: 2019-LAI-CUEVA-X01 ACM Class: I.2; I.4.9

arXiv:2109.06990 [pdf, other]

Personalization, Privacy, and Me

Authors: Reshma Narayanan Kutty, Claudia Orellana-Rodriguez, Igor Brigadir, Ernesto Diaz-Aviles

Abstract: News recommendation and personalization is not a solved problem. People are growing concerned of their data being collected in excess in the name of personalization and the usage of it for purposes other than the ones they would think reasonable. Our experience in building personalization products for publishers while adhering to safeguard user privacy led us to investigate more on the user perspe… ▽ More News recommendation and personalization is not a solved problem. People are growing concerned of their data being collected in excess in the name of personalization and the usage of it for purposes other than the ones they would think reasonable. Our experience in building personalization products for publishers while adhering to safeguard user privacy led us to investigate more on the user perspective of privacy and personalization. We conducted a survey to explore people's experience with personalization and privacy and the viewpoints of different age groups. In this paper, we share our major findings with publishers and the community that can inform algorithmic design and implementation of the next generation of news recommender systems, which must put the human at its core and reach a balance between personalization experiences and privacy to reap the benefits of both. △ Less

Submitted 14 September, 2021; originally announced September 2021.

Comments: ACM CCS Concepts: Information systems~Recommender systems, Information systems~Personalization, Security and privacy~Human and societal aspects of security and privacy, General and reference~Surveys and overviews. Keywords: Personalization, Privacy, Survey

arXiv:2109.03955 [pdf]

doi 10.1145/3460231.3478884

NU:BRIEF -- A Privacy-aware Newsletter Personalization Engine for Publishers

Authors: Ernesto Diaz-Aviles, Claudia Orellana-Rodriguez, Igor Brigadir, Reshma Narayanan Kutty

Abstract: Newsletters have (re-) emerged as a powerful tool for publishers to engage with their readers directly and more effectively. Despite the diversity in their audiences, publishers' newsletters remain largely a one-size-fits-all offering, which is suboptimal. In this paper, we present NU:BRIEF, a web application for publishers that enables them to personalize their newsletters without harvesting pers… ▽ More Newsletters have (re-) emerged as a powerful tool for publishers to engage with their readers directly and more effectively. Despite the diversity in their audiences, publishers' newsletters remain largely a one-size-fits-all offering, which is suboptimal. In this paper, we present NU:BRIEF, a web application for publishers that enables them to personalize their newsletters without harvesting personal data. Personalized newsletters build a habit and become a great conversion tool for publishers, providing an alternative readers-generated revenue model to a declining ad/clickbait-centered business model. △ Less

Submitted 8 September, 2021; originally announced September 2021.

Comments: Fifteenth ACM Conference on Recommender Systems (RecSys '21), September 27-October 1, 2021, Amsterdam, Netherlands

arXiv:1611.03426 [pdf, other]

Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter?

Authors: Avaré Stewart, Sara Romano, Nattiya Kanhabua, Sergio Di Martino, Wolf Siberski, Antonino Mazzeo, Wolfgang Nejdl, Ernesto Diaz-Aviles

Abstract: Social media services such as Twitter are a valuable source of information for decision support systems. Many studies have shown that this also holds for the medical domain, where Twitter is considered a viable tool for public health officials to sift through relevant information for the early detection, management, and control of epidemic outbreaks. This is possible due to the inherent capability… ▽ More Social media services such as Twitter are a valuable source of information for decision support systems. Many studies have shown that this also holds for the medical domain, where Twitter is considered a viable tool for public health officials to sift through relevant information for the early detection, management, and control of epidemic outbreaks. This is possible due to the inherent capability of social media services to transmit information faster than traditional channels. However, the majority of current studies have limited their scope to the detection of common and seasonal health recurring events (e.g., Influenza-like Illness), partially due to the noisy nature of Twitter data, which makes outbreak detection and management very challenging. Within the European project M-Eco, we developed a Twitter-based Epidemic Intelligence (EI) system, which is designed to also handle a more general class of unexpected and aperiodic outbreaks. In particular, we faced three main research challenges in this endeavor: 1) dynamic classification to manage terminology evolution of Twitter messages, 2) alert generation to produce reliable outbreak alerts analyzing the (noisy) tweet time series, and 3) ranking and recommendation to support domain experts for better assessment of the generated alerts. In this paper, we empirically evaluate our proposed approach to these challenges using real-world outbreak datasets and a large collection of tweets. We validate our solution with domain experts, describe our experiences, and give a more realistic view on the benefits and issues of analyzing social media for public health. △ Less

Submitted 10 November, 2016; originally announced November 2016.

Comments: ACM CCS Concepts: Applied computing - Health informatics; Information systems - Web mining; Document filtering; Novelty in information retrieval; Recommender systems; Human-centered computing - Social media

arXiv:1604.00647 [pdf, other]

Multi-Relational Learning at Scale with ADMM

Authors: Lucas Drumond, Ernesto Diaz-Aviles, Lars Schmidt-Thieme

Abstract: Learning from multiple-relational data which contains noise, ambiguities, or duplicate entities is essential to a wide range of applications such as statistical inference based on Web Linked Data, recommender systems, computational biology, and natural language processing. These tasks usually require working with very large and complex datasets - e.g., the Web graph - however, current approaches t… ▽ More Learning from multiple-relational data which contains noise, ambiguities, or duplicate entities is essential to a wide range of applications such as statistical inference based on Web Linked Data, recommender systems, computational biology, and natural language processing. These tasks usually require working with very large and complex datasets - e.g., the Web graph - however, current approaches to multi-relational learning are not practical for such scenarios due to their high computational complexity and poor scalability on large data. In this paper, we propose a novel and scalable approach for multi-relational factorization based on consensus optimization. Our model, called ConsMRF, is based on the Alternating Direction Method of Multipliers (ADMM) framework, which enables us to optimize each target relation using a smaller set of parameters than the state-of-the-art competitors in this task. Due to ADMM's nature, ConsMRF can be easily parallelized which makes it suitable for large multi-relational data. Experiments on large Web datasets - derived from DBpedia, Wikipedia and YAGO - show the efficiency and performance improvement of ConsMRF over strong competitors. In addition, ConsMRF near-linear scalability indicates great potential to tackle Web-scale problem sizes. △ Less

Submitted 3 April, 2016; originally announced April 2016.

Comments: Keywords: Multi-Relational Learning, Distributed Learning, Factorization Models, ADMM

arXiv:1509.05257 [pdf, other]

(Blue) Taxi Destination and Trip Time Prediction from Partial Trajectories

Authors: Hoang Thanh Lam, Ernesto Diaz-Aviles, Alessandra Pascale, Yiannis Gkoufas, Bei Chen

Abstract: Real-time estimation of destination and travel time for taxis is of great importance for existing electronic dispatch systems. We present an approach based on trip matching and ensemble learning, in which we leverage the patterns observed in a dataset of roughly 1.7 million taxi journeys to predict the corresponding final destination and travel time for ongoing taxi trips, as a solution for the EC… ▽ More Real-time estimation of destination and travel time for taxis is of great importance for existing electronic dispatch systems. We present an approach based on trip matching and ensemble learning, in which we leverage the patterns observed in a dataset of roughly 1.7 million taxi journeys to predict the corresponding final destination and travel time for ongoing taxi trips, as a solution for the ECML/PKDD Discovery Challenge 2015 competition. The results of our empirical evaluation show that our approach is effective and very robust, which led our team -- BlueTaxi -- to the 3rd and 7th position of the final rankings for the trip time and destination prediction tasks, respectively. Given the fact that the final rankings were computed using a very small test set (with only 320 trips) we believe that our approach is one of the most robust solutions for the challenge based on the consistency of our good results across the test sets. △ Less

Submitted 17 September, 2015; originally announced September 2015.

Comments: ECML/PKDD Discovery Challenge 2015

ACM Class: I.2.6; I.5.2

arXiv:1508.02884 [pdf, other]

Towards Real-time Customer Experience Prediction for Telecommunication Operators

Authors: Ernesto Diaz-Aviles, Fabio Pinelli, Karol Lynch, Zubair Nabi, Yiannis Gkoufas, Eric Bouillet, Francesco Calabrese, Eoin Coughlan, Peter Holland, Jason Salzwedel

Abstract: Telecommunications operators (telcos) traditional sources of income, voice and SMS, are shrinking due to customers using over-the-top (OTT) applications such as WhatsApp or Viber. In this challenging environment it is critical for telcos to maintain or grow their market share, by providing users with as good an experience as possible on their network. But the task of extracting customer insights… ▽ More Telecommunications operators (telcos) traditional sources of income, voice and SMS, are shrinking due to customers using over-the-top (OTT) applications such as WhatsApp or Viber. In this challenging environment it is critical for telcos to maintain or grow their market share, by providing users with as good an experience as possible on their network. But the task of extracting customer insights from the vast amounts of data collected by telcos is growing in complexity and scale everey day. How can we measure and predict the quality of a user's experience on a telco network in real-time? That is the problem that we address in this paper. We present an approach to capture, in (near) real-time, the mobile customer experience in order to assess which conditions lead the user to place a call to a telco's customer care center. To this end, we follow a supervised learning approach for prediction and train our 'Restricted Random Forest' model using, as a proxy for bad experience, the observed customer transactions in the telco data feed before the user places a call to a customer care center. We evaluate our approach using a rich dataset provided by a major African telecommunication's company and a novel big data architecture for both the training and scoring of predictive models. Our empirical study shows our solution to be effective at predicting user experience by inferring if a customer will place a call based on his current context. These promising results open new possibilities for improved customer service, which will help telcos to reduce churn rates and improve customer experience, both factors that directly impact their revenue growth. △ Less

Submitted 24 September, 2015; v1 submitted 12 August, 2015; originally announced August 2015.

Comments: IEEE 2015 BigData Conference (to appear). Keywords: Telecom operators; Customer Care; Big Data; Predictive Analytics

ACM Class: I.2.6; K.4.0; H.3.3

arXiv:1412.7990 [pdf, other]

doi 10.1145/2668067.2668072

Predicting User Engagement in Twitter with Collaborative Ranking

Authors: Ernesto Diaz-Aviles, Hoang Thanh Lam, Fabio Pinelli, Stefano Braghin, Yiannis Gkoufas, Michele Berlingerio, Francesco Calabrese

Abstract: Collaborative Filtering (CF) is a core component of popular web-based services such as Amazon, YouTube, Netflix, and Twitter. Most applications use CF to recommend a small set of items to the user. For instance, YouTube presents to a user a list of top-n videos she would likely watch next based on her rating and viewing history. Current methods of CF evaluation have been focused on assessing the q… ▽ More Collaborative Filtering (CF) is a core component of popular web-based services such as Amazon, YouTube, Netflix, and Twitter. Most applications use CF to recommend a small set of items to the user. For instance, YouTube presents to a user a list of top-n videos she would likely watch next based on her rating and viewing history. Current methods of CF evaluation have been focused on assessing the quality of a predicted rating or the ranking performance for top-n recommended items. However, restricting the recommender system evaluation to these two aspects is rather limiting and neglects other dimensions that could better characterize a well-perceived recommendation. In this paper, instead of optimizing rating or top-n recommendation, we focus on the task of predicting which items generate the highest user engagement. In particular, we use Twitter as our testbed and cast the problem as a Collaborative Ranking task where the rich features extracted from the metadata of the tweets help to complement the transaction information limited to user ids, item ids, ratings and timestamps. We learn a scoring function that directly optimizes the user engagement in terms of nDCG@10 on the predicted ranking. Experiments conducted on an extended version of the MovieTweetings dataset, released as part of the RecSys Challenge 2014, show the effectiveness of our approach. △ Less

Submitted 26 December, 2014; originally announced December 2014.

Comments: RecSysChallenge'14 at RecSys 2014, October 10, 2014, Foster City, CA, USA

ACM Class: H.3.3; I.2.6

Journal ref: In Proceedings of the 2014 Recommender Systems Challenge (RecSysChallenge'14). ACM, New York, NY, USA, , Pages 41 , 6 pages

arXiv:1407.4832 [pdf, other]

Collaborative Filtering Ensemble for Personalized Name Recommendation

Authors: Bernat Coma-Puig, Ernesto Diaz-Aviles, Wolfgang Nejdl

Abstract: Out of thousands of names to choose from, picking the right one for your child is a daunting task. In this work, our objective is to help parents making an informed decision while choosing a name for their baby. We follow a recommender system approach and combine, in an ensemble, the individual rankings produced by simple collaborative filtering algorithms in order to produce a personalized list o… ▽ More Out of thousands of names to choose from, picking the right one for your child is a daunting task. In this work, our objective is to help parents making an informed decision while choosing a name for their baby. We follow a recommender system approach and combine, in an ensemble, the individual rankings produced by simple collaborative filtering algorithms in order to produce a personalized list of names that meets the individual parents' taste. Our experiments were conducted using real-world data collected from the query logs of 'nameling' (nameling.net), an online portal for searching and exploring names, which corresponds to the dataset released in the context of the ECML PKDD Discover Challenge 2013. Our approach is intuitive, easy to implement, and features fast training and prediction steps. △ Less

Submitted 16 July, 2014; originally announced July 2014.

Comments: Top-N recommendation; personalized ranking; given name recommendation

ACM Class: H.3.3; I.2.6

Journal ref: Proceedings of the ECML PKDD Discovery Challenge - Recommending Given Names. Co-located with ECML PKDD 2013. Prague, Czech Republic, September 27, 2013

arXiv:1203.1378 [pdf, other]

Epidemic Intelligence for the Crowd, by the Crowd (Full Version)

Authors: Ernesto Diaz-Aviles, Avaré Stewart, Edward Velasco, Kerstin Denecke, Wolfgang Nejdl

Abstract: Tracking Twitter for public health has shown great potential. However, most recent work has been focused on correlating Twitter messages to influenza rates, a disease that exhibits a marked seasonal pattern. In the presence of sudden outbreaks, how can social media streams be used to strengthen surveillance capacity? In May 2011, Germany reported an outbreak of Enterohemorrhagic Escherichia coli (… ▽ More Tracking Twitter for public health has shown great potential. However, most recent work has been focused on correlating Twitter messages to influenza rates, a disease that exhibits a marked seasonal pattern. In the presence of sudden outbreaks, how can social media streams be used to strengthen surveillance capacity? In May 2011, Germany reported an outbreak of Enterohemorrhagic Escherichia coli (EHEC). It was one of the largest described outbreaks of EHEC/HUS worldwide and the largest in Germany. In this work, we study the crowd's behavior in Twitter during the outbreak. In particular, we report how tracking Twitter helped to detect key user messages that triggered signal detection alarms before MedISys and other well established early warning systems. We also introduce a personalized learning to rank approach that exploits the relationships discovered by: (i) latent semantic topics computed using Latent Dirichlet Allocation (LDA), and (ii) observing the social tagging behavior in Twitter, to rank tweets for epidemic intelligence. Our results provide the grounds for new public health research based on social media. △ Less

Submitted 5 March, 2012; originally announced March 2012.

Comments: A short version of this work has been accepted for publication at the International AAAI Conference on Weblogs and Social Media (ICWSM 2012)

arXiv:1101.0654 [pdf, ps, other]

Personalized Event-Based Surveillance and Alerting Support for the Assessment of Risk

Authors: Avaré Stewar, Ricardo Lage, Ernesto Diaz-Aviles, Peter Dolog

Abstract: In a typical Event-Based Surveillance setting, a stream of web documents is continuously monitored for disease reporting. A structured representation of the disease reporting events is extracted from the raw text, and the events are then aggregated to produce signals, which are intended to represent early warnings against potential public health threats. To public health officials, these warning… ▽ More In a typical Event-Based Surveillance setting, a stream of web documents is continuously monitored for disease reporting. A structured representation of the disease reporting events is extracted from the raw text, and the events are then aggregated to produce signals, which are intended to represent early warnings against potential public health threats. To public health officials, these warnings represent an overwhelming list of "one-size-fits-all" information for risk assessment. To reduce this overload, two techniques are proposed. First, filtering signals according to the user's preferences (e.g., location, disease, symptoms, etc.) helps reduce the undesired noise. Second, re-ranking the filtered signals, according to an individual's feedback and annotation, allows a user-specific, prioritized ranking of the most relevant warnings. We introduce an approach that takes into account this two-step process of: 1) filtering and 2) re-ranking the results of reporting signals. For this, Collaborative Filtering and Personalization are common techniques used to support users in dealing with the large amount of information that they face. △ Less

Submitted 4 January, 2011; originally announced January 2011.

Comments: International Meeting on Emerging Diseases and Surveillance. IMED 2011 - Poster Session - Vienna, Austria. February 4-7, 2011

arXiv:0902.0798 [pdf, ps, other]

Alleviating Media Bias Through Intelligent Agent Blogging

Authors: Ernesto Diaz-Aviles

Abstract: Consumers of mass media must have a comprehensive, balanced and plural selection of news to get an unbiased perspective; but achieving this goal can be very challenging, laborious and time consuming. News stories development over time, its (in)consistency, and different level of coverage across the media outlets are challenges that a conscientious reader has to overcome in order to alleviate bia… ▽ More Consumers of mass media must have a comprehensive, balanced and plural selection of news to get an unbiased perspective; but achieving this goal can be very challenging, laborious and time consuming. News stories development over time, its (in)consistency, and different level of coverage across the media outlets are challenges that a conscientious reader has to overcome in order to alleviate bias. In this paper we present an intelligent agent framework currently facilitating analysis of the main sources of on-line news in El Salvador. We show how prior tools of text analysis and Web 2.0 technologies can be combined with minimal manual intervention to help individuals on their rational decision process, while holding media outlets accountable for their work. △ Less

Submitted 4 February, 2009; originally announced February 2009.

ACM Class: I.2.11; J.4; H.3

arXiv:0812.4461 [pdf, ps, other]

Mining User Profiles to Support Structure and Explanation in Open Social Networking

Authors: Avare Stewart, Ernesto Diaz-Aviles, Wolfgang Nejdl

Abstract: The proliferation of media sharing and social networking websites has brought with it vast collections of site-specific user generated content. The result is a Social Networking Divide in which the concepts and structure common across different sites are hidden. The knowledge and structures from one social site are not adequately exploited to provide new information and resources to the same or… ▽ More The proliferation of media sharing and social networking websites has brought with it vast collections of site-specific user generated content. The result is a Social Networking Divide in which the concepts and structure common across different sites are hidden. The knowledge and structures from one social site are not adequately exploited to provide new information and resources to the same or different users in comparable social sites. For music bloggers, this latent structure, forces bloggers to select sub-optimal blogrolls. However, by integrating the social activities of music bloggers and listeners, we are able to overcome this limitation: improving the quality of the blogroll neighborhoods, in terms of similarity, by 85 percent when using tracks and by 120 percent when integrating tags from another site. △ Less

Submitted 23 December, 2008; originally announced December 2008.

Comments: International Workshop on Interacting with Multimedia Content in the Social Semantic Web (IMC-SSW 2008). Collocated with the 3rd International Conference on Semantic and Digital Media Technologies (SAMT 2008), Koblenz, Germany, Dec. 03 2008

ACM Class: H.3.3; H.3.5

Journal ref: In Proceedings of the International Workshop on Interacting with Multimedia Content in the Social Semantic Web (IMC-SSW'08), pages 21-30. Koblenz, Germany, Dec. 3, 2008

arXiv:0812.4460 [pdf, ps, other]

Emergence of Spontaneous Order Through Neighborhood Formation in Peer-to-Peer Recommender Systems

Authors: Ernesto Diaz-Aviles, Lars Schmidt-Thieme, Cai-Nicolas Ziegler

Abstract: The advent of the Semantic Web necessitates paradigm shifts away from centralized client/server architectures towards decentralization and peer-to-peer computation, making the existence of central authorities superfluous and even impossible. At the same time, recommender systems are gaining considerable impact in e-commerce, providing people with recommendations that are personalized and tailore… ▽ More The advent of the Semantic Web necessitates paradigm shifts away from centralized client/server architectures towards decentralization and peer-to-peer computation, making the existence of central authorities superfluous and even impossible. At the same time, recommender systems are gaining considerable impact in e-commerce, providing people with recommendations that are personalized and tailored to their very needs. These recommender systems have traditionally been deployed with stark centralized scenarios in mind, operating in closed communities detached from their host network's outer perimeter. We aim at marrying these two worlds, i.e., decentralized peer-to-peer computing and recommender systems, in one agent-based framework. Our architecture features an epidemic-style protocol maintaining neighborhoods of like-minded peers in a robust, selforganizing fashion. In order to demonstrate our architecture's ability to retain scalability, robustness and to allow for convergence towards high-quality recommendations, we conduct offline experiments on top of the popular MovieLens dataset. △ Less

Submitted 23 December, 2008; originally announced December 2008.

Comments: WWW '05 International Workshop on Innovations in Web Infrastructure (IWI '05) May 10, 2005, Chiba, Japan

ACM Class: C.2.4; H.3.3

Journal ref: WWW '05 International Workshop on Innovations in Web Infrastructure (IWI '05) May 10, 2005, Chiba, Japan

Showing 1–14 of 14 results for author: Diaz-Aviles, E