Search | arXiv e-print repository

Towards an ontology of portions of matter to support multi-scale analysis and provenance tracking

Authors: Lucas Valadares Vieira, Mara Abel, Fabricio Henrique Rodrigues, Tiago Prince Sales, Claudenir M. Fonseca

Abstract: This paper presents an ontology of portions of matter with practical implications across scientific and industrial domains. The ontology is developed under the Unified Foundational Ontology (UFO), which uses the concept of quantity to represent topologically maximally self-connected portions of matter. The proposed ontology introduces the granuleOf parthood relation, holding between objects and po… ▽ More This paper presents an ontology of portions of matter with practical implications across scientific and industrial domains. The ontology is developed under the Unified Foundational Ontology (UFO), which uses the concept of quantity to represent topologically maximally self-connected portions of matter. The proposed ontology introduces the granuleOf parthood relation, holding between objects and portions of matter. It also discusses the constitution of quantities by collections of granules, the representation of sub-portions of matter, and the tracking of matter provenance between quantities using historical relations. Lastly, a case study is presented to demonstrate the use of the portion of matter ontology in the geology domain for an Oil & Gas industry application. In the case study, we model how to represent the historical relation between an original portion of rock and the sub-portions created during the industrial process. Lastly, future research directions are outlined, including investigating granularity levels and defining a taxonomy of events. △ Less

Submitted 1 June, 2024; originally announced June 2024.

ACM Class: I.2.4

arXiv:2405.15896 [pdf, other]

Enhancing Augmentative and Alternative Communication with Card Prediction and Colourful Semantics

Authors: Jayr Pereira, Francisco Rodrigues, Jaylton Pereira, Cleber Zanchettin, Robson Fidalgo

Abstract: This paper presents an approach to enhancing Augmentative and Alternative Communication (AAC) systems by integrating Colourful Semantics (CS) with transformer-based language models specifically tailored for Brazilian Portuguese. We introduce an adapted BERT model, BERTptCS, which incorporates the CS framework for improved prediction of communication cards. The primary aim is to enhance the accurac… ▽ More This paper presents an approach to enhancing Augmentative and Alternative Communication (AAC) systems by integrating Colourful Semantics (CS) with transformer-based language models specifically tailored for Brazilian Portuguese. We introduce an adapted BERT model, BERTptCS, which incorporates the CS framework for improved prediction of communication cards. The primary aim is to enhance the accuracy and contextual relevance of communication card predictions, which are essential in AAC systems for individuals with complex communication needs (CCN). We compared BERTptCS with a baseline model, BERTptAAC, which lacks CS integration. Our results demonstrate that BERTptCS significantly outperforms BERTptAAC in various metrics, including top-k accuracy, Mean Reciprocal Rank (MRR), and Entropy@K. Integrating CS into the language model improves prediction accuracy and offers a more intuitive and contextual understanding of user inputs, facilitating more effective communication. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2402.11973 [pdf, other]

Bayesian Active Learning for Censored Regression

Authors: Frederik Boe Hüttel, Christoffer Riis, Filipe Rodrigues, Francisco Câmara Pereira

Abstract: Bayesian active learning is based on information theoretical approaches that focus on maximising the information that new observations provide to the model parameters. This is commonly done by maximising the Bayesian Active Learning by Disagreement (BALD) acquisitions function. However, we highlight that it is challenging to estimate BALD when the new data points are subject to censorship, where o… ▽ More Bayesian active learning is based on information theoretical approaches that focus on maximising the information that new observations provide to the model parameters. This is commonly done by maximising the Bayesian Active Learning by Disagreement (BALD) acquisitions function. However, we highlight that it is challenging to estimate BALD when the new data points are subject to censorship, where only clipped values of the targets are observed. To address this, we derive the entropy and the mutual information for censored distributions and derive the BALD objective for active learning in censored regression ($\mathcal{C}$-BALD). We propose a novel modelling approach to estimate the $\mathcal{C}$-BALD objective and use it for active learning in the censored setting. Across a wide range of datasets and models, we demonstrate that $\mathcal{C}$-BALD outperforms other Bayesian active learning methods in censored regression. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2401.05322 [pdf, other]

Arrival Time Prediction for Autonomous Shuttle Services in the Real World: Evidence from Five Cities

Authors: Carolin Schmidt, Mathias Tygesen, Filipe Rodrigues

Abstract: Urban mobility is on the cusp of transformation with the emergence of shared, connected, and cooperative automated vehicles. Yet, for them to be accepted by customers, trust in their punctuality is vital. Many pilot initiatives operate without a fixed schedule, thus enhancing the importance of reliable arrival time (AT) predictions. This study presents an AT prediction system for autonomous shuttl… ▽ More Urban mobility is on the cusp of transformation with the emergence of shared, connected, and cooperative automated vehicles. Yet, for them to be accepted by customers, trust in their punctuality is vital. Many pilot initiatives operate without a fixed schedule, thus enhancing the importance of reliable arrival time (AT) predictions. This study presents an AT prediction system for autonomous shuttles, utilizing separate models for dwell and running time predictions, validated on real-world data from five cities. Alongside established methods such as XGBoost, we explore the benefits of integrating spatial data using graph neural networks (GNN). To accurately handle the case of a shuttle bypassing a stop, we propose a hierarchical model combining a random forest classifier and a GNN. The results for the final AT prediction are promising, showing low errors even when predicting several stops ahead. Yet, no single model emerges as universally superior, and we provide insights into the characteristics of pilot sites that influence the model selection process. Finally, we identify dwell time prediction as the key determinant in overall AT prediction accuracy when autonomous shuttles are deployed in low-traffic areas or under regulatory speed limits. This research provides insights into the current state of autonomous public transport prediction models and paves the way for more data-informed decision-making as the field advances. △ Less

Submitted 10 January, 2024; originally announced January 2024.

arXiv:2311.11200 [pdf, other]

Scale-free networks: improved inference

Authors: Nixon Jerez-Lillo, Francisco A. Rodrigues, Pedro L. Ramos

Abstract: The power-law distribution plays a crucial role in complex networks as well as various applied sciences. Investigating whether the degree distribution of a network follows a power-law distribution is an important concern. The commonly used inferential methods for estimating the model parameters often yield biased estimates, which can lead to the rejection of the hypothesis that a model conforms to… ▽ More The power-law distribution plays a crucial role in complex networks as well as various applied sciences. Investigating whether the degree distribution of a network follows a power-law distribution is an important concern. The commonly used inferential methods for estimating the model parameters often yield biased estimates, which can lead to the rejection of the hypothesis that a model conforms to a power-law. In this paper, we discuss improved methods that utilize Bayesian inference to obtain accurate estimates and precise credibility intervals. The inferential methods are derived for both continuous and discrete distributions. These methods reveal that objective Bayesian approaches return nearly unbiased estimates for the parameters of both models. Notably, in the continuous case, we identify an explicit posterior distribution. This work enhances the power of goodness-of-fit tests, enabling us to accurately discern whether a network or any other dataset adheres to a power-law distribution. We apply the proposed approach to fit degree distributions for more than 5,000 synthetic networks and over 3,000 real networks. The results indicate that our method is more suitable in practice, as it yields a frequency of acceptance close to the specified nominal level. △ Less

Submitted 18 November, 2023; originally announced November 2023.

Comments: 28 pages, 9 figures

arXiv:2310.10368 [pdf, other]

Machine learning in physics: a short guide

Authors: Francisco A. Rodrigues

Abstract: Machine learning is a rapidly growing field with the potential to revolutionize many areas of science, including physics. This review provides a brief overview of machine learning in physics, covering the main concepts of supervised, unsupervised, and reinforcement learning, as well as more specialized topics such as causal inference, symbolic regression, and deep learning. We present some of the… ▽ More Machine learning is a rapidly growing field with the potential to revolutionize many areas of science, including physics. This review provides a brief overview of machine learning in physics, covering the main concepts of supervised, unsupervised, and reinforcement learning, as well as more specialized topics such as causal inference, symbolic regression, and deep learning. We present some of the principal applications of machine learning in physics and discuss the associated challenges and perspectives. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: 8 pages, 1 figure. Europhysics Letters (EPL), 2023

arXiv:2310.09131 [pdf, other]

Machine learning-based prediction of Q-voter model in complex networks

Authors: Aruane M. Pineda, Paul Kent, Colm Connaughton, Francisco A. Rodrigues

Abstract: In this article, we consider machine learning algorithms to accurately predict two variables associated with the $Q$-voter model in complex networks, i.e., (i) the consensus time and (ii) the frequency of opinion changes. Leveraging nine topological measures of the underlying networks, we verify that the clustering coefficient (C) and information centrality (IC) emerge as the most important predic… ▽ More In this article, we consider machine learning algorithms to accurately predict two variables associated with the $Q$-voter model in complex networks, i.e., (i) the consensus time and (ii) the frequency of opinion changes. Leveraging nine topological measures of the underlying networks, we verify that the clustering coefficient (C) and information centrality (IC) emerge as the most important predictors for these outcomes. Notably, the machine learning algorithms demonstrate accuracy across three distinct initialization methods of the $Q$-voter model, including random selection and the involvement of high- and low-degree agents with positive opinions. By unraveling the intricate interplay between network structure and dynamics, this research sheds light on the underlying mechanisms responsible for polarization effects and other dynamic patterns in social systems. Adopting a holistic approach that comprehends the complexity of network systems, this study offers insights into the intricate dynamics associated with polarization effects and paves the way for investigating the structure and dynamics of complex systems through modern machine learning methods. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Comments: 32 pages, 10 figures

Journal ref: Journal of Statistical Mechanics: Theory and Experiment (JSTAT), 2023

arXiv:2308.10650 [pdf, other]

Deep Evidential Learning for Bayesian Quantile Regression

Authors: Frederik Boe Hüttel, Filipe Rodrigues, Francisco Câmara Pereira

Abstract: It is desirable to have accurate uncertainty estimation from a single deterministic forward-pass model, as traditional methods for uncertainty quantification are computationally expensive. However, this is difficult because single forward-pass models do not sample weights during inference and often make assumptions about the target distribution, such as assuming it is Gaussian. This can be restric… ▽ More It is desirable to have accurate uncertainty estimation from a single deterministic forward-pass model, as traditional methods for uncertainty quantification are computationally expensive. However, this is difficult because single forward-pass models do not sample weights during inference and often make assumptions about the target distribution, such as assuming it is Gaussian. This can be restrictive in regression tasks, where the mean and standard deviation are inadequate to model the target distribution accurately. This paper proposes a deep Bayesian quantile regression model that can estimate the quantiles of a continuous target distribution without the Gaussian assumption. The proposed method is based on evidential learning, which allows the model to capture aleatoric and epistemic uncertainty with a single deterministic forward-pass model. This makes the method efficient and scalable to large models and datasets. We demonstrate that the proposed method achieves calibrated uncertainties on non-Gaussian distributions, disentanglement of aleatoric and epistemic uncertainty, and robustness to out-of-distribution samples. △ Less

Submitted 21 August, 2023; originally announced August 2023.

arXiv:2306.13200 [pdf, other]

Improving Log-Cumulant Based Estimation of Roughness Information in SAR imagery

Authors: Jeova Farias Sales Rocha Neto, Francisco Alixandre Avila Rodrigues

Abstract: Synthetic Aperture Radar (SAR) image understanding is crucial in remote sensing applications, but it is hindered by its intrinsic noise contamination, called speckle. Sophisticated statistical models, such as the $\mathcal{G}^0$ family of distributions, have been employed to SAR data and many of the current advancements in processing this imagery have been accomplished through extracting informati… ▽ More Synthetic Aperture Radar (SAR) image understanding is crucial in remote sensing applications, but it is hindered by its intrinsic noise contamination, called speckle. Sophisticated statistical models, such as the $\mathcal{G}^0$ family of distributions, have been employed to SAR data and many of the current advancements in processing this imagery have been accomplished through extracting information from these models. In this paper, we propose improvements to parameter estimation in $\mathcal{G}^0$ distributions using the Method of Log-Cumulants. First, using Bayesian modeling, we construct that regularly produce reliable roughness estimates under both $\mathcal{G}^0_A$ and $\mathcal{G}^0_I$ models. Second, we make use of an approximation of the Trigamma function to compute the estimated roughness in constant time, making it considerably faster than the existing method for this task. Finally, we show how we can use this method to achieve fast and reliable SAR image understanding based on roughness information. △ Less

Submitted 22 June, 2023; originally announced June 2023.

arXiv:2305.09129 [pdf, other]

Graph Reinforcement Learning for Network Control via Bi-Level Optimization

Authors: Daniele Gammelli, James Harrison, Kaidi Yang, Marco Pavone, Filipe Rodrigues, Francisco C. Pereira

Abstract: Optimization problems over dynamic networks have been extensively studied and widely used in the past decades to formulate numerous real-world problems. However, (1) traditional optimization-based approaches do not scale to large networks, and (2) the design of good heuristics or approximation algorithms often requires significant manual trial-and-error. In this work, we argue that data-driven str… ▽ More Optimization problems over dynamic networks have been extensively studied and widely used in the past decades to formulate numerous real-world problems. However, (1) traditional optimization-based approaches do not scale to large networks, and (2) the design of good heuristics or approximation algorithms often requires significant manual trial-and-error. In this work, we argue that data-driven strategies can automate this process and learn efficient algorithms without compromising optimality. To do so, we present network control problems through the lens of reinforcement learning and propose a graph network-based framework to handle a broad class of problems. Instead of naively computing actions over high-dimensional graph elements, e.g., edges, we propose a bi-level formulation where we (1) specify a desired next state via RL, and (2) solve a convex program to best achieve it, leading to drastically improved scalability and performance. We further highlight a collection of desirable features to system designers, investigate design decisions, and present experiments on real-world control problems showing the utility, scalability, and flexibility of our framework. △ Less

Submitted 15 May, 2023; originally announced May 2023.

Comments: 9 pages, 4 figures

arXiv:2303.16859 [pdf, other]

doi 10.1088/2632-072X/acf6a4

Group polarization, influence, and domination in online interaction networks: A case study of the 2022 Brazilian elections

Authors: Ruben Interian, Francisco A. Rodrigues

Abstract: In this work, we investigate the evolution of polarization, influence, and domination in online interaction networks. Twitter data collected before and during the 2022 Brazilian elections is used as a case study. From a theoretical perspective, we develop a methodology called d-modularity that allows discovering the contribution of specific groups to network polarization using the well-known modul… ▽ More In this work, we investigate the evolution of polarization, influence, and domination in online interaction networks. Twitter data collected before and during the 2022 Brazilian elections is used as a case study. From a theoretical perspective, we develop a methodology called d-modularity that allows discovering the contribution of specific groups to network polarization using the well-known modularity measure. While the overall network modularity (somewhat unexpectedly) decreased, the proposed group-oriented approach allows concluding that the contribution of the right-leaning community to this modularity increased, remaining very high during the analyzed period. Our methodology is general enough to be used in any situation when the contribution of specific groups to overall network modularity and polarization is needed to investigate. Moreover, using the concept of partial domination, we are able to compare the reach of sets of influential profiles from different groups and their ability to accomplish coordinated communication inside their groups and across segments of the entire network during some specific time window. We show that in the whole network, the left-leaning high-influential information spreaders dominated, reaching a substantial fraction of users with fewer spreaders. However, when comparing domination inside the groups, the results are inverse. Right-leaning spreaders dominate their communities using few nodes, showing as the most capable of accomplishing coordinated communication. The results bring evidence of extreme isolation and the ease of accomplishing coordinated communication that characterized right-leaning communities during the 2022 Brazilian elections. △ Less

Submitted 29 March, 2023; originally announced March 2023.

MSC Class: 05C69; 05C90 ACM Class: J.4; G.2.2

arXiv:2303.15489 [pdf]

Railway Network Delay Evolution: A Heterogeneous Graph Neural Network Approach

Authors: Zhongcan Li, ** Huang, Chao Wen, Filipe Rodrigues

Abstract: Railway operations involve different types of entities (stations, trains, etc.), making the existing graph/network models with homogenous nodes (i.e., the same kind of nodes) incapable of capturing the interactions between the entities. This paper aims to develop a heterogeneous graph neural network (HetGNN) model, which can address different types of nodes (i.e., heterogeneous nodes), to investig… ▽ More Railway operations involve different types of entities (stations, trains, etc.), making the existing graph/network models with homogenous nodes (i.e., the same kind of nodes) incapable of capturing the interactions between the entities. This paper aims to develop a heterogeneous graph neural network (HetGNN) model, which can address different types of nodes (i.e., heterogeneous nodes), to investigate the train delay evolution on railway networks. To this end, a graph architecture combining the HetGNN model and the GraphSAGE homogeneous GNN (HomoGNN), called SAGE-Het, is proposed. The aim is to capture the interactions between trains, trains and stations, and stations and other stations on delay evolution based on different edges. In contrast to the traditional methods that require the inputs to have constant dimensions (e.g., in rectangular or grid-like arrays) or only allow homogeneous nodes in the graph, SAGE-Het allows for flexible inputs and heterogeneous nodes. The data from two sub-networks of the China railway network are applied to test the performance and robustness of the proposed SAGE-Het model. The experimental results show that SAGE-Het exhibits better performance than the existing delay prediction methods and some advanced HetGNNs used for other prediction tasks; the predictive performances of SAGE-Het under different prediction time horizons (10/20/30 min ahead) all outperform other baseline methods; Specifically, the influences of train interactions on delay propagation are investigated based on the proposed model. The results show that train interactions become subtle when the train headways increase . This finding directly contributes to decision-making in the situation where conflict-resolution or train-canceling actions are needed. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: 29 pages; 8 figures; 7 tables

arXiv:2302.14833 [pdf, other]

Learning to Control Autonomous Fleets from Observation via Offline Reinforcement Learning

Authors: Carolin Schmidt, Daniele Gammelli, Francisco Camara Pereira, Filipe Rodrigues

Abstract: Autonomous Mobility-on-Demand (AMoD) systems are an evolving mode of transportation in which a centrally coordinated fleet of self-driving vehicles dynamically serves travel requests. The control of these systems is typically formulated as a large network optimization problem, and reinforcement learning (RL) has recently emerged as a promising approach to solve the open challenges in this space. R… ▽ More Autonomous Mobility-on-Demand (AMoD) systems are an evolving mode of transportation in which a centrally coordinated fleet of self-driving vehicles dynamically serves travel requests. The control of these systems is typically formulated as a large network optimization problem, and reinforcement learning (RL) has recently emerged as a promising approach to solve the open challenges in this space. Recent centralized RL approaches focus on learning from online data, ignoring the per-sample-cost of interactions within real-world transportation systems. To address these limitations, we propose to formalize the control of AMoD systems through the lens of offline reinforcement learning and learn effective control strategies using solely offline data, which is readily available to current mobility operators. We further investigate design decisions and provide empirical evidence based on data from real-world mobility systems showing how offline learning allows to recover AMoD control policies that (i) exhibit performance on par with online methods, (ii) allow for sample-efficient online fine-tuning and (iii) eliminate the need for complex simulation environments. Crucially, this paper demonstrates that offline RL is a promising paradigm for the application of RL-based solutions within economically-critical systems, such as mobility systems. △ Less

Submitted 25 August, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

arXiv:2301.06418 [pdf, other]

Mind the Gap: Modelling Difference Between Censored and Uncensored Electric Vehicle Charging Demand

Authors: Frederik Boe Hüttel, Filipe Rodrigues, Francisco Câmara Pereira

Abstract: Electric vehicle charging demand models, with charging records as input, will inherently be biased toward the supply of available chargers. These models often fail to account for demand lost from occupied charging stations and competitors. The lost demand suggests that the actual demand is likely higher than the charging records reflect, i.e., the true demand is latent (unobserved), and the observ… ▽ More Electric vehicle charging demand models, with charging records as input, will inherently be biased toward the supply of available chargers. These models often fail to account for demand lost from occupied charging stations and competitors. The lost demand suggests that the actual demand is likely higher than the charging records reflect, i.e., the true demand is latent (unobserved), and the observations are censored. As a result, machine learning models that rely on these observed records for forecasting charging demand may be limited in their application in future infrastructure expansion and supply management, as they do not estimate the true demand for charging. We propose using censorship-aware models to model charging demand to address this limitation. These models incorporate censorship in their loss functions and learn the true latent demand distribution from observed charging records. We study how occupied charging stations and competing services censor demand using GPS trajectories from cars in Copenhagen, Denmark. We find that censorship occurs up to $61\%$ of the time in some areas of the city. We use the observed charging demand from our study to estimate the true demand and find that censorship-aware models provide better prediction and uncertainty estimation of actual demand than censorship-unaware models. We suggest that future charging models based on charging records should account for censoring to expand the application areas of machine learning models in supply management and infrastructure expansion. △ Less

Submitted 30 May, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

arXiv:2210.05737 [pdf, other]

Context-aware Bayesian Mixed Multinomial Logit Model

Authors: Mirosława Łukawska, Anders Fjendbo Jensen, Filipe Rodrigues

Abstract: The mixed multinomial logit model assumes constant preference parameters of a decision-maker throughout different choice situations, which may be considered too strong for certain choice modelling applications. This paper proposes an effective approach to model context-dependent intra-respondent heterogeneity, thereby introducing the concept of the Context-aware Bayesian mixed multinomial logit mo… ▽ More The mixed multinomial logit model assumes constant preference parameters of a decision-maker throughout different choice situations, which may be considered too strong for certain choice modelling applications. This paper proposes an effective approach to model context-dependent intra-respondent heterogeneity, thereby introducing the concept of the Context-aware Bayesian mixed multinomial logit model, where a neural network maps contextual information to interpretable shifts in the preference parameters of each individual in each choice occasion. The proposed model offers several key advantages. First, it supports both continuous and discrete variables, as well as complex non-linear interactions between both types of variables. Secondly, each context specification is considered jointly as a whole by the neural network rather than each variable being considered independently. Finally, since the neural network parameters are shared across all decision-makers, it can leverage information from other decision-makers to infer the effect of a particular context on a particular decision-maker. Even though the context-aware Bayesian mixed multinomial logit model allows for flexible interactions between attributes, the increase in computational complexity is minor, compared to the mixed multinomial logit model. We illustrate the concept and interpretation of the proposed model in a simulation study. We furthermore present a real-world case study from the travel behaviour domain - a bicycle route choice model, based on a large-scale, crowdsourced dataset of GPS trajectories including 119,448 trips made by 8,555 cyclists. △ Less

Submitted 29 March, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

arXiv:2208.04667 [pdf, other]

Representation learning of rare temporal conditions for travel time prediction

Authors: Niklas Petersen, Filipe Rodrigues, Francisco Pereira

Abstract: Predicting travel time under rare temporal conditions (e.g., public holidays, school vacation period, etc.) constitutes a challenge due to the limitation of historical data. If at all available, historical data often form a heterogeneous time series due to high probability of other changes over long periods of time (e.g., road works, introduced traffic calming initiatives, etc.). This is especiall… ▽ More Predicting travel time under rare temporal conditions (e.g., public holidays, school vacation period, etc.) constitutes a challenge due to the limitation of historical data. If at all available, historical data often form a heterogeneous time series due to high probability of other changes over long periods of time (e.g., road works, introduced traffic calming initiatives, etc.). This is especially prominent in cities and suburban areas. We present a vector-space model for encoding rare temporal conditions, that allows coherent representation learning across different temporal conditions. We show increased performance for travel time prediction over different baselines when utilizing the vector-space encoding for representing the temporal setting. △ Less

Submitted 9 August, 2022; originally announced August 2022.

arXiv:2204.05059 [pdf, other]

doi 10.1016/j.chaos.2022.112306

Forecasting new diseases in low-data settings using transfer learning

Authors: Kirstin Roster, Colm Connaughton, Francisco A. Rodrigues

Abstract: Recent infectious disease outbreaks, such as the COVID-19 pandemic and the Zika epidemic in Brazil, have demonstrated both the importance and difficulty of accurately forecasting novel infectious diseases. When new diseases first emerge, we have little knowledge of the transmission process, the level and duration of immunity to reinfection, or other parameters required to build realistic epidemiol… ▽ More Recent infectious disease outbreaks, such as the COVID-19 pandemic and the Zika epidemic in Brazil, have demonstrated both the importance and difficulty of accurately forecasting novel infectious diseases. When new diseases first emerge, we have little knowledge of the transmission process, the level and duration of immunity to reinfection, or other parameters required to build realistic epidemiological models. Time series forecasts and machine learning, while less reliant on assumptions about the disease, require large amounts of data that are also not available in early stages of an outbreak. In this study, we examine how knowledge of related diseases can help make predictions of new diseases in data-scarce environments using transfer learning. We implement both an empirical and a theoretical approach. Using empirical data from Brazil, we compare how well different machine learning models transfer knowledge between two different disease pairs: (i) dengue and Zika, and (ii) influenza and COVID-19. In the theoretical analysis, we generate data using different transmission and recovery rates with an SIR compartmental model, and then compare the effectiveness of different transfer learning methods. We find that transfer learning offers the potential to improve predictions, even beyond a model based on data from the target disease, though the appropriate source disease must be chosen carefully. While imperfect, these models offer an additional input for decision makers during pandemic response. △ Less

Submitted 7 April, 2022; originally announced April 2022.

arXiv:2203.02954 [pdf, other]

On the importance of stationarity, strong baselines and benchmarks in transport prediction problems

Authors: Filipe Rodrigues

Abstract: Over the last years, the transportation community has witnessed a tremendous amount of research contributions on new deep learning approaches for spatio-temporal forecasting. These contributions tend to emphasize the modeling of spatial correlations, while neglecting the fairly stable and recurrent nature of human mobility patterns. In this short paper, we show that a naive baseline method based o… ▽ More Over the last years, the transportation community has witnessed a tremendous amount of research contributions on new deep learning approaches for spatio-temporal forecasting. These contributions tend to emphasize the modeling of spatial correlations, while neglecting the fairly stable and recurrent nature of human mobility patterns. In this short paper, we show that a naive baseline method based on the average weekly pattern and linear regression can achieve comparable results to many state-of-the-art deep learning approaches for spatio-temporal forecasting in transportation, or even outperform them on several datasets, thus contrasting the importance of stationarity and recurrent patterns in the data with the importance of spatial correlations. Furthermore, we establish 9 different reference benchmarks that can be used to compare new approaches for spatio-temporal forecasting, and provide a discussion on best practices and the direction that the field is taking. △ Less

Submitted 6 March, 2022; originally announced March 2022.

Comments: 6 pages

arXiv:2201.10307 [pdf, other]

Unboxing the graph: Neural Relational Inference for Mobility Prediction

Authors: Mathias Niemann Tygesen, Francisco C. Pereira, Filipe Rodrigues

Abstract: Predicting the supply and demand of transport systems is vital for efficient traffic management, control, optimization, and planning. For example, predicting where from/to and when people intend to travel by taxi can support fleet managers to distribute resources; better predicting traffic speeds/congestion allows for pro-active control measures or for users to better choose their paths. Making sp… ▽ More Predicting the supply and demand of transport systems is vital for efficient traffic management, control, optimization, and planning. For example, predicting where from/to and when people intend to travel by taxi can support fleet managers to distribute resources; better predicting traffic speeds/congestion allows for pro-active control measures or for users to better choose their paths. Making spatio-temporal predictions is known to be a hard task, but recently Graph Neural Networks (GNNs) have been widely applied on non-euclidean spatial data. However, most GNN models require a predefined graph, and so far, researchers rely on heuristics to generate this graph for the model to use. In this paper, we use Neural Relational Inference to learn the optimal graph for the model. Our approach has several advantages: 1) a Variational Auto Encoder structure allows for the graph to be dynamically determined by the data, potentially changing through time; 2) the encoder structure allows the use of external data in the generation of the graph; 3) it is possible to place Bayesian priors on the generated graphs to encode domain knowledge. We conduct experiments on two datasets, namely the NYC Yellow Taxi and the PEMS road traffic datasets. In both datasets, we outperform benchmarks and show performance comparable to state-of-the-art. Furthermore, we do an in-depth analysis of the learned graphs, providing insights on what kinds of connections GNNs use for spatio-temporal predictions in the transport domain. △ Less

Submitted 25 January, 2022; originally announced January 2022.

arXiv:2201.07866 [pdf]

A Practical Approach of Actions for FAIRification Workflows

Authors: Natalia Queiroz de Oliveira, Vânia Borges, Henrique F. Rodrigues, Maria Luiza Machado Campos, Giseli Rabello Lopes

Abstract: Since their proposal in 2016, the FAIR principles have been largely discussed by different communities and initiatives involved in the development of infrastructures to enhance support for data findability, accessibility, interoperability, and reuse. One of the challenges in implementing these principles lies in defining a well-delimited process with organized and detailed actions. This paper pres… ▽ More Since their proposal in 2016, the FAIR principles have been largely discussed by different communities and initiatives involved in the development of infrastructures to enhance support for data findability, accessibility, interoperability, and reuse. One of the challenges in implementing these principles lies in defining a well-delimited process with organized and detailed actions. This paper presents a workflow of actions that is being adopted in the VODAN BR pilot for generating FAIR (meta)data for COVID-19 research. It provides the understanding of each step of the process, establishing their contribution. In this work, we also evaluate potential tools to (semi)automatize (meta)data treatment whenever possible. Although defined for a particular use case, it is expected that this workflow can be applied for other epidemical research and in other domains, benefiting the entire scientific community. △ Less

Submitted 19 January, 2022; originally announced January 2022.

Comments: Preprint. Submitted to MTSR2021 on 25th October 2021. 12 pages. To be published in "Metadata and Semantic Research"

arXiv:2110.06140 [pdf, other]

EEG functional connectivity and deep learning for automatic diagnosis of brain disorders: Alzheimer's disease and schizophrenia

Authors: Caroline L. Alves, Aruane M. Pineda, Kirstin Roster, Christiane Thielemann, Francisco A. Rodrigues

Abstract: Mental disorders are among the leading causes of disability worldwide. The first step in treating these conditions is to obtain an accurate diagnosis, but the absence of established clinical tests makes this task challenging. Machine learning algorithms can provide a possible solution to this problem, as we describe in this work. We present a method for the automatic diagnosis of mental disorders… ▽ More Mental disorders are among the leading causes of disability worldwide. The first step in treating these conditions is to obtain an accurate diagnosis, but the absence of established clinical tests makes this task challenging. Machine learning algorithms can provide a possible solution to this problem, as we describe in this work. We present a method for the automatic diagnosis of mental disorders based on the matrix of connections obtained from EEG time series and deep learning. We show that our approach can classify patients with Alzheimer's disease and schizophrenia with a high level of accuracy. The comparison with the traditional cases, that use raw EEG time series, shows that our method provides the highest precision. Therefore, the application of deep neural networks on data from brain connections is a very promising method to the diagnosis of neurological disorders. △ Less

Submitted 7 October, 2021; originally announced October 2021.

Comments: 10 pages, 5 figures, 9 tables

arXiv:2108.07856 [pdf, other]

OncoPetNet: A Deep Learning based AI system for mitotic figure counting on H&E stained whole slide digital images in a large veterinary diagnostic lab setting

Authors: Michael Fitzke, Derick Whitley, Wilson Yau, Fernando Rodrigues Jr, Vladimir Fadeev, Cindy Bacmeister, Chris Carter, Jeffrey Edwards, Matthew P. Lungren, Mark Parkinson

Abstract: Background: Histopathology is an important modality for the diagnosis and management of many diseases in modern healthcare, and plays a critical role in cancer care. Pathology samples can be large and require multi-site sampling, leading to upwards of 20 slides for a single tumor, and the human-expert tasks of site selection and and quantitative assessment of mitotic figures are time consuming and… ▽ More Background: Histopathology is an important modality for the diagnosis and management of many diseases in modern healthcare, and plays a critical role in cancer care. Pathology samples can be large and require multi-site sampling, leading to upwards of 20 slides for a single tumor, and the human-expert tasks of site selection and and quantitative assessment of mitotic figures are time consuming and subjective. Automating these tasks in the setting of a digital pathology service presents significant opportunities to improve workflow efficiency and augment human experts in practice. Approach: Multiple state-of-the-art deep learning techniques for histopathology image classification and mitotic figure detection were used in the development of OncoPetNet. Additionally, model-free approaches were used to increase speed and accuracy. The robust and scalable inference engine leverages Pytorch's performance optimizations as well as specifically developed speed up techniques in inference. Results: The proposed system, demonstrated significantly improved mitotic counting performance for 41 cancer cases across 14 cancer types compared to human expert baselines. In 21.9% of cases use of OncoPetNet led to change in tumor grading compared to human expert evaluation. In deployment, an effective 0.27 min/slide inference was achieved in a high throughput veterinary diagnostic pathology service across 2 centers processing 3,323 digital whole slide images daily. Conclusion: This work represents the first successful automated deployment of deep learning systems for real-time expert-level performance on important histopathology tasks at scale in a high volume clinical practice. The resulting impact outlines important considerations for model development, deployment, clinical decision making, and informs best practices for implementation of deep learning systems in digital histopathology practices. △ Less

Submitted 17 August, 2021; originally announced August 2021.

arXiv:2108.00858 [pdf, other]

Predictive and Prescriptive Performance of Bike-Sharing Demand Forecasts for Inventory Management

Authors: Daniele Gammelli, Yihua Wang, Dennis Prak, Filipe Rodrigues, Stefan Minner, Francisco Camara Pereira

Abstract: Bike-sharing systems are a rapidly develo** mode of transportation and provide an efficient alternative to passive, motorized personal mobility. The asymmetric nature of bike demand causes the need for rebalancing bike stations, which is typically done during night time. To determine the optimal starting inventory level of a station for a given day, a User Dissatisfaction Function (UDF) models u… ▽ More Bike-sharing systems are a rapidly develo** mode of transportation and provide an efficient alternative to passive, motorized personal mobility. The asymmetric nature of bike demand causes the need for rebalancing bike stations, which is typically done during night time. To determine the optimal starting inventory level of a station for a given day, a User Dissatisfaction Function (UDF) models user pickups and returns as non-homogeneous Poisson processes with piece-wise linear rates. In this paper, we devise a deep generative model directly applicable in the UDF by introducing a variational Poisson recurrent neural network model (VP-RNN) to forecast future pickup and return rates. We empirically evaluate our approach against both traditional and learning-based forecasting methods on real trip travel data from the city of New York, USA, and show how our model outperforms benchmarks in terms of system efficiency and demand satisfaction. By explicitly focusing on the combination of decision-making algorithms with learning-based forecasting methods, we highlight a number of shortcomings in literature. Crucially, we show how more accurate predictions do not necessarily translate into better inventory decisions. By providing insights into the interplay between forecasts, model assumptions, and decisions, we point out that forecasts and decision models should be carefully evaluated and harmonized to optimally control shared mobility systems. △ Less

Submitted 28 July, 2021; originally announced August 2021.

Comments: 28 pages, 6 figures

arXiv:2106.12905 [pdf, other]

Neural Networks for Dengue Prediction: A Systematic Review

Authors: Kirstin Roster, Francisco A. Rodrigues

Abstract: Due to a lack of treatments and universal vaccine, early forecasts of Dengue are an important tool for disease control. Neural networks are powerful predictive models that have made contributions to many areas of public health. In this systematic review, we provide an introduction to the neural networks relevant to Dengue forecasting and review their applications in the literature. The objective i… ▽ More Due to a lack of treatments and universal vaccine, early forecasts of Dengue are an important tool for disease control. Neural networks are powerful predictive models that have made contributions to many areas of public health. In this systematic review, we provide an introduction to the neural networks relevant to Dengue forecasting and review their applications in the literature. The objective is to help inform model design for future work. Following the PRISMA guidelines, we conduct a systematic search of studies that use neural networks to forecast Dengue in human populations. We summarize the relative performance of neural networks and comparator models, model architectures and hyper-parameters, as well as choices of input features. Nineteen papers were included. Most studies implement shallow neural networks using historical Dengue incidence and meteorological input features. Prediction horizons tend to be short. Building on the strengths of neural networks, most studies use granular observations at the city or sub-national level. Performance of neural networks relative to comparators such as Support Vector Machines varies across study contexts. The studies suggest that neural networks can provide good predictions of Dengue and should be included in the set of candidate models. The use of convolutional, recurrent, or deep networks is relatively unexplored but offers promising avenues for further research, as does the use of a broader set of input features such as social media or mobile phone data. △ Less

Submitted 22 June, 2021; originally announced June 2021.

Comments: 16 pages, 6 figures, 1 table

arXiv:2106.10940 [pdf, other]

Deep Spatio-Temporal Forecasting of Electrical Vehicle Charging Demand

Authors: Frederik Boe Hüttel, Inon Peled, Filipe Rodrigues, Francisco C. Pereira

Abstract: Electric vehicles can offer a low carbon emission solution to reverse rising emission trends. However, this requires that the energy used to meet the demand is green. To meet this requirement, accurate forecasting of the charging demand is vital. Short and long-term charging demand forecasting will allow for better optimisation of the power grid and future infrastructure expansions. In this paper,… ▽ More Electric vehicles can offer a low carbon emission solution to reverse rising emission trends. However, this requires that the energy used to meet the demand is green. To meet this requirement, accurate forecasting of the charging demand is vital. Short and long-term charging demand forecasting will allow for better optimisation of the power grid and future infrastructure expansions. In this paper, we propose to use publicly available data to forecast the electric vehicle charging demand. To model the complex spatial-temporal correlations between charging stations, we argue that Temporal Graph Convolution Models are the most suitable to capture the correlations. The proposed Temporal Graph Convolutional Networks provide the most accurate forecasts for short and long-term forecasting compared with other forecasting methods. △ Less

Submitted 21 June, 2021; originally announced June 2021.

arXiv:2104.11434 [pdf, other]

Graph Neural Network Reinforcement Learning for Autonomous Mobility-on-Demand Systems

Authors: Daniele Gammelli, Kaidi Yang, James Harrison, Filipe Rodrigues, Francisco C. Pereira, Marco Pavone

Abstract: Autonomous mobility-on-demand (AMoD) systems represent a rapidly develo** mode of transportation wherein travel requests are dynamically handled by a coordinated fleet of robotic, self-driving vehicles. Given a graph representation of the transportation network - one where, for example, nodes represent areas of the city, and edges the connectivity between them - we argue that the AMoD control pr… ▽ More Autonomous mobility-on-demand (AMoD) systems represent a rapidly develo** mode of transportation wherein travel requests are dynamically handled by a coordinated fleet of robotic, self-driving vehicles. Given a graph representation of the transportation network - one where, for example, nodes represent areas of the city, and edges the connectivity between them - we argue that the AMoD control problem is naturally cast as a node-wise decision-making problem. In this paper, we propose a deep reinforcement learning framework to control the rebalancing of AMoD systems through graph neural networks. Crucially, we demonstrate that graph neural networks enable reinforcement learning agents to recover behavior policies that are significantly more transferable, generalizable, and scalable than policies learned through other approaches. Empirically, we show how the learned policies exhibit promising zero-shot transfer capabilities when faced with critical portability tasks such as inter-city generalization, service area expansion, and adaptation to potentially complex urban topologies. △ Less

Submitted 16 August, 2021; v1 submitted 23 April, 2021; originally announced April 2021.

arXiv:2104.06819 [pdf, other]

Short-term bus travel time prediction for transfer synchronization with intelligent uncertainty handling

Authors: Niklas Christoffer Petersen, Anders Parslov, Filipe Rodrigues

Abstract: This paper presents two novel approaches for uncertainty estimation adapted and extended for the multi-link bus travel time problem. The uncertainty is modeled directly as part of recurrent artificial neural networks, but using two fundamentally different approaches: one based on Deep Quantile Regression (DQR) and the other on Bayesian Recurrent Neural Networks (BRNN). Both models predict multiple… ▽ More This paper presents two novel approaches for uncertainty estimation adapted and extended for the multi-link bus travel time problem. The uncertainty is modeled directly as part of recurrent artificial neural networks, but using two fundamentally different approaches: one based on Deep Quantile Regression (DQR) and the other on Bayesian Recurrent Neural Networks (BRNN). Both models predict multiple time steps into the future, but handle the time-dependent uncertainty estimation differently. We present a sampling technique in order to aggregate quantile estimates for link level travel time to yield the multi-link travel time distribution needed for a vehicle to travel from its current position to a specific downstream stop point or transfer site. To motivate the relevance of uncertainty-aware models in the domain, we focus on the connection assurance application as a case study: An expert system to determine whether a bus driver should hold and wait for a connecting service, or break the connection and reduce its own delay. Our results show that the DQR-model performs overall best for the 80%, 90% and 95% prediction intervals, both for a 15 minute time horizon into the future (t + 1), but also for the 30 and 45 minutes time horizon (t + 2 and t + 3), with a constant, but very small underestimation of the uncertainty interval (1-4 pp.). However, we also show, that the BRNN model still can outperform the DQR for specific cases. Lastly, we demonstrate how a simple decision support system can take advantage of our uncertainty-aware travel time models to prioritize the difference in travel time uncertainty for bus holding at strategic points, thus reducing the introduced delay for the connection assurance application. △ Less

Submitted 14 April, 2021; originally announced April 2021.

arXiv:2104.01214 [pdf, other]

Modeling Censored Mobility Demand through Quantile Regression Neural Networks

Authors: Frederik Boe Hüttel, Inon Peled, Filipe Rodrigues, Francisco C. Pereira

Abstract: Shared mobility services require accurate demand models for effective service planning. On the one hand, modeling the full probability distribution of demand is advantageous because the entire uncertainty structure preserves valuable information for decision-making. On the other hand, demand is often observed through the usage of the service itself, so that the observations are censored, as they a… ▽ More Shared mobility services require accurate demand models for effective service planning. On the one hand, modeling the full probability distribution of demand is advantageous because the entire uncertainty structure preserves valuable information for decision-making. On the other hand, demand is often observed through the usage of the service itself, so that the observations are censored, as they are inherently limited by available supply. Since the 1980s, various works on Censored Quantile Regression models have performed well under such conditions. Further, in the last two decades, several papers have proposed to implement these models flexibly through Neural Networks. However, the models in current works estimate the quantiles individually, thus incurring a computational overhead and ignoring valuable relationships between the quantiles. We address this gap by extending current Censored Quantile Regression models to learn multiple quantiles at once and apply these to synthetic baseline datasets and datasets from two shared mobility providers in the Copenhagen metropolitan area in Denmark. The results show that our extended models yield fewer quantile crossings and less computational overhead without compromising model performance. △ Less

Submitted 9 July, 2022; v1 submitted 2 April, 2021; originally announced April 2021.

Comments: 13 pages, 9 figures, 5 tables

arXiv:2101.12252 [pdf]

doi 10.1016/j.trc.2022.103552

Gaussian Process Latent Class Choice Models

Authors: Georges Sfeir, Filipe Rodrigues, Maya Abou-Zeid

Abstract: We present a Gaussian Process - Latent Class Choice Model (GP-LCCM) to integrate a non-parametric class of probabilistic machine learning within discrete choice models (DCMs). Gaussian Processes (GPs) are kernel-based algorithms that incorporate expert knowledge by assuming priors over latent functions rather than priors over parameters, which makes them more flexible in addressing nonlinear probl… ▽ More We present a Gaussian Process - Latent Class Choice Model (GP-LCCM) to integrate a non-parametric class of probabilistic machine learning within discrete choice models (DCMs). Gaussian Processes (GPs) are kernel-based algorithms that incorporate expert knowledge by assuming priors over latent functions rather than priors over parameters, which makes them more flexible in addressing nonlinear problems. By integrating a Gaussian Process within a LCCM structure, we aim at improving discrete representations of unobserved heterogeneity. The proposed model would assign individuals probabilistically to behaviorally homogeneous clusters (latent classes) using GPs and simultaneously estimate class-specific choice models by relying on random utility models. Furthermore, we derive and implement an Expectation-Maximization (EM) algorithm to jointly estimate/infer the hyperparameters of the GP kernel function and the class-specific choice parameters by relying on a Laplace approximation and gradient-based numerical optimization methods, respectively. The model is tested on two different mode choice applications and compared against different LCCM benchmarks. Results show that GP-LCCM allows for a more complex and flexible representation of heterogeneity and improves both in-sample fit and out-of-sample predictive power. Moreover, behavioral and economic interpretability is maintained at the class-specific choice model level while local interpretation of the latent classes can still be achieved, although the non-parametric characteristic of GPs lessens the transparency of the model. △ Less

Submitted 28 January, 2021; originally announced January 2021.

arXiv:2009.04822 [pdf, other]

Generalized Multi-Output Gaussian Process Censored Regression

Authors: Daniele Gammelli, Kasper Pryds Rolsted, Dario Pacino, Filipe Rodrigues

Abstract: When modelling censored observations, a typical approach in current regression methods is to use a censored-Gaussian (i.e. Tobit) model to describe the conditional output distribution. In this paper, as in the case of missing data, we argue that exploiting correlations between multiple outputs can enable models to better address the bias introduced by censored data. To do so, we introduce a hetero… ▽ More When modelling censored observations, a typical approach in current regression methods is to use a censored-Gaussian (i.e. Tobit) model to describe the conditional output distribution. In this paper, as in the case of missing data, we argue that exploiting correlations between multiple outputs can enable models to better address the bias introduced by censored data. To do so, we introduce a heteroscedastic multi-output Gaussian process model which combines the non-parametric flexibility of GPs with the ability to leverage information from correlated outputs under input-dependent noise conditions. To address the resulting inference intractability, we further devise a variational bound to the marginal log-likelihood suitable for stochastic optimization. We empirically evaluate our model against other generative models for censored data on both synthetic and real world tasks and further show how it can be generalized to deal with arbitrary likelihood functions. Results show how the added flexibility allows our model to better estimate the underlying non-censored (i.e. true) process under potentially complex censoring dynamics. △ Less

Submitted 4 May, 2022; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: 17 pages, 6 figures, 3 tables

arXiv:2007.02739 [pdf]

doi 10.1016/j.jocm.2021.100320

Semi-nonparametric Latent Class Choice Model with a Flexible Class Membership Component: A Mixture Model Approach

Authors: Georges Sfeir, Maya Abou-Zeid, Filipe Rodrigues, Francisco Camara Pereira, Isam Kaysi

Abstract: This study presents a semi-nonparametric Latent Class Choice Model (LCCM) with a flexible class membership component. The proposed model formulates the latent classes using mixture models as an alternative approach to the traditional random utility specification with the aim of comparing the two approaches on various measures including prediction accuracy and representation of heterogeneity in the… ▽ More This study presents a semi-nonparametric Latent Class Choice Model (LCCM) with a flexible class membership component. The proposed model formulates the latent classes using mixture models as an alternative approach to the traditional random utility specification with the aim of comparing the two approaches on various measures including prediction accuracy and representation of heterogeneity in the choice process. Mixture models are parametric model-based clustering techniques that have been widely used in areas such as machine learning, data mining and patter recognition for clustering and classification problems. An Expectation-Maximization (EM) algorithm is derived for the estimation of the proposed model. Using two different case studies on travel mode choice behavior, the proposed model is compared to traditional discrete choice models on the basis of parameter estimates' signs, value of time, statistical goodness-of-fit measures, and cross-validation tests. Results show that mixture models improve the overall performance of latent class choice models by providing better out-of-sample prediction accuracy in addition to better representations of heterogeneity without weakening the behavioral and economic interpretability of the choice models. △ Less

Submitted 6 July, 2020; originally announced July 2020.

arXiv:2006.05256 [pdf, other]

Recurrent Flow Networks: A Recurrent Latent Variable Model for Density Modelling of Urban Mobility

Authors: Daniele Gammelli, Filipe Rodrigues

Abstract: Mobility-on-demand (MoD) systems represent a rapidly develo** mode of transportation wherein travel requests are dynamically handled by a coordinated fleet of vehicles. Crucially, the efficiency of an MoD system highly depends on how well supply and demand distributions are aligned in spatio-temporal space (i.e., to satisfy user demand, cars have to be available in the correct place and at the d… ▽ More Mobility-on-demand (MoD) systems represent a rapidly develo** mode of transportation wherein travel requests are dynamically handled by a coordinated fleet of vehicles. Crucially, the efficiency of an MoD system highly depends on how well supply and demand distributions are aligned in spatio-temporal space (i.e., to satisfy user demand, cars have to be available in the correct place and at the desired time). To do so, we argue that predictive models should aim to explicitly disentangle between temporal} and spatial variability in the evolution of urban mobility demand. However, current approaches typically ignore this distinction by either treating both sources of variability jointly, or completely ignoring their presence in the first place. In this paper, we propose recurrent flow networks (RFN), where we explore the inclusion of (i) latent random variables in the hidden state of recurrent neural networks to model temporal variability, and (ii) normalizing flows to model the spatial distribution of mobility demand. We demonstrate how predictive models explicitly disentangling between spatial and temporal variability exhibit several desirable properties, and empirically show how this enables the generation of distributions matching potentially complex urban topologies. △ Less

Submitted 4 May, 2022; v1 submitted 9 June, 2020; originally announced June 2020.

Comments: 16 pages, 6 figures

arXiv:2004.05426 [pdf, other]

Scaling Bayesian inference of mixed multinomial logit models to very large datasets

Authors: Filipe Rodrigues

Abstract: Variational inference methods have been shown to lead to significant improvements in the computational efficiency of approximate Bayesian inference in mixed multinomial logit models when compared to standard Markov-chain Monte Carlo (MCMC) methods without compromising accuracy. However, despite their demonstrated efficiency gains, existing methods still suffer from important limitations that preve… ▽ More Variational inference methods have been shown to lead to significant improvements in the computational efficiency of approximate Bayesian inference in mixed multinomial logit models when compared to standard Markov-chain Monte Carlo (MCMC) methods without compromising accuracy. However, despite their demonstrated efficiency gains, existing methods still suffer from important limitations that prevent them to scale to very large datasets, while providing the flexibility to allow for rich prior distributions and to capture complex posterior distributions. In this paper, we propose an Amortized Variational Inference approach that leverages stochastic backpropagation, automatic differentiation and GPU-accelerated computation, for effectively scaling Bayesian inference in Mixed Multinomial Logit models to very large datasets. Moreover, we show how normalizing flows can be used to increase the flexibility of the variational posterior approximations. Through an extensive simulation study, we empirically show that the proposed approach is able to achieve computational speedups of multiple orders of magnitude over traditional MSLE and MCMC approaches for large datasets without compromising estimation accuracy. △ Less

Submitted 11 April, 2020; originally announced April 2020.

Comments: 12 pages, 3 figures

arXiv:2004.05137 [pdf, other]

doi 10.13140/RG.2.2.15224.80644

Energy Predictive Models for Convolutional Neural Networks on Mobile Platforms

Authors: Crefeda Faviola Rodrigues, Graham Riley, Mikel Lujan

Abstract: Energy use is a key concern when deploying deep learning models on mobile and embedded platforms. Current studies develop energy predictive models based on application-level features to provide researchers a way to estimate the energy consumption of their deep learning models. This information is useful for building resource-aware models that can make efficient use of the hard-ware resources. Howe… ▽ More Energy use is a key concern when deploying deep learning models on mobile and embedded platforms. Current studies develop energy predictive models based on application-level features to provide researchers a way to estimate the energy consumption of their deep learning models. This information is useful for building resource-aware models that can make efficient use of the hard-ware resources. However, previous works on predictive modelling provide little insight into the trade-offs involved in the choice of features on the final predictive model accuracy and model complexity. To address this issue, we provide a comprehensive analysis of building regression-based predictive models for deep learning on mobile devices, based on empirical measurements gathered from the SyNERGY framework.Our predictive modelling strategy is based on two types of predictive models used in the literature:individual layers and layer-type. Our analysis of predictive models show that simple layer-type features achieve a model complexity of 4 to 32 times less for convolutional layer predictions for a similar accuracy compared to predictive models using more complex features adopted by previous approaches. To obtain an overall energy estimate of the inference phase, we build layer-type predictive models for the fully-connected and pooling layers using 12 representative Convolutional NeuralNetworks (ConvNets) on the Jetson TX1 and the Snapdragon 820using software backends such as OpenBLAS, Eigen and CuDNN. We obtain an accuracy between 76% to 85% and a model complexity of 1 for the overall energy prediction of the test ConvNets across different hardware-software combinations. △ Less

Submitted 10 April, 2020; originally announced April 2020.

Comments: 9 pages, 4 Figures

ACM Class: C.4; B.0; I.4; I.2

arXiv:2001.07402 [pdf, other]

Estimating Latent Demand of Shared Mobility through Censored Gaussian Processes

Authors: Daniele Gammelli, Inon Peled, Filipe Rodrigues, Dario Pacino, Haci A. Kurtaran, Francisco C. Pereira

Abstract: Transport demand is highly dependent on supply, especially for shared transport services where availability is often limited. As observed demand cannot be higher than available supply, historical transport data typically represents a biased, or censored, version of the true underlying demand pattern. Without explicitly accounting for this inherent distinction, predictive models of demand would nec… ▽ More Transport demand is highly dependent on supply, especially for shared transport services where availability is often limited. As observed demand cannot be higher than available supply, historical transport data typically represents a biased, or censored, version of the true underlying demand pattern. Without explicitly accounting for this inherent distinction, predictive models of demand would necessarily represent a biased version of true demand, thus less effectively predicting the needs of service users. To counter this problem, we propose a general method for censorship-aware demand modeling, for which we devise a censored likelihood function. We apply this method to the task of shared mobility demand prediction by incorporating the censored likelihood within a Gaussian Process model, which can flexibly approximate arbitrary functional forms. Experiments on artificial and real-world datasets show how taking into account the limiting effect of supply on demand is essential in the process of obtaining an unbiased predictive model of user demand behavior. △ Less

Submitted 17 February, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

Comments: 21 pages, 10 figures

arXiv:1910.00544 [pdf, other]

A machine learning approach to predicting dynamical observables from network structure

Authors: Francisco A. Rodrigues, Thomas Peron, Colm Connaughton, Jurgen Kurths, Yamir Moreno

Abstract: Estimating the outcome of a given dynamical process from structural features is a key unsolved challenge in network science. The goal is hindered by difficulties associated to nonlinearities, correlations and feedbacks between the structure and dynamics of complex systems. In this work, we develop an approach based on machine learning algorithms that is shown to provide an answer to the previous c… ▽ More Estimating the outcome of a given dynamical process from structural features is a key unsolved challenge in network science. The goal is hindered by difficulties associated to nonlinearities, correlations and feedbacks between the structure and dynamics of complex systems. In this work, we develop an approach based on machine learning algorithms that is shown to provide an answer to the previous challenge. Specifically, we show that it is possible to estimate the outbreak size of a disease starting from a single node as well as the degree of synchronicity of a system made up of Kuramoto oscillators. In doing so, we show which topological features of the network are key for this estimation, and provide a rank of the importance of network metrics with higher accuracy than previously done. Our approach is general and can be applied to any dynamical process running on top of complex networks. Likewise, our work constitutes an important step towards the application of machine learning methods to unravel dynamical patterns emerging in complex networked systems. △ Less

Submitted 1 October, 2019; originally announced October 2019.

Comments: 5 pages including 6 figures

arXiv:1906.03855 [pdf, other]

Bayesian Automatic Relevance Determination for Utility Function Specification in Discrete Choice Models

Authors: Filipe Rodrigues, Nicola Ortelli, Michel Bierlaire, Francisco Pereira

Abstract: Specifying utility functions is a key step towards applying the discrete choice framework for understanding the behaviour processes that govern user choices. However, identifying the utility function specifications that best model and explain the observed choices can be a very challenging and time-consuming task. This paper seeks to help modellers by leveraging the Bayesian framework and the conce… ▽ More Specifying utility functions is a key step towards applying the discrete choice framework for understanding the behaviour processes that govern user choices. However, identifying the utility function specifications that best model and explain the observed choices can be a very challenging and time-consuming task. This paper seeks to help modellers by leveraging the Bayesian framework and the concept of automatic relevance determination (ARD), in order to automatically determine an optimal utility function specification from an exponentially large set of possible specifications in a purely data-driven manner. Based on recent advances in approximate Bayesian inference, a doubly stochastic variational inference is developed, which allows the proposed DCM-ARD model to scale to very large and high-dimensional datasets. Using semi-artificial choice data, the proposed approach is shown to very accurately recover the true utility function specifications that govern the observed choices. Moreover, when applied to real choice data, DCM-ARD is shown to be able discover high quality specifications that can outperform previous ones from the literature according to multiple criteria, thereby demonstrating its practical applicability. △ Less

Submitted 10 June, 2019; originally announced June 2019.

Comments: 21 pages, 2 figures, 11 tables

arXiv:1904.10370 [pdf]

A survey on Big Data and Machine Learning for Chemistry

Authors: Jose F Rodrigues Jr, Larisa Florea, Maria C F de Oliveira, Dermot Diamond, Osvaldo N Oliveira Jr

Abstract: Herein we review aspects of leading-edge research and innovation in chemistry which exploits big data and machine learning (ML), two computer science fields that combine to yield machine intelligence. ML can accelerate the solution of intricate chemical problems and even solve problems that otherwise would not be tractable. But the potential benefits of ML come at the cost of big data production;… ▽ More Herein we review aspects of leading-edge research and innovation in chemistry which exploits big data and machine learning (ML), two computer science fields that combine to yield machine intelligence. ML can accelerate the solution of intricate chemical problems and even solve problems that otherwise would not be tractable. But the potential benefits of ML come at the cost of big data production; that is, the algorithms, in order to learn, demand large volumes of data of various natures and from different sources, from materials properties to sensor data. In the survey, we propose a roadmap for future developments, with emphasis on materials discovery and chemical sensing, and within the context of the Internet of Things (IoT), both prominent research fields for ML in the context of big data. In addition to providing an overview of recent advances, we elaborate upon the conceptual and practical limitations of big data and ML applied to chemistry, outlining processes, discussing pitfalls, and reviewing cases of success and failure. △ Less

Submitted 23 April, 2019; originally announced April 2019.

MSC Class: 74Exx; 74Fxx; 97Rxx

arXiv:1904.08353 [pdf, other]

Towards Robust Deep Reinforcement Learning for Traffic Signal Control: Demand Surges, Incidents and Sensor Failures

Authors: Filipe Rodrigues, Carlos Lima Azevedo

Abstract: Reinforcement learning (RL) constitutes a promising solution for alleviating the problem of traffic congestion. In particular, deep RL algorithms have been shown to produce adaptive traffic signal controllers that outperform conventional systems. However, in order to be reliable in highly dynamic urban areas, such controllers need to be robust with the respect to a series of exogenous sources of u… ▽ More Reinforcement learning (RL) constitutes a promising solution for alleviating the problem of traffic congestion. In particular, deep RL algorithms have been shown to produce adaptive traffic signal controllers that outperform conventional systems. However, in order to be reliable in highly dynamic urban areas, such controllers need to be robust with the respect to a series of exogenous sources of uncertainty. In this paper, we develop an open-source callback-based framework for promoting the flexible evaluation of different deep RL configurations under a traffic simulation environment. With this framework, we investigate how deep RL-based adaptive traffic controllers perform under different scenarios, namely under demand surges caused by special events, capacity reductions from incidents and sensor failures. We extract several key insights for the development of robust deep RL algorithms for traffic control and propose concrete designs to mitigate the impact of the considered exogenous uncertainties. △ Less

Submitted 22 July, 2019; v1 submitted 17 April, 2019; originally announced April 2019.

Comments: 8 pages

arXiv:1903.02791 [pdf, other]

doi 10.1016/j.eswa.2018.11.028

Multi-output Bus Travel Time Prediction with Convolutional LSTM Neural Network

Authors: Niklas Christoffer Petersen, Filipe Rodrigues, Francisco Camara Pereira

Abstract: Accurate and reliable travel time predictions in public transport networks are essential for delivering an attractive service that is able to compete with other modes of transport in urban areas. The traditional application of this information, where arrival and departure predictions are displayed on digital boards, is highly visible in the city landscape of most modern metropolises. More recently… ▽ More Accurate and reliable travel time predictions in public transport networks are essential for delivering an attractive service that is able to compete with other modes of transport in urban areas. The traditional application of this information, where arrival and departure predictions are displayed on digital boards, is highly visible in the city landscape of most modern metropolises. More recently, the same information has become critical as input for smart-phone trip planners in order to alert passengers about unreachable connections, alternative route choices and prolonged travel times. More sophisticated Intelligent Transport Systems (ITS) include the predictions of connection assurance, i.e. to hold back services in case a connecting service is delayed. In order to operate such systems, and to ensure the confidence of passengers in the systems, the information provided must be accurate and reliable. Traditional methods have trouble with this as congestion, and thus travel time variability, increases in cities, consequently making travel time predictions in urban areas a non-trivial task. This paper presents a system for bus travel time prediction that leverages the non-static spatio-temporal correlations present in urban bus networks, allowing the discovery of complex patterns not captured by traditional methods. The underlying model is a multi-output, multi-time-step, deep neural network that uses a combination of convolutional and long short-term memory (LSTM) layers. The method is empirically evaluated and compared to other popular approaches for link travel time prediction and currently available services, including the currently deployed model in Copenhagen, Denmark. We find that the proposed model significantly outperforms all the other methods we compare with, and is able to detect small irregular peaks in bus travel times very quickly. △ Less

Submitted 7 March, 2019; originally announced March 2019.

Journal ref: Expert Systems with Applications, Volume 120, 15 April 2019, Pages 426-435

arXiv:1902.00716 [pdf, other]

doi 10.1088/1367-2630/ab687c

Centrality anomalies in complex networks as a result of model over-simplification

Authors: Luiz G. A. Alves, Alberto Aleta, Francisco A. Rodrigues, Yamir Moreno, Luis A. Nunes Amaral

Abstract: Tremendous advances have been made in our understanding of the properties and evolution of complex networks. These advances were initially driven by information-poor empirical networks and theoretical analysis of unweighted and undirected graphs. Recently, information-rich empirical data complex networks supported the development of more sophisticated models that include edge directionality and we… ▽ More Tremendous advances have been made in our understanding of the properties and evolution of complex networks. These advances were initially driven by information-poor empirical networks and theoretical analysis of unweighted and undirected graphs. Recently, information-rich empirical data complex networks supported the development of more sophisticated models that include edge directionality and weight properties, and multiple layers. Many studies still focus on unweighted undirected description of networks, prompting an essential question: how to identify when a model is simpler than it must be? Here, we argue that the presence of centrality anomalies in complex networks is a result of model over-simplification. Specifically, we investigate the well-known anomaly in betweenness centrality for transportation networks, according to which highly connected nodes are not necessarily the most central. Using a broad class of network models with weights and spatial constraints and four large data sets of transportation networks, we show that the unweighted projection of the structure of these networks can exhibit a significant fraction of anomalous nodes compared to a random null model. However, the weighted projection of these networks, compared with an appropriated null model, significantly reduces the fraction of anomalies observed, suggesting that centrality anomalies are a symptom of model over-simplification. Because lack of information-rich data is a common challenge when dealing with complex networks and can cause anomalies that misestimate the role of nodes in the system, we argue that sufficiently sophisticated models be used when anomalies are detected. △ Less

Submitted 13 March, 2020; v1 submitted 2 February, 2019; originally announced February 2019.

Comments: 14 pages, including 9 figures. APS style. Accepted for publication in New Journal of Physics

Journal ref: New Journal of Physics 23, 013043 (2020)

arXiv:1812.08755 [pdf, other]

doi 10.1109/TPAMI.2016.2635136

A Bayesian Additive Model for Understanding Public Transport Usage in Special Events

Authors: Filipe Rodrigues, Stanislav S. Borysov, Bernardete Ribeiro, Francisco C. Pereira

Abstract: Public special events, like sports games, concerts and festivals are well known to create disruptions in transportation systems, often catching the operators by surprise. Although these are usually planned well in advance, their impact is difficult to predict, even when organisers and transportation operators coordinate. The problem highly increases when several events happen concurrently. To solv… ▽ More Public special events, like sports games, concerts and festivals are well known to create disruptions in transportation systems, often catching the operators by surprise. Although these are usually planned well in advance, their impact is difficult to predict, even when organisers and transportation operators coordinate. The problem highly increases when several events happen concurrently. To solve these problems, costly processes, heavily reliant on manual search and personal experience, are usual practice in large cities like Singapore, London or Tokyo. This paper presents a Bayesian additive model with Gaussian process components that combines smart card records from public transport with context information about events that is continuously mined from the Web. We develop an efficient approximate inference algorithm using expectation propagation, which allows us to predict the total number of public transportation trips to the special event areas, thereby contributing to a more adaptive transportation system. Furthermore, for multiple concurrent event scenarios, the proposed algorithm is able to disaggregate gross trip counts into their most likely components related to specific events and routine behavior. Using real data from Singapore, we show that the presented model outperforms the best baseline model by up to 26% in R2 and also has explanatory power for its individual components. △ Less

Submitted 20 December, 2018; originally announced December 2018.

Comments: 14 pages, IEEE Transactions on Pattern Analysis and Machine Intelligence (Volume: 39 , Issue: 11 , Nov. 1 2017)

Journal ref: Rodrigues, F., Borysov, S. S., Ribeiro, B., & Pereira, F. C. (2017). A Bayesian additive model for understanding public transport usage in special events. IEEE transactions on pattern analysis and machine intelligence, 39(11), 2113-2126

arXiv:1812.08739 [pdf, other]

doi 10.1109/TITS.2018.2817879

Multi-Output Gaussian Processes for Crowdsourced Traffic Data Imputation

Authors: Filipe Rodrigues, Kristian Henrickson, Francisco C. Pereira

Abstract: Traffic speed data imputation is a fundamental challenge for data-driven transport analysis. In recent years, with the ubiquity of GPS-enabled devices and the widespread use of crowdsourcing alternatives for the collection of traffic data, transportation professionals increasingly look to such user-generated data for many analysis, planning, and decision support applications. However, due to the m… ▽ More Traffic speed data imputation is a fundamental challenge for data-driven transport analysis. In recent years, with the ubiquity of GPS-enabled devices and the widespread use of crowdsourcing alternatives for the collection of traffic data, transportation professionals increasingly look to such user-generated data for many analysis, planning, and decision support applications. However, due to the mechanics of the data collection process, crowdsourced traffic data such as probe-vehicle data is highly prone to missing observations, making accurate imputation crucial for the success of any application that makes use of that type of data. In this article, we propose the use of multi-output Gaussian processes (GPs) to model the complex spatial and temporal patterns in crowdsourced traffic data. While the Bayesian nonparametric formalism of GPs allows us to model observation uncertainty, the multi-output extension based on convolution processes effectively enables us to capture complex spatial dependencies between nearby road segments. Using 6 months of crowdsourced traffic speed data or "probe vehicle data" for several locations in Copenhagen, the proposed approach is empirically shown to significantly outperform popular state-of-the-art imputation methods. △ Less

Submitted 8 June, 2019; v1 submitted 20 December, 2018; originally announced December 2018.

Comments: 10 pages, IEEE Transactions on Intelligent Transportation Systems, 2018

arXiv:1812.08733 [pdf, other]

doi 10.1016/j.trc.2018.08.007

Heteroscedastic Gaussian processes for uncertainty modeling in large-scale crowdsourced traffic data

Authors: Filipe Rodrigues, Francisco C. Pereira

Abstract: Accurately modeling traffic speeds is a fundamental part of efficient intelligent transportation systems. Nowadays, with the widespread deployment of GPS-enabled devices, it has become possible to crowdsource the collection of speed information to road users (e.g. through mobile applications or dedicated in-vehicle devices). Despite its rather wide spatial coverage, crowdsourced speed data also br… ▽ More Accurately modeling traffic speeds is a fundamental part of efficient intelligent transportation systems. Nowadays, with the widespread deployment of GPS-enabled devices, it has become possible to crowdsource the collection of speed information to road users (e.g. through mobile applications or dedicated in-vehicle devices). Despite its rather wide spatial coverage, crowdsourced speed data also brings very important challenges, such as the highly variable measurement noise in the data due to a variety of driving behaviors and sample sizes. When not properly accounted for, this noise can severely compromise any application that relies on accurate traffic data. In this article, we propose the use of heteroscedastic Gaussian processes (HGP) to model the time-varying uncertainty in large-scale crowdsourced traffic data. Furthermore, we develop a HGP conditioned on sample size and traffic regime (SRC-HGP), which makes use of sample size information (probe vehicles per minute) as well as previous observed speeds, in order to more accurately model the uncertainty in observed speeds. Using 6 months of crowdsourced traffic data from Copenhagen, we empirically show that the proposed heteroscedastic models produce significantly better predictive distributions when compared to current state-of-the-art methods for both speed imputation and short-term forecasting tasks. △ Less

Submitted 20 December, 2018; originally announced December 2018.

Comments: 22 pages, Transportation Research Part C: Emerging Technologies (Elsevier)

Journal ref: Rodrigues, F., & Pereira, F. C. (2018). Heteroscedastic Gaussian processes for uncertainty modeling in large-scale crowdsourced traffic data. Transportation Research Part C: Emerging Technologies, 95, 636-651

arXiv:1810.12260 [pdf]

Wireless Terahertz System Architectures for Networks Beyond 5G

Authors: Alexandros-Apostolos A. Boulogeorgos, Angeliki Alexiou, Dimitrios Kritharidis, Alexandros Katsiotis, Georgia Ntouni, Joonas Kokkoniemi, Janne Lethtomaki, Markku Juntti, Dessy Yankova, Ahmed Mokhtar, Jean-Charles Point, Jose Machado, Robert Elschner, Colja Schubert, Thomas Merkle, Ricardo Ferreira, Francisco Rodrigues, Jose Lima

Abstract: The present white paper focuses on the system requirements of TERRANOVA. Initially details the key use cases for the TERRANOVA technology and presents the description of the network architecture. In more detail, the use cases are classified into two categories, namely backhaul & fronthaul and access and small cell backhaul. The first category refers to fibre extender, point-to-point and redundancy… ▽ More The present white paper focuses on the system requirements of TERRANOVA. Initially details the key use cases for the TERRANOVA technology and presents the description of the network architecture. In more detail, the use cases are classified into two categories, namely backhaul & fronthaul and access and small cell backhaul. The first category refers to fibre extender, point-to-point and redundancy applications, whereas the latter is designed to support backup connection for small and medium-sized enterprises (SMEs), internet of things (IoT) dense environments, data centres, indoor wireless access, ad hoc networks, and last mile access. Then, it provides the networks architecture for the TERRANOVA system as well as the network elements that need to be deployed. The use cases are matched to specific technical scenarios, namely outdoor fixed point-to-point (P2P), outdoor/indoor individual point-to-multipoint (P2MP), and outdoor/indoor "quasi"-omnidirection, and the key performance requirements of each scenario are identified. Likewise, we present the breakthrough novel technology concepts, including the joint design of baseband signal processing for the complete optical and wireless link, the development of broadband and spectrally efficient RF-frontends for frequencies >275 GHz, as well as channel modelling, waveforms, antenna array and multiple-access schemes design, which we are going to use in order to satisfy the presented requirements. Next, an overview of the required new functionalities in both physical (PHY) layer and medium access control (MAC) layers in the TERRANOVA system architecture will be given. Finally, the individual enablers of the TERRANOVA system are combined to develop particular candidate architectures for each of the three technical scenarios. △ Less

Submitted 29 October, 2018; originally announced October 2018.

Comments: 73 pages, 31 figures, 7 tables. arXiv admin note: text overlap with arXiv:1503.00697 by other authors

arXiv:1808.08798 [pdf, other]

Beyond expectation: Deep joint mean and quantile regression for spatio-temporal problems

Authors: Filipe Rodrigues, Francisco C. Pereira

Abstract: Spatio-temporal problems are ubiquitous and of vital importance in many research fields. Despite the potential already demonstrated by deep learning methods in modeling spatio-temporal data, typical approaches tend to focus solely on conditional expectations of the output variables being modeled. In this paper, we propose a multi-output multi-quantile deep learning approach for jointly modeling se… ▽ More Spatio-temporal problems are ubiquitous and of vital importance in many research fields. Despite the potential already demonstrated by deep learning methods in modeling spatio-temporal data, typical approaches tend to focus solely on conditional expectations of the output variables being modeled. In this paper, we propose a multi-output multi-quantile deep learning approach for jointly modeling several conditional quantiles together with the conditional expectation as a way to provide a more complete "picture" of the predictive density in spatio-temporal problems. Using two large-scale datasets from the transportation domain, we empirically demonstrate that, by approaching the quantile regression problem from a multi-task learning perspective, it is possible to solve the embarrassing quantile crossings problem, while simultaneously significantly outperforming state-of-the-art quantile regression methods. Moreover, we show that jointly modeling the mean and several conditional quantiles not only provides a rich description about the predictive density that can capture heteroscedastic properties at a neglectable computational overhead, but also leads to improved predictions of the conditional expectation due to the extra information and a regularization effect induced by the added quantiles. △ Less

Submitted 27 August, 2018; originally announced August 2018.

Comments: 12 pages, 9 figures

arXiv:1808.05902 [pdf, other]

doi 10.1109/TPAMI.2017.2648786

Learning Supervised Topic Models for Classification and Regression from Crowds

Authors: Filipe Rodrigues, Mariana Lourenço, Bernardete Ribeiro, Francisco Pereira

Abstract: The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a… ▽ More The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages of the proposed model over state-of-the-art approaches. △ Less

Submitted 17 August, 2018; originally announced August 2018.

Comments: 14 pages

Journal ref: Rodrigues, F., Lourenco, M., Ribeiro, B. and Pereira, F.C., 2017. Learning supervised topic models for classification and regression from crowds. IEEE transactions on pattern analysis and machine intelligence, 39(12), pp.2409-2422

arXiv:1808.05535 [pdf, other]

doi 10.1016/j.inffus.2018.07.007

Combining time-series and textual data for taxi demand prediction in event areas: a deep learning approach

Authors: Filipe Rodrigues, Ioulia Markou, Francisco Pereira

Abstract: Accurate time-series forecasting is vital for numerous areas of application such as transportation, energy, finance, economics, etc. However, while modern techniques are able to explore large sets of temporal data to build forecasting models, they typically neglect valuable information that is often available under the form of unstructured text. Although this data is in a radically different forma… ▽ More Accurate time-series forecasting is vital for numerous areas of application such as transportation, energy, finance, economics, etc. However, while modern techniques are able to explore large sets of temporal data to build forecasting models, they typically neglect valuable information that is often available under the form of unstructured text. Although this data is in a radically different format, it often contains contextual explanations for many of the patterns that are observed in the temporal data. In this paper, we propose two deep learning architectures that leverage word embeddings, convolutional layers and attention mechanisms for combining text information with time-series data. We apply these approaches for the problem of taxi demand forecasting in event areas. Using publicly available taxi data from New York, we empirically show that by fusing these two complementary cross-modal sources of information, the proposed models are able to significantly reduce the error in the forecasts. △ Less

Submitted 16 August, 2018; originally announced August 2018.

Comments: 20 pages, 6 figures

Journal ref: Rodrigues, F., Markou, I., Pereira, F. Combining time-series and textual data for taxi demand prediction in event areas: a deep learning approach. In Information Fusion, Elsevier, 2018

arXiv:1808.02931 [pdf, other]

doi 10.1103/PhysRevE.99.032301

Mobility helps problem-solving systems to avoid Groupthink

Authors: Paulo F. Gomes, Sandro M. Reia, Francisco A. Rodrigues, José F. Fontanari

Abstract: Groupthink occurs when everyone in a group starts thinking alike, as when people put unlimited faith in a leader. Avoiding this phenomenon is a ubiquitous challenge to problem-solving enterprises and typical countermeasures involve the mobility of group members. Here we use an agent-based model of imitative learning to study the influence of the mobility of the agents on the time they require to f… ▽ More Groupthink occurs when everyone in a group starts thinking alike, as when people put unlimited faith in a leader. Avoiding this phenomenon is a ubiquitous challenge to problem-solving enterprises and typical countermeasures involve the mobility of group members. Here we use an agent-based model of imitative learning to study the influence of the mobility of the agents on the time they require to find the global maxima of NK-fitness landscapes. The agents cooperate by exchanging information on their fitness and use this information to copy the fittest agent in their influence neighborhoods, which are determined by face-to-face interaction networks. The influence neighborhoods are variable since the agents perform random walks in a two-dimensional space. We find that mobility is slightly harmful for solving easy problems, i.e. problems that do not exhibit suboptimal solutions or local maxima. For difficult problems, however, mobility can prevent the imitative search being trapped in suboptimal solutions and guarantees a better performance than the independent search for any system size. △ Less

Submitted 7 January, 2019; v1 submitted 7 August, 2018; originally announced August 2018.

Journal ref: Phys. Rev. E 99, 032301 (2019)

arXiv:1808.02848 [pdf, other]

Pattern Recognition Approach to Violin Shapes of MIMO database

Authors: Thomas Peron, Francisco A. Rodrigues, Luciano da F. Costa

Abstract: Since the landmarks established by the Cremonese school in the 16th century, the history of violin design has been marked by experimentation. While great effort has been invested since the early 19th century by the scientific community on researching violin acoustics, substantially less attention has been given to the statistical characterization of how the violin shape evolved over time. In this… ▽ More Since the landmarks established by the Cremonese school in the 16th century, the history of violin design has been marked by experimentation. While great effort has been invested since the early 19th century by the scientific community on researching violin acoustics, substantially less attention has been given to the statistical characterization of how the violin shape evolved over time. In this paper we study the morphology of violins retrieved from the Musical Instrument Museums Online (MIMO) database -- the largest freely accessible platform providing information about instruments held in public museums. From the violin images, we derive a set of measurements that reflect relevant geometrical features of the instruments. The application of Principal Component Analysis (PCA) uncovered similarities between violin makers and their respective copyists, as well as among luthiers belonging to the same family lineage, in the context of historical narrative. Combined with a time-windowed approach, thin plate splines visualizations revealed that the average violin outline has remained mostly stable over time, not adhering to any particular trends of design across different periods in music history. △ Less

Submitted 8 August, 2018; originally announced August 2018.

Showing 1–50 of 73 results for author: Rodrigues, F