-
A Review of Global Sensitivity Analysis Methods and a comparative case study on Digit Classification
Authors:
Zahra Sadeghi,
Stan Matwin
Abstract:
Global sensitivity analysis (GSA) aims to detect influential input factors that lead a model to arrive at a certain decision and is a significant approach for mitigating the computational burden of processing high dimensional data. In this paper, we provide a comprehensive review and a comparison on global sensitivity analysis methods. Additionally, we propose a methodology for evaluating the effi…
▽ More
Global sensitivity analysis (GSA) aims to detect influential input factors that lead a model to arrive at a certain decision and is a significant approach for mitigating the computational burden of processing high dimensional data. In this paper, we provide a comprehensive review and a comparison on global sensitivity analysis methods. Additionally, we propose a methodology for evaluating the efficacy of these methods by conducting a case study on MNIST digit dataset. Our study goes through the underlying mechanism of widely used GSA methods and highlights their efficacy through a comprehensive methodology.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Gravity-Informed Deep Learning Framework for Predicting Ship Traffic Flow and Invasion Risk of Non-Indigenous Species via Ballast Water Discharge
Authors:
Ruixin Song,
Gabriel Spadon,
Ronald Pelot,
Stan Matwin,
Amilcar Soares
Abstract:
Invasive species in water bodies pose a major threat to the environment and biodiversity globally. Due to increased transportation and trade, non-native species have been introduced to new environments, causing damage to ecosystems and leading to economic losses in agriculture, forestry, and fisheries. Therefore, there is a pressing need for risk assessment and management techniques to mitigate th…
▽ More
Invasive species in water bodies pose a major threat to the environment and biodiversity globally. Due to increased transportation and trade, non-native species have been introduced to new environments, causing damage to ecosystems and leading to economic losses in agriculture, forestry, and fisheries. Therefore, there is a pressing need for risk assessment and management techniques to mitigate the impact of these invasions. This study aims to develop a new physics-inspired model to forecast maritime ship** traffic and thus inform risk assessment of invasive species spread through global transportation networks. Inspired by the gravity model for international trades, our model considers various factors that influence the likelihood and impact of vessel activities, such as ship** flux density, distance between ports, trade flow, and centrality measures of transportation hubs. Additionally, by analyzing the risk network of invasive species, we provide a comprehensive framework for assessing the invasion threat level given a pair of origin and destination. Accordingly, this paper introduces transformers to gravity models to rebuild the short- and long-term dependencies that make the risk analysis feasible. Thus, we introduce a physics-inspired framework that achieves an 89% segmentation accuracy for existing and non-existing trajectories and an 84.8% accuracy for the number of vessels flowing between key port areas, representing more than 10% improvement over the traditional deep-gravity model. Along these lines, this research contributes to a better understanding of invasive species risk assessment. It allows policymakers, conservationists, and stakeholders to prioritize management actions by identifying high-risk invasion pathways. Besides, our model is versatile and can include new data sources, making it suitable for assessing species invasion risks in a changing global landscape.
△ Less
Submitted 29 January, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
Causal Generative Explainers using Counterfactual Inference: A Case Study on the Morpho-MNIST Dataset
Authors:
Will Taylor-Melanson,
Zahra Sadeghi,
Stan Matwin
Abstract:
In this paper, we propose leveraging causal generative learning as an interpretable tool for explaining image classifiers. Specifically, we present a generative counterfactual inference approach to study the influence of visual features (i.e., pixels) as well as causal factors through generative learning. To this end, we first uncover the most influential pixels on a classifier's decision by varyi…
▽ More
In this paper, we propose leveraging causal generative learning as an interpretable tool for explaining image classifiers. Specifically, we present a generative counterfactual inference approach to study the influence of visual features (i.e., pixels) as well as causal factors through generative learning. To this end, we first uncover the most influential pixels on a classifier's decision by varying the value of a causal attribute via counterfactual inference and computing both Shapely and contrastive explanations for counterfactual images with these different attribute values. We then establish a Monte-Carlo mechanism using the generator of a causal generative model in order to adapt Shapley explainers to produce feature importances for the human-interpretable attributes of a causal dataset in the case where a classifier has been trained exclusively on the images of the dataset. Finally, we present optimization methods for creating counterfactual explanations of classifiers by means of counterfactual inference, proposing straightforward approaches for both differentiable and arbitrary classifiers. We exploit the Morpho-MNIST causal dataset as a case study for exploring our proposed methods for generating counterfacutl explantions. We employ visual explanation methods from OmnixAI open source toolkit to compare them with our proposed methods. By employing quantitative metrics to measure the interpretability of counterfactual explanations, we find that our proposed methods of counterfactual explanation offer more interpretable explanations compared to those generated from OmnixAI. This finding suggests that our methods are well-suited for generating highly interpretable counterfactual explanations on causal datasets.
△ Less
Submitted 20 January, 2024;
originally announced January 2024.
-
Improving Dribbling, Passing, and Marking Actions in Soccer Simulation 2D Games Using Machine Learning
Authors:
Nader Zare,
Omid Amini,
Aref Sayareh,
Mahtab Sarvmaili,
Arad Firouzkouhi,
Stan Matwin,
Amilcar Soares
Abstract:
The RoboCup competition was started in 1997, and is known as the oldest RoboCup league. The RoboCup 2D Soccer Simulation League is a stochastic, partially observable soccer environment in which 24 autonomous agents play on two opposing teams. In this paper, we detail the main strategies and functionalities of CYRUS, the RoboCup 2021 2D Soccer Simulation League champions. The new functionalities pr…
▽ More
The RoboCup competition was started in 1997, and is known as the oldest RoboCup league. The RoboCup 2D Soccer Simulation League is a stochastic, partially observable soccer environment in which 24 autonomous agents play on two opposing teams. In this paper, we detail the main strategies and functionalities of CYRUS, the RoboCup 2021 2D Soccer Simulation League champions. The new functionalities presented and discussed in this work are (i) Multi Action Dribble, (ii) Pass Prediction and (iii) Marking Decision. The Multi Action Dribbling strategy enabled CYRUS to succeed more often and to be safer when dribbling actions were performed during a game. The Pass Prediction enhanced our gameplay by predicting our teammate's passing behavior, anticipating and making our agents collaborate better towards scoring goals. Finally, the Marking Decision addressed the multi-agent matching problem to improve CYRUS defensive strategy by finding an optimal solution to mark opponents' players.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
Probabilistic Feature Augmentation for AIS-Based Multi-Path Long-Term Vessel Trajectory Forecasting
Authors:
Gabriel Spadon,
Jay Kumar,
Derek Eden,
Josh van Berkel,
Tom Foster,
Amilcar Soares,
Ronan Fablet,
Stan Matwin,
Ronald Pelot
Abstract:
Maritime transportation is paramount in achieving global economic growth, entailing concurrent ecological obligations in sustainability and safeguarding endangered marine species, most notably preserving large whale populations. In this regard, the Automatic Identification System (AIS) data plays a significant role by offering real-time streaming data on vessel movement, allowing enhanced traffic…
▽ More
Maritime transportation is paramount in achieving global economic growth, entailing concurrent ecological obligations in sustainability and safeguarding endangered marine species, most notably preserving large whale populations. In this regard, the Automatic Identification System (AIS) data plays a significant role by offering real-time streaming data on vessel movement, allowing enhanced traffic monitoring. This study explores using AIS data to prevent vessel-to-whale collisions by forecasting long-term vessel trajectories from engineered AIS data sequences. For such a task, we have developed an encoder-decoder model architecture using Bidirectional Long Short-Term Memory Networks (Bi-LSTM) to predict the next 12 hours of vessel trajectories using 1 to 3 hours of AIS data as input. We feed the model with probabilistic features engineered from historical AIS data that refer to each trajectory's potential route and destination. The model then predicts the vessel's trajectory, considering these additional features by leveraging convolutional layers for spatial feature learning and a position-aware attention mechanism that increases the importance of recent timesteps of a sequence during temporal feature learning. The probabilistic features have an F1 Score of approximately 85% and 75% for each feature type, respectively, demonstrating their effectiveness in augmenting information to the neural network. We test our model on the Gulf of St. Lawrence, a region known to be the habitat of North Atlantic Right Whales (NARW). Our model achieved a high R2 score of over 98% using various techniques and features. It stands out among other approaches as it can make complex decisions during turnings and path selection. Our study highlights the potential of data engineering and trajectory forecasting models for marine life species preservation.
△ Less
Submitted 2 May, 2024; v1 submitted 29 October, 2023;
originally announced October 2023.
-
Pyrus Base: An Open Source Python Framework for the RoboCup 2D Soccer Simulation
Authors:
Nader Zare,
Aref Sayareh,
Omid Amini,
Mahtab Sarvmaili,
Arad Firouzkouhi,
Stan Matwin,
Amilcar Soares
Abstract:
Soccer, also known as football in some parts of the world, involves two teams of eleven players whose objective is to score more goals than the opposing team. To simulate this game and attract scientists from all over the world to conduct research and participate in an annual computer-based soccer world cup, Soccer Simulation 2D (SS2D) was one of the leagues initiated in the RoboCup competition. I…
▽ More
Soccer, also known as football in some parts of the world, involves two teams of eleven players whose objective is to score more goals than the opposing team. To simulate this game and attract scientists from all over the world to conduct research and participate in an annual computer-based soccer world cup, Soccer Simulation 2D (SS2D) was one of the leagues initiated in the RoboCup competition. In every SS2D game, two teams of 11 players and one coach connect to the RoboCup Soccer Simulation Server and compete against each other. Over the past few years, several C++ base codes have been employed to control agents' behavior and their communication with the server. Although C++ base codes have laid the foundation for the SS2D, develo** them requires an advanced level of C++ programming. C++ language complexity is a limiting disadvantage of C++ base codes for all users, especially for beginners. To conquer the challenges of C++ base codes and provide a powerful baseline for develo** machine learning concepts, we introduce Pyrus, the first Python base code for SS2D. Pyrus is developed to encourage researchers to efficiently develop their ideas and integrate machine learning algorithms into their teams. Pyrus base is open-source code, and it is publicly available under MIT License on GitHub
△ Less
Submitted 21 July, 2023;
originally announced July 2023.
-
Observation Denoising in CYRUS Soccer Simulation 2D Team For RoboCup 2023
Authors:
Aref Sayareh,
Nader Zare,
Omid Amini,
Arad Firouzkouhi,
Mahtab Sarvmaili,
Stan Matwin
Abstract:
The RoboCup competitions hold various leagues, and the Soccer Simulation 2D League is a major one among them. Soccer Simulation 2D (SS2D) match involves two teams, including 11 players and a coach, competing against each other. The players can only communicate with the Soccer Simulation Server during the game. This paper presents the latest research of the CYRUS soccer simulation 2D team, the cham…
▽ More
The RoboCup competitions hold various leagues, and the Soccer Simulation 2D League is a major one among them. Soccer Simulation 2D (SS2D) match involves two teams, including 11 players and a coach, competing against each other. The players can only communicate with the Soccer Simulation Server during the game. This paper presents the latest research of the CYRUS soccer simulation 2D team, the champion of RoboCup 2021. We will explain our denoising idea powered by long short-term memory networks (LSTM) and deep neural networks (DNN). The CYRUS team uses the CYRUS2D base code that was developed based on the Helios and Gliders bases.
△ Less
Submitted 27 May, 2023;
originally announced May 2023.
-
Evolutionary Augmentation Policy Optimization for Self-supervised Learning
Authors:
Noah Barrett,
Zahra Sadeghi,
Stan Matwin
Abstract:
Self-supervised Learning (SSL) is a machine learning algorithm for pretraining Deep Neural Networks (DNNs) without requiring manually labeled data. The central idea of this learning technique is based on an auxiliary stage aka pretext task in which labeled data are created automatically through data augmentation and exploited for pretraining the DNN. However, the effect of each pretext task is not…
▽ More
Self-supervised Learning (SSL) is a machine learning algorithm for pretraining Deep Neural Networks (DNNs) without requiring manually labeled data. The central idea of this learning technique is based on an auxiliary stage aka pretext task in which labeled data are created automatically through data augmentation and exploited for pretraining the DNN. However, the effect of each pretext task is not well studied or compared in the literature. In this paper, we study the contribution of augmentation operators on the performance of self supervised learning algorithms in a constrained settings. We propose an evolutionary search method for optimization of data augmentation pipeline in pretext tasks and measure the impact of augmentation operators in several SOTA SSL algorithms. By encoding different combination of augmentation operators in chromosomes we seek the optimal augmentation policies through an evolutionary optimization mechanism. We further introduce methods for analyzing and explaining the performance of optimized SSL algorithms. Our results indicate that our proposed method can find solutions that outperform the accuracy of classification of SSL algorithms which confirms the influence of augmentation policy choice on the overall performance of SSL algorithms. We also compare optimal SSL solutions found by our evolutionary search mechanism and show the effect of batch size in the pretext task on two visual datasets.
△ Less
Submitted 2 August, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Cyrus2D base: Source Code Base for RoboCup 2D Soccer Simulation League
Authors:
Nader Zare,
Omid Amini,
Aref Sayareh,
Mahtab Sarvmaili,
Arad Firouzkouhi,
Saba Ramezani Rad,
Stan Matwin,
Amilcar Soares
Abstract:
Soccer Simulation 2D League is one of the major leagues of RoboCup competitions. In a Soccer Simulation 2D (SS2D) game, two teams of 11 players and one coach compete against each other. Several base codes have been released for the RoboCup soccer simulation 2D (RCSS2D) community that have promoted the application of multi-agent and AI algorithms in this field. In this paper, we introduce "Cyrus2D…
▽ More
Soccer Simulation 2D League is one of the major leagues of RoboCup competitions. In a Soccer Simulation 2D (SS2D) game, two teams of 11 players and one coach compete against each other. Several base codes have been released for the RoboCup soccer simulation 2D (RCSS2D) community that have promoted the application of multi-agent and AI algorithms in this field. In this paper, we introduce "Cyrus2D Base", which is derived from the base code of the RCSS2D 2021 champion. We merged Gliders2D base V2.6 with the newest version of the Helios base. We applied several features of Cyrus2021 to improve the performance and capabilities of this base alongside a Data Extractor to facilitate the implementation of machine learning in the field. We have tested this base code in different teams and scenarios, and the obtained results demonstrate significant improvements in the defensive and offensive strategy of the team.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
A semi-supervised methodology for fishing activity detection using the geometry behind the trajectory of multiple vessels
Authors:
Martha Dais Ferreira,
Gabriel Spadon,
Amilcar Soares,
Stan Matwin
Abstract:
Automatic Identification System (AIS) messages are useful for tracking vessel activity across oceans worldwide using radio links and satellite transceivers. Such data plays a significant role in tracking vessel activity and map** mobility patterns such as those found in fishing. Accordingly, this paper proposes a geometric-driven semi-supervised approach for fishing activity detection from AIS d…
▽ More
Automatic Identification System (AIS) messages are useful for tracking vessel activity across oceans worldwide using radio links and satellite transceivers. Such data plays a significant role in tracking vessel activity and map** mobility patterns such as those found in fishing. Accordingly, this paper proposes a geometric-driven semi-supervised approach for fishing activity detection from AIS data. Through the proposed methodology we show how to explore the information included in the messages to extract features describing the geometry of the vessel route. To this end, we leverage the unsupervised nature of cluster analysis to label the trajectory geometry highlighting the changes in the vessel's moving pattern which tends to indicate fishing activity. The labels obtained by the proposed unsupervised approach are used to detect fishing activities, which we approach as a time-series classification task. In this context, we propose a solution using recurrent neural networks on AIS data streams with roughly 87% of the overall $F$-score on the whole trajectories of 50 different unseen fishing vessels. Such results are accompanied by a broad benchmark study assessing the performance of different Recurrent Neural Network (RNN) architectures. In conclusion, this work contributes by proposing a thorough process that includes data preparation, labeling, data modeling, and model validation. Therefore, we present a novel solution for mobility pattern detection that relies upon unfolding the trajectory in time and observing their inherent geometry.
△ Less
Submitted 22 August, 2022; v1 submitted 12 July, 2022;
originally announced July 2022.
-
CYRUS Soccer Simulation 2D Team Description Paper 2021
Authors:
Nader Zare,
Aref Sayareh,
Mahtab Sarvmaili,
Omid Amini,
Amilcar Soares,
Stan Matwin
Abstract:
In this report, we briefly present the technical procedure and simulation steps for the 2D soccer simulation of team Cyrus. We emphasize on this document on how the prediction of teammates' behavior is performed. In our proposed method, the agent receives the noisy inputs from the server, and predicts the ball holder full state behavior. Taking advantage of this approach for choosing the optimal v…
▽ More
In this report, we briefly present the technical procedure and simulation steps for the 2D soccer simulation of team Cyrus. We emphasize on this document on how the prediction of teammates' behavior is performed. In our proposed method, the agent receives the noisy inputs from the server, and predicts the ball holder full state behavior. Taking advantage of this approach for choosing the optimal view angle shows 11.30% improvement on the expected win rate.
△ Less
Submitted 5 June, 2022;
originally announced June 2022.
-
CYRUS Soccer Simulation 2D Team Description Paper 2022
Authors:
Nader Zare,
Arad Firouzkouhi,
Omid Amini,
Mahtab Sarvmaili,
Aref Sayareh,
Saba Ramezani Rad,
Stan Matwin,
Amilcar Soares
Abstract:
Soccer Simulation 2D League is one of the major leagues of RoboCup competitions. In a Soccer Simulation 2D (SS2D) game, two teams of 11 players and one coach compete against each other. The players are only allowed to communicate with the server that is called Soccer Simulation Server. This paper introduces the previous and current research of the CYRUS soccer simulation team, the champion of Robo…
▽ More
Soccer Simulation 2D League is one of the major leagues of RoboCup competitions. In a Soccer Simulation 2D (SS2D) game, two teams of 11 players and one coach compete against each other. The players are only allowed to communicate with the server that is called Soccer Simulation Server. This paper introduces the previous and current research of the CYRUS soccer simulation team, the champion of RoboCup 2021. We will present our idea about improving Unmarking Decisioning and Positioning by using Pass Prediction Deep Neural Network. Based on our experimental results, this idea proven to be effective on increasing the winning rate of Cyrus against opponents.
△ Less
Submitted 22 May, 2022;
originally announced May 2022.
-
AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation
Authors:
Farshid Varno,
Marzie Saghayi,
Laya Rafiee Sevyeri,
Sharut Gupta,
Stan Matwin,
Mohammad Havaei
Abstract:
In Federated Learning (FL), a number of clients or devices collaborate to train a model without sharing their data. Models are optimized locally at each client and further communicated to a central hub for aggregation. While FL is an appealing decentralized training paradigm, heterogeneity among data from different clients can cause the local optimization to drift away from the global objective. I…
▽ More
In Federated Learning (FL), a number of clients or devices collaborate to train a model without sharing their data. Models are optimized locally at each client and further communicated to a central hub for aggregation. While FL is an appealing decentralized training paradigm, heterogeneity among data from different clients can cause the local optimization to drift away from the global objective. In order to estimate and therefore remove this drift, variance reduction techniques have been incorporated into FL optimization recently. However, these approaches inaccurately estimate the clients' drift and ultimately fail to remove it properly. In this work, we propose an adaptive algorithm that accurately estimates drift across clients. In comparison to previous works, our approach necessitates less storage and communication bandwidth, as well as lower compute costs. Additionally, our proposed methodology induces stability by constraining the norm of estimates for client drift, making it more practical for large scale FL. Experimental findings demonstrate that the proposed algorithm converges significantly faster and achieves higher accuracy than the baselines across various FL benchmarks.
△ Less
Submitted 24 July, 2023; v1 submitted 27 April, 2022;
originally announced April 2022.
-
Unfolding AIS transmission behavior for vessel movement modeling on noisy data leveraging machine learning
Authors:
Gabriel Spadon,
Martha D. Ferreira,
Amilcar Soares,
Stan Matwin
Abstract:
The oceans are a source of an impressive mixture of complex data that could be used to uncover relationships yet to be discovered. Such data comes from the oceans and their surface, such as Automatic Identification System (AIS) messages used for tracking vessels' trajectories. AIS messages are transmitted over radio or satellite at ideally periodic time intervals but vary irregularly over time. As…
▽ More
The oceans are a source of an impressive mixture of complex data that could be used to uncover relationships yet to be discovered. Such data comes from the oceans and their surface, such as Automatic Identification System (AIS) messages used for tracking vessels' trajectories. AIS messages are transmitted over radio or satellite at ideally periodic time intervals but vary irregularly over time. As such, this paper aims to model the AIS message transmission behavior through neural networks for forecasting upcoming AIS messages' content from multiple vessels, particularly in a simultaneous approach despite messages' temporal irregularities as outliers. We present a set of experiments comprising multiple algorithms for forecasting tasks with horizon sizes of varying lengths. Deep learning models (e.g., neural networks) revealed themselves to adequately preserve vessels' spatial awareness regardless of temporal irregularity. We show how convolutional layers, feed-forward networks, and recurrent neural networks can improve such tasks by working together. Experimenting with short, medium, and large-sized sequences of messages, our model achieved 36/37/38% of the Relative Percentage Difference - the lower, the better, whereas we observed 92/45/96% on the Elman's RNN, 51/52/40% on the GRU, and 129/98/61% on the LSTM. These results support our model as a driver for improving the prediction of vessel routes when analyzing multiple vessels of diverging types simultaneously under temporally noise data.
△ Less
Submitted 5 July, 2022; v1 submitted 24 February, 2022;
originally announced February 2022.
-
Survey of Generative Methods for Social Media Analysis
Authors:
Stan Matwin,
Aristides Milios,
Paweł Prałat,
Amilcar Soares,
François Théberge
Abstract:
This survey draws a broad-stroke, panoramic picture of the State of the Art (SoTA) of the research in generative methods for the analysis of social media data. It fills a void, as the existing survey articles are either much narrower in their scope or are dated. We included two important aspects that currently gain importance in mining and modeling social media: dynamics and networks. Social dynam…
▽ More
This survey draws a broad-stroke, panoramic picture of the State of the Art (SoTA) of the research in generative methods for the analysis of social media data. It fills a void, as the existing survey articles are either much narrower in their scope or are dated. We included two important aspects that currently gain importance in mining and modeling social media: dynamics and networks. Social dynamics are important for understanding the spreading of influence or diseases, formation of friendships, the productivity of teams, etc. Networks, on the other hand, may capture various complex relationships providing additional insight and identifying important patterns that would otherwise go unnoticed.
△ Less
Submitted 13 December, 2021;
originally announced December 2021.
-
Continuous Control with Deep Reinforcement Learning for Autonomous Vessels
Authors:
Nader Zare,
Bruno Brandoli,
Mahtab Sarvmaili,
Amilcar Soares,
Stan Matwin
Abstract:
Maritime autonomous transportation has played a crucial role in the globalization of the world economy. Deep Reinforcement Learning (DRL) has been applied to automatic path planning to simulate vessel collision avoidance situations in open seas. End-to-end approaches that learn complex map**s directly from the input have poor generalization to reach the targets in different environments. In this…
▽ More
Maritime autonomous transportation has played a crucial role in the globalization of the world economy. Deep Reinforcement Learning (DRL) has been applied to automatic path planning to simulate vessel collision avoidance situations in open seas. End-to-end approaches that learn complex map**s directly from the input have poor generalization to reach the targets in different environments. In this work, we present a new strategy called state-action rotation to improve agent's performance in unseen situations by rotating the obtained experience (state-action-state) and preserving them in the replay buffer. We designed our model based on Deep Deterministic Policy Gradient, local view maker, and planner. Our agent uses two deep Convolutional Neural Networks to estimate the policy and action-value functions. The proposed model was exhaustively trained and tested in maritime scenarios with real maps from cities such as Montreal and Halifax. Experimental results show that the state-action rotation on top of the CVN consistently improves the rate of arrival to a destination (RATD) by up 11.96% with respect to the Vessel Navigator with Planner and Local View (VNPLV), as well as it achieves superior performance in unseen map**s by up 30.82%. Our proposed approach exhibits advantages in terms of robustness when tested in a new environment, supporting the idea that generalization can be achieved by using state-action rotation.
△ Less
Submitted 26 June, 2021;
originally announced June 2021.
-
Artificial Intelligence for Emotion-Semantic Trending and People Emotion Detection During COVID-19 Social Isolation
Authors:
Hamed Jelodar,
Rita Orji,
Stan Matwin,
Swarna Weerasinghe,
Oladapo Oyebode,
Yongli Wang
Abstract:
Taking advantage of social media platforms, such as Twitter, this paper provides an effective framework for emotion detection among those who are quarantined. Early detection of emotional feelings and their trends help implement timely intervention strategies. Given the limitations of medical diagnosis of early emotional change signs during the quarantine period, artificial intelligence models pro…
▽ More
Taking advantage of social media platforms, such as Twitter, this paper provides an effective framework for emotion detection among those who are quarantined. Early detection of emotional feelings and their trends help implement timely intervention strategies. Given the limitations of medical diagnosis of early emotional change signs during the quarantine period, artificial intelligence models provide effective mechanisms in uncovering early signs, symptoms and escalating trends. Novelty of the approach presented herein is a multitask methodological framework of text data processing, implemented as a pipeline for meaningful emotion detection and analysis, based on the Plutchik/Ekman approach to emotion detection and trend detection. We present an evaluation of the framework and a pilot system. Results of confirm the effectiveness of the proposed framework for topic trends and emotion detection of COVID-19 tweets. Our findings revealed Stay-At-Home restrictions result in people expressing on twitter both negative and positive emotional semantics. Semantic trends of safety issues related to staying at home rapidly decreased within the 28 days and also negative feelings related to friends dying and quarantined life increased in some days. These findings have potential to impact public health policy decisions through monitoring trends of emotional feelings of those who are quarantined. The framework presented here has potential to assist in such monitoring by using as an online emotion detection tool kit.
△ Less
Submitted 16 January, 2021;
originally announced January 2021.
-
Pay Attention to Evolution: Time Series Forecasting with Deep Graph-Evolution Learning
Authors:
Gabriel Spadon,
Shenda Hong,
Bruno Brandoli,
Stan Matwin,
Jose F. Rodrigues-Jr,
Jimeng Sun
Abstract:
Time-series forecasting is one of the most active research topics in artificial intelligence. Applications in real-world time series should consider two factors for achieving reliable predictions: modeling dynamic dependencies among multiple variables and adjusting the model's intrinsic hyperparameters. A still open gap in that literature is that statistical and ensemble learning approaches system…
▽ More
Time-series forecasting is one of the most active research topics in artificial intelligence. Applications in real-world time series should consider two factors for achieving reliable predictions: modeling dynamic dependencies among multiple variables and adjusting the model's intrinsic hyperparameters. A still open gap in that literature is that statistical and ensemble learning approaches systematically present lower predictive performance than deep learning methods. They generally disregard the data sequence aspect entangled with multivariate data represented in more than one time series. Conversely, this work presents a novel neural network architecture for time-series forecasting that combines the power of graph evolution with deep recurrent learning on distinct data distributions; we named our method Recurrent Graph Evolution Neural Network (ReGENN). The idea is to infer multiple multivariate relationships between co-occurring time-series by assuming that the temporal data depends not only on inner variables and intra-temporal relationships (i.e., observations from itself) but also on outer variables and inter-temporal relationships (i.e., observations from other-selves). An extensive set of experiments was conducted comparing ReGENN with dozens of ensemble methods and classical statistical ones, showing sound improvement of up to 64.87% over the competing algorithms. Furthermore, we present an analysis of the intermediate weights arising from ReGENN, showing that by looking at inter and intra-temporal relationships simultaneously, time-series forecasting is majorly improved if paying attention to how multiple multivariate data synchronously evolve.
△ Less
Submitted 26 May, 2021; v1 submitted 28 August, 2020;
originally announced August 2020.
-
COVID-19 Pandemic: Identifying Key Issues using Social Media and Natural Language Processing
Authors:
Oladapo Oyebode,
Chinenye Ndulue,
Dinesh Mulchandani,
Banuchitra Suruliraj,
Ashfaq Adib,
Fidelia Anulika Orji,
Evangelos Milios,
Stan Matwin,
Rita Orji
Abstract:
The COVID-19 pandemic has affected people's lives in many ways. Social media data can reveal public perceptions and experience with respect to the pandemic, and also reveal factors that hamper or support efforts to curb global spread of the disease. In this paper, we analyzed COVID-19-related comments collected from six social media platforms using Natural Language Processing (NLP) techniques. We…
▽ More
The COVID-19 pandemic has affected people's lives in many ways. Social media data can reveal public perceptions and experience with respect to the pandemic, and also reveal factors that hamper or support efforts to curb global spread of the disease. In this paper, we analyzed COVID-19-related comments collected from six social media platforms using Natural Language Processing (NLP) techniques. We identified relevant opinionated keyphrases and their respective sentiment polarity (negative or positive) from over 1 million randomly selected comments, and then categorized them into broader themes using thematic analysis. Our results uncover 34 negative themes out of which 17 are economic, socio-political, educational, and political issues. 20 positive themes were also identified. We discuss the negative issues and suggest interventions to tackle them based on the positive themes and research evidence.
△ Less
Submitted 23 August, 2020;
originally announced August 2020.
-
SemEval-2020 Task 5: Counterfactual Recognition
Authors:
Xiaoyu Yang,
Stephen Obadinma,
Huasha Zhao,
Qiong Zhang,
Stan Matwin,
Xiaodan Zhu
Abstract:
We present a counterfactual recognition (CR) task, the shared Task 5 of SemEval-2020. Counterfactuals describe potential outcomes (consequents) produced by actions or circumstances that did not happen or cannot happen and are counter to the facts (antecedent). Counterfactual thinking is an important characteristic of the human cognitive system; it connects antecedents and consequents with causal r…
▽ More
We present a counterfactual recognition (CR) task, the shared Task 5 of SemEval-2020. Counterfactuals describe potential outcomes (consequents) produced by actions or circumstances that did not happen or cannot happen and are counter to the facts (antecedent). Counterfactual thinking is an important characteristic of the human cognitive system; it connects antecedents and consequents with causal relations. Our task provides a benchmark for counterfactual recognition in natural language with two subtasks. Subtask-1 aims to determine whether a given sentence is a counterfactual statement or not. Subtask-2 requires the participating systems to extract the antecedent and consequent in a given counterfactual statement. During the SemEval-2020 official evaluation period, we received 27 submissions to Subtask-1 and 11 to Subtask-2. The data, baseline code, and leaderboard can be found at https://competitions.codalab.org/competitions/21691. The data and baseline code are also available at https://zenodo.org/record/3932442.
△ Less
Submitted 2 August, 2020;
originally announced August 2020.
-
Learn Faster and Forget Slower via Fast and Stable Task Adaptation
Authors:
Farshid Varno,
Lucas May Petry,
Lisa Di Jorio,
Stan Matwin
Abstract:
Training Deep Neural Networks (DNNs) is still highly time-consuming and compute-intensive. It has been shown that adapting a pretrained model may significantly accelerate this process. With a focus on classification, we show that current fine-tuning techniques make the pretrained models catastrophically forget the transferred knowledge even before anything about the new task is learned. Such rapid…
▽ More
Training Deep Neural Networks (DNNs) is still highly time-consuming and compute-intensive. It has been shown that adapting a pretrained model may significantly accelerate this process. With a focus on classification, we show that current fine-tuning techniques make the pretrained models catastrophically forget the transferred knowledge even before anything about the new task is learned. Such rapid knowledge loss undermines the merits of transfer learning and may result in a much slower convergence rate compared to when the maximum amount of knowledge is exploited. We investigate the source of this problem from different perspectives and to alleviate it, introduce Fast And Stable Task-adaptation (FAST), an easy to apply fine-tuning algorithm. The paper provides a novel geometric perspective on how the loss landscape of source and target tasks are linked in different transfer learning strategies. We empirically show that compared to prevailing fine-tuning practices, FAST learns the target task faster and forgets the source task slower.
△ Less
Submitted 29 November, 2020; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Analyzing the Impact of Foursquare and Streetlight Data with Human Demographics on Future Crime Prediction
Authors:
Fateha Khanam Bappee,
Lucas May Petry,
Amilcar Soares,
Stan Matwin
Abstract:
Finding the factors contributing to criminal activities and their consequences is essential to improve quantitative crime research. To respond to this concern, we examine an extensive set of features from different perspectives and explanations. Our study aims to build data-driven models for predicting future crime occurrences. In this paper, we propose the use of streetlight infrastructure and Fo…
▽ More
Finding the factors contributing to criminal activities and their consequences is essential to improve quantitative crime research. To respond to this concern, we examine an extensive set of features from different perspectives and explanations. Our study aims to build data-driven models for predicting future crime occurrences. In this paper, we propose the use of streetlight infrastructure and Foursquare data along with demographic characteristics for improving future crime incident prediction. We evaluate the classification performance based on various feature combinations as well as with the baseline model. Our proposed model was tested on each smallest geographic region in Halifax, Canada. Our findings demonstrate the effectiveness of integrating diverse sources of data to gain satisfactory classification performance.
△ Less
Submitted 12 June, 2020;
originally announced June 2020.
-
Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation
Authors:
Xiang Jiang,
Qicheng Lao,
Stan Matwin,
Mohammad Havaei
Abstract:
We present an approach for unsupervised domain adaptation---with a strong focus on practical considerations of within-domain class imbalance and between-domain class distribution shift---from a class-conditioned domain alignment perspective. Current methods for class-conditioned domain alignment aim to explicitly minimize a loss function based on pseudo-label estimations of the target domain. Howe…
▽ More
We present an approach for unsupervised domain adaptation---with a strong focus on practical considerations of within-domain class imbalance and between-domain class distribution shift---from a class-conditioned domain alignment perspective. Current methods for class-conditioned domain alignment aim to explicitly minimize a loss function based on pseudo-label estimations of the target domain. However, these methods suffer from pseudo-label bias in the form of error accumulation. We propose a method that removes the need for explicit optimization of model parameters from pseudo-labels directly. Instead, we present a sampling-based implicit alignment approach, where the sample selection procedure is implicitly guided by the pseudo-labels. Theoretical analysis reveals the existence of a domain-discriminator shortcut in misaligned classes, which is addressed by the proposed implicit alignment approach to facilitate domain-adversarial learning. Empirical results and ablation studies confirm the effectiveness of the proposed approach, especially in the presence of within-domain class imbalance and between-domain class distribution shift.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
Give more data, awareness and control to individual citizens, and they will help COVID-19 containment
Authors:
Mirco Nanni,
Gennady Andrienko,
Albert-László Barabási,
Chiara Boldrini,
Francesco Bonchi,
Ciro Cattuto,
Francesca Chiaromonte,
Giovanni Comandé,
Marco Conti,
Mark Coté,
Frank Dignum,
Virginia Dignum,
Josep Domingo-Ferrer,
Paolo Ferragina,
Fosca Giannotti,
Riccardo Guidotti,
Dirk Helbing,
Kimmo Kaski,
Janos Kertesz,
Sune Lehmann,
Bruno Lepri,
Paul Lukowicz,
Stan Matwin,
David Megías Jiménez,
Anna Monreale
, et al. (14 additional authors not shown)
Abstract:
The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the phase 2 of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countri…
▽ More
The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the phase 2 of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens' privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens' "personal data stores", to be shared separately and selectively, voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates - if and when they want, for specific aims - with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.
△ Less
Submitted 16 April, 2020; v1 submitted 10 April, 2020;
originally announced April 2020.
-
Challenges in Vessel Behavior and Anomaly Detection: From Classical Machine Learning to Deep Learning
Authors:
Lucas May Petry,
Amilcar Soares,
Vania Bogorny,
Bruno Brandoli,
Stan Matwin
Abstract:
The global expansion of maritime activities and the development of the Automatic Identification System (AIS) have driven the advances in maritime monitoring systems in the last decade. Monitoring vessel behavior is fundamental to safeguard maritime operations, protecting other vessels sailing the ocean and the marine fauna and flora. Given the enormous volume of vessel data continually being gener…
▽ More
The global expansion of maritime activities and the development of the Automatic Identification System (AIS) have driven the advances in maritime monitoring systems in the last decade. Monitoring vessel behavior is fundamental to safeguard maritime operations, protecting other vessels sailing the ocean and the marine fauna and flora. Given the enormous volume of vessel data continually being generated, real-time analysis of vessel behaviors is only possible because of decision support systems provided with event and anomaly detection methods. However, current works on vessel event detection are ad-hoc methods able to handle only a single or a few predefined types of vessel behavior. Most of the existing approaches do not learn from the data and require the definition of queries and rules for describing each behavior. In this paper, we discuss challenges and opportunities in classical machine learning and deep learning for vessel event and anomaly detection. We hope to motivate the research of novel methods and tools, since addressing these challenges is an essential step towards actual intelligent maritime monitoring systems.
△ Less
Submitted 7 April, 2020;
originally announced April 2020.
-
Using Deep Reinforcement Learning Methods for Autonomous Vessels in 2D Environments
Authors:
Mohammad Etemad,
Nader Zare,
Mahtab Sarvmaili,
Amilcar Soares,
Bruno Brandoli Machado,
Stan Matwin
Abstract:
Unmanned Surface Vehicles technology (USVs) is an exciting topic that essentially deploys an algorithm to safely and efficiently performs a mission. Although reinforcement learning is a well-known approach to modeling such a task, instability and divergence may occur when combining off-policy and function approximation. In this work, we used deep reinforcement learning combining Q-learning with a…
▽ More
Unmanned Surface Vehicles technology (USVs) is an exciting topic that essentially deploys an algorithm to safely and efficiently performs a mission. Although reinforcement learning is a well-known approach to modeling such a task, instability and divergence may occur when combining off-policy and function approximation. In this work, we used deep reinforcement learning combining Q-learning with a neural representation to avoid instability. Our methodology uses deep q-learning and combines it with a rolling wave planning approach on agile methodology. Our method contains two critical parts in order to perform missions in an unknown environment. The first is a path planner that is responsible for generating a potential effective path to a destination without considering the details of the root. The latter is a decision-making module that is responsible for short-term decisions on avoiding obstacles during the near future steps of USV exploitation within the context of the value function. Simulations were performed using two algorithms: a basic vanilla vessel navigator (VVN) as a baseline and an improved one for the vessel navigator with a planner and local view (VNPLV). Experimental results show that the proposed method enhanced the performance of VVN by 55.31 on average for long-distance missions. Our model successfully demonstrated obstacle avoidance by means of deep reinforcement learning using planning adaptive paths in unknown environments.
△ Less
Submitted 23 March, 2020;
originally announced March 2020.
-
Wise Sliding Window Segmentation: A classification-aided approach for trajectory segmentation
Authors:
Mohammad Etemad,
Zahra Etemad,
Amilcar Soares,
Vania Bogorny,
Stan Matwin,
Luis Torgo
Abstract:
Large amounts of mobility data are being generated from many different sources, and several data mining methods have been proposed for this data. One of the most critical steps for trajectory data mining is segmentation. This task can be seen as a pre-processing step in which a trajectory is divided into several meaningful consecutive sub-sequences. This process is necessary because trajectory pat…
▽ More
Large amounts of mobility data are being generated from many different sources, and several data mining methods have been proposed for this data. One of the most critical steps for trajectory data mining is segmentation. This task can be seen as a pre-processing step in which a trajectory is divided into several meaningful consecutive sub-sequences. This process is necessary because trajectory patterns may not hold in the entire trajectory but on trajectory parts. In this work, we propose a supervised trajectory segmentation algorithm, called Wise Sliding Window Segmentation (WS-II). It processes the trajectory coordinates to find behavioral changes in space and time, generating an error signal that is further used to train a binary classifier for segmenting trajectory data. This algorithm is flexible and can be used in different domains. We evaluate our method over three real datasets from different domains (meteorology, fishing, and individuals movements), and compare it with four other trajectory segmentation algorithms: OWS, GRASP-UTS, CB-SMoT, and SPD. We observed that the proposed algorithm achieves the highest performance for all datasets with statistically significant differences in terms of the harmonic mean of purity and coverage.
△ Less
Submitted 23 March, 2020;
originally announced March 2020.
-
Black Box Explanation by Learning Image Exemplars in the Latent Feature Space
Authors:
Riccardo Guidotti,
Anna Monreale,
Stan Matwin,
Dino Pedreschi
Abstract:
We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars resp…
▽ More
We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by "morphing" into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.
△ Less
Submitted 27 January, 2020;
originally announced February 2020.
-
Performance of a Deep Neural Network at Detecting North Atlantic Right Whale Upcalls
Authors:
Oliver S. Kirsebom,
Fabio Frazao,
Yvan Simard,
Nathalie Roy,
Stan Matwin,
Samuel Giard
Abstract:
Passive acoustics provides a powerful tool for monitoring the endangered North Atlantic right whale ($Eubalaena$ $glacialis$), but robust detection algorithms are needed to handle diverse and variable acoustic conditions and differences in recording techniques and equipment. Here, we investigate the potential of deep neural networks for addressing this need. ResNet, an architecture commonly used f…
▽ More
Passive acoustics provides a powerful tool for monitoring the endangered North Atlantic right whale ($Eubalaena$ $glacialis$), but robust detection algorithms are needed to handle diverse and variable acoustic conditions and differences in recording techniques and equipment. Here, we investigate the potential of deep neural networks for addressing this need. ResNet, an architecture commonly used for image recognition, is trained to recognize the time-frequency representation of the characteristic North Atlantic right whale upcall. The network is trained on several thousand examples recorded at various locations in the Gulf of St.\ Lawrence in 2018 and 2019, using different equipment and deployment techniques. Used as a detection algorithm on fifty 30-minute recordings from the years 2015-2017 containing over one thousand upcalls, the network achieves recalls up to 80%, while maintaining a precision of 90%. Importantly, the performance of the network improves as more variance is introduced into the training dataset, whereas the opposite trend is observed using a conventional linear discriminant analysis approach. Our work demonstrates that deep neural networks can be trained to identify North Atlantic right whale upcalls under diverse and variable conditions with a performance that compares favorably to that of existing algorithms.
△ Less
Submitted 29 February, 2020; v1 submitted 24 January, 2020;
originally announced January 2020.
-
Multimodal Deep Learning for Mental Disorders Prediction from Audio Speech Samples
Authors:
Habibeh Naderi,
Behrouz Haji Soleimani,
Stan Matwin
Abstract:
Key features of mental illnesses are reflected in speech. Our research focuses on designing a multimodal deep learning structure that automatically extracts salient features from recorded speech samples for predicting various mental disorders including depression, bipolar, and schizophrenia. We adopt a variety of pre-trained models to extract embeddings from both audio and text segments. We use se…
▽ More
Key features of mental illnesses are reflected in speech. Our research focuses on designing a multimodal deep learning structure that automatically extracts salient features from recorded speech samples for predicting various mental disorders including depression, bipolar, and schizophrenia. We adopt a variety of pre-trained models to extract embeddings from both audio and text segments. We use several state-of-the-art embedding techniques including BERT, FastText, and Doc2VecC for the text representation learning and WaveNet and VGG-ish models for audio encoding. We also leverage huge auxiliary emotion-labeled text and audio corpora to train emotion-specific embeddings and use transfer learning in order to address the problem of insufficient annotated multimodal data available. All these embeddings are then combined into a joint representation in a multimodal fusion layer and finally a recurrent neural network is used to predict the mental disorder. Our results show that mental disorders can be predicted with acceptable accuracy through multimodal analysis of clinical interviews.
△ Less
Submitted 13 April, 2020; v1 submitted 3 September, 2019;
originally announced September 2019.
-
Unsupervised Behavior Change Detection in Multidimensional Data Streams for Maritime Traffic Monitoring
Authors:
Lucas May Petry,
Amilcar Soares,
Vania Bogorny,
Stan Matwin
Abstract:
The worldwide growth of maritime traffic and the development of the Automatic Identification System (AIS) has led to advances in monitoring systems for preventing vessel accidents and detecting illegal activities. In this work, we describe research gaps and challenges in machine learning for vessel behavior change and event detection, considering several constraints imposed by real-time data strea…
▽ More
The worldwide growth of maritime traffic and the development of the Automatic Identification System (AIS) has led to advances in monitoring systems for preventing vessel accidents and detecting illegal activities. In this work, we describe research gaps and challenges in machine learning for vessel behavior change and event detection, considering several constraints imposed by real-time data streams and the maritime monitoring domain. As a starting point, we investigate how unsupervised and semi-supervised change detection methods may be employed for identifying shifts in vessel behavior, aiming to detect and label unusual events.
△ Less
Submitted 14 August, 2019;
originally announced August 2019.
-
Marine Mammal Species Classification using Convolutional Neural Networks and a Novel Acoustic Representation
Authors:
Mark Thomas,
Bruce Martin,
Katie Kowarski,
Briand Gaudet,
Stan Matwin
Abstract:
Research into automated systems for detecting and classifying marine mammals in acoustic recordings is expanding internationally due to the necessity to analyze large collections of data for conservation purposes. In this work, we present a Convolutional Neural Network that is capable of classifying the vocalizations of three species of whales, non-biological sources of noise, and a fifth class pe…
▽ More
Research into automated systems for detecting and classifying marine mammals in acoustic recordings is expanding internationally due to the necessity to analyze large collections of data for conservation purposes. In this work, we present a Convolutional Neural Network that is capable of classifying the vocalizations of three species of whales, non-biological sources of noise, and a fifth class pertaining to ambient noise. In this way, the classifier is capable of detecting the presence and absence of whale vocalizations in an acoustic recording. Through transfer learning, we show that the classifier is capable of learning high-level representations and can generalize to additional species. We also propose a novel representation of acoustic signals that builds upon the commonly used spectrogram representation by way of interpolating and stacking multiple spectrograms produced using different Short-time Fourier Transform (STFT) parameters. The proposed representation is particularly effective for the task of marine mammal species classification where the acoustic events we are attempting to classify are sensitive to the parameters of the STFT.
△ Less
Submitted 30 July, 2019;
originally announced July 2019.
-
Efficient Neural Task Adaptation by Maximum Entropy Initialization
Authors:
Farshid Varno,
Behrouz Haji Soleimani,
Marzie Saghayi,
Lisa Di Jorio,
Stan Matwin
Abstract:
Transferring knowledge from one neural network to another has been shown to be helpful for learning tasks with few training examples. Prevailing fine-tuning methods could potentially contaminate pre-trained features by comparably high energy random noise. This noise is mainly delivered from a careless replacement of task-specific parameters. We analyze theoretically such knowledge contamination fo…
▽ More
Transferring knowledge from one neural network to another has been shown to be helpful for learning tasks with few training examples. Prevailing fine-tuning methods could potentially contaminate pre-trained features by comparably high energy random noise. This noise is mainly delivered from a careless replacement of task-specific parameters. We analyze theoretically such knowledge contamination for classification tasks and propose a practical and easy to apply method to trap and minimize the contaminant. In our approach, the entropy of the output estimates gets maximized initially and the first back-propagated error is stalled at the output of the last layer. Our proposed method not only outperforms the traditional fine-tuning, but also significantly speeds up the convergence of the learner. It is robust to randomness and independent of the choice of architecture. Overall, our experiments show that the power of transfer learning has been substantially underestimated so far.
△ Less
Submitted 11 July, 2019; v1 submitted 25 May, 2019;
originally announced May 2019.
-
When a Tweet is Actually Sexist. A more Comprehensive Classification of Different Online Harassment Categories and The Challenges in NLP
Authors:
Sima Sharifirad,
Stan Matwin
Abstract:
Sexism is very common in social media and makes the boundaries of freedom tighter for feminist and female users. There is still no comprehensive classification of sexism attracting natural language processing techniques. Categorizing sexism in social media in the categories of hostile or benevolent sexism are so general that simply ignores the other types of sexism happening in these media. This p…
▽ More
Sexism is very common in social media and makes the boundaries of freedom tighter for feminist and female users. There is still no comprehensive classification of sexism attracting natural language processing techniques. Categorizing sexism in social media in the categories of hostile or benevolent sexism are so general that simply ignores the other types of sexism happening in these media. This paper proposes a more comprehensive and in-depth categories of online harassment in social media e.g. twitter into the following categories, "Indirect harassment", "Information threat", "sexual harassment", "Physical harassment" and "Not sexist" and address the challenge of labeling them along with presenting the classification result of the categories. It is preliminary work applying machine learning to learn the concept of sexism and distinguishes itself by looking at more precise categories of sexism in social media.
△ Less
Submitted 27 February, 2019;
originally announced February 2019.
-
Recurrent Neural Networks with Stochastic Layers for Acoustic Novelty Detection
Authors:
Duong Nguyen,
Oliver S. Kirsebom,
Fábio Frazão,
Ronan Fablet,
Stan Matwin
Abstract:
In this paper, we adapt Recurrent Neural Networks with Stochastic Layers, which are the state-of-the-art for generating text, music and speech, to the problem of acoustic novelty detection. By integrating uncertainty into the hidden states, this type of network is able to learn the distribution of complex sequences. Because the learned distribution can be calculated explicitly in terms of probabil…
▽ More
In this paper, we adapt Recurrent Neural Networks with Stochastic Layers, which are the state-of-the-art for generating text, music and speech, to the problem of acoustic novelty detection. By integrating uncertainty into the hidden states, this type of network is able to learn the distribution of complex sequences. Because the learned distribution can be calculated explicitly in terms of probability, we can evaluate how likely an observation is then detect low-probability events as novel. The model is robust, highly unsupervised, end-to-end and requires minimum preprocessing, feature engineering or hyperparameter tuning. An experiment on a benchmark dataset shows that our model outperforms the state-of-the-art acoustic novelty detectors.
△ Less
Submitted 13 February, 2019;
originally announced February 2019.
-
How is Your Mood When Writing Sexist tweets? Detecting the Emotion Type and Intensity of Emotion Using Natural Language Processing Techniques
Authors:
Sima Sharifirad,
Borna Jafarpour,
Stan Matwin
Abstract:
Online social platforms have been the battlefield of users with different emotions and attitudes toward each other in recent years. While sexism has been considered as a category of hateful speech in the literature, there is no comprehensive definition and category of sexism attracting natural language processing techniques. Categorizing sexism as either benevolent or hostile sexism is so broad th…
▽ More
Online social platforms have been the battlefield of users with different emotions and attitudes toward each other in recent years. While sexism has been considered as a category of hateful speech in the literature, there is no comprehensive definition and category of sexism attracting natural language processing techniques. Categorizing sexism as either benevolent or hostile sexism is so broad that it easily ignores the other categories of sexism on social media. Sharifirad S and Matwin S 2018 proposed a well-defined category of sexism including indirect harassment, information threat, sexual harassment and physical harassment, inspired from social science for the purpose of natural language processing techniques. In this article, we take advantage of a newly released dataset in SemEval-2018 task1: Affect in tweets, to show the type of emotion and intensity of emotion in each category. We train, test and evaluate different classification methods on the SemEval- 2018 dataset and choose the classifier with highest accuracy for testing on each category of sexist tweets to know the mental state and the affectual state of the user who tweets in each category. It is a nice avenue to explore because not all the tweets are directly sexist and they carry different emotions from the users. This is the first work experimenting on affect detection this in depth on sexist tweets. Based on our best knowledge they are all new contributions to the field; we are the first to demonstrate the power of such in-depth sentiment analysis on the sexist tweets.
△ Less
Submitted 28 January, 2019;
originally announced February 2019.
-
2-D Embedding of Large and High-dimensional Data with Minimal Memory and Computational Time Requirements
Authors:
Witold Dzwinel,
Rafal Wcislo,
Stan Matwin
Abstract:
In the advent of big data era, interactive visualization of large data sets consisting of M*10^5+ high-dimensional feature vectors of length N (N ~ 10^3+), is an indispensable tool for data exploratory analysis. The state-of-the-art data embedding (DE) methods of N-D data into 2-D (3-D) visually perceptible space (e.g., based on t-SNE concept) are too demanding computationally to be efficiently em…
▽ More
In the advent of big data era, interactive visualization of large data sets consisting of M*10^5+ high-dimensional feature vectors of length N (N ~ 10^3+), is an indispensable tool for data exploratory analysis. The state-of-the-art data embedding (DE) methods of N-D data into 2-D (3-D) visually perceptible space (e.g., based on t-SNE concept) are too demanding computationally to be efficiently employed for interactive data analytics of large and high-dimensional datasets. Herein we present a simple method, ivhd (interactive visualization of high-dimensional data tool), which radically outperforms the modern data-embedding algorithms in both computational and memory loads, while retaining high quality of N-D data embedding in 2-D (3-D). We show that DE problem is equivalent to the nearest neighbor nn-graph visualization, where only indices of a few nearest neighbors of each data sample has to be known, and binary distance between data samples -- 0 to the nearest and 1 to the other samples -- is defined. These improvements reduce the time-complexity and memory load from O(M log M) to O(M), and ensure minimal O(M) proportionality coefficient as well. We demonstrate high efficiency, quality and robustness of ivhd on popular benchmark datasets such as MNIST, 20NG, NORB and RCV1.
△ Less
Submitted 4 February, 2019;
originally announced February 2019.
-
Improving the Interpretability of Deep Neural Networks with Knowledge Distillation
Authors:
Xuan Liu,
Xiaoguang Wang,
Stan Matwin
Abstract:
Deep Neural Networks have achieved huge success at a wide spectrum of applications from language modeling, computer vision to speech recognition. However, nowadays, good performance alone is not sufficient to satisfy the needs of practical deployment where interpretability is demanded for cases involving ethics and mission critical applications. The complex models of Deep Neural Networks make it h…
▽ More
Deep Neural Networks have achieved huge success at a wide spectrum of applications from language modeling, computer vision to speech recognition. However, nowadays, good performance alone is not sufficient to satisfy the needs of practical deployment where interpretability is demanded for cases involving ethics and mission critical applications. The complex models of Deep Neural Networks make it hard to understand and reason the predictions, which hinders its further progress. To tackle this problem, we apply the Knowledge Distillation technique to distill Deep Neural Networks into decision trees in order to attain good performance and interpretability simultaneously. We formulate the problem at hand as a multi-output regression problem and the experiments demonstrate that the student model achieves significantly better accuracy performance (about 1\% to 5\%) than vanilla decision trees at the same level of tree depth. The experiments are implemented on the TensorFlow platform to make it scalable to big datasets. To the best of our knowledge, we are the first to distill Deep Neural Networks into vanilla decision trees on multi-class datasets.
△ Less
Submitted 28 December, 2018;
originally announced December 2018.
-
On feature selection and evaluation of transportation mode prediction strategies
Authors:
Mohammad Etemad,
Amilcar Soares Junior,
Stan Matwin
Abstract:
Transportation modes prediction is a fundamental task for decision making in smart cities and traffic management systems. Traffic policies designed based on trajectory mining can save money and time for authorities and the public. It may reduce the fuel consumption and commute time and moreover, may provide more pleasant moments for residents and tourists. Since the number of features that may be…
▽ More
Transportation modes prediction is a fundamental task for decision making in smart cities and traffic management systems. Traffic policies designed based on trajectory mining can save money and time for authorities and the public. It may reduce the fuel consumption and commute time and moreover, may provide more pleasant moments for residents and tourists. Since the number of features that may be used to predict a user transportation mode can be substantial, finding a subset of features that maximizes a performance measure is worth investigating. In this work, we explore wrapper and information retrieval methods to find the best subset of trajectory features. After finding the best classifier and the best feature subset, our results were compared with two related papers that applied deep learning methods and the results showed that our framework achieved better performance. Furthermore, two types of cross-validation approaches were investigated, and the performance results show that the random cross-validation method provides optimistic results.
△ Less
Submitted 5 September, 2018; v1 submitted 9 August, 2018;
originally announced August 2018.
-
On the Importance of Attention in Meta-Learning for Few-Shot Text Classification
Authors:
Xiang Jiang,
Mohammad Havaei,
Gabriel Chartrand,
Hassan Chouaib,
Thomas Vincent,
Andrew Jesson,
Nicolas Chapados,
Stan Matwin
Abstract:
Current deep learning based text classification methods are limited by their ability to achieve fast learning and generalization when the data is scarce. We address this problem by integrating a meta-learning procedure that uses the knowledge learned across many tasks as an inductive bias towards better natural language understanding. Based on the Model-Agnostic Meta-Learning framework (MAML), we…
▽ More
Current deep learning based text classification methods are limited by their ability to achieve fast learning and generalization when the data is scarce. We address this problem by integrating a meta-learning procedure that uses the knowledge learned across many tasks as an inductive bias towards better natural language understanding. Based on the Model-Agnostic Meta-Learning framework (MAML), we introduce the Attentive Task-Agnostic Meta-Learning (ATAML) algorithm for text classification. The essential difference between MAML and ATAML is in the separation of task-agnostic representation learning and task-specific attentive adaptation. The proposed ATAML is designed to encourage task-agnostic representation learning by way of task-agnostic parameterization and facilitate task-specific adaptation via attention mechanisms. We provide evidence to show that the attention mechanism in ATAML has a synergistic effect on learning performance. In comparisons with models trained from random initialization, pretrained models and meta trained MAML, our proposed ATAML method generalizes better on single-label and multi-label classification tasks in miniRCV1 and miniReuters-21578 datasets.
△ Less
Submitted 3 June, 2018;
originally announced June 2018.
-
Predicting Crime Using Spatial Features
Authors:
Fateha Khanam Bappee,
Amilcar Soares Junior,
Stan Matwin
Abstract:
Our study aims to build a machine learning model for crime prediction using geospatial features for different categories of crime. The reverse geocoding technique is applied to retrieve open street map (OSM) spatial data. This study also proposes finding hotpoints extracted from crime hotspots area found by Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). A spati…
▽ More
Our study aims to build a machine learning model for crime prediction using geospatial features for different categories of crime. The reverse geocoding technique is applied to retrieve open street map (OSM) spatial data. This study also proposes finding hotpoints extracted from crime hotspots area found by Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN). A spatial distance feature is then computed based on the position of different hotpoints for various types of crime and this value is used as a feature for classifiers. We test the engineered features in crime data from Royal Canadian Mounted Police of Halifax, NS. We observed a significant performance improvement in crime prediction using the new generated spatial features.
△ Less
Submitted 12 March, 2018;
originally announced March 2018.
-
Predicting Transportation Modes of GPS Trajectories using Feature Engineering and Noise Removal
Authors:
Mohammad Etemad,
Amilcar Soares Junior,
Stan Matwin
Abstract:
Understanding transportation mode from GPS (Global Positioning System) traces is an essential topic in the data mobility domain. In this paper, a framework is proposed to predict transportation modes. This framework follows a sequence of five steps: (i) data preparation, where GPS points are grouped in trajectory samples; (ii) point features generation; (iii) trajectory features extraction; (iv) n…
▽ More
Understanding transportation mode from GPS (Global Positioning System) traces is an essential topic in the data mobility domain. In this paper, a framework is proposed to predict transportation modes. This framework follows a sequence of five steps: (i) data preparation, where GPS points are grouped in trajectory samples; (ii) point features generation; (iii) trajectory features extraction; (iv) noise removal; (v) normalization. We show that the extraction of the new point features: bearing rate, the rate of rate of change of the bearing rate and the global and local trajectory features, like medians and percentiles enables many classifiers to achieve high accuracy (96.5%) and f1 (96.3%) scores. We also show that the noise removal task affects the performance of all the models tested. Finally, the empirical tests where we compare this work against state-of-art transportation mode prediction strategies show that our framework is competitive and outperforms most of them.
△ Less
Submitted 27 February, 2018;
originally announced February 2018.
-
One Single Deep Bidirectional LSTM Network for Word Sense Disambiguation of Text Data
Authors:
Ahmad Pesaranghader,
Ali Pesaranghader,
Stan Matwin,
Marina Sokolova
Abstract:
Due to recent technical and scientific advances, we have a wealth of information hidden in unstructured text data such as offline/online narratives, research articles, and clinical reports. To mine these data properly, attributable to their innate ambiguity, a Word Sense Disambiguation (WSD) algorithm can avoid numbers of difficulties in Natural Language Processing (NLP) pipeline. However, conside…
▽ More
Due to recent technical and scientific advances, we have a wealth of information hidden in unstructured text data such as offline/online narratives, research articles, and clinical reports. To mine these data properly, attributable to their innate ambiguity, a Word Sense Disambiguation (WSD) algorithm can avoid numbers of difficulties in Natural Language Processing (NLP) pipeline. However, considering a large number of ambiguous words in one language or technical domain, we may encounter limiting constraints for proper deployment of existing WSD models. This paper attempts to address the problem of one-classifier-per-one-word WSD algorithms by proposing a single Bidirectional Long Short-Term Memory (BLSTM) network which by considering senses and context sequences works on all ambiguous words collectively. Evaluated on SensEval-3 benchmark, we show the result of our model is comparable with top-performing WSD algorithms. We also discuss how applying additional modifications alleviates the model fault and the need for more training data.
△ Less
Submitted 25 February, 2018;
originally announced February 2018.
-
Interpretable Deep Convolutional Neural Networks via Meta-learning
Authors:
Xuan Liu,
Xiaoguang Wang,
Stan Matwin
Abstract:
Model interpretability is a requirement in many applications in which crucial decisions are made by users relying on a model's outputs. The recent movement for "algorithmic fairness" also stipulates explainability, and therefore interpretability of learning models. And yet the most successful contemporary Machine Learning approaches, the Deep Neural Networks, produce models that are highly non-int…
▽ More
Model interpretability is a requirement in many applications in which crucial decisions are made by users relying on a model's outputs. The recent movement for "algorithmic fairness" also stipulates explainability, and therefore interpretability of learning models. And yet the most successful contemporary Machine Learning approaches, the Deep Neural Networks, produce models that are highly non-interpretable. We attempt to address this challenge by proposing a technique called CNN-INTE to interpret deep Convolutional Neural Networks (CNN) via meta-learning. In this work, we interpret a specific hidden layer of the deep CNN model on the MNIST image dataset. We use a clustering algorithm in a two-level structure to find the meta-level training data and Random Forest as base learning algorithms to generate the meta-level test data. The interpretation results are displayed visually via diagrams, which clearly indicates how a specific test instance is classified. Our method achieves global interpretation for all the test instances without sacrificing the accuracy obtained by the original deep CNN model. This means our model is faithful to the deep CNN model, which leads to reliable interpretations.
△ Less
Submitted 18 August, 2018; v1 submitted 2 February, 2018;
originally announced February 2018.
-
TrajectoryNet: An Embedded GPS Trajectory Representation for Point-based Classification Using Recurrent Neural Networks
Authors:
Xiang Jiang,
Erico N de Souza,
Ahmad Pesaranghader,
Baifan Hu,
Daniel L. Silver,
Stan Matwin
Abstract:
Understanding and discovering knowledge from GPS (Global Positioning System) traces of human activities is an essential topic in mobility-based urban computing. We propose TrajectoryNet-a neural network architecture for point-based trajectory classification to infer real world human transportation modes from GPS traces. To overcome the challenge of capturing the underlying latent factors in the lo…
▽ More
Understanding and discovering knowledge from GPS (Global Positioning System) traces of human activities is an essential topic in mobility-based urban computing. We propose TrajectoryNet-a neural network architecture for point-based trajectory classification to infer real world human transportation modes from GPS traces. To overcome the challenge of capturing the underlying latent factors in the low-dimensional and heterogeneous feature space imposed by GPS data, we develop a novel representation that embeds the original feature space into another space that can be understood as a form of basis expansion. We also enrich the feature space via segment-based information and use Maxout activations to improve the predictive power of Recurrent Neural Networks (RNNs). We achieve over 98% classification accuracy when detecting four types of transportation modes, outperforming existing models without additional sensory data or location-based prior knowledge.
△ Less
Submitted 30 August, 2017; v1 submitted 7 May, 2017;
originally announced May 2017.
-
Studying Positive Speech on Twitter
Authors:
Marina Sokolova,
Vera Sazonova,
Kanyi Huang,
Rudraneel Chakraboty,
Stan Matwin
Abstract:
We present results of empirical studies on positive speech on Twitter. By positive speech we understand speech that works for the betterment of a given situation, in this case relations between different communities in a conflict-prone country. We worked with four Twitter data sets. Through semi-manual opinion mining, we found that positive speech accounted for < 1% of the data . In fully automate…
▽ More
We present results of empirical studies on positive speech on Twitter. By positive speech we understand speech that works for the betterment of a given situation, in this case relations between different communities in a conflict-prone country. We worked with four Twitter data sets. Through semi-manual opinion mining, we found that positive speech accounted for < 1% of the data . In fully automated studies, we tested two approaches: unsupervised statistical analysis, and supervised text classification based on distributed word representation. We discuss benefits and challenges of those approaches and report empirical evidence obtained in the study.
△ Less
Submitted 24 February, 2017;
originally announced February 2017.
-
Reflexive Regular Equivalence for Bipartite Data
Authors:
Aaron Gerow,
Mingyang Zhou,
Stan Matwin,
Feng Shi
Abstract:
Bipartite data is common in data engineering and brings unique challenges, particularly when it comes to clustering tasks that impose on strong structural assumptions. This work presents an unsupervised method for assessing similarity in bipartite data. Similar to some co-clustering methods, the method is based on regular equivalence in graphs. The algorithm uses spectral properties of a bipartite…
▽ More
Bipartite data is common in data engineering and brings unique challenges, particularly when it comes to clustering tasks that impose on strong structural assumptions. This work presents an unsupervised method for assessing similarity in bipartite data. Similar to some co-clustering methods, the method is based on regular equivalence in graphs. The algorithm uses spectral properties of a bipartite adjacency matrix to estimate similarity in both dimensions. The method is reflexive in that similarity in one dimension is used to inform similarity in the other. Reflexive regular equivalence can also use the structure of transitivities -- in a network sense -- the contribution of which is controlled by the algorithm's only free-parameter, $α$. The method is completely unsupervised and can be used to validate assumptions of co-similarity, which are required but often untested, in co-clustering analyses. Three variants of the method with different normalizations are tested on synthetic data. The method is found to be robust to noise and well-suited to asymmetric co-similar structure, making it particularly informative for cluster analysis and recommendation in bipartite data of unknown structure. In experiments, the convergence and speed of the algorithm are found to be stable for different levels of noise. Real-world data from a network of malaria genes are analyzed, where the similarity produced by the reflexive method is shown to out-perform other measures' ability to correctly classify genes.
△ Less
Submitted 16 February, 2017;
originally announced February 2017.
-
Topic Modelling and Event Identification from Twitter Textual Data
Authors:
Marina Sokolova,
Kanyi Huang,
Stan Matwin,
Joshua Ramisch,
Vera Sazonova,
Renee Black,
Chris Orwa,
Sidney Ochieng,
Nanjira Sambuli
Abstract:
The tremendous growth of social media content on the Internet has inspired the development of the text analytics to understand and solve real-life problems. Leveraging statistical topic modelling helps researchers and practitioners in better comprehension of textual content as well as provides useful information for further analysis. Statistical topic modelling becomes especially important when we…
▽ More
The tremendous growth of social media content on the Internet has inspired the development of the text analytics to understand and solve real-life problems. Leveraging statistical topic modelling helps researchers and practitioners in better comprehension of textual content as well as provides useful information for further analysis. Statistical topic modelling becomes especially important when we work with large volumes of dynamic text, e.g., Facebook or Twitter datasets. In this study, we summarize the message content of four data sets of Twitter messages relating to challenging social events in Kenya. We use Latent Dirichlet Allocation (LDA) topic modelling to analyze the content. Our study uses two evaluation measures, Normalized Mutual Information (NMI) and topic coherence analysis, to select the best LDA models. The obtained LDA results show that the tool can be effectively used to extract discussion topics and summarize them for further manual analysis
△ Less
Submitted 8 August, 2016;
originally announced August 2016.
-
YOURPRIVACYPROTECTOR, A recommender system for privacy settings in social networks
Authors:
Kambiz Ghazinour,
Stan Matwin,
Marina Sokolova
Abstract:
Ensuring privacy of users of social networks is probably an unsolvable conundrum. At the same time, an informed use of the existing privacy options by the social network participants may alleviate - or even prevent - some of the more drastic privacy-averse incidents. Unfortunately, recent surveys show that an average user is either not aware of these options or does not use them, probably due to t…
▽ More
Ensuring privacy of users of social networks is probably an unsolvable conundrum. At the same time, an informed use of the existing privacy options by the social network participants may alleviate - or even prevent - some of the more drastic privacy-averse incidents. Unfortunately, recent surveys show that an average user is either not aware of these options or does not use them, probably due to their perceived complexity. It is therefore reasonable to believe that tools assisting users with two tasks: 1) understanding their social net behavior in terms of their privacy settings and broad privacy categories, and 2)recommending reasonable privacy options, will be a valuable tool for everyday privacy practice in a social network context. This paper presents YourPrivacyProtector, a recommender system that shows how simple machine learning techniques may provide useful assistance in these two tasks to Facebook users. We support our claim with empirical results of application of YourPrivacyProtector to two groups of Facebook users.
△ Less
Submitted 5 February, 2016;
originally announced February 2016.
-
Sanitization of Call Detail Records via Differentially-private Summaries
Authors:
Mohammad Alaggan,
Sébastien Gambs,
Stan Matwin,
Eriko Souza,
Mohammed Tuhin
Abstract:
In this work, we initiate the study of human mobility from sanitized call detail records (CDRs). Such data can be extremely valuable to solve important societal issues such as the improvement of urban transportation or the understanding on the spread of diseases. One of the fundamental building block for such study is the computation of mobility patterns summarizing how individuals move during a g…
▽ More
In this work, we initiate the study of human mobility from sanitized call detail records (CDRs). Such data can be extremely valuable to solve important societal issues such as the improvement of urban transportation or the understanding on the spread of diseases. One of the fundamental building block for such study is the computation of mobility patterns summarizing how individuals move during a given period from one area e.g., cellular tower or administrative district) to another. However, such knowledge cannot be published directly as it has been demonstrated that the access to this type of data enable the (re-)identification of individuals. To answer this issue and to foster the development of such applications in a privacy-preserving manner, we propose in this paper a novel approach in which CDRs are summarized under the form of a differentially-private Bloom filter for the purpose of privately counting the number of mobile service users moving from one area (region) to another in a given time frame. Our sanitization method is both time and space efficient, and ensures differential privacy while solving the shortcomings of a solution recently proposed to this problem. We also report on experiments conducted with the proposed solution using a real life CDRs dataset. The results obtained show that our method achieves - in most cases - a performance similar to another method (linear counting sketch) that does not provide any privacy guarantees. Thus, we conclude that our method maintains a high utility while providing strong privacy guarantees.
△ Less
Submitted 31 December, 2014; v1 submitted 29 December, 2014;
originally announced December 2014.