-
2D Convolutional Neural Network for Event Reconstruction in IceCube DeepCore
Authors:
J. H. Peterson,
M. Prado Rodriguez,
K. Hanson
Abstract:
IceCube DeepCore is an extension of the IceCube Neutrino Observatory designed to measure GeV scale atmospheric neutrino interactions for the purpose of neutrino oscillation studies. Distinguishing muon neutrinos from other flavors and reconstructing inelasticity are especially difficult tasks at GeV scale energies in IceCube DeepCore due to sparse instrumentation. Convolutional neural networks (CN…
▽ More
IceCube DeepCore is an extension of the IceCube Neutrino Observatory designed to measure GeV scale atmospheric neutrino interactions for the purpose of neutrino oscillation studies. Distinguishing muon neutrinos from other flavors and reconstructing inelasticity are especially difficult tasks at GeV scale energies in IceCube DeepCore due to sparse instrumentation. Convolutional neural networks (CNNs) have been found to have better success at neutrino event reconstruction than conventional likelihood-based methods. In this contribution, we present a new CNN model that exploits time and depth translational symmetry in IceCube DeepCore data and present the model's performance, specifically for flavor identification and inelasticity reconstruction.
△ Less
Submitted 30 July, 2023;
originally announced July 2023.
-
The Bell-Touchard Counting process
Authors:
Thomas Freud,
Pablo M. Rodriguez
Abstract:
The Poisson process is one of the simplest stochastic processes defined in continuous time, having interesting mathematical properties, leading, in many situations, to applications mathematically treatable. One of the limitations of the Poisson process is the rare events hypothesis; which is the hypothesis of unitary jumps within an infinitesimal window of time. Although that restriction may be av…
▽ More
The Poisson process is one of the simplest stochastic processes defined in continuous time, having interesting mathematical properties, leading, in many situations, to applications mathematically treatable. One of the limitations of the Poisson process is the rare events hypothesis; which is the hypothesis of unitary jumps within an infinitesimal window of time. Although that restriction may be avoided by the compound Poisson process, in most situations, we don't have a closed expression for the probability distribution of the increments of such processes, leaving us options such as working with probability generating functions, numerical analysis and simulations. It is with this motivation in mind, inspired by the recent developments of discrete distributions, that we propose a new counting process based on the Bell-Touchard probability distribution, naming it the Bell-Touchard process. We verify that the process is a compound Poisson process, a multiple Poisson process and that it is closed for convolution plus decomposition operations. Besides, we show that the Bell-Touchard process arises naturally from the composition of two Poisson processes. Moreover, we propose two generalizations; namely, the compound Bell-Touchard process and the non-homogeneous Bell-Touchard process, showing that the last one arises from the composition of a non-homogeneous Poisson process along with a homogeneous Poisson process. We emphasize that since previous works have been shown that the Bell-Touchard probability distribution can be used quite effectively for modelling count data, the Bell-Touchard process and its generalizations may contribute to the formulation of mathematical treatable models where the rare events hypothesis is not suitable.
△ Less
Submitted 24 May, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
Overcoming challenges in leveraging GANs for few-shot data augmentation
Authors:
Christopher Beckham,
Issam Laradji,
Pau Rodriguez,
David Vazquez,
Derek Nowrouzezahrai,
Christopher Pal
Abstract:
In this paper, we explore the use of GAN-based few-shot data augmentation as a method to improve few-shot classification performance. We perform an exploration into how a GAN can be fine-tuned for such a task (one of which is in a class-incremental manner), as well as a rigorous empirical investigation into how well these models can perform to improve few-shot classification. We identify issues re…
▽ More
In this paper, we explore the use of GAN-based few-shot data augmentation as a method to improve few-shot classification performance. We perform an exploration into how a GAN can be fine-tuned for such a task (one of which is in a class-incremental manner), as well as a rigorous empirical investigation into how well these models can perform to improve few-shot classification. We identify issues related to the difficulty of training such generative models under a purely supervised regime with very few examples, as well as issues regarding the evaluation protocols of existing works. We also find that in this regime, classification accuracy is highly sensitive to how the classes of the dataset are randomly split. Therefore, we propose a semi-supervised fine-tuning approach as a more pragmatic way forward to address these problems.
△ Less
Submitted 8 August, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future Directions
Authors:
Vincenzo Lomonaco,
Lorenzo Pellegrini,
Pau Rodriguez,
Massimo Caccia,
Qi She,
Yu Chen,
Quentin Jodelet,
Rui** Wang,
Zheda Mai,
David Vazquez,
German I. Parisi,
Nikhil Churamani,
Marc Pickett,
Issam Laradji,
Davide Maltoni
Abstract:
In the last few years, we have witnessed a renewed and fast-growing interest in continual learning with deep neural networks with the shared objective of making current AI systems more adaptive, efficient and autonomous. However, despite the significant and undoubted progress of the field in addressing the issue of catastrophic forgetting, benchmarking different continual learning approaches is a…
▽ More
In the last few years, we have witnessed a renewed and fast-growing interest in continual learning with deep neural networks with the shared objective of making current AI systems more adaptive, efficient and autonomous. However, despite the significant and undoubted progress of the field in addressing the issue of catastrophic forgetting, benchmarking different continual learning approaches is a difficult task by itself. In fact, given the proliferation of different settings, training and evaluation protocols, metrics and nomenclature, it is often tricky to properly characterize a continual learning algorithm, relate it to other solutions and gauge its real-world applicability. The first Continual Learning in Computer Vision challenge held at CVPR in 2020 has been one of the first opportunities to evaluate different continual learning algorithms on a common hardware with a large set of shared evaluation metrics and 3 different settings based on the realistic CORe50 video benchmark. In this paper, we report the main results of the competition, which counted more than 79 teams registered, 11 finalists and 2300$ in prizes. We also summarize the winning approaches, current challenges and future research directions.
△ Less
Submitted 14 September, 2020;
originally announced September 2020.
-
Language models and Automated Essay Scoring
Authors:
Pedro Uria Rodriguez,
Amir Jafari,
Christopher M. Ormerod
Abstract:
In this paper, we present a new comparative study on automatic essay scoring (AES). The current state-of-the-art natural language processing (NLP) neural network architectures are used in this work to achieve above human-level accuracy on the publicly available Kaggle AES dataset. We compare two powerful language models, BERT and XLNet, and describe all the layers and network architectures in thes…
▽ More
In this paper, we present a new comparative study on automatic essay scoring (AES). The current state-of-the-art natural language processing (NLP) neural network architectures are used in this work to achieve above human-level accuracy on the publicly available Kaggle AES dataset. We compare two powerful language models, BERT and XLNet, and describe all the layers and network architectures in these models. We elucidate the network architectures of BERT and XLNet using clear notation and diagrams and explain the advantages of transformer architectures over traditional recurrent neural network architectures. Linear algebra notation is used to clarify the functions of transformers and attention mechanisms. We compare the results with more traditional methods, such as bag of words (BOW) and long short term memory (LSTM) networks.
△ Less
Submitted 18 September, 2019;
originally announced September 2019.
-
TADAM: Task dependent adaptive metric for improved few-shot learning
Authors:
Boris N. Oreshkin,
Pau Rodriguez,
Alexandre Lacoste
Abstract:
Few-shot learning has become essential for producing models that generalize from few examples. In this work, we identify that metric scaling and metric task conditioning are important to improve the performance of few-shot algorithms. Our analysis reveals that simple metric scaling completely changes the nature of few-shot algorithm parameter updates. Metric scaling provides improvements up to 14%…
▽ More
Few-shot learning has become essential for producing models that generalize from few examples. In this work, we identify that metric scaling and metric task conditioning are important to improve the performance of few-shot algorithms. Our analysis reveals that simple metric scaling completely changes the nature of few-shot algorithm parameter updates. Metric scaling provides improvements up to 14% in accuracy for certain metrics on the mini-Imagenet 5-way 5-shot classification task. We further propose a simple and effective way of conditioning a learner on the task sample set, resulting in learning a task-dependent metric space. Moreover, we propose and empirically test a practical end-to-end optimization procedure based on auxiliary task co-training to learn a task-dependent metric space. The resulting few-shot learning model based on the task-dependent scaled metric achieves state of the art on mini-Imagenet. We confirm these results on another few-shot dataset that we introduce in this paper based on CIFAR100. Our code is publicly available at https://github.com/ElementAI/TADAM.
△ Less
Submitted 25 January, 2019; v1 submitted 23 May, 2018;
originally announced May 2018.