Skip to main content

Showing 1–44 of 44 results for author: Mocanu, D C

.
  1. arXiv:2406.18373  [pdf, other

    cs.CL cs.SD eess.AS

    Dynamic Data Pruning for Automatic Speech Recognition

    Authors: Qiao Xiao, **chuan Ma, Adriana Fernandez-Lopez, Boqian Wu, Lu Yin, Stavros Petridis, Mykola Pechenizkiy, Maja Pantic, Decebal Constantin Mocanu, Shiwei Liu

    Abstract: The recent success of Automatic Speech Recognition (ASR) is largely attributed to the ever-growing amount of training data. However, this trend has made model training prohibitively costly and imposed computational demands. While data pruning has been proposed to mitigate this issue by identifying a small subset of relevant data, its application in ASR has been barely explored, and existing works… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  2. arXiv:2406.06495  [pdf, other

    cs.LG

    Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity

    Authors: Calarina Muslimani, Bram Grooten, Deepak Ranganatha Sastry Mamillapalli, Mykola Pechenizkiy, Decebal Constantin Mocanu, Matthew E. Taylor

    Abstract: For autonomous agents to successfully integrate into human-centered environments, agents should be able to learn from and adapt to humans in their native settings. Preference-based reinforcement learning (PbRL) is a promising approach that learns reward functions from human preferences. This enables RL agents to adapt their behavior based on human desires. However, humans live in a world full of d… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  3. arXiv:2403.14684  [pdf, other

    cs.CV cs.LG

    FOCIL: Finetune-and-Freeze for Online Class Incremental Learning by Training Randomly Pruned Sparse Experts

    Authors: Murat Onur Yildirim, Elif Ceren Gok Yildirim, Decebal Constantin Mocanu, Joaquin Vanschoren

    Abstract: Class incremental learning (CIL) in an online continual learning setting strives to acquire knowledge on a series of novel classes from a data stream, using each data point only once for training. This is more realistic compared to offline modes, where it is assumed that all data from novel class(es) is readily available. Current online CIL approaches store a subset of the previous data which crea… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  4. arXiv:2312.15339  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    MaDi: Learning to Mask Distractions for Generalization in Visual Deep Reinforcement Learning

    Authors: Bram Grooten, Tristan Tomilin, Gautham Vasan, Matthew E. Taylor, A. Rupam Mahmood, Meng Fang, Mykola Pechenizkiy, Decebal Constantin Mocanu

    Abstract: The visual world provides an abundance of information, but many input pixels received by agents often contain distracting stimuli. Autonomous agents need the ability to distinguish useful information from task-irrelevant perceptions, enabling them to generalize to unseen environments with new distractions. Existing works approach this problem using data augmentation or large auxiliary networks wit… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: Accepted as full-paper (oral) at AAMAS 2024. Code is available at https://github.com/bramgrooten/mask-distractions and see our 40-second video at https://youtu.be/2oImF0h1k48

  5. arXiv:2312.04727  [pdf, other

    cs.CV

    E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D Medical Image Segmentation

    Authors: Boqian Wu, Qiao Xiao, Shiwei Liu, Lu Yin, Mykola Pechenizkiy, Decebal Constantin Mocanu, Maurice Van Keulen, Elena Mocanu

    Abstract: Deep neural networks have evolved as the leading approach in 3D medical image segmentation due to their outstanding performance. However, the ever-increasing model size and computation cost of deep neural networks have become the primary barrier to deploying them on real-world resource-limited hardware. In pursuit of improving performance and efficiency, we propose a 3D medical image segmentation… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  6. arXiv:2308.14831  [pdf, other

    cs.LG cs.CV

    Continual Learning with Dynamic Sparse Training: Exploring Algorithms for Effective Model Updates

    Authors: Murat Onur Yildirim, Elif Ceren Gok Yildirim, Ghada Sokar, Decebal Constantin Mocanu, Joaquin Vanschoren

    Abstract: Continual learning (CL) refers to the ability of an intelligent system to sequentially acquire and retain knowledge from a stream of data with as little computational overhead as possible. To this end; regularization, replay, architecture, and parameter isolation approaches were introduced to the literature. Parameter isolation using a sparse network which enables to allocate distinct parts of the… ▽ More

    Submitted 4 December, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

  7. arXiv:2306.12230  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Fantastic Weights and How to Find Them: Where to Prune in Dynamic Sparse Training

    Authors: Aleksandra I. Nowak, Bram Grooten, Decebal Constantin Mocanu, Jacek Tabor

    Abstract: Dynamic Sparse Training (DST) is a rapidly evolving area of research that seeks to optimize the sparse initialization of a neural network by adapting its topology during training. It has been shown that under specific conditions, DST is able to outperform dense models. The key components of this framework are the pruning and growing criteria, which are repeatedly applied during the training proces… ▽ More

    Submitted 29 November, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023

  8. arXiv:2305.18382  [pdf, other

    cs.LG

    Adaptive Sparsity Level during Training for Efficient Time Series Forecasting with Transformers

    Authors: Zahra Atashgahi, Mykola Pechenizkiy, Raymond Veldhuis, Decebal Constantin Mocanu

    Abstract: Efficient time series forecasting has become critical for real-world applications, particularly with deep neural networks (DNNs). Efficiency in DNNs can be achieved through sparse connectivity and reducing the model size. However, finding the sparsity level automatically during training remains challenging due to the heterogeneity in the loss-sparsity tradeoffs across the datasets. In this paper,… ▽ More

    Submitted 12 June, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

  9. arXiv:2303.07200  [pdf, other

    cs.NE cs.AI cs.LG

    Supervised Feature Selection with Neuron Evolution in Sparse Neural Networks

    Authors: Zahra Atashgahi, Xuhao Zhang, Neil Kichler, Shiwei Liu, Lu Yin, Mykola Pechenizkiy, Raymond Veldhuis, Decebal Constantin Mocanu

    Abstract: Feature selection that selects an informative subset of variables from data not only enhances the model interpretability and performance but also alleviates the resource demands. Recently, there has been growing attention on feature selection using neural networks. However, existing methods usually suffer from high computational costs when applied to high-dimensional datasets. In this paper, inspi… ▽ More

    Submitted 14 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

  10. arXiv:2302.06548  [pdf, other

    cs.LG cs.AI

    Automatic Noise Filtering with Dynamic Sparse Training in Deep Reinforcement Learning

    Authors: Bram Grooten, Ghada Sokar, Shibhansh Dohare, Elena Mocanu, Matthew E. Taylor, Mykola Pechenizkiy, Decebal Constantin Mocanu

    Abstract: Tomorrow's robots will need to distinguish useful information from noise when performing different tasks. A household robot for instance may continuously receive a plethora of information about the home, but needs to focus on just a small subset to successfully execute its current chore. Filtering distracting inputs that contain irrelevant data has received little attention in the reinforcement le… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: Accepted as full-paper at AAMAS 2023

  11. arXiv:2212.09840  [pdf, other

    cs.LG cs.AI

    Dynamic Sparse Network for Time Series Classification: Learning What to "see''

    Authors: Qiao Xiao, Boqian Wu, Yu Zhang, Shiwei Liu, Mykola Pechenizkiy, Elena Mocanu, Decebal Constantin Mocanu

    Abstract: The receptive field (RF), which determines the region of time series to be ``seen'' and used, is critical to improve the performance for time series classification (TSC). However, the variation of signal scales across and within time series data, makes it challenging to decide on proper RF sizes for TSC. In this paper, we propose a dynamic sparse network (DSN) with sparse connections for TSC, whic… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: Accepted at Neural Information Processing Systems (NeurIPS 2022)

  12. arXiv:2211.15335  [pdf, other

    cs.LG

    You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets

    Authors: Tian** Huang, Tianlong Chen, Meng Fang, Vlado Menkovski, Jiaxu Zhao, Lu Yin, Yulong Pei, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy, Shiwei Liu

    Abstract: Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i.e., untrained networks). However, the presence of such untrained subnetworks in graph neural networks (GNNs) still remai… ▽ More

    Submitted 4 February, 2024; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted by the LoG conference 2022 as a spotlight

    Journal ref: LoG 2022 (Oral & Best Paper Award)

  13. arXiv:2211.14627  [pdf, other

    cs.LG cs.CV

    Where to Pay Attention in Sparse Training for Feature Selection?

    Authors: Ghada Sokar, Zahra Atashgahi, Mykola Pechenizkiy, Decebal Constantin Mocanu

    Abstract: A new line of research for feature selection based on neural networks has recently emerged. Despite its superiority to classical methods, it requires many training iterations to converge and detect informative features. The computational time becomes prohibitively long for datasets with a large number of samples or a very high dimensional feature space. In this paper, we present a new efficient un… ▽ More

    Submitted 26 November, 2022; originally announced November 2022.

    Comments: Accepted at Neural Information Processing Systems (NeurIPS) 2022

  14. arXiv:2207.03932  [pdf, other

    cs.LG cs.AI

    Memory-free Online Change-point Detection: A Novel Neural Network Approach

    Authors: Zahra Atashgahi, Decebal Constantin Mocanu, Raymond Veldhuis, Mykola Pechenizkiy

    Abstract: Change-point detection (CPD), which detects abrupt changes in the data distribution, is recognized as one of the most significant tasks in time series analysis. Despite the extensive literature on offline CPD, unsupervised online CPD still suffers from major challenges, including scalability, hyperparameter tuning, and learning constraints. To mitigate some of these challenges, in this paper, we p… ▽ More

    Submitted 6 December, 2023; v1 submitted 8 July, 2022; originally announced July 2022.

  15. arXiv:2205.15322  [pdf, other

    cs.LG cs.AI

    Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training

    Authors: Lu Yin, Vlado Menkovski, Meng Fang, Tian** Huang, Yulong Pei, Mykola Pechenizkiy, Decebal Constantin Mocanu, Shiwei Liu

    Abstract: Recent works on sparse neural network training (sparse training) have shown that a compelling trade-off between performance and efficiency can be achieved by training intrinsically sparse neural networks from scratch. Existing sparse training methods usually strive to find the best sparse subnetwork possible in one single run, without involving any expensive dense or pre-training steps. For instan… ▽ More

    Submitted 18 August, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: 17 pages, 5 figures, accepted by the 38th Conference on Uncertainty in Artificial Intelligence (UAI)

  16. arXiv:2202.02643  [pdf, other

    cs.LG cs.AI cs.CV

    The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

    Authors: Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy

    Abstract: Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training. In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks. Without any delicate pru… ▽ More

    Submitted 5 February, 2022; originally announced February 2022.

    Comments: Published as a conference paper at ICLR 2022. Code is available at https://github.com/VITA-Group/Random_Pruning

  17. arXiv:2110.05329  [pdf, other

    cs.LG cs.AI

    Avoiding Forgetting and Allowing Forward Transfer in Continual Learning via Sparse Networks

    Authors: Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy

    Abstract: Using task-specific components within a neural network in continual learning (CL) is a compelling strategy to address the stability-plasticity dilemma in fixed-capacity models without access to past data. Current methods focus only on selecting a sub-network for a new task that reduces forgetting of past tasks. However, this selection could limit the forward transfer of relevant past knowledge tha… ▽ More

    Submitted 6 July, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: Accepted at European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2022)

  18. arXiv:2106.14568  [pdf, other

    cs.LG cs.CV

    Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

    Authors: Shiwei Liu, Tianlong Chen, Zahra Atashgahi, Xiaohan Chen, Ghada Sokar, Elena Mocanu, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu

    Abstract: The success of deep ensembles on improving predictive performance, uncertainty estimation, and out-of-distribution robustness has been extensively studied in the machine learning literature. Albeit the promising results, naively training multiple deep neural networks and combining their predictions at inference leads to prohibitive computational costs and memory requirements. Recently proposed eff… ▽ More

    Submitted 7 February, 2022; v1 submitted 28 June, 2021; originally announced June 2021.

    Comments: published in International Conference on Learning Representations (ICLR 2022)

    Journal ref: Proceedings of the International Conference on Machine Learning (ICLR 2022)

  19. arXiv:2106.10404  [pdf, other

    cs.LG cs.CV

    Sparse Training via Boosting Pruning Plasticity with Neuroregeneration

    Authors: Shiwei Liu, Tianlong Chen, Xiaohan Chen, Zahra Atashgahi, Lu Yin, Huanyu Kou, Li Shen, Mykola Pechenizkiy, Zhangyang Wang, Decebal Constantin Mocanu

    Abstract: Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised a lot of attention currently on post-training pruning (iterative magnitude pruning), and before-training pruning (pruning at initialization). The former method suffers from an extremely large computation cost and the latter usually struggles with insufficient performance. In comparison, during-training prun… ▽ More

    Submitted 6 February, 2022; v1 submitted 18 June, 2021; originally announced June 2021.

    Comments: Published on the thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021). Code can be found https://github.com/Shiweiliuiiiiiii/GraNet

    Journal ref: Conference on Neural Information Processing Systems (NeurIPS 2021)

  20. arXiv:2106.04217  [pdf, other

    cs.LG cs.AI

    Dynamic Sparse Training for Deep Reinforcement Learning

    Authors: Ghada Sokar, Elena Mocanu, Decebal Constantin Mocanu, Mykola Pechenizkiy, Peter Stone

    Abstract: Deep reinforcement learning (DRL) agents are trained through trial-and-error interactions with the environment. This leads to a long training time for dense neural networks to achieve good performance. Hence, prohibitive computation and memory resources are consumed. Recently, learning efficient DRL agents has received increasing attention. Yet, current methods focus on accelerating inference time… ▽ More

    Submitted 5 May, 2022; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: Published in the Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI-22)

  21. arXiv:2103.01636  [pdf, other

    cs.AI cs.LG cs.MA cs.NE

    Sparse Training Theory for Scalable and Efficient Agents

    Authors: Decebal Constantin Mocanu, Elena Mocanu, Tiago Pinto, Selima Curci, Phuong H. Nguyen, Madeleine Gibescu, Damien Ernst, Zita A. Vale

    Abstract: A fundamental task for artificial intelligence is learning. Deep Neural Networks have proven to cope perfectly with all learning paradigms, i.e. supervised, unsupervised, and reinforcement learning. Nevertheless, traditional deep learning approaches make use of cloud computing facilities and do not scale well to autonomous agents with low computational resources. Even in the cloud, they suffer fro… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

    Journal ref: 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021)

  22. arXiv:2102.02887  [pdf, other

    cs.LG cs.AI cs.CV

    Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training

    Authors: Shiwei Liu, Lu Yin, Decebal Constantin Mocanu, Mykola Pechenizkiy

    Abstract: In this paper, we introduce a new perspective on training deep neural networks capable of state-of-the-art performance without the need for the expensive over-parameterization by proposing the concept of In-Time Over-Parameterization (ITOP) in sparse training. By starting from a random sparse network and continuously exploring sparse connectivities during training, we can perform an Over-Parameter… ▽ More

    Submitted 15 June, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: 16 pages; 10 figures; Published in Proceedings of the 38th International Conference on Machine Learning. Code can be found https://github.com/Shiweiliuiiiiiii/In-Time-Over-Parameterization

    Journal ref: Proceedings of the 38th International Conference on Machine Learning (2021)

  23. arXiv:2102.01732  [pdf, other

    cs.LG cs.NE

    Truly Sparse Neural Networks at Scale

    Authors: Selima Curci, Decebal Constantin Mocanu, Mykola Pechenizkiyi

    Abstract: Recently, sparse training methods have started to be established as a de facto approach for training and inference efficiency in artificial neural networks. Yet, this efficiency is just in theory. In practice, everyone uses a binary mask to simulate sparsity since the typical deep learning software and hardware are optimized for dense matrix operations. In this paper, we take an orthogonal approac… ▽ More

    Submitted 12 July, 2022; v1 submitted 2 February, 2021; originally announced February 2021.

    Comments: 30 pages, 17 figures

  24. arXiv:2101.12136  [pdf, other

    cs.LG cs.AI cs.CV

    Self-Attention Meta-Learner for Continual Learning

    Authors: Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy

    Abstract: Continual learning aims to provide intelligent agents capable of learning multiple tasks sequentially with neural networks. One of its main challenging, catastrophic forgetting, is caused by the neural networks non-optimal ability to learn in non-stationary distributions. In most settings of the current approaches, the agent starts from randomly initialized parameters and is optimized to master th… ▽ More

    Submitted 28 January, 2021; originally announced January 2021.

    Journal ref: 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2021)

  25. arXiv:2101.09048  [pdf, other

    cs.LG cs.AI

    Selfish Sparse RNN Training

    Authors: Shiwei Liu, Decebal Constantin Mocanu, Yulong Pei, Mykola Pechenizkiy

    Abstract: Sparse neural networks have been widely applied to reduce the computational demands of training and deploying over-parameterized deep neural networks. For inference acceleration, methods that discover a sparse network from a pre-trained dense network (dense-to-sparse training) work effectively. Recently, dynamic sparse training (DST) has been proposed to train sparse neural networks without pre-tr… ▽ More

    Submitted 15 June, 2021; v1 submitted 22 January, 2021; originally announced January 2021.

    Comments: Published in Proceedings of the 38th International Conference on Machine Learning. Code can be found in https://github.com/Shiweiliuiiiiiii/Selfish-RNN

    Journal ref: Proceedings of the 38th International Conference on Machine Learning (2021)

  26. arXiv:2101.06162  [pdf, other

    cs.LG cs.AI cs.CV

    Learning Invariant Representation for Continual Learning

    Authors: Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy

    Abstract: Continual learning aims to provide intelligent agents that are capable of learning continually a sequence of tasks, building on previously learned knowledge. A key challenge in this learning paradigm is catastrophically forgetting previously learned tasks when the agent faces a new one. Current rehearsal-based methods show their success in mitigating the catastrophic forgetting problem by replayin… ▽ More

    Submitted 15 January, 2021; originally announced January 2021.

    Comments: Accepted at the AAAI Meta-Learning for Computer Vision Workshop (2021)

  27. arXiv:2012.00560  [pdf, other

    cs.LG stat.ML

    Quick and Robust Feature Selection: the Strength of Energy-efficient Sparse Training for Autoencoders

    Authors: Zahra Atashgahi, Ghada Sokar, Tim van der Lee, Elena Mocanu, Decebal Constantin Mocanu, Raymond Veldhuis, Mykola Pechenizkiy

    Abstract: Major complications arise from the recent increase in the amount of high-dimensional data, including high computational costs and memory requirements. Feature selection, which identifies the most relevant and informative attributes of a dataset, has been introduced as a solution to this problem. Most of the existing feature selection methods are computationally inefficient; inefficient algorithms… ▽ More

    Submitted 13 September, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: 29 pages

  28. SpaceNet: Make Free Space For Continual Learning

    Authors: Ghada Sokar, Decebal Constantin Mocanu, Mykola Pechenizkiy

    Abstract: The continual learning (CL) paradigm aims to enable neural networks to learn tasks continually in a sequential fashion. The fundamental challenge in this learning paradigm is catastrophic forgetting previously learned tasks when the model is optimized for a new task, especially when their data is not accessible. Current architectural-based methods aim at alleviating the catastrophic forgetting pro… ▽ More

    Submitted 14 April, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

    Comments: Published in Neurocomputing Journal

    Journal ref: Neurocomputing, 439: 1-11, 2021

  29. arXiv:2006.14085  [pdf, other

    cs.LG stat.ML

    Topological Insights into Sparse Neural Networks

    Authors: Shiwei Liu, Tim Van der Lee, Anil Yaman, Zahra Atashgahi, Davide Ferraro, Ghada Sokar, Mykola Pechenizkiy, Decebal Constantin Mocanu

    Abstract: Sparse neural networks are effective approaches to reduce the resource requirements for the deployment of deep neural networks. Recently, the concept of adaptive sparse connectivity, has emerged to allow training sparse neural networks from scratch by optimizing the sparse structure during training. However, comparing different sparse topologies and determining how sparse topologies evolve during… ▽ More

    Submitted 4 July, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: 17 pages, 17 pages

  30. Novelty Producing Synaptic Plasticity

    Authors: Anil Yaman, Giovanni Iacca, Decebal Constantin Mocanu, George Fletcher, Mykola Pechenizkiy

    Abstract: A learning process with the plasticity property often requires reinforcement signals to guide the process. However, in some tasks (e.g. maze-navigation), it is very difficult (or impossible) to measure the performance of an agent (i.e. a fitness value) to provide reinforcements since the position of the goal is not known. This requires finding the correct behavior among a vast number of possible b… ▽ More

    Submitted 10 February, 2020; originally announced February 2020.

  31. arXiv:1906.11626  [pdf, ps, other

    cs.NE cs.LG

    On improving deep learning generalization with adaptive sparse connectivity

    Authors: Shiwei Liu, Decebal Constantin Mocanu, Mykola Pechenizkiy

    Abstract: Large neural networks are very successful in various tasks. However, with limited data, the generalization capabilities of deep neural networks are also very limited. In this paper, we empirically start showing that intrinsically sparse neural networks with adaptive sparse connectivity, which by design have a strict parameter budget during the training phase, have better generalization capabilitie… ▽ More

    Submitted 27 June, 2019; originally announced June 2019.

    Comments: ICML 2019 Workshop on Understanding and Improving Generalization in Deep Learning

  32. Evolving Plasticity for Autonomous Learning under Changing Environmental Conditions

    Authors: Anil Yaman, Giovanni Iacca, Decebal Constantin Mocanu, Matt Coler, George Fletcher, Mykola Pechenizkiy

    Abstract: A fundamental aspect of learning in biological neural networks is the plasticity property which allows them to modify their configurations during their lifetime. Hebbian learning is a biologically plausible mechanism for modeling the plasticity property in artificial neural networks (ANNs), based on the local interactions of neurons. However, the emergence of a coherent global learning behavior fr… ▽ More

    Submitted 7 December, 2020; v1 submitted 2 April, 2019; originally announced April 2019.

    Comments: Evolutionary Computation Journal

    Journal ref: Evolutionary Computation 1 25, 2020

  33. Learning with Delayed Synaptic Plasticity

    Authors: Anil Yaman, Giovanni Iacca, Decebal Constantin Mocanu, George Fletcher, Mykola Pechenizkiy

    Abstract: The plasticity property of biological neural networks allows them to perform learning and optimize their behavior by changing their configuration. Inspired by biology, plasticity can be modeled in artificial neural networks by using Hebbian learning rules, i.e. rules that update synapses based on the neuron activations and reinforcement signals. However, the distal reward problem arises when the r… ▽ More

    Submitted 17 April, 2019; v1 submitted 22 March, 2019; originally announced March 2019.

    Comments: GECCO2019

  34. A Brain-inspired Algorithm for Training Highly Sparse Neural Networks

    Authors: Zahra Atashgahi, Joost Pieterse, Shiwei Liu, Decebal Constantin Mocanu, Raymond Veldhuis, Mykola Pechenizkiy

    Abstract: Sparse neural networks attract increasing interest as they exhibit comparable performance to their dense counterparts while being computationally efficient. Pruning the dense neural networks is among the most widely used methods to obtain a sparse neural network. Driven by the high training cost of such methods that can be unaffordable for a low-resource device, training sparse neural networks spa… ▽ More

    Submitted 10 November, 2022; v1 submitted 17 March, 2019; originally announced March 2019.

  35. arXiv:1901.09208  [pdf, other

    cs.NE

    Intrinsically Sparse Long Short-Term Memory Networks

    Authors: Shiwei Liu, Decebal Constantin Mocanu, Mykola Pechenizkiy

    Abstract: Long Short-Term Memory (LSTM) has achieved state-of-the-art performances on a wide range of tasks. Its outstanding performance is guaranteed by the long-term memory ability which matches the sequential data perfectly and the gating structure controlling the information flow. However, LSTMs are prone to be memory-bandwidth limited in realistic applications and need an unbearable period of training… ▽ More

    Submitted 26 January, 2019; originally announced January 2019.

    Comments: 9 pages, 8 figures and 4 tables

  36. arXiv:1901.09181  [pdf, other

    cs.NE cs.LG stat.ML

    Sparse evolutionary Deep Learning with over one million artificial neurons on commodity hardware

    Authors: Shiwei Liu, Decebal Constantin Mocanu, Amarsagar Reddy Ramapuram Matavalam, Yulong Pei, Mykola Pechenizkiy

    Abstract: Artificial Neural Networks (ANNs) have emerged as hot topics in the research community. Despite the success of ANNs, it is challenging to train and deploy modern ANNs on commodity hardware due to the ever-increasing model size and the unprecedented growth in the data volumes. Particularly for microarray data, the very-high dimensionality and the small number of samples make it difficult for machin… ▽ More

    Submitted 15 January, 2021; v1 submitted 26 January, 2019; originally announced January 2019.

    Comments: 16 pages

  37. arXiv:1804.07645  [pdf, other

    cs.CV cs.LG stat.ML

    One-Shot Learning using Mixture of Variational Autoencoders: a Generalization Learning approach

    Authors: Decebal Constantin Mocanu, Elena Mocanu

    Abstract: Deep learning, even if it is very successful nowadays, traditionally needs very large amounts of labeled data to perform excellent on the classification task. In an attempt to solve this problem, the one-shot learning paradigm, which makes use of just one labeled sample per class and prior knowledge, becomes increasingly important. In this paper, we propose a new one-shot learning method, dubbed M… ▽ More

    Submitted 18 April, 2018; originally announced April 2018.

    Journal ref: 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2018)

  38. Limited Evaluation Cooperative Co-evolutionary Differential Evolution for Large-scale Neuroevolution

    Authors: Anil Yaman, Decebal Constantin Mocanu, Giovanni Iacca, George Fletcher, Mykola Pechenizkiy

    Abstract: Many real-world control and classification tasks involve a large number of features. When artificial neural networks (ANNs) are used for modeling these tasks, the network architectures tend to be large. Neuroevolution is an effective approach for optimizing ANNs; however, there are two bottlenecks that make their application challenging in case of high-dimensional networks using direct encoding. F… ▽ More

    Submitted 6 May, 2018; v1 submitted 19 April, 2018; originally announced April 2018.

  39. arXiv:1707.05878  [pdf, other

    cs.LG cs.AI math.OC

    On-line Building Energy Optimization using Deep Reinforcement Learning

    Authors: Elena Mocanu, Decebal Constantin Mocanu, Phuong H. Nguyen, Antonio Liotta, Michael E. Webber, Madeleine Gibescu, J. G. Slootweg

    Abstract: Unprecedented high volumes of data are becoming available with the growth of the advanced metering infrastructure. These are expected to benefit planning and operation of the future power system, and to help the customers transition from a passive to an active role. In this paper, we explore for the first time in the smart grid context the benefits of using Deep Reinforcement Learning, a hybrid ty… ▽ More

    Submitted 18 July, 2017; originally announced July 2017.

  40. Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science

    Authors: Decebal Constantin Mocanu, Elena Mocanu, Peter Stone, Phuong H. Nguyen, Madeleine Gibescu, Antonio Liotta

    Abstract: Through the success of deep learning in various domains, artificial neural networks are currently among the most used artificial intelligence methods. Taking inspiration from the network properties of biological neural networks (e.g. sparsity, scale-freeness), we argue that (contrary to general practice) artificial neural networks, too, should not have fully-connected layers. Here we propose spars… ▽ More

    Submitted 20 June, 2018; v1 submitted 15 July, 2017; originally announced July 2017.

    Comments: 18 pages

    Journal ref: Nature Communications, 2018

  41. arXiv:1610.05555  [pdf, other

    cs.LG cs.NE

    Online Contrastive Divergence with Generative Replay: Experience Replay without Storing Data

    Authors: Decebal Constantin Mocanu, Maria Torres Vega, Eric Eaton, Peter Stone, Antonio Liotta

    Abstract: Conceived in the early 1990s, Experience Replay (ER) has been shown to be a successful mechanism to allow online learning algorithms to reuse past experiences. Traditionally, ER can be applied to all machine learning paradigms (i.e., unsupervised, supervised, and reinforcement learning). Recently, ER has contributed to improving the performance of deep reinforcement learning. Yet, its application… ▽ More

    Submitted 18 October, 2016; originally announced October 2016.

  42. arXiv:1604.07322  [pdf, other

    cs.MM cs.AI

    Predictive No-Reference Assessment of Video Quality

    Authors: Maria Torres Vega, Decebal Constantin Mocanu, Antonio Liotta

    Abstract: Among the various means to evaluate the quality of video streams, No-Reference (NR) methods have low computation and may be executed on thin clients. Thus, NR algorithms would be perfect candidates in cases of real-time quality assessment, automated quality control and, particularly, in adaptive mobile streaming. Yet, existing NR approaches are often inaccurate, in comparison to Full-Reference (FR… ▽ More

    Submitted 27 April, 2016; v1 submitted 25 April, 2016; originally announced April 2016.

    Comments: 13 pages, 8 figures, IEEE Selected Topics on Signal Processing

  43. A topological insight into restricted Boltzmann machines

    Authors: Decebal Constantin Mocanu, Elena Mocanu, Phuong H. Nguyen, Madeleine Gibescu, Antonio Liotta

    Abstract: Restricted Boltzmann Machines (RBMs) and models derived from them have been successfully used as basic building blocks in deep artificial neural networks for automatic features extraction, unsupervised weights initialization, but also as density estimators. Thus, their generative and discriminative capabilities, but also their computational time are instrumental to a wide range of applications. Ou… ▽ More

    Submitted 18 July, 2016; v1 submitted 20 April, 2016; originally announced April 2016.

    Comments: http://link.springer.com/article/10.1007/s10994-016-5570-z, Machine Learning, issn=1573-0565, 2016

  44. Estimating 3D Trajectories from 2D Projections via Disjunctive Factored Four-Way Conditional Restricted Boltzmann Machines

    Authors: Decebal Constantin Mocanu, Haitham Bou Ammar, Luis Puig, Eric Eaton, Antonio Liotta

    Abstract: Estimation, recognition, and near-future prediction of 3D trajectories based on their two dimensional projections available from one camera source is an exceptionally difficult problem due to uncertainty in the trajectories and environment, high dimensionality of the specific trajectory states, lack of enough labeled data and so on. In this article, we propose a solution to solve this problem base… ▽ More

    Submitted 29 April, 2017; v1 submitted 20 April, 2016; originally announced April 2016.

    Comments: Pattern Recognition, ISSN 0031-3203, Elsevier, 2017