Search | arXiv e-print repository

TRIM: A Design Space Exploration Model for Deep Neural Networks Inference and Training Accelerators

Authors: Yangjie Qi, Shuo Zhang, Tarek M. Taha

Abstract: There is increasing demand for specialized hardware for training deep neural networks, both in edge/IoT environments and in high-performance computing systems. The design space of such hardware is very large due to the wide range of processing architectures, deep neural network configurations, and dataflow options. This makes develo** deep neural network processors quite complex, especially for… ▽ More There is increasing demand for specialized hardware for training deep neural networks, both in edge/IoT environments and in high-performance computing systems. The design space of such hardware is very large due to the wide range of processing architectures, deep neural network configurations, and dataflow options. This makes develo** deep neural network processors quite complex, especially for training. We present TRIM, an infrastructure to help hardware architects explore the design space of deep neural network accelerators for both inference and training in the early design stages. The model evaluates at the whole network level, considering both inter-layer and intra-layer activities. Given applications, essential hardware specifications, and a design goal, TRIM can quickly explore different hardware design options, select the optimal dataflow and guide new hardware architecture design. We validated TRIM with FPGA-based implementation of deep neural network accelerators and ASIC-based architectures. We also show how to use TRIM to explore the design space through several case studies. TRIM is a powerful tool to help architects evaluate different hardware choices to develop efficient inference and training architecture design. △ Less

Submitted 1 May, 2022; v1 submitted 17 May, 2021; originally announced May 2021.

arXiv:2004.03747 [pdf]

COVID_MTNet: COVID-19 Detection with Multi-Task Deep Learning Approaches

Authors: Md Zahangir Alom, M M Shaifur Rahman, Mst Shamima Nasrin, Tarek M. Taha, Vijayan K. Asari

Abstract: COVID-19 is currently one the most life-threatening problems around the world. The fast and accurate detection of the COVID-19 infection is essential to identify, take better decisions and ensure treatment for the patients which will help save their lives. In this paper, we propose a fast and efficient way to identify COVID-19 patients with multi-task deep learning (DL) methods. Both X-ray and CT… ▽ More COVID-19 is currently one the most life-threatening problems around the world. The fast and accurate detection of the COVID-19 infection is essential to identify, take better decisions and ensure treatment for the patients which will help save their lives. In this paper, we propose a fast and efficient way to identify COVID-19 patients with multi-task deep learning (DL) methods. Both X-ray and CT scan images are considered to evaluate the proposed technique. We employ our Inception Residual Recurrent Convolutional Neural Network with Transfer Learning (TL) approach for COVID-19 detection and our NABLA-N network model for segmenting the regions infected by COVID-19. The detection model shows around 84.67% testing accuracy from X-ray images and 98.78% accuracy in CT-images. A novel quantitative analysis strategy is also proposed in this paper to determine the percentage of infected regions in X-ray and CT images. The qualitative and quantitative results demonstrate promising results for COVID-19 detection and infected region localization. △ Less

Submitted 18 April, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

Comments: 11 pages, 15 figures

arXiv:1906.12338 [pdf]

High Speed Cognitive Domain Ontologies for Asset Allocation Using Loihi Spiking Neurons

Authors: Chris Yakopcic, Nayim Rahman, Tanvir Atahary, Tarek M. Taha, Alex Beigh, Scott Douglass

Abstract: Cognitive agents are typically utilized in autonomous systems for automated decision making. These systems interact at real time with their environment and are generally heavily power constrained. Thus, there is a strong need for a real time agent running on a low power platform. The agent examined is the Cognitively Enhanced Complex Event Processing (CECEP) architecture. This is an autonomous dec… ▽ More Cognitive agents are typically utilized in autonomous systems for automated decision making. These systems interact at real time with their environment and are generally heavily power constrained. Thus, there is a strong need for a real time agent running on a low power platform. The agent examined is the Cognitively Enhanced Complex Event Processing (CECEP) architecture. This is an autonomous decision support tool that reasons like humans and enables enhanced agent-based decision-making. It has applications in a large variety of domains including autonomous systems, operations research, intelligence analysis, and data mining. One of the key components of CECEP is the mining of knowledge from a repository described as a Cognitive Domain Ontology (CDO). One problem that is often tasked to CDOs is asset allocation. Given the number of possible solutions in this allocation problem, determining the optimal solution via CDO can be very time consuming. In this work we show that a grid of isolated spiking neurons is capable of generating solutions to this problem very quickly, although some degree of approximation is required to achieve the speedup. However, the approximate spiking approach presented in this work was able to complete all allocation simulations with greater than 99.9% accuracy. To show the feasibility of low power implementation, this algorithm was executed using the Intel Loihi manycore neuromorphic processor. Given the vast increase in speed (greater than 1000 times in larger allocation problems), as well as the reduction in computational requirements, the presented algorithm is ideal for moving asset allocation to low power, portable, embedded hardware. △ Less

Submitted 28 June, 2019; originally announced June 2019.

Comments: Accepted and to appear in the proceedings of the 2019 IEEE-INNS International Joint Conference on Neural Networks. Citation: C. Yakopcic, T. Atahary, N. Rahman, T. M. Taha, A. Beigh, and S. Douglass, High Speed Approximate Cognitive Domain Ontologies for Asset Allocation Using Loihi Spiking Neurons, IEEE-INNS International Joint Conference on Neural Networks, Budapest, Hungary, July, 2019

arXiv:1904.11126 [pdf]

Skin Cancer Segmentation and Classification with NABLA-N and Inception Recurrent Residual Convolutional Networks

Authors: Md Zahangir Alom, Theus Aspiras, Tarek M. Taha, Vijayan K. Asari

Abstract: In the last few years, Deep Learning (DL) has been showing superior performance in different modalities of biomedical image analysis. Several DL architectures have been proposed for classification, segmentation, and detection tasks in medical imaging and computational pathology. In this paper, we propose a new DL architecture, the NABLA-N network, with better feature fusion techniques in decoding… ▽ More In the last few years, Deep Learning (DL) has been showing superior performance in different modalities of biomedical image analysis. Several DL architectures have been proposed for classification, segmentation, and detection tasks in medical imaging and computational pathology. In this paper, we propose a new DL architecture, the NABLA-N network, with better feature fusion techniques in decoding units for dermoscopic image segmentation tasks. The NABLA-N network has several advances for segmentation tasks. First, this model ensures better feature representation for semantic segmentation with a combination of low to high-level feature maps. Second, this network shows better quantitative and qualitative results with the same or fewer network parameters compared to other methods. In addition, the Inception Recurrent Residual Convolutional Neural Network (IRRCNN) model is used for skin cancer classification. The proposed NABLA-N network and IRRCNN models are evaluated for skin cancer segmentation and classification on the benchmark datasets from the International Skin Imaging Collaboration 2018 (ISIC-2018). The experimental results show superior performance on segmentation tasks compared to the Recurrent Residual U-Net (R2U-Net). The classification model shows around 87% testing accuracy for dermoscopic skin cancer classification on ISIC2018 dataset. △ Less

Submitted 24 April, 2019; originally announced April 2019.

Comments: 7 pages, 7 figures, 2 Tables

arXiv:1904.09075 [pdf]

Advanced Deep Convolutional Neural Network Approaches for Digital Pathology Image Analysis: a comprehensive evaluation with different use cases

Authors: Md Zahangir Alom, Theus Aspiras, Tarek M. Taha, Vijayan K. Asari, TJ Bowen, Dave Billiter, Simon Arkell

Abstract: Deep Learning (DL) approaches have been providing state-of-the-art performance in different modalities in the field of medical imagining including Digital Pathology Image Analysis (DPIA). Out of many different DL approaches, Deep Convolutional Neural Network (DCNN) technique provides superior performance for classification, segmentation, and detection tasks. Most of the task in DPIA problems are s… ▽ More Deep Learning (DL) approaches have been providing state-of-the-art performance in different modalities in the field of medical imagining including Digital Pathology Image Analysis (DPIA). Out of many different DL approaches, Deep Convolutional Neural Network (DCNN) technique provides superior performance for classification, segmentation, and detection tasks. Most of the task in DPIA problems are somehow possible to solve with classification, segmentation, and detection approaches. In addition, sometimes pre and post-processing methods are applied for solving some specific type of problems. Recently, different DCNN models including Inception residual recurrent CNN (IRRCNN), Densely Connected Recurrent Convolution Network (DCRCN), Recurrent Residual U-Net (R2U-Net), and R2U-Net based regression model (UD-Net) have proposed and provide state-of-the-art performance for different computer vision and medical image analysis tasks. However, these advanced DCNN models have not been explored for solving different problems related to DPIA. In this study, we have applied these DCNN techniques for solving different DPIA problems and evaluated on different publicly available benchmark datasets for seven different tasks in digital pathology including lymphoma classification, Invasive Ductal Carcinoma (IDC) detection, nuclei segmentation, epithelium segmentation, tubule segmentation, lymphocyte detection, and mitosis detection. The experimental results are evaluated with different performance metrics such as sensitivity, specificity, accuracy, F1-score, Receiver Operating Characteristics (ROC) curve, dice coefficient (DC), and Means Squired Errors (MSE). The results demonstrate superior performance for classification, segmentation, and detection tasks compared to existing machine learning and DCNN based approaches. △ Less

Submitted 19 April, 2019; originally announced April 2019.

Comments: 25 pages, 28 figures, 9 tables

arXiv:1811.04241 [pdf]

Breast Cancer Classification from Histopathological Images with Inception Recurrent Residual Convolutional Neural Network

Authors: Md Zahangir Alom, Chris Yakopcic, Tarek M. Taha, Vijayan K. Asari

Abstract: The Deep Convolutional Neural Network (DCNN) is one of the most powerful and successful deep learning approaches. DCNNs have already provided superior performance in different modalities of medical imaging including breast cancer classification, segmentation, and detection. Breast cancer is one of the most common and dangerous cancers impacting women worldwide. In this paper, we have proposed a me… ▽ More The Deep Convolutional Neural Network (DCNN) is one of the most powerful and successful deep learning approaches. DCNNs have already provided superior performance in different modalities of medical imaging including breast cancer classification, segmentation, and detection. Breast cancer is one of the most common and dangerous cancers impacting women worldwide. In this paper, we have proposed a method for breast cancer classification with the Inception Recurrent Residual Convolutional Neural Network (IRRCNN) model. The IRRCNN is a powerful DCNN model that combines the strength of the Inception Network (Inception-v4), the Residual Network (ResNet), and the Recurrent Convolutional Neural Network (RCNN). The IRRCNN shows superior performance against equivalent Inception Networks, Residual Networks, and RCNNs for object recognition tasks. In this paper, the IRRCNN approach is applied for breast cancer classification on two publicly available datasets including BreakHis and Breast Cancer Classification Challenge 2015. The experimental results are compared against the existing machine learning and deep learning-based approaches with respect to image-based, patch-based, image-level, and patient-level classification. The IRRCNN model provides superior classification performance in terms of sensitivity, Area Under the Curve (AUC), the ROC curve, and global accuracy compared to existing approaches for both datasets. △ Less

Submitted 10 November, 2018; originally announced November 2018.

Comments: 15 pages, 9 figures, 9 tables

arXiv:1811.03447 [pdf]

Microscopic Nuclei Classification, Segmentation and Detection with improved Deep Convolutional Neural Network (DCNN) Approaches

Authors: Md Zahangir Alom, Chris Yakopcic, Tarek M. Taha, Vijayan K. Asari

Abstract: Due to cellular heterogeneity, cell nuclei classification, segmentation, and detection from pathological images are challenging tasks. In the last few years, Deep Convolutional Neural Networks (DCNN) approaches have been shown state-of-the-art (SOTA) performance on histopathological imaging in different studies. In this work, we have proposed different advanced DCNN models and evaluated for nuclei… ▽ More Due to cellular heterogeneity, cell nuclei classification, segmentation, and detection from pathological images are challenging tasks. In the last few years, Deep Convolutional Neural Networks (DCNN) approaches have been shown state-of-the-art (SOTA) performance on histopathological imaging in different studies. In this work, we have proposed different advanced DCNN models and evaluated for nuclei classification, segmentation, and detection. First, the Densely Connected Recurrent Convolutional Network (DCRN) model is used for nuclei classification. Second, Recurrent Residual U-Net (R2U-Net) is applied for nuclei segmentation. Third, the R2U-Net regression model which is named UD-Net is used for nuclei detection from pathological images. The experiments are conducted with different datasets including Routine Colon Cancer(RCC) classification and detection dataset, and Nuclei Segmentation Challenge 2018 dataset. The experimental results show that the proposed DCNN models provide superior performance compared to the existing approaches for nuclei classification, segmentation, and detection tasks. The results are evaluated with different performance metrics including precision, recall, Dice Coefficient (DC), Means Squared Errors (MSE), F1-score, and overall accuracy. We have achieved around 3.4% and 4.5% better F-1 score for nuclei classification and detection tasks compared to recently published DCNN based method. In addition, R2U-Net shows around 92.15% testing accuracy in term of DC. These improved methods will help for pathological practices for better quantitative analysis of nuclei in Whole Slide Images(WSI) which ultimately will help for better understanding of different types of cancer in clinical workflow. △ Less

Submitted 8 November, 2018; originally announced November 2018.

Comments: 18 pages, 16 figures, 3 Tables

arXiv:1804.01198 [pdf, ps, other]

New recursive approximations for variable-order fractional operators with applications

Authors: M. A. Zaky, E. H. Doha, T. M. Taha, D. Baleanu

Abstract: To broaden the range of applicability of variable-order fractional differential models, reliable numerical approaches are needed to solve the model equation. In this paper, we develop Laguerre spectral collocation methods for solving variable-order fractional initial value problems on the half line. Specifically, we derive three-term recurrence relations to efficiently calculate the variable-order… ▽ More To broaden the range of applicability of variable-order fractional differential models, reliable numerical approaches are needed to solve the model equation. In this paper, we develop Laguerre spectral collocation methods for solving variable-order fractional initial value problems on the half line. Specifically, we derive three-term recurrence relations to efficiently calculate the variable-order fractional integrals and derivatives of the modified generalized Laguerre polynomials, which lead to the corresponding fractional differentiation matrices that will be used to construct the collocation methods. Comparison with other existing methods shows the superior accuracy of the proposed spectral collocation methods. △ Less

Submitted 3 April, 2018; originally announced April 2018.

Comments: 12 pages and 1 figure

MSC Class: 42C05; 65D99; 35R11; 65N35

Journal ref: Mathematical Modelling and Analysis 2018

arXiv:1803.01164 [pdf]

The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches

Authors: Md Zahangir Alom, Tarek M. Taha, Christopher Yakopcic, Stefan Westberg, Paheding Sidike, Mst Shamima Nasrin, Brian C Van Esesn, Abdul A S. Awwal, Vijayan K. Asari

Abstract: Deep learning has demonstrated tremendous success in variety of application domains in the past few years. This new field of machine learning has been growing rapidly and applied in most of the application domains with some new modalities of applications, which helps to open new opportunity. There are different methods have been proposed on different category of learning approaches, which includes… ▽ More Deep learning has demonstrated tremendous success in variety of application domains in the past few years. This new field of machine learning has been growing rapidly and applied in most of the application domains with some new modalities of applications, which helps to open new opportunity. There are different methods have been proposed on different category of learning approaches, which includes supervised, semi-supervised and un-supervised learning. The experimental results show state-of-the-art performance of deep learning over traditional machine learning approaches in the field of Image Processing, Computer Vision, Speech Recognition, Machine Translation, Art, Medical imaging, Medical information processing, Robotics and control, Bio-informatics, Natural Language Processing (NLP), Cyber security, and many more. This report presents a brief survey on development of DL approaches, including Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) including Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU), Auto-Encoder (AE), Deep Belief Network (DBN), Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). In addition, we have included recent development of proposed advanced variant DL techniques based on the mentioned DL approaches. Furthermore, DL approaches have explored and evaluated in different application domains are also included in this survey. We have also comprised recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys have published on Deep Learning in Neural Networks [1, 38] and a survey on RL [234]. However, those papers have not discussed the individual advanced techniques for training large scale deep learning models and the recently developed method of generative models [1]. △ Less

Submitted 12 September, 2018; v1 submitted 3 March, 2018; originally announced March 2018.

Comments: 39 pages, 46 figures, 3 tables. arXiv admin note: text overlap with arXiv:1408.3264, arXiv:1411.4046

arXiv:1802.06955 [pdf]

Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation

Authors: Md Zahangir Alom, Mahmudul Hasan, Chris Yakopcic, Tarek M. Taha, Vijayan K. Asari

Abstract: Deep learning (DL) based semantic segmentation methods have been providing state-of-the-art performance in the last few years. More specifically, these techniques have been successfully applied to medical image classification, segmentation, and detection tasks. One deep learning technique, U-Net, has become one of the most popular for these applications. In this paper, we propose a Recurrent Convo… ▽ More Deep learning (DL) based semantic segmentation methods have been providing state-of-the-art performance in the last few years. More specifically, these techniques have been successfully applied to medical image classification, segmentation, and detection tasks. One deep learning technique, U-Net, has become one of the most popular for these applications. In this paper, we propose a Recurrent Convolutional Neural Network (RCNN) based on U-Net as well as a Recurrent Residual Convolutional Neural Network (RRCNN) based on U-Net models, which are named RU-Net and R2U-Net respectively. The proposed models utilize the power of U-Net, Residual Network, as well as RCNN. There are several advantages of these proposed architectures for segmentation tasks. First, a residual unit helps when training deep architecture. Second, feature accumulation with recurrent residual convolutional layers ensures better feature representation for segmentation tasks. Third, it allows us to design better U-Net architecture with same number of network parameters with better performance for medical image segmentation. The proposed models are tested on three benchmark datasets such as blood vessel segmentation in retina images, skin cancer segmentation, and lung lesion segmentation. The experimental results show superior performance on segmentation tasks compared to equivalent models including U-Net and residual U-Net (ResU-Net). △ Less

Submitted 29 May, 2018; v1 submitted 19 February, 2018; originally announced February 2018.

Comments: 12 pages, 21 figures, 3 Tables

arXiv:1802.02615 [pdf]

Effective Quantization Approaches for Recurrent Neural Networks

Authors: Md Zahangir Alom, Adam T Moody, Naoya Maruyama, Brian C Van Essen, Tarek M. Taha

Abstract: Deep learning, and in particular Recurrent Neural Networks (RNN) have shown superior accuracy in a large variety of tasks including machine translation, language understanding, and movie frame generation. However, these deep learning approaches are very expensive in terms of computation. In most cases, Graphic Processing Units (GPUs) are in used for large scale implementations. Meanwhile, energy e… ▽ More Deep learning, and in particular Recurrent Neural Networks (RNN) have shown superior accuracy in a large variety of tasks including machine translation, language understanding, and movie frame generation. However, these deep learning approaches are very expensive in terms of computation. In most cases, Graphic Processing Units (GPUs) are in used for large scale implementations. Meanwhile, energy efficient RNN approaches are proposed for deploying solutions on special purpose hardware including Field Programming Gate Arrays (FPGAs) and mobile platforms. In this paper, we propose an effective quantization approach for Recurrent Neural Networks (RNN) techniques including Long Short Term Memory (LSTM), Gated Recurrent Units (GRU), and Convolutional Long Short Term Memory (ConvLSTM). We have implemented different quantization methods including Binary Connect {-1, 1}, Ternary Connect {-1, 0, 1}, and Quaternary Connect {-1, -0.5, 0.5, 1}. These proposed approaches are evaluated on different datasets for sentiment analysis on IMDB and video frame predictions on the moving MNIST dataset. The experimental results are compared against the full precision versions of the LSTM, GRU, and ConvLSTM. They show promising results for both sentiment analysis and video frame prediction. △ Less

Submitted 7 February, 2018; originally announced February 2018.

Comments: 8 pages, 23 figures,Submitted to International Joint Conference on Neural Networks (IJCNN) 2018

arXiv:1802.02608 [pdf]

Deep Versus Wide Convolutional Neural Networks for Object Recognition on Neuromorphic System

Authors: Md Zahangir Alom, Theodore Josue, Md Nayim Rahman, Will Mitchell, Chris Yakopcic, Tarek M. Taha

Abstract: In the last decade, special purpose computing systems, such as Neuromorphic computing, have become very popular in the field of computer vision and machine learning for classification tasks. In 2015, IBM's released the TrueNorth Neuromorphic system, kick-starting a new era of Neuromorphic computing. Alternatively, Deep Learning approaches such as Deep Convolutional Neural Networks (DCNN) show almo… ▽ More In the last decade, special purpose computing systems, such as Neuromorphic computing, have become very popular in the field of computer vision and machine learning for classification tasks. In 2015, IBM's released the TrueNorth Neuromorphic system, kick-starting a new era of Neuromorphic computing. Alternatively, Deep Learning approaches such as Deep Convolutional Neural Networks (DCNN) show almost human-level accuracies for detection and classification tasks. IBM's 2016 release of a deep learning framework for DCNNs, called Energy Efficient Deep Neuromorphic Networks (Eedn). Eedn shows promise for delivering high accuracies across a number of different benchmarks, while consuming very low power, using IBM's TrueNorth chip. However, there are many things that remained undiscovered using the Eedn framework for classification tasks on a Neuromorphic system. In this paper, we have empirically evaluated the performance of different DCNN architectures implemented within the Eedn framework. The goal of this work was discover the most efficient way to implement DCNN models for object classification tasks using the TrueNorth system. We performed our experiments using benchmark data sets such as MNIST, COIL 20, and COIL 100. The experimental results show very promising classification accuracies with very low power consumption on IBM's NS1e Neurosynaptic system. The results show that for datasets with large numbers of classes, wider networks perform better when compared to deep networks comprised of nearly the same core complexity on IBM's TrueNorth system. △ Less

Submitted 7 February, 2018; originally announced February 2018.

Comments: 8 pages, 14 figures. Submitted to International Joint Conference on Neural Networks (IJCNN) 2018

arXiv:1712.09888 [pdf]

Improved Inception-Residual Convolutional Neural Network for Object Recognition

Authors: Md Zahangir Alom, Mahmudul Hasan, Chris Yakopcic, Tarek M. Taha, Vijayan K. Asari

Abstract: Machine learning and computer vision have driven many of the greatest advances in the modeling of Deep Convolutional Neural Networks (DCNNs). Nowadays, most of the research has been focused on improving recognition accuracy with better DCNN models and learning approaches. The recurrent convolutional approach is not applied very much, other than in a few DCNN architectures. On the other hand, Incep… ▽ More Machine learning and computer vision have driven many of the greatest advances in the modeling of Deep Convolutional Neural Networks (DCNNs). Nowadays, most of the research has been focused on improving recognition accuracy with better DCNN models and learning approaches. The recurrent convolutional approach is not applied very much, other than in a few DCNN architectures. On the other hand, Inception-v4 and Residual networks have promptly become popular among computer the vision community. In this paper, we introduce a new DCNN model called the Inception Recurrent Residual Convolutional Neural Network (IRRCNN), which utilizes the power of the Recurrent Convolutional Neural Network (RCNN), the Inception network, and the Residual network. This approach improves the recognition accuracy of the Inception-residual network with same number of network parameters. In addition, this proposed architecture generalizes the Inception network, the RCNN, and the Residual network with significantly improved training accuracy. We have empirically evaluated the performance of the IRRCNN model on different benchmarks including CIFAR-10, CIFAR-100, TinyImageNet-200, and CU3D-100. The experimental results show higher recognition accuracy against most of the popular DCNN models including the RCNN. We have also investigated the performance of the IRRCNN approach against the Equivalent Inception Network (EIN) and the Equivalent Inception Residual Network (EIRN) counterpart on the CIFAR-100 dataset. We report around 4.53%, 4.49% and 3.56% improvement in classification accuracy compared with the RCNN, EIN, and EIRN on the CIFAR-100 dataset respectively. Furthermore, the experiment has been conducted on the TinyImageNet-200 and CU3D-100 datasets where the IRRCNN provides better testing accuracy compared to the Inception Recurrent CNN (IRCNN), the EIN, and the EIRN. △ Less

Submitted 28 December, 2017; originally announced December 2017.

Comments: 17 pages, 15 figures, 4 tables

arXiv:1712.09872 [pdf]

Handwritten Bangla Character Recognition Using The State-of-Art Deep Convolutional Neural Networks

Authors: Md Zahangir Alom, Peheding Sidike, Mahmudul Hasan, Tark M. Taha, Vijayan K. Asari

Abstract: In spite of advances in object recognition technology, Handwritten Bangla Character Recognition (HBCR) remains largely unsolved due to the presence of many ambiguous handwritten characters and excessively cursive Bangla handwritings. Even the best existing recognizers do not lead to satisfactory performance for practical applications related to Bangla character recognition and have much lower perf… ▽ More In spite of advances in object recognition technology, Handwritten Bangla Character Recognition (HBCR) remains largely unsolved due to the presence of many ambiguous handwritten characters and excessively cursive Bangla handwritings. Even the best existing recognizers do not lead to satisfactory performance for practical applications related to Bangla character recognition and have much lower performance than those developed for English alpha-numeric characters. To improve the performance of HBCR, we herein present the application of the state-of-the-art Deep Convolutional Neural Networks (DCNN) including VGG Network, All Convolution Network (All-Conv Net), Network in Network (NiN), Residual Network, FractalNet, and DenseNet for HBCR. The deep learning approaches have the advantage of extracting and using feature information, improving the recognition of 2D shapes with a high degree of invariance to translation, scaling and other distortions. We systematically evaluated the performance of DCNN models on publicly available Bangla handwritten character dataset called CMATERdb and achieved the superior recognition accuracy when using DCNN models. This improvement would help in building an automatic HBCR system for practical applications. △ Less

Submitted 10 February, 2018; v1 submitted 28 December, 2017; originally announced December 2017.

Comments: 12 pages,22 figures, 5 tables. arXiv admin note: text overlap with arXiv:1705.02680

arXiv:1705.02680 [pdf, other]

Handwritten Bangla Digit Recognition Using Deep Learning

Authors: Md Zahangir Alom, Paheding Sidike, Tarek M. Taha, Vijayan K. Asari

Abstract: In spite of the advances in pattern recognition technology, Handwritten Bangla Character Recognition (HBCR) (such as alpha-numeric and special characters) remains largely unsolved due to the presence of many perplexing characters and excessive cursive in Bangla handwriting. Even the best existing recognizers do not lead to satisfactory performance for practical applications. To improve the perform… ▽ More In spite of the advances in pattern recognition technology, Handwritten Bangla Character Recognition (HBCR) (such as alpha-numeric and special characters) remains largely unsolved due to the presence of many perplexing characters and excessive cursive in Bangla handwriting. Even the best existing recognizers do not lead to satisfactory performance for practical applications. To improve the performance of Handwritten Bangla Digit Recognition (HBDR), we herein present a new approach based on deep neural networks which have recently shown excellent performance in many pattern recognition and machine learning applications, but has not been throughly attempted for HBDR. We introduce Bangla digit recognition techniques based on Deep Belief Network (DBN), Convolutional Neural Networks (CNN), CNN with dropout, CNN with dropout and Gaussian filters, and CNN with dropout and Gabor filters. These networks have the advantage of extracting and using feature information, improving the recognition of two dimensional shapes with a high degree of invariance to translation, scaling and other pattern distortions. We systematically evaluated the performance of our method on publicly available Bangla numeral image database named CMATERdb 3.1.1. From experiments, we achieved 98.78% recognition rate using the proposed method: CNN with Gabor features and dropout, which outperforms the state-of-the-art algorithms for HDBR. △ Less

Submitted 7 May, 2017; originally announced May 2017.

Comments: 12 pages, 10 figures, 3 tables

arXiv:1704.07709 [pdf, other]

Inception Recurrent Convolutional Neural Network for Object Recognition

Authors: Md Zahangir Alom, Mahmudul Hasan, Chris Yakopcic, Tarek M. Taha

Abstract: Deep convolutional neural networks (DCNNs) are an influential tool for solving various problems in the machine learning and computer vision fields. In this paper, we introduce a new deep learning model called an Inception- Recurrent Convolutional Neural Network (IRCNN), which utilizes the power of an inception network combined with recurrent layers in DCNN architecture. We have empirically evaluat… ▽ More Deep convolutional neural networks (DCNNs) are an influential tool for solving various problems in the machine learning and computer vision fields. In this paper, we introduce a new deep learning model called an Inception- Recurrent Convolutional Neural Network (IRCNN), which utilizes the power of an inception network combined with recurrent layers in DCNN architecture. We have empirically evaluated the recognition performance of the proposed IRCNN model using different benchmark datasets such as MNIST, CIFAR-10, CIFAR- 100, and SVHN. Experimental results show similar or higher recognition accuracy when compared to most of the popular DCNNs including the RCNN. Furthermore, we have investigated IRCNN performance against equivalent Inception Networks and Inception-Residual Networks using the CIFAR-100 dataset. We report about 3.5%, 3.47% and 2.54% improvement in classification accuracy when compared to the RCNN, equivalent Inception Networks, and Inception- Residual Networks on the augmented CIFAR- 100 dataset respectively. △ Less

Submitted 25 April, 2017; originally announced April 2017.

Comments: 11 pages, 10 figures, 2 tables

arXiv:1704.06370 [pdf]

Robust Multi-view Pedestrian Tracking Using Neural Networks

Authors: Md Zahangir Alom, Tarek M. Taha

Abstract: In this paper, we present a real-time robust multi-view pedestrian detection and tracking system for video surveillance using neural networks which can be used in dynamic environments. The proposed system consists of two phases: multi-view pedestrian detection and tracking. First, pedestrian detection utilizes background subtraction to segment the foreground blob. An adaptive background subtractio… ▽ More In this paper, we present a real-time robust multi-view pedestrian detection and tracking system for video surveillance using neural networks which can be used in dynamic environments. The proposed system consists of two phases: multi-view pedestrian detection and tracking. First, pedestrian detection utilizes background subtraction to segment the foreground blob. An adaptive background subtraction method where each of the pixel of input image models as a mixture of Gaussians and uses an on-line approximation to update the model applies to extract the foreground region. The Gaussian distributions are then evaluated to determine which are most likely to result from a background process. This method produces a steady, real-time tracker in outdoor environment that consistently deals with changes of lighting condition, and long-term scene change. Second, the Tracking is performed at two phases: pedestrian classification and tracking the individual subject. A sliding window is applied on foreground binary image to select an input window which is used for selecting the input image patches from actually input frame. The neural networks is used for classification with PHOG features. Finally, a Kalman filter is applied to calculate the subsequent step for tracking that aims at finding the exact position of pedestrians in an input image. The experimental result shows that the proposed approach yields promising performance on multi-view pedestrian detection and tracking on different benchmark datasets. △ Less

Submitted 20 April, 2017; originally announced April 2017.

Comments: 8 pages, 3 figures

arXiv:1606.04609 [pdf]

High Throughput Neural Network based Embedded Streaming Multicore Processors

Authors: Raqibul Hasan, Tarek M. Taha, Chris Yakopcic, David J. Mountain

Abstract: With power consumption becoming a critical processor design issue, specialized architectures for low power processing are becoming popular. Several studies have shown that neural networks can be used for signal processing and pattern recognition applications. This study examines the design of memristor based multicore neural processors that would be used primarily to process data directly from sen… ▽ More With power consumption becoming a critical processor design issue, specialized architectures for low power processing are becoming popular. Several studies have shown that neural networks can be used for signal processing and pattern recognition applications. This study examines the design of memristor based multicore neural processors that would be used primarily to process data directly from sensors. Additionally, we have examined the design of SRAM based neural processors for the same task. Full system evaluation of the multicore processors based on these specialized cores were performed taking I/O and routing circuits into consideration. The area and power benefits were compared with traditional multicore RISC processors. Our results show that the memristor based architectures can provide an energy efficiency between three and five orders of magnitude greater than that of RISC processors for the benchmarks examined. △ Less

Submitted 14 June, 2016; originally announced June 2016.

Comments: 8 pages. arXiv admin note: text overlap with arXiv:1603.07400

arXiv:1302.6515 [pdf]

Hybrid Crossbar Architecture for a Memristor Based Memory

Authors: Chris Yakopcic, Tarek M. Taha

Abstract: This paper describes a new memristor crossbar architecture that is proposed for use in a high density cache design. This design has less than 10% of the write energy consumption than a simple memristor crossbar. Also, it has up to 4 times the bit density of an STT-MRAM system and up to 11 times the bit density of an SRAM architecture. The proposed architecture is analyzed using a detailed SPICE an… ▽ More This paper describes a new memristor crossbar architecture that is proposed for use in a high density cache design. This design has less than 10% of the write energy consumption than a simple memristor crossbar. Also, it has up to 4 times the bit density of an STT-MRAM system and up to 11 times the bit density of an SRAM architecture. The proposed architecture is analyzed using a detailed SPICE analysis that accounts for the resistance of the wires in the memristor structure. Additionally, the memristor model used in this work has been matched to specific device characterization data to provide accurate results in terms of energy, area, and timing. △ Less

Submitted 9 April, 2013; v1 submitted 26 February, 2013; originally announced February 2013.

Showing 1–19 of 19 results for author: Taha, T M