Search | arXiv e-print repository

Precise Apple Detection and Localization in Orchards using YOLOv5 for Robotic Harvesting Systems

Abstract: The advancement of agricultural robotics holds immense promise for transforming fruit harvesting practices, particularly within the apple industry. The accurate detection and localization of fruits are pivotal for the successful implementation of robotic harvesting systems. In this paper, we propose a novel approach to apple detection and position estimation utilizing an object detection model, YO… ▽ More The advancement of agricultural robotics holds immense promise for transforming fruit harvesting practices, particularly within the apple industry. The accurate detection and localization of fruits are pivotal for the successful implementation of robotic harvesting systems. In this paper, we propose a novel approach to apple detection and position estimation utilizing an object detection model, YOLOv5. Our primary objective is to develop a robust system capable of identifying apples in complex orchard environments and providing precise location information. To achieve this, we curated an autonomously labeled dataset comprising diverse apple tree images, which was utilized for both training and evaluation purposes. Through rigorous experimentation, we compared the performance of our YOLOv5-based system with other popular object detection models, including SSD. Our results demonstrate that the YOLOv5 model outperforms its counterparts, achieving an impressive apple detection accuracy of approximately 85%. We believe that our proposed system's accurate apple detection and position estimation capabilities represent a significant advancement in agricultural robotics, laying the groundwork for more efficient and sustainable fruit harvesting practices. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2404.06733 [pdf, other]

Incremental XAI: Memorable Understanding of AI with Incremental Explanations

Authors: Jessica Y. Bo, Pan Hao, Brian Y. Lim

Abstract: Many explainable AI (XAI) techniques strive for interpretability by providing concise salient information, such as sparse linear factors. However, users either only see inaccurate global explanations, or highly-varying local explanations. We propose to provide more detailed explanations by leveraging the human cognitive capacity to accumulate knowledge by incrementally receiving more details. Focu… ▽ More Many explainable AI (XAI) techniques strive for interpretability by providing concise salient information, such as sparse linear factors. However, users either only see inaccurate global explanations, or highly-varying local explanations. We propose to provide more detailed explanations by leveraging the human cognitive capacity to accumulate knowledge by incrementally receiving more details. Focusing on linear factor explanations (factors $\times$ values = outcome), we introduce Incremental XAI to automatically partition explanations for general and atypical instances by providing Base + Incremental factors to help users read and remember more faithful explanations. Memorability is improved by reusing base factors and reducing the number of factors shown in atypical cases. In modeling, formative, and summative user studies, we evaluated the faithfulness, memorability and understandability of Incremental XAI against baseline explanation methods. This work contributes towards more usable explanation that users can better ingrain to facilitate intuitive engagement with AI. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: CHI 2024

arXiv:2403.10423 [pdf, ps, other]

Quantization Avoids Saddle Points in Distributed Optimization

Authors: Yanan Bo, Yongqiang Wang

Abstract: Distributed nonconvex optimization underpins key functionalities of numerous distributed systems, ranging from power systems, smart buildings, cooperative robots, vehicle networks to sensor networks. Recently, it has also merged as a promising solution to handle the enormous growth in data and model sizes in deep learning. A fundamental problem in distributed nonconvex optimization is avoiding con… ▽ More Distributed nonconvex optimization underpins key functionalities of numerous distributed systems, ranging from power systems, smart buildings, cooperative robots, vehicle networks to sensor networks. Recently, it has also merged as a promising solution to handle the enormous growth in data and model sizes in deep learning. A fundamental problem in distributed nonconvex optimization is avoiding convergence to saddle points, which significantly degrade optimization accuracy. We discover that the process of quantization, which is necessary for all digital communications, can be exploited to enable saddle-point avoidance. More specifically, we propose a stochastic quantization scheme and prove that it can effectively escape saddle points and ensure convergence to a second-order stationary point in distributed nonconvex optimization. With an easily adjustable quantization granularity, the approach allows a user to control the number of bits sent per iteration and, hence, to aggressively reduce the communication overhead. Numerical experimental results using distributed optimization and learning problems on benchmark datasets confirm the effectiveness of the approach. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: Accepted as a Research Article to Proceedings of the National Academy of Sciences (PNAS)

arXiv:2401.01564 [pdf, other]

Deep Learning Based Superposition Coded Modulation for Hierarchical Semantic Communications over Broadcast Channels

Authors: Yufei Bo, Shuo Shao, Meixia tao

Abstract: We consider multi-user semantic communications over broadcast channels. While most existing works consider that each receiver requires either the same or independent semantic information, this paper explores the scenario where the semantic information desired by different receivers is different but correlated. In particular, we investigate semantic communications over Gaussian broadcast channels w… ▽ More We consider multi-user semantic communications over broadcast channels. While most existing works consider that each receiver requires either the same or independent semantic information, this paper explores the scenario where the semantic information desired by different receivers is different but correlated. In particular, we investigate semantic communications over Gaussian broadcast channels where the transmitter has a common observable source but the receivers wish to recover hierarchical semantic information in adaptation to their channel conditions. Inspired by the capacity achieving property of superposition codes, we propose a deep learning based superposition coded modulation (DeepSCM) scheme. Specifically, the hierarchical semantic information is first extracted and encoded into basic and enhanced feature vectors. A linear minimum mean square error (LMMSE) decorrelator is then developed to obtain a refinement from the enhanced features that is uncorrelated with the basic features. Finally, the basic features and their refinement are superposed for broadcasting after probabilistic modulation. Experiments are conducted for two-receiver image semantic broadcasting with coarse and fine classification as hierarchical semantic tasks. DeepSCM outperforms the benchmarking coded-modulation scheme without a superposition structure, especially with large channel disparity and high order modulation. It also approaches the performance upperbound as if there were only one receiver. △ Less

Submitted 12 June, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

arXiv:2311.05014 [pdf, other]

Interpreting Pretrained Language Models via Concept Bottlenecks

Authors: Zhen Tan, Lu Cheng, Song Wang, Yuan Bo, Jundong Li, Huan Liu

Abstract: Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. However, the lack of interpretability due to their ``black-box'' nature poses challenges for responsible implementation. Although previous studies have attempted to improve interpretability by using, e.g., attention weights in self-attention layers, these weights often lack clarity, readab… ▽ More Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. However, the lack of interpretability due to their ``black-box'' nature poses challenges for responsible implementation. Although previous studies have attempted to improve interpretability by using, e.g., attention weights in self-attention layers, these weights often lack clarity, readability, and intuitiveness. In this research, we propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans. For example, we learn the concept of ``Food'' and investigate how it influences the prediction of a model's sentiment towards a restaurant review. We introduce C$^3$M, which combines human-annotated and machine-generated concepts to extract hidden neurons designed to encapsulate semantically meaningful and task-specific concepts. Through empirical evaluations on real-world datasets, we manifest that our approach offers valuable insights to interpret PLM behavior, helps diagnose model failures, and enhances model robustness amidst noisy concept labels. △ Less

Submitted 8 November, 2023; originally announced November 2023.

arXiv:2310.06690 [pdf, other]

Joint Coding-Modulation for Digital Semantic Communications via Variational Autoencoder

Authors: Yufei Bo, Yiheng Duan, Shuo Shao, Meixia Tao

Abstract: Semantic communications have emerged as a new paradigm for improving communication efficiency by transmitting the semantic information of a source message that is most relevant to a desired task at the receiver. Most existing approaches typically utilize neural networks (NNs) to design end-to-end semantic communication systems, where NN-based semantic encoders output continuously distributed signa… ▽ More Semantic communications have emerged as a new paradigm for improving communication efficiency by transmitting the semantic information of a source message that is most relevant to a desired task at the receiver. Most existing approaches typically utilize neural networks (NNs) to design end-to-end semantic communication systems, where NN-based semantic encoders output continuously distributed signals to be sent directly to the channel in an analog fashion. In this work, we propose a joint coding-modulation (JCM) framework for digital semantic communications by using variational autoencoder (VAE). Our approach learns the transition probability from source data to discrete constellation symbols, thereby avoiding the non-differentiability problem of digital modulation. Meanwhile, by jointly designing the coding and modulation process together, we can match the obtained modulation strategy with the operating channel condition. We also derive a matching loss function with information-theoretic meaning for end-to-end training. Experiments on image semantic communication validate the superiority of our proposed JCM framework over the state-of-the-art quantization-based digital semantic coding-modulation methods across a wide range of channel conditions, transmission rates, and modulation orders. Furthermore, its performance gap to analog semantic communication reduces as the modulation order increases while enjoying the hardware implementation convenience. △ Less

Submitted 29 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2305.04808 [pdf, other]

CAT: A Contextualized Conceptualization and Instantiation Framework for Commonsense Reasoning

Authors: Weiqi Wang, Tianqing Fang, Baixuan Xu, Chun Yi Louis Bo, Yangqiu Song, Lei Chen

Abstract: Commonsense reasoning, aiming at endowing machines with a human-like ability to make situational presumptions, is extremely challenging to generalize. For someone who barely knows about "meditation," while is knowledgeable about "singing," he can still infer that "meditation makes people relaxed" from the existing knowledge that "singing makes people relaxed" by first conceptualizing "singing" as… ▽ More Commonsense reasoning, aiming at endowing machines with a human-like ability to make situational presumptions, is extremely challenging to generalize. For someone who barely knows about "meditation," while is knowledgeable about "singing," he can still infer that "meditation makes people relaxed" from the existing knowledge that "singing makes people relaxed" by first conceptualizing "singing" as a "relaxing event" and then instantiating that event to "meditation." This process, known as conceptual induction and deduction, is fundamental to commonsense reasoning while lacking both labeled data and methodologies to enhance commonsense modeling. To fill such a research gap, we propose CAT (Contextualized ConceptuAlization and InsTantiation), a semi-supervised learning framework that integrates event conceptualization and instantiation to conceptualize commonsense knowledge bases at scale. Extensive experiments show that our framework achieves state-of-the-art performances on two conceptualization tasks, and the acquired abstract commonsense knowledge can significantly improve commonsense inference modeling. Our code, data, and fine-tuned models are publicly available at https://github.com/HKUST-KnowComp/CAT. △ Less

Submitted 10 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: ACL2023 Main Conference

arXiv:2211.07889 [pdf, other]

Pretraining ECG Data with Adversarial Masking Improves Model Generalizability for Data-Scarce Tasks

Authors: Jessica Y. Bo, Hen-Wei Huang, Alvin Chan, Giovanni Traverso

Abstract: Medical datasets often face the problem of data scarcity, as ground truth labels must be generated by medical professionals. One mitigation strategy is to pretrain deep learning models on large, unlabelled datasets with self-supervised learning (SSL). Data augmentations are essential for improving the generalizability of SSL-trained models, but they are typically handcrafted and tuned manually. We… ▽ More Medical datasets often face the problem of data scarcity, as ground truth labels must be generated by medical professionals. One mitigation strategy is to pretrain deep learning models on large, unlabelled datasets with self-supervised learning (SSL). Data augmentations are essential for improving the generalizability of SSL-trained models, but they are typically handcrafted and tuned manually. We use an adversarial model to generate masks as augmentations for 12-lead electrocardiogram (ECG) data, where masks learn to occlude diagnostically-relevant regions of the ECGs. Compared to random augmentations, adversarial masking reaches better accuracy when transferring to to two diverse downstream objectives: arrhythmia classification and gender classification. Compared to a state-of-art ECG augmentation method 3KG, adversarial masking performs better in data-scarce regimes, demonstrating the generalizability of our model. △ Less

Submitted 14 November, 2022; originally announced November 2022.

Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 9 pages

arXiv:2208.05704 [pdf, other]

Learning Based Joint Coding-Modulation for Digital Semantic Communication Systems

Authors: Yufei Bo, Yiheng Duan, Shuo Shao, Meixia Tao

Abstract: In learning-based semantic communications, neural networks have replaced different building blocks in traditional communication systems. However, the digital modulation still remains a challenge for neural networks. The intrinsic mechanism of neural network based digital modulation is map** continuous output of the neural network encoder into discrete constellation symbols, which is a non-differ… ▽ More In learning-based semantic communications, neural networks have replaced different building blocks in traditional communication systems. However, the digital modulation still remains a challenge for neural networks. The intrinsic mechanism of neural network based digital modulation is map** continuous output of the neural network encoder into discrete constellation symbols, which is a non-differentiable function that cannot be trained with existing gradient descend algorithms. To overcome this challenge, in this paper we develop a joint coding-modulation scheme for digital semantic communications with BPSK modulation. In our method, the neural network outputs the likelihood of each constellation point, instead of having a concrete map**. A random code rather than a deterministic code is hence used, which preserves more information for the symbols with a close likelihood on each constellation point. The joint coding-modulation design can match the modulation process with channel states, and hence improve the performance of digital semantic communications. Experiment results show that our method outperforms existing digital modulation methods in semantic communications over a wide range of SNR, and outperforms neural network based analog modulation method in low SNR regime. △ Less

Submitted 6 November, 2022; v1 submitted 11 August, 2022; originally announced August 2022.

arXiv:2207.04586 [pdf, other]

PF4Microservices: A decomposion scheme for microservices based on Problem Frames

Authors: Zhi Li, Yitao Bo, Hongbin Xiao

Abstract: In recent years, microservice architecture has become a popular architectural style in software engineering, with its natural support for DevOps and continuous delivery, as well as its scalability and extensibility, which drive industry practitioners to migrate to microservice architecture. However, there are many challenges in adopting a microservice architecture, the most important of which is h… ▽ More In recent years, microservice architecture has become a popular architectural style in software engineering, with its natural support for DevOps and continuous delivery, as well as its scalability and extensibility, which drive industry practitioners to migrate to microservice architecture. However, there are many challenges in adopting a microservice architecture, the most important of which is how to properly decomposition a monolithic system into microservices. Currently, microservice decomposition decisions for monolithic systems rely on subjective human experience, which is a costly, time-consuming process with high uncertainty of results. To address this problem, this paper proposes a method for microservice decomposition using Jackson Problem Frames. In this method, requirements of the system are analysed, descriptions of the interactions between the proposed software and its environment is obtained, multiple problem diagrams are constructed, and then the problem diagrams are merged by analyzing the correlation and similarity between them, resulting in a microservice decomposition scheme. A case study is also conducted based on a smart parking system. The results of the study show that the method can perform microservice decomposition based on requirements and the software environment, resulting in reducing the decisionmaking burden of developers, with reasonable decomposition results. △ Less

Submitted 10 July, 2022; originally announced July 2022.

Comments: 7 pages

arXiv:2202.05822 [pdf, other]

CLIPasso: Semantically-Aware Object Sketching

Authors: Yael Vinker, Ehsan Pajouheshgar, Jessica Y. Bo, Roman Christian Bachmann, Amit Haim Bermano, Daniel Cohen-Or, Amir Zamir, Ariel Shamir

Abstract: Abstraction is at the heart of sketching due to the simple and minimal nature of line drawings. Abstraction entails identifying the essential visual properties of an object or scene, which requires semantic understanding and prior knowledge of high-level concepts. Abstract depictions are therefore challenging for artists, and even more so for machines. We present CLIPasso, an object sketching meth… ▽ More Abstraction is at the heart of sketching due to the simple and minimal nature of line drawings. Abstraction entails identifying the essential visual properties of an object or scene, which requires semantic understanding and prior knowledge of high-level concepts. Abstract depictions are therefore challenging for artists, and even more so for machines. We present CLIPasso, an object sketching method that can achieve different levels of abstraction, guided by geometric and semantic simplifications. While sketch generation methods often rely on explicit sketch datasets for training, we utilize the remarkable ability of CLIP (Contrastive-Language-Image-Pretraining) to distill semantic concepts from sketches and images alike. We define a sketch as a set of Bézier curves and use a differentiable rasterizer to optimize the parameters of the curves directly with respect to a CLIP-based perceptual loss. The abstraction degree is controlled by varying the number of strokes. The generated sketches demonstrate multiple levels of abstraction while maintaining recognizability, underlying structure, and essential visual components of the subject drawn. △ Less

Submitted 16 May, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

Comments: https://clipasso.github.io/clipasso/

arXiv:2108.08212 [pdf, other]

Confidence Adaptive Regularization for Deep Learning with Noisy Labels

Authors: Yangdi Lu, Yang Bo, Wenbo He

Abstract: Recent studies on the memorization effects of deep neural networks on noisy labels show that the networks first fit the correctly-labeled training samples before memorizing the mislabeled samples. Motivated by this early-learning phenomenon, we propose a novel method to prevent memorization of the mislabeled samples. Unlike the existing approaches which use the model output to identify or ignore t… ▽ More Recent studies on the memorization effects of deep neural networks on noisy labels show that the networks first fit the correctly-labeled training samples before memorizing the mislabeled samples. Motivated by this early-learning phenomenon, we propose a novel method to prevent memorization of the mislabeled samples. Unlike the existing approaches which use the model output to identify or ignore the mislabeled samples, we introduce an indicator branch to the original model and enable the model to produce a confidence value for each sample. The confidence values are incorporated in our loss function which is learned to assign large confidence values to correctly-labeled samples and small confidence values to mislabeled samples. We also propose an auxiliary regularization term to further improve the robustness of the model. To improve the performance, we gradually correct the noisy labels with a well-designed target estimation strategy. We provide the theoretical analysis and conduct the experiments on synthetic and real-world datasets, demonstrating that our approach achieves comparable results to the state-of-the-art methods. △ Less

Submitted 5 September, 2021; v1 submitted 18 August, 2021; originally announced August 2021.

arXiv:2103.12814 [pdf, other]

Co-matching: Combating Noisy Labels by Augmentation Anchoring

Authors: Yangdi Lu, Yang Bo, Wenbo He

Abstract: Deep learning with noisy labels is challenging as deep neural networks have the high capacity to memorize the noisy labels. In this paper, we propose a learning algorithm called Co-matching, which balances the consistency and divergence between two networks by augmentation anchoring. Specifically, we have one network generate anchoring label from its prediction on a weakly-augmented image. Meanwhi… ▽ More Deep learning with noisy labels is challenging as deep neural networks have the high capacity to memorize the noisy labels. In this paper, we propose a learning algorithm called Co-matching, which balances the consistency and divergence between two networks by augmentation anchoring. Specifically, we have one network generate anchoring label from its prediction on a weakly-augmented image. Meanwhile, we force its peer network, taking the strongly-augmented version of the same image as input, to generate prediction close to the anchoring label. We then update two networks simultaneously by selecting small-loss instances to minimize both unsupervised matching loss (i.e., measure the consistency of the two networks) and supervised classification loss (i.e. measure the classification performance). Besides, the unsupervised matching loss makes our method not heavily rely on noisy labels, which prevents memorization of noisy labels. Experiments on three benchmark datasets demonstrate that Co-matching achieves results comparable to the state-of-the-art methods. △ Less

Submitted 23 March, 2021; originally announced March 2021.

Comments: 13 pages, 10 figures. arXiv admin note: text overlap with arXiv:2003.02752 by other authors

arXiv:2103.10567 [pdf, other]

CLTA: Contents and Length-based Temporal Attention for Few-shot Action Recognition

Authors: Yang Bo, Yangdi Lu, Wenbo He

Abstract: Few-shot action recognition has attracted increasing attention due to the difficulty in acquiring the properly labelled training samples. Current works have shown that preserving spatial information and comparing video descriptors are crucial for few-shot action recognition. However, the importance of preserving temporal information is not well discussed. In this paper, we propose a Contents and L… ▽ More Few-shot action recognition has attracted increasing attention due to the difficulty in acquiring the properly labelled training samples. Current works have shown that preserving spatial information and comparing video descriptors are crucial for few-shot action recognition. However, the importance of preserving temporal information is not well discussed. In this paper, we propose a Contents and Length-based Temporal Attention (CLTA) model, which learns customized temporal attention for the individual video to tackle the few-shot action recognition problem. CLTA utilizes the Gaussian likelihood function as the template to generate temporal attention and trains the learning matrices to study the mean and standard deviation based on both frame contents and length. We show that even a not fine-tuned backbone with an ordinary softmax classifier can still achieve similar or better results compared to the state-of-the-art few-shot action recognition with precisely captured temporal attention. △ Less

Submitted 18 March, 2021; originally announced March 2021.

Comments: 8 pages, 4 figures

arXiv:2009.14502 [pdf, other]

Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks

Authors: Yoonho Boo, Sungho Shin, Jungwook Choi, Wonyong Sung

Abstract: The quantization of deep neural networks (QDNNs) has been actively studied for deployment in edge devices. Recent studies employ the knowledge distillation (KD) method to improve the performance of quantized networks. In this study, we propose stochastic precision ensemble training for QDNNs (SPEQ). SPEQ is a knowledge distillation training scheme; however, the teacher is formed by sharing the mod… ▽ More The quantization of deep neural networks (QDNNs) has been actively studied for deployment in edge devices. Recent studies employ the knowledge distillation (KD) method to improve the performance of quantized networks. In this study, we propose stochastic precision ensemble training for QDNNs (SPEQ). SPEQ is a knowledge distillation training scheme; however, the teacher is formed by sharing the model parameters of the student network. We obtain the soft labels of the teacher by changing the bit precision of the activation stochastically at each layer of the forward-pass computation. The student model is trained with these soft labels to reduce the activation quantization noise. The cosine similarity loss is employed, instead of the KL-divergence, for KD training. As the teacher model changes continuously by random bit-precision assignment, it exploits the effect of stochastic ensemble KD. SPEQ outperforms the existing quantization training methods in various tasks, such as image classification, question-answering, and transfer learning without the need for cumbersome teacher networks. △ Less

Submitted 30 September, 2020; originally announced September 2020.

arXiv:2006.00530 [pdf, other]

Quantized Neural Networks: Characterization and Holistic Optimization

Authors: Yoonho Boo, Sungho Shin, Wonyong Sung

Abstract: Quantized deep neural networks (QDNNs) are necessary for low-power, high throughput, and embedded applications. Previous studies mostly focused on develo** optimization methods for the quantization of given models. However, quantization sensitivity depends on the model architecture. Therefore, the model selection needs to be a part of the QDNN design process. Also, the characteristics of weight… ▽ More Quantized deep neural networks (QDNNs) are necessary for low-power, high throughput, and embedded applications. Previous studies mostly focused on develo** optimization methods for the quantization of given models. However, quantization sensitivity depends on the model architecture. Therefore, the model selection needs to be a part of the QDNN design process. Also, the characteristics of weight and activation quantization are quite different. This study proposes a holistic approach for the optimization of QDNNs, which contains QDNN training methods as well as quantization-friendly architecture design. Synthesized data is used to visualize the effects of weight and activation quantization. The results indicate that deeper models are more prone to activation quantization, while wider models improve the resiliency to both weight and activation quantization. This study can provide insight into better optimization of QDNNs. △ Less

Submitted 31 May, 2020; originally announced June 2020.

arXiv:2002.00343 [pdf, other]

SQWA: Stochastic Quantized Weight Averaging for Improving the Generalization Capability of Low-Precision Deep Neural Networks

Authors: Sungho Shin, Yoonho Boo, Wonyong Sung

Abstract: Designing a deep neural network (DNN) with good generalization capability is a complex process especially when the weights are severely quantized. Model averaging is a promising approach for achieving the good generalization capability of DNNs, especially when the loss surface for training contains many sharp minima. We present a new quantized neural network optimization approach, stochastic quant… ▽ More Designing a deep neural network (DNN) with good generalization capability is a complex process especially when the weights are severely quantized. Model averaging is a promising approach for achieving the good generalization capability of DNNs, especially when the loss surface for training contains many sharp minima. We present a new quantized neural network optimization approach, stochastic quantized weight averaging (SQWA), to design low-precision DNNs with good generalization capability using model averaging. The proposed approach includes (1) floating-point model training, (2) direct quantization of weights, (3) capturing multiple low-precision models during retraining with cyclical learning rates, (4) averaging the captured models, and (5) re-quantizing the averaged model and fine-tuning it with low-learning rates. Additionally, we present a loss-visualization technique on the quantized weight domain to clearly elucidate the behavior of the proposed method. Visualization results indicate that a quantized DNN (QDNN) optimized with the proposed approach is located near the center of the flat minimum in the loss surface. With SQWA training, we achieved state-of-the-art results for 2-bit QDNNs on CIFAR-100 and ImageNet datasets. Although we only employed a uniform quantization scheme for the sake of implementation in VLSI or low-precision neural processing units, the performance achieved exceeded those of previous studies employing non-uniform quantization. △ Less

Submitted 2 February, 2020; originally announced February 2020.

arXiv:2001.05663 [pdf]

NbO2-based memristive neurons for burst-based perceptron

Authors: Yeheng Bo, Peng Zhang, Ziqing Luo, Shuai Li, Juan Song, Xinjun Liu

Abstract: Neuromorphic computing using spike-based learning has broad prospects in reducing computing power. Memristive neurons composed with two locally active memristors have been used to mimic the dynamical behaviors of biological neurons. In this work, the dynamic operating conditions of NbO2-based memristive neurons and their transformation boundaries between the spiking and the bursting are comprehens… ▽ More Neuromorphic computing using spike-based learning has broad prospects in reducing computing power. Memristive neurons composed with two locally active memristors have been used to mimic the dynamical behaviors of biological neurons. In this work, the dynamic operating conditions of NbO2-based memristive neurons and their transformation boundaries between the spiking and the bursting are comprehensively investigated. Furthermore, the underlying mechanism of bursting is analyzed and the controllability of the number of spikes during each burst period is demonstrated. Finally, pattern classification and information transmitting in a perceptron neural network by using the number of spikes per bursting period to encode information is proposed. The results show a promising approach for the practical implementation of neuristor in spiking neural networks. △ Less

Submitted 10 April, 2020; v1 submitted 16 January, 2020; originally announced January 2020.

arXiv:1909.01688 [pdf, other]

Knowledge distillation for optimization of quantized deep neural networks

Authors: Sungho Shin, Yoonho Boo, Wonyong Sung

Abstract: Knowledge distillation (KD) is a very popular method for model size reduction. Recently, the technique is exploited for quantized deep neural networks (QDNNs) training as a way to restore the performance sacrificed by word-length reduction. KD, however, employs additional hyper-parameters, such as temperature, coefficient, and the size of teacher network for QDNN training. We analyze the effect of… ▽ More Knowledge distillation (KD) is a very popular method for model size reduction. Recently, the technique is exploited for quantized deep neural networks (QDNNs) training as a way to restore the performance sacrificed by word-length reduction. KD, however, employs additional hyper-parameters, such as temperature, coefficient, and the size of teacher network for QDNN training. We analyze the effect of these hyper-parameters for QDNN optimization with KD. We find that these hyper-parameters are inter-related, and also introduce a simple and effective technique that reduces \textit{coefficient} during training. With KD employing the proposed hyper-parameters, we achieve the test accuracy of 92.7% and 67.0% on Resnet20 with 2-bit ternary weights for CIFAR-10 and CIFAR-100 data sets, respectively. △ Less

Submitted 23 October, 2019; v1 submitted 4 September, 2019; originally announced September 2019.

arXiv:1707.03684 [pdf, other]

Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations

Authors: Yoonho Boo, Wonyong Sung

Abstract: Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory accesses for inference. We propose a weight compression method for deep neural networks, which allows values of +1 or -1 only at predetermined positions of the weights so that decoding usin… ▽ More Deep neural networks (DNNs) usually demand a large amount of operations for real-time inference. Especially, fully-connected layers contain a large number of weights, thus they usually need many off-chip memory accesses for inference. We propose a weight compression method for deep neural networks, which allows values of +1 or -1 only at predetermined positions of the weights so that decoding using a table can be conducted easily. For example, the structured sparse (8,2) coding allows at most two non-zero values among eight weights. This method not only enables multiplication-free DNN implementations but also compresses the weight storage by up to x32 compared to floating-point networks. Weight distribution normalization and gradual pruning techniques are applied to mitigate the performance degradation. The experiments are conducted with fully-connected deep neural networks and convolutional neural networks. △ Less

Submitted 1 July, 2017; originally announced July 2017.

Comments: This paper is accepted in SIPS 2017

arXiv:1702.08171 [pdf, ps, other]

Fixed-point optimization of deep neural networks with adaptive step size retraining

Authors: Sungho Shin, Yoonho Boo, Wonyong Sung

Abstract: Fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations. Many deep neural networks show fairly good performance even with 2- or 3-bit precision when quantized weights are fine-tuned by retraining. We propose an improved fixedpoint optimization algorithm that estimates the quantization step size dynamically during the retrainin… ▽ More Fixed-point optimization of deep neural networks plays an important role in hardware based design and low-power implementations. Many deep neural networks show fairly good performance even with 2- or 3-bit precision when quantized weights are fine-tuned by retraining. We propose an improved fixedpoint optimization algorithm that estimates the quantization step size dynamically during the retraining. In addition, a gradual quantization scheme is also tested, which sequentially applies fixed-point optimizations from high- to low-precision. The experiments are conducted for feed-forward deep neural networks (FFDNNs), convolutional neural networks (CNNs), and recurrent neural networks (RNNs). △ Less

Submitted 27 February, 2017; originally announced February 2017.

Comments: This paper is accepted in ICASSP 2017

Showing 1–21 of 21 results for author: Boo, Y