Search | arXiv e-print repository

Do deep neural networks utilize the weight space efficiently?

Authors: Onur Can Koyun, Behçet Uğur Töreyin

Abstract: Deep learning models like Transformers and Convolutional Neural Networks (CNNs) have revolutionized various domains, but their parameter-intensive nature hampers deployment in resource-constrained settings. In this paper, we introduce a novel concept utilizes column space and row space of weight matrices, which allows for a substantial reduction in model parameters without compromising performance… ▽ More Deep learning models like Transformers and Convolutional Neural Networks (CNNs) have revolutionized various domains, but their parameter-intensive nature hampers deployment in resource-constrained settings. In this paper, we introduce a novel concept utilizes column space and row space of weight matrices, which allows for a substantial reduction in model parameters without compromising performance. Leveraging this paradigm, we achieve parameter-efficient deep learning models.. Our approach applies to both Bottleneck and Attention layers, effectively halving the parameters while incurring only minor performance degradation. Extensive experiments conducted on the ImageNet dataset with ViT and ResNet50 demonstrate the effectiveness of our method, showcasing competitive performance when compared to traditional models. This approach not only addresses the pressing demand for parameter efficient deep learning solutions but also holds great promise for practical deployment in real-world scenarios. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2305.08265 [pdf, other]

Vehicle Detection and Classification without Residual Calculation: Accelerating HEVC Image Decoding with Random Perturbation Injection

Authors: Muhammet Sebul Beratoğlu, Behçet Uğur Töreyin

Abstract: In the field of video analytics, particularly traffic surveillance, there is a growing need for efficient and effective methods for processing and understanding video data. Traditional full video decoding techniques can be computationally intensive and time-consuming, leading researchers to explore alternative approaches in the compressed domain. This study introduces a novel random perturbation-b… ▽ More In the field of video analytics, particularly traffic surveillance, there is a growing need for efficient and effective methods for processing and understanding video data. Traditional full video decoding techniques can be computationally intensive and time-consuming, leading researchers to explore alternative approaches in the compressed domain. This study introduces a novel random perturbation-based compressed domain method for reconstructing images from High Efficiency Video Coding (HEVC) bitstreams, specifically designed for traffic surveillance applications. To the best of our knowledge, our method is the first to propose substituting random perturbations for residual values, creating a condensed representation of the original image while retaining information relevant to video understanding tasks, particularly focusing on vehicle detection and classification as key use cases. By not using residual data, our proposed method significantly reduces the data needed in the image reconstruction process, allowing for more efficient storage and transmission of information. This is particularly important when considering the vast amount of video data involved in surveillance applications. Applied to the public BIT-Vehicle dataset, we demonstrate a significant increase in the reconstruction speed compared to the traditional full decoding approach, with our proposed method being approximately 56% faster than the pixel domain method. Additionally, we achieve a detection accuracy of 99.9%, on par with the pixel domain method, and a classification accuracy of 96.84%, only 0.98% lower than the pixel domain method. Furthermore, we showcase the significant reduction in data size, leading to more efficient storage and transmission. Our research establishes the potential of compressed domain methods in traffic surveillance applications, where speed and data size are critical factors. △ Less

Submitted 5 August, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

Comments: 10 pages 4 figures

MSC Class: 68T20 ACM Class: E.4; I.4.5; H.3.3; I.5.4

arXiv:2203.12976 [pdf, other]

doi 10.1016/j.image.2022.116675

Focus-and-Detect: A Small Object Detection Framework for Aerial Images

Authors: Onur Can Koyun, Reyhan Kevser Keser, İbrahim Batuhan Akkaya, Behçet Uğur Töreyin

Abstract: Despite recent advances, object detection in aerial images is still a challenging task. Specific problems in aerial images makes the detection problem harder, such as small objects, densely packed objects, objects in different sizes and with different orientations. To address small object detection problem, we propose a two-stage object detection framework called "Focus-and-Detect". The first stag… ▽ More Despite recent advances, object detection in aerial images is still a challenging task. Specific problems in aerial images makes the detection problem harder, such as small objects, densely packed objects, objects in different sizes and with different orientations. To address small object detection problem, we propose a two-stage object detection framework called "Focus-and-Detect". The first stage which consists of an object detector network supervised by a Gaussian Mixture Model, generates clusters of objects constituting the focused regions. The second stage, which is also an object detector network, predicts objects within the focal regions. Incomplete Box Suppression (IBS) method is also proposed to overcome the truncation effect of region search approach. Results indicate that the proposed two-stage framework achieves an AP score of 42.06 on VisDrone validation dataset, surpassing all other state-of-the-art small object detection methods reported in the literature, to the best of authors' knowledge. △ Less

Submitted 24 March, 2022; originally announced March 2022.

Comments: 12 pages, 6 figures

Journal ref: Signal Processing: Image Communication, Volume 104, May 2022, 116675

arXiv:2202.02056 [pdf, other]

Unsupervised Behaviour Analysis of News Consumption in Turkish Media

Authors: Didem Makaroglu, Altan Cakir, Behcet Ugur Toreyin

Abstract: Clickstream data, which come with a massive volume generated by human activities on websites, have become a prominent feature for identifying readers' characteristics by newsrooms after the digitization of news outlets. Although the nature of clickstream data has a similar logic within websites, it has inherent limitations in recognizing human behaviours when looking from a broad perspective, whic… ▽ More Clickstream data, which come with a massive volume generated by human activities on websites, have become a prominent feature for identifying readers' characteristics by newsrooms after the digitization of news outlets. Although the nature of clickstream data has a similar logic within websites, it has inherent limitations in recognizing human behaviours when looking from a broad perspective, which brings the need to limit the problem in niche areas. This study investigates the anonymized readers' click activities on the organizations' websites to identify news consumption patterns following referrals from Twitter,who incidentally reach but propensity is mainly routed news content. Methodologies for ensemble cluster analysis with mixed-type embedding strategies are applied and compared to find similar reader groups and interests independent of time. Various internal validation perspectives are used to determine the optimality of the quality of clusters, where the Calinski Harabasz Index (CHI) is found to give a generalizable result. Our findings demonstrate that clustering a mixed-type dataset approaches the optimal internal validation scores, which we define to discriminate the clusters and algorithms considering applied strategies when embedded by Uniform Manifold Approximation and Projection (UMAP) and using a consensus function as a key to access the most applicable hyperparameter configurations in the given ensemble rather than using consensus function results directly. Evaluation of the resulting clusters highlights specific clusters repeatedly present in the separated monthly samples by Adjusted Mutual Information scores greater than 0.5, which provide insights to the news organizations and overcome the degradation of the modeling behaviours due to the change in the interest over time. △ Less

Submitted 8 October, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

Comments: Submitted to Big Data Research

arXiv:2105.00695 [pdf, other]

ResVGAE: Going Deeper with Residual Modules for Link Prediction

Authors: Indrit Nallbani, Reyhan Kevser Keser, Aydin Ayanzadeh, Nurullah Çalık, Behçet Uğur Töreyin

Abstract: Graph autoencoders are efficient at embedding graph-based data sets. Most graph autoencoder architectures have shallow depths which limits their ability to capture meaningful relations between nodes separated by multi-hops. In this paper, we propose Residual Variational Graph Autoencoder, ResVGAE, a deep variational graph autoencoder model with multiple residual modules. We show that our multiple… ▽ More Graph autoencoders are efficient at embedding graph-based data sets. Most graph autoencoder architectures have shallow depths which limits their ability to capture meaningful relations between nodes separated by multi-hops. In this paper, we propose Residual Variational Graph Autoencoder, ResVGAE, a deep variational graph autoencoder model with multiple residual modules. We show that our multiple residual modules, a convolutional layer with residual connection, improve the average precision of the graph autoencoders. Experimental results suggest that our proposed model with residual modules outperforms the models without residual modules and achieves similar results when compared with other state-of-the-art methods. △ Less

Submitted 4 August, 2022; v1 submitted 3 May, 2021; originally announced May 2021.

arXiv:2103.00053 [pdf, other]

doi 10.1016/j.eswa.2022.119040

PURSUhInT: In Search of Informative Hint Points Based on Layer Clustering for Knowledge Distillation

Authors: Reyhan Kevser Keser, Aydin Ayanzadeh, Omid Abdollahi Aghdam, Caglar Kilcioglu, Behcet Ugur Toreyin, Nazim Kemal Ure

Abstract: One of the most efficient methods for model compression is hint distillation, where the student model is injected with information (hints) from several different layers of the teacher model. Although the selection of hint points can drastically alter the compression performance, conventional distillation approaches overlook this fact and use the same hint points as in the early studies. Therefore,… ▽ More One of the most efficient methods for model compression is hint distillation, where the student model is injected with information (hints) from several different layers of the teacher model. Although the selection of hint points can drastically alter the compression performance, conventional distillation approaches overlook this fact and use the same hint points as in the early studies. Therefore, we propose a clustering based hint selection methodology, where the layers of teacher model are clustered with respect to several metrics and the cluster centers are used as the hint points. Our method is applicable for any student network, once it is applied on a chosen teacher network. The proposed approach is validated in CIFAR-100 and ImageNet datasets, using various teacher-student pairs and numerous hint distillation methods. Our results show that hint points selected by our algorithm results in superior compression performance compared to state-of-the-art knowledge distillation algorithms on the same student models and datasets. △ Less

Submitted 3 November, 2022; v1 submitted 26 February, 2021; originally announced March 2021.

Comments: Our codes are published on Code Ocean, where the link to our codes is: https://codeocean.com/capsule/4245746/tree/v1

Journal ref: Expert Systems with Applications, Volume 213, Part B, March 2023, 119040

arXiv:1902.01824 [pdf, other]

Deep Convolutional Generative Adversarial Networks Based Flame Detection in Video

Authors: Süleyman Aslan, Uğur Güdükbay, B. Uğur Töreyin, A. Enis Çetin

Abstract: Real-time flame detection is crucial in video based surveillance systems. We propose a vision-based method to detect flames using Deep Convolutional Generative Adversarial Neural Networks (DCGANs). Many existing supervised learning approaches using convolutional neural networks do not take temporal information into account and require substantial amount of labeled data. In order to have a robust r… ▽ More Real-time flame detection is crucial in video based surveillance systems. We propose a vision-based method to detect flames using Deep Convolutional Generative Adversarial Neural Networks (DCGANs). Many existing supervised learning approaches using convolutional neural networks do not take temporal information into account and require substantial amount of labeled data. In order to have a robust representation of sequences with and without flame, we propose a two-stage training of a DCGAN exploiting spatio-temporal flame evolution. Our training framework includes the regular training of a DCGAN with real spatio-temporal images, namely, temporal slice images, and noise vectors, and training the discriminator separately using the temporal flame images without the generator. Experimental results show that the proposed method effectively detects flame in video with negligible false positive rates in real-time. △ Less

Submitted 5 February, 2019; originally announced February 2019.

arXiv:1812.10256 [pdf, other]

doi 10.1007/s11517-021-02388-w

A Whole Slide Image Grading Benchmark and Tissue Classification for Cervical Cancer Precursor Lesions with Inter-Observer Variability

Authors: Abdulkadir Albayrak, Asli Unlu, Nurullah Calik, Abdulkerim Capar, Gokhan Bilgin, Behcet Ugur Toreyin, Bahar Muezzinoglu, Ilknur Turkmen, Lutfiye Durak-Ata

Abstract: The cervical cancer develo** from the precancerous lesions caused by the Human Papilloma Virus (HPV) has been one of the preventable cancers with the help of periodic screening. There are two types of grading conventions widely accepted among pathologists. On the other hand, inter-observer variability is an important issue for final diagnosis. In this paper, a whole-slide image grading benchmark… ▽ More The cervical cancer develo** from the precancerous lesions caused by the Human Papilloma Virus (HPV) has been one of the preventable cancers with the help of periodic screening. There are two types of grading conventions widely accepted among pathologists. On the other hand, inter-observer variability is an important issue for final diagnosis. In this paper, a whole-slide image grading benchmark for cervical cancer precursor lesions is introduced. The papillae of the cervical epithelium and overlap** cell problems are handled and a tissue classification method with a novel morphological feature exploiting the relative orientation between the BM and the major axis of all nuclei is developed and its performance is evaluated. Besides, the inter-observer variability is also revealed by a thorough comparison among pathologists' decisions, as well as, the final diagnosis. △ Less

Submitted 26 December, 2018; originally announced December 2018.

arXiv:1101.4749 [pdf, ps, other]

doi 10.1117/1.3595426

Online Adaptive Decision Fusion Framework Based on Entropic Projections onto Convex Sets with Application to Wildfire Detection in Video

Authors: Osman Gunay, Behcet Ugur Toreyin, Kivanc Kose, A. Enis Cetin

Abstract: In this paper, an Entropy functional based online Adaptive Decision Fusion (EADF) framework is developed for image analysis and computer vision applications. In this framework, it is assumed that the compound algorithm consists of several sub-algorithms each of which yielding its own decision as a real number centered around zero, representing the confidence level of that particular sub-algorithm.… ▽ More In this paper, an Entropy functional based online Adaptive Decision Fusion (EADF) framework is developed for image analysis and computer vision applications. In this framework, it is assumed that the compound algorithm consists of several sub-algorithms each of which yielding its own decision as a real number centered around zero, representing the confidence level of that particular sub-algorithm. Decision values are linearly combined with weights which are updated online according to an active fusion method based on performing entropic projections onto convex sets describing sub-algorithms. It is assumed that there is an oracle, who is usually a human operator, providing feedback to the decision fusion method. A video based wildfire detection system is developed to evaluate the performance of the algorithm in handling the problems where data arrives sequentially. In this case, the oracle is the security guard of the forest lookout tower verifying the decision of the combined algorithm. Simulation results are presented. The EADF framework is also tested with a standard dataset. △ Less

Submitted 25 January, 2011; originally announced January 2011.

Comments: 10 pages, 7 figures

Showing 1–9 of 9 results for author: Toreyin, B U