Search | arXiv e-print repository

arXiv:2406.19814 [pdf, other]

Extract More from Less: Efficient Fine-Grained Visual Recognition in Low-Data Regimes

Authors: Dmitry Demidov, Abduragim Shtanchaev, Mihail Mihaylov, Mohammad Almansoori

Abstract: The emerging task of fine-grained image classification in low-data regimes assumes the presence of low inter-class variance and large intra-class variation along with a highly limited amount of training samples per class. However, traditional ways of separately dealing with fine-grained categorisation and extremely scarce data may be inefficient under both these harsh conditions presented together… ▽ More The emerging task of fine-grained image classification in low-data regimes assumes the presence of low inter-class variance and large intra-class variation along with a highly limited amount of training samples per class. However, traditional ways of separately dealing with fine-grained categorisation and extremely scarce data may be inefficient under both these harsh conditions presented together. In this paper, we present a novel framework, called AD-Net, aiming to enhance deep neural network performance on this challenge by leveraging the power of Augmentation and Distillation techniques. Specifically, our approach is designed to refine learned features through self-distillation on augmented samples, mitigating harmful overfitting. We conduct comprehensive experiments on popular fine-grained image classification benchmarks where our AD-Net demonstrates consistent improvement over traditional fine-tuning and state-of-the-art low-data techniques. Remarkably, with the smallest data available, our framework shows an outstanding relative accuracy increase of up to 45 % compared to standard ResNet-50 and up to 27 % compared to the closest SOTA runner-up. We emphasise that our approach is practically architecture-independent and adds zero extra cost at inference time. Additionally, we provide an extensive study on the impact of every framework's component, highlighting the importance of each in achieving optimal performance. Source code and trained models are publicly available at github.com/demidovd98/fgic_lowd. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: Main paper and Appendices

arXiv:2401.01164 [pdf, other]

doi 10.1007/978-3-031-45676-3_36

Distilling Local Texture Features for Colorectal Tissue Classification in Low Data Regimes

Authors: Dmitry Demidov, Roba Al Majzoub, Amandeep Kumar, Fahad Khan

Abstract: Multi-class colorectal tissue classification is a challenging problem that is typically addressed in a setting, where it is assumed that ample amounts of training data is available. However, manual annotation of fine-grained colorectal tissue samples of multiple classes, especially the rare ones like stromal tumor and anal cancer is laborious and expensive. To address this, we propose a knowledge… ▽ More Multi-class colorectal tissue classification is a challenging problem that is typically addressed in a setting, where it is assumed that ample amounts of training data is available. However, manual annotation of fine-grained colorectal tissue samples of multiple classes, especially the rare ones like stromal tumor and anal cancer is laborious and expensive. To address this, we propose a knowledge distillation-based approach, named KD-CTCNet, that effectively captures local texture information from few tissue samples, through a distillation loss, to improve the standard CNN features. The resulting enriched feature representation achieves improved classification performance specifically in low data regimes. Extensive experiments on two public datasets of colorectal tissues reveal the merits of the proposed contributions, with a consistent gain achieved over different approaches across low data settings. The code and models are publicly available on GitHub. △ Less

Submitted 2 January, 2024; originally announced January 2024.

Journal ref: Machine Learning in Medical Imaging (MLMI) 2023

arXiv:2305.07102 [pdf, other]

doi 10.5220/0011611100003417

Salient Mask-Guided Vision Transformer for Fine-Grained Classification

Authors: Dmitry Demidov, Muhammad Hamza Sharif, Aliakbar Abdurahimov, Hisham Cholakkal, Fahad Shahbaz Khan

Abstract: Fine-grained visual classification (FGVC) is a challenging computer vision problem, where the task is to automatically recognise objects from subordinate categories. One of its main difficulties is capturing the most discriminative inter-class variances among visually similar classes. Recently, methods with Vision Transformer (ViT) have demonstrated noticeable achievements in FGVC, generally by em… ▽ More Fine-grained visual classification (FGVC) is a challenging computer vision problem, where the task is to automatically recognise objects from subordinate categories. One of its main difficulties is capturing the most discriminative inter-class variances among visually similar classes. Recently, methods with Vision Transformer (ViT) have demonstrated noticeable achievements in FGVC, generally by employing the self-attention mechanism with additional resource-consuming techniques to distinguish potentially discriminative regions while disregarding the rest. However, such approaches may struggle to effectively focus on truly discriminative regions due to only relying on the inherent self-attention mechanism, resulting in the classification token likely aggregating global information from less-important background patches. Moreover, due to the immense lack of the datapoints, classifiers may fail to find the most helpful inter-class distinguishing features, since other unrelated but distinctive background regions may be falsely recognised as being valuable. To this end, we introduce a simple yet effective Salient Mask-Guided Vision Transformer (SM-ViT), where the discriminability of the standard ViT`s attention maps is boosted through salient masking of potentially discriminative foreground regions. Extensive experiments demonstrate that with the standard training procedure our SM-ViT achieves state-of-the-art performance on popular FGVC benchmarks among existing ViT-based approaches while requiring fewer resources and lower input image resolution. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Comments: Accepted by VISAPP 2023 (Best Student Paper Award)

Journal ref: VISAPP 2023

arXiv:2211.15479 [pdf, other]

Object Detection in Aerial Imagery

Authors: Dmitry Demidov, Rushali Grandhe, Salem AlMarri

Abstract: Object detection in natural images has achieved remarkable results over the years. However, a similar progress has not yet been observed in aerial object detection due to several challenges, such as high resolution images, instances scale variation, class imbalance etc. We show the performance of two-stage, one-stage and attention based object detectors on the iSAID dataset. Furthermore, we descri… ▽ More Object detection in natural images has achieved remarkable results over the years. However, a similar progress has not yet been observed in aerial object detection due to several challenges, such as high resolution images, instances scale variation, class imbalance etc. We show the performance of two-stage, one-stage and attention based object detectors on the iSAID dataset. Furthermore, we describe some modifications and analysis performed for different models - a) In two stage detector: introduced weighted attention based FPN, class balanced sampler and density prediction head. b) In one stage detector: used weighted focal loss and introduced FPN. c) In attention based detector: compare single,multi-scale attention and demonstrate effect of different backbones. Finally, we show a comparative study highlighting the pros and cons of different models in aerial imagery setting. △ Less

Submitted 15 November, 2022; originally announced November 2022.

Comments: Technical report

arXiv:2202.09056 [pdf, other]

Efficient solution of 3D elasticity problems with smoothed aggregation algebraic multigrid and block arithmetics

Authors: Denis Demidov

Abstract: Efficient solution of 3D elasticity problems is an important part of many industrial and scientific applications. Smoothed aggregation algebraic multigrid using rigid body modes for the tentative prolongation operator construction is an efficient and robust choice for the solution of linear systems arising from the discretization of elasticity equations. The system matrices on every level of the m… ▽ More Efficient solution of 3D elasticity problems is an important part of many industrial and scientific applications. Smoothed aggregation algebraic multigrid using rigid body modes for the tentative prolongation operator construction is an efficient and robust choice for the solution of linear systems arising from the discretization of elasticity equations. The system matrices on every level of the multigrid hierarchy have block structure, so using block representation and block arithmetics should significantly improve the solver efficiency. However, the tentative prolongation operator construction may only be done using scalar representation. The paper proposes a couple of practical approaches for enabling the use of block arithmetics with smoothed aggregation algebraic multigrid based on the open-source AMGCL library. It is shown on the example of two real-world model problems that the suggested improvements may speed up the solution by 50% and reduce the memory requirements for the preconditioner by 30%. The implementation is straightforward and only requires a minimal amount of code. △ Less

Submitted 18 February, 2022; originally announced February 2022.

Comments: 10 pages, 2 figures

MSC Class: 35-04; 65-04; 65Y05; 65Y10; 65Y15; 97N80

arXiv:2108.02054 [pdf, ps, other]

Partial Reuse AMG Setup Cost Amortization Strategy for the Solution of Non-Steady State Problems

Authors: D. E. Demidov

Abstract: The partial reuse algebraic multigrid (AMG) setup cost amortization strategy is presented for the solution of non-steady state problems. The transfer operators are reused from the previous time steps, and the system matrices and the smoother operators are rebuilt on each of the AMG hierarchy levels. It is shown on the example of modelling a two-fluid dam break scenario that the strategy may decrea… ▽ More The partial reuse algebraic multigrid (AMG) setup cost amortization strategy is presented for the solution of non-steady state problems. The transfer operators are reused from the previous time steps, and the system matrices and the smoother operators are rebuilt on each of the AMG hierarchy levels. It is shown on the example of modelling a two-fluid dam break scenario that the strategy may decrease the AMG preconditioner setup cost by 40% to 200%. The total compute time is decreased by up to 20%, but the specific outcome depends on the fraction of time that the setup step initially takes. △ Less

Submitted 4 August, 2021; originally announced August 2021.

MSC Class: 35-04; 65-04; 65Y05; 65Y10; 65Y15; 97N80

arXiv:2006.06052 [pdf, other]

doi 10.1016/j.jocs.2020.101285

Accelerating linear solvers for Stokes problems with C++ metaprogramming

Authors: Denis Demidov, Lin Mu, Bin Wang

Abstract: The efficient solution of large sparse saddle point systems is very important in computational fluid mechanics. The discontinuous Galerkin finite element methods have become increasingly popular for incompressible flow problems but their application is limited due to high computational cost. We describe the C++ programming techniques that may help to accelerate linear solvers for such problems. Th… ▽ More The efficient solution of large sparse saddle point systems is very important in computational fluid mechanics. The discontinuous Galerkin finite element methods have become increasingly popular for incompressible flow problems but their application is limited due to high computational cost. We describe the C++ programming techniques that may help to accelerate linear solvers for such problems. The approach is based on the policy-based design pattern and partial template specialization, and is implemented in the open source AMGCL library. The efficiency is demonstrated with the example of accelerating an iterative solver of a discontinuous Galerkin finite element method for the Stokes problem. The implementation allows selecting algorithmic components of the solver by adjusting template parameters without any changes to the codebase. It is possible to switch the system matrix to use small statically sized blocks to store the nonzero values, or use a mixed precision solution, which results in up to 4 times speedup, and reduces the memory footprint of the algorithm by about 40\%. We evaluate both monolithic and composite preconditioning strategies for the 3 benchmark problems. The performance of the proposed solution is compared with a multithreaded direct Pardiso solver and a parallel iterative PETSc solver. △ Less

Submitted 22 December, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

MSC Class: 35-04; 65-04; 65Y05; 65Y10; 65Y15; 97N80

arXiv:1902.07920 [pdf, other]

doi 10.1016/j.physa.2019.123199

What is the central bank of Wikipedia?

Authors: Denis Demidov, Klaus M. Frahm, Dima L. Shepelyansky

Abstract: We analyze the influence and interactions of 60 largest world banks for 195 world countries using the reduced Google matrix algorithm for the English Wikipedia network with 5 416 537 articles. While the top asset rank positions are taken by the banks of China, with China Industrial and Commercial Bank of China at the first place, we show that the network influence is dominated by USA banks with Go… ▽ More We analyze the influence and interactions of 60 largest world banks for 195 world countries using the reduced Google matrix algorithm for the English Wikipedia network with 5 416 537 articles. While the top asset rank positions are taken by the banks of China, with China Industrial and Commercial Bank of China at the first place, we show that the network influence is dominated by USA banks with Goldman Sachs being the central bank. We determine the network structure of interactions of banks and countries and PageRank sensitivity of countries to selected banks. We also present GPU oriented code which significantly accelerates the numerical computations of reduced Google matrix. △ Less

Submitted 21 February, 2019; originally announced February 2019.

Journal ref: Physica A 542, 123199 (2020)

arXiv:1811.05704 [pdf, other]

doi 10.1134/S1995080219050056

AMGCL: an Efficient, Flexible, and Extensible Algebraic Multigrid Implementation

Authors: Denis Demidov

Abstract: The paper presents AMGCL -- an opensource C++ library implementing the algebraic multigrid method (AMG) for solution of large sparse linear systems of equations, usually arising from discretization of partial differential equations on an unstructured grid. The library supports both shared and distributed memory computation, allows to utilize modern massively parallel processors via OpenMP, OpenCL,… ▽ More The paper presents AMGCL -- an opensource C++ library implementing the algebraic multigrid method (AMG) for solution of large sparse linear systems of equations, usually arising from discretization of partial differential equations on an unstructured grid. The library supports both shared and distributed memory computation, allows to utilize modern massively parallel processors via OpenMP, OpenCL, or CUDA technologies, has minimal dependencies, and is easily extensible. The design principles behind AMGCL are discussed and it is shown that the code performance is on par with alternative implementations. △ Less

Submitted 14 November, 2018; originally announced November 2018.

MSC Class: 35-04; 65-04; 65Y05; 65Y10; 65Y15; 97N80

arXiv:1710.03940 [pdf, other]

doi 10.1134/S1995080220040071

Subdomain Deflation Combined with Local AMG: a Case Study Using AMGCL Library

Authors: Denis Demidov, Riccardo Rossi

Abstract: The paper proposes a combination of the subdomain deflation method and local algebraic multigrid as a scalable distributed memory preconditioner that is able to solve large linear systems of equations. The implementation of the algorithm is made available for the community as part of an open source AMGCL library. The solution targets both homogeneous (CPU-only) and heterogeneous (CPU/GPU) systems,… ▽ More The paper proposes a combination of the subdomain deflation method and local algebraic multigrid as a scalable distributed memory preconditioner that is able to solve large linear systems of equations. The implementation of the algorithm is made available for the community as part of an open source AMGCL library. The solution targets both homogeneous (CPU-only) and heterogeneous (CPU/GPU) systems, employing hybrid MPI/OpenMP approach in the former and a combination of MPI, OpenMP, and CUDA in the latter cases. The use of OpenMP minimizes the number of MPI processes, thus reducing the communication overhead of the deflation method and improving both weak and strong scalability of the preconditioner. The examples of scalar, Poisson-like, systems as well as non-scalar problems, stemming out of the discretization of the Navier-Stokes equations, are considered in order to estimate performance of the implemented algorithm. A comparison with a traditional global AMG preconditioner based on a well-established Trilinos ML package is provided. △ Less

Submitted 26 October, 2018; v1 submitted 11 October, 2017; originally announced October 2017.

Comments: 21 pages, 7 figures

ACM Class: D.1.3; G.1.0; G.1.8

arXiv:1704.05392 [pdf]

Synergy of all-purpose static solver and temporal reasoning tools in dynamic integrated expert systems

Authors: Galina Rybina, Alexey Mozgachev, Dmitry Demidov

Abstract: The paper discusses scientific and technological problems of dynamic integrated expert systems development. Extensions of problem-oriented methodology for dynamic integrated expert systems development are considered. Attention is paid to the temporal knowledge representation and processing. The paper discusses scientific and technological problems of dynamic integrated expert systems development. Extensions of problem-oriented methodology for dynamic integrated expert systems development are considered. Attention is paid to the temporal knowledge representation and processing. △ Less

Submitted 16 April, 2017; originally announced April 2017.

Comments: 8 pages, 3 figures

Journal ref: "Informatsionno-izmeritelnye i upravlyayushchie sistemy" (Information-measuring and Control Systems) no.8, vol.12, 2014. pp 27-33. ISSN 2070-0814

arXiv:1212.6326 [pdf, other]

doi 10.1137/120903683

Programming CUDA and OpenCL: A Case Study Using Modern C++ Libraries

Authors: Denis Demidov, Karsten Ahnert, Karl Rupp, Peter Gottschling

Abstract: We present a comparison of several modern C++ libraries providing high-level interfaces for programming multi- and many-core architectures on top of CUDA or OpenCL. The comparison focuses on the solution of ordinary differential equations and is based on odeint, a framework for the solution of systems of ordinary differential equations. Odeint is designed in a very flexible way and may be easily a… ▽ More We present a comparison of several modern C++ libraries providing high-level interfaces for programming multi- and many-core architectures on top of CUDA or OpenCL. The comparison focuses on the solution of ordinary differential equations and is based on odeint, a framework for the solution of systems of ordinary differential equations. Odeint is designed in a very flexible way and may be easily adapted for effective use of libraries such as Thrust, MTL4, VexCL, or ViennaCL, using CUDA or OpenCL technologies. We found that CUDA and OpenCL work equally well for problems of large sizes, while OpenCL has higher overhead for smaller problems. Furthermore, we show that modern high-level libraries allow to effectively use the computational resources of many-core GPUs or multi-core CPUs without much knowledge of the underlying technologies. △ Less

Submitted 26 April, 2013; v1 submitted 27 December, 2012; originally announced December 2012.

Comments: 21 pages, 4 figures, submitted to SIAM Journal of Scientific Computing and accepted

Showing 1–12 of 12 results for author: Demidov, D