Skip to main content

Showing 1–28 of 28 results for author: Konushin, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15020  [pdf, other

    cs.CV

    A3D: Does Diffusion Dream about 3D Alignment?

    Authors: Savva Ignatyev, Nina Konovalova, Daniil Selikhanovych, Nikolay Patakin, Oleg Voynov, Dmitry Senushkin, Alexander Filippov, Anton Konushin, Peter Wonka, Evgeny Burnaev

    Abstract: We tackle the problem of text-driven 3D generation from a geometry alignment perspective. We aim at the generation of multiple objects which are consistent in terms of semantics and geometry. Recent methods based on Score Distillation have succeeded in distilling the knowledge from 2D diffusion models to high-quality objects represented by 3D neural radiance fields. These methods handle multiple t… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  2. arXiv:2404.16718  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Features Fusion for Dual-View Mammography Mass Detection

    Authors: Arina Varlamova, Valery Belotsky, Grigory Novikov, Anton Konushin, Evgeny Sidorov

    Abstract: Detection of malignant lesions on mammography images is extremely important for early breast cancer diagnosis. In clinical practice, images are acquired from two different angles, and radiologists can fully utilize information from both views, simultaneously locating the same lesion. However, for automatic detection approaches such information fusion remains a challenge. In this paper, we propose… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted at ISBI 2024 (21st IEEE International Symposium on Biomedical Imaging)

  3. TETRIS: Towards Exploring the Robustness of Interactive Segmentation

    Authors: Andrey Moskalenko, Vlad Shakhuro, Anna Vorontsova, Anton Konushin, Anton Antonov, Alexander Krapukhin, Denis Shepelev, Konstantin Soshin

    Abstract: Interactive segmentation methods rely on user inputs to iteratively update the selection mask. A click specifying the object of interest is arguably the most simple and intuitive interaction type, and thereby the most common choice for interactive segmentation. However, user clicking patterns in the interactive segmentation context remain unexplored. Accordingly, interactive segmentation evaluatio… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI2024

    MSC Class: 68T45 ACM Class: I.4.6

  4. arXiv:2311.14405  [pdf, other

    cs.CV

    OneFormer3D: One Transformer for Unified Point Cloud Segmentation

    Authors: Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich

    Abstract: Semantic, instance, and panoptic segmentation of 3D point clouds have been addressed using task-specific models of distinct design. Thereby, the similarity of all segmentation tasks and the implicit relationship between them have not been utilized effectively. This paper presents a unified, simple, and effective model addressing all these tasks jointly. The model, named OneFormer3D, performs insta… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

  5. arXiv:2306.02878  [pdf, other

    cs.CV

    Single-Stage 3D Geometry-Preserving Depth Estimation Model Training on Dataset Mixtures with Uncalibrated Stereo Data

    Authors: Nikolay Patakin, Mikhail Romanov, Anna Vorontsova, Mikhail Artemyev, Anton Konushin

    Abstract: Nowadays, robotics, AR, and 3D modeling applications attract considerable attention to single-view depth estimation (SVDE) as it allows estimating scene geometry from a single RGB image. Recent works have demonstrated that the accuracy of an SVDE method hugely depends on the diversity and volume of the training data. However, RGB-D datasets obtained via depth capturing or 3D reconstruction are typ… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Journal ref: CVPR 2022

  6. arXiv:2305.19000  [pdf, other

    cs.CV cs.LG

    Independent Component Alignment for Multi-Task Learning

    Authors: Dmitry Senushkin, Nikolay Patakin, Arseny Kuznetsov, Anton Konushin

    Abstract: In a multi-task learning (MTL) setting, a single model is trained to tackle a diverse set of tasks jointly. Despite rapid progress in the field, MTL remains challenging due to optimization issues such as conflicting and dominating gradients. In this work, we propose using a condition number of a linear system of gradients as a stability criterion of an MTL optimization. We theoretically demonstrat… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Journal ref: CVPR2023

  7. arXiv:2302.06353  [pdf, other

    cs.CV

    Contour-based Interactive Segmentation

    Authors: Danil Galeev, Polina Popenova, Anna Vorontsova, Anton Konushin

    Abstract: Recent advances in interactive segmentation (IS) allow speeding up and simplifying image editing and labeling greatly. The majority of modern IS approaches accept user input in the form of clicks. However, using clicks may require too many user interactions, especially when selecting small objects, minor parts of an object, or a group of objects of the same type. In this paper, we consider such a… ▽ More

    Submitted 5 December, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    ACM Class: I.4.6

  8. arXiv:2302.02871  [pdf, other

    cs.CV

    Top-Down Beats Bottom-Up in 3D Instance Segmentation

    Authors: Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich

    Abstract: Most 3D instance segmentation methods exploit a bottom-up strategy, typically including resource-exhaustive post-processing. For point grou**, bottom-up methods rely on prior assumptions about the objects in the form of hyperparameters, which are domain-specific and need to be carefully tuned. On the contrary, we address 3D instance segmentation with a TD3D: the pioneering cluster-free, fully-co… ▽ More

    Submitted 11 September, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

  9. arXiv:2302.02858  [pdf, other

    cs.CV

    TR3D: Towards Real-Time Indoor 3D Object Detection

    Authors: Danila Rukhovich, Anna Vorontsova, Anton Konushin

    Abstract: Recently, sparse 3D convolutions have changed 3D object detection. Performing on par with the voting-based approaches, 3D CNNs are memory-efficient and scale to large scenes better. However, there is still room for improvement. With a conscious, practice-oriented approach to problem-solving, we analyze the performance of such methods and localize the weaknesses. Applying modifications that resolve… ▽ More

    Submitted 5 December, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

  10. arXiv:2210.04572  [pdf, other

    cs.CV

    Floorplan-Aware Camera Poses Refinement

    Authors: Anna Sokolova, Filipp Nikitin, Anna Vorontsova, Anton Konushin

    Abstract: Processing large indoor scenes is a challenging task, as scan registration and camera trajectory estimation methods accumulate errors across time. As a result, the quality of reconstructed scans is insufficient for some applications, such as visual-based localization and navigation, where the correct position of walls is crucial. For many indoor scenes, there exists an image of a technical floor… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: IROS 2022

  11. arXiv:2112.00322  [pdf, other

    cs.CV

    FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection

    Authors: Danila Rukhovich, Anna Vorontsova, Anton Konushin

    Abstract: Recently, promising applications in robotics and augmented reality have attracted considerable attention to 3D object detection from point clouds. In this paper, we present FCAF3D - a first-in-class fully convolutional anchor-free indoor 3D object detection method. It is a simple yet effective method that uses a voxel representation of a point cloud and processes voxels with sparse convolutions. F… ▽ More

    Submitted 24 March, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

  12. arXiv:2106.01178  [pdf, other

    cs.CV

    ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

    Authors: Danila Rukhovich, Anna Vorontsova, Anton Konushin

    Abstract: In this paper, we introduce the task of multi-view RGB-based 3D object detection as an end-to-end optimization problem. To address this problem, we propose ImVoxelNet, a novel fully convolutional method of 3D object detection based on monocular or multi-view RGB images. The number of monocular images in each multi-view input can variate during training and inference; actually, this number might be… ▽ More

    Submitted 15 October, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

  13. arXiv:2102.06583  [pdf, other

    cs.CV

    Reviving Iterative Training with Mask Guidance for Interactive Segmentation

    Authors: Konstantin Sofiiuk, Ilia A. Petrov, Anton Konushin

    Abstract: Recent works on click-based interactive segmentation have demonstrated state-of-the-art results by using various inference-time optimization schemes. These methods are considerably more computationally expensive compared to feedforward approaches, as they require performing backward passes through a network during inference and are hard to deploy on mobile frameworks that usually support only forw… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

  14. arXiv:2101.04927  [pdf, other

    cs.CV

    Road images augmentation with synthetic traffic signs using neural networks

    Authors: Anton Konushin, Boris Faizov, Vlad Shakhuro

    Abstract: Traffic sign recognition is a well-researched problem in computer vision. However, the state of the art methods works only for frequent sign classes, which are well represented in training datasets. We consider the task of rare traffic sign detection and classification. We aim to solve that problem by using synthetic training data. Such training data is obtained by embedding synthetic images of si… ▽ More

    Submitted 13 January, 2021; originally announced January 2021.

    Comments: The paper was submitted to the journal "Computer Optics" and is currently under review

  15. arXiv:2009.12419  [pdf, other

    cs.CV cs.LG eess.IV

    Towards General Purpose Geometry-Preserving Single-View Depth Estimation

    Authors: Mikhail Romanov, Nikolay Patatkin, Anna Vorontsova, Sergey Nikolenko, Anton Konushin, Dmitry Senyushkin

    Abstract: Single-view depth estimation (SVDE) plays a crucial role in scene understanding for AR applications, 3D modeling, and robotics, providing the geometry of a scene based on a single image. Recent works have shown that a successful solution strongly relies on the diversity and volume of training data. This data can be sourced from stereo movies and photos. However, they do not provide geometrically c… ▽ More

    Submitted 9 February, 2021; v1 submitted 25 September, 2020; originally announced September 2020.

  16. arXiv:2006.10451  [pdf, other

    cs.CV

    Learning High-Resolution Domain-Specific Representations with a GAN Generator

    Authors: Danil Galeev, Konstantin Sofiiuk, Danila Rukhovich, Mikhail Romanov, Olga Barinova, Anton Konushin

    Abstract: In recent years generative models of visual data have made a great progress, and now they are able to produce images of high quality and diversity. In this work we study representations learnt by a GAN generator. First, we show that these representations can be easily projected onto semantic segmentation map using a lightweight decoder. We find that such semantic projection can be learnt from just… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

  17. arXiv:2006.00809  [pdf, other

    cs.CV

    Foreground-aware Semantic Representations for Image Harmonization

    Authors: Konstantin Sofiiuk, Polina Popenova, Anton Konushin

    Abstract: Image harmonization is an important step in photo editing to achieve visual consistency in composite images by adjusting the appearances of foreground to make it compatible with background. Previous approaches to harmonize composites are based on training of encoder-decoder networks from scratch, which makes it challenging for a neural network to learn a high-level representation of objects. We pr… ▽ More

    Submitted 1 June, 2020; originally announced June 2020.

  18. arXiv:2005.08607  [pdf, other

    cs.CV

    Decoder Modulation for Indoor Depth Completion

    Authors: Dmitry Senushkin, Mikhail Romanov, Ilia Belikov, Anton Konushin, Nikolay Patakin

    Abstract: Depth completion recovers a dense depth map from sensor measurements. Current methods are mostly tailored for very sparse depth measurements from LiDARs in outdoor settings, while for indoor scenes Time-of-Flight (ToF) or structured light sensors are mostly used. These sensors provide semi-dense maps, with dense measurements in some regions and almost empty in others. We propose a new model that t… ▽ More

    Submitted 8 February, 2021; v1 submitted 18 May, 2020; originally announced May 2020.

  19. arXiv:2005.05708  [pdf, other

    cs.CV

    IterDet: Iterative Scheme for Object Detection in Crowded Environments

    Authors: Danila Rukhovich, Konstantin Sofiiuk, Danil Galeev, Olga Barinova, Anton Konushin

    Abstract: Deep learning-based detectors usually produce a redundant set of object bounding boxes including many duplicate detections of the same object. These boxes are then filtered using non-maximum suppression (NMS) in order to select exactly one bounding box per object of interest. This greedy scheme is simple and provides sufficient accuracy for isolated objects but often fails in crowded environments,… ▽ More

    Submitted 29 January, 2021; v1 submitted 12 May, 2020; originally announced May 2020.

  20. arXiv:2001.10331  [pdf, other

    cs.CV

    f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation

    Authors: Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, Anton Konushin

    Abstract: Deep neural networks have become a mainstream approach to interactive segmentation. As we show in our experiments, while for some images a trained network provides accurate segmentation result with just a few clicks, for some unknown objects it cannot achieve satisfactory result even with a large amount of user input. Recently proposed backpropagating refinement (BRS) scheme introduces an optimiza… ▽ More

    Submitted 25 August, 2020; v1 submitted 28 January, 2020; originally announced January 2020.

  21. arXiv:1912.05405  [pdf, other

    cs.CV

    Training Deep SLAM on Single Frames

    Authors: Igor Slinko, Anna Vorontsova, Dmitry Zhukov, Olga Barinova, Anton Konushin

    Abstract: Learning-based visual odometry and SLAM methods demonstrate a steady improvement over past years. However, collecting ground truth poses to train these methods is difficult and expensive. This could be resolved by training in an unsupervised mode, but there is still a large gap between performance of unsupervised and supervised methods. In this work, we focus on generating synthetic data for deep… ▽ More

    Submitted 11 December, 2019; originally announced December 2019.

  22. arXiv:1910.04755  [pdf, other

    cs.CV cs.RO

    Measuring robustness of Visual SLAM

    Authors: David Prokhorov, Dmitry Zhukov, Olga Barinova, Anna Vorontsova, Anton Konushin

    Abstract: Simultaneous localization and map** (SLAM) is an essential component of robotic systems. In this work we perform a feasibility study of RGB-D SLAM for the task of indoor robot navigation. Recent visual SLAM methods, e.g. ORBSLAM2 \cite{mur2017orb}, demonstrate really impressive accuracy, but the experiments in the papers are usually conducted on just a few sequences, that makes it difficult to r… ▽ More

    Submitted 10 October, 2019; originally announced October 2019.

  23. arXiv:1909.12146  [pdf, other

    cs.CV

    DISCOMAN: Dataset of Indoor SCenes for Odometry, Map** And Navigation

    Authors: Pavel Kirsanov, Airat Gaskarov, Filipp Konokhov, Konstantin Sofiiuk, Anna Vorontsova, Igor Slinko, Dmitry Zhukov, Sergey Bykov, Olga Barinova, Anton Konushin

    Abstract: We present a novel dataset for training and benchmarking semantic SLAM methods. The dataset consists of 200 long sequences, each one containing 3000-5000 data frames. We generate the sequences using realistic home layouts. For that we sample trajectories that simulate motions of a simple home robot, and then render the frames along the trajectories. Each data frame contains a) RGB images generated… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

    Comments: 8 pages, 7 figures

  24. arXiv:1909.07829  [pdf, other

    cs.CV

    AdaptIS: Adaptive Instance Selection Network

    Authors: Konstantin Sofiiuk, Olga Barinova, Anton Konushin

    Abstract: We present Adaptive Instance Selection network architecture for class-agnostic instance segmentation. Given an input image and a point $(x, y)$, it generates a mask for the object located at $(x, y)$. The network adapts to the input point with a help of AdaIN layers, thus producing different masks for different objects on the same image. AdaptIS generates pixel-accurate object masks, therefore it… ▽ More

    Submitted 17 September, 2019; originally announced September 2019.

    Comments: Accepted at ICCV 2019

  25. Perceptual Image Anomaly Detection

    Authors: Nina Tuluptceva, Bart Bakker, Irina Fedulova, Anton Konushin

    Abstract: We present a novel method for image anomaly detection, where algorithms that use samples drawn from some distribution of "normal" data, aim to detect out-of-distribution (abnormal) samples. Our approach includes a combination of encoder and generator for map** an image distribution to a predefined latent distribution and vice versa. It leverages Generative Adversarial Networks to learn these dat… ▽ More

    Submitted 28 February, 2020; v1 submitted 12 September, 2019; originally announced September 2019.

    Comments: The final authenticated publication is available online at https://doi.org/10.1007/978-3-030-41404-7_12

    Journal ref: In: Palaiahnakote S., Sanniti di Baja G., Wang L., Yan W. (eds) Pattern Recognition. ACPR 2019. Lecture Notes in Computer Science, vol 12046. Springer, Cham

  26. arXiv:1907.07227  [pdf, other

    cs.CV

    Scene Motion Decomposition for Learnable Visual Odometry

    Authors: Igor Slinko, Anna Vorontsova, Filipp Konokhov, Olga Barinova, Anton Konushin

    Abstract: Optical Flow (OF) and depth are commonly used for visual odometry since they provide sufficient information about camera ego-motion in a rigid scene. We reformulate the problem of ego-motion estimation as a problem of motion estimation of a 3D-scene with respect to a static camera. The entire scene motion can be represented as a combination of motions of its visible points. Using OF and depth we e… ▽ More

    Submitted 16 July, 2019; originally announced July 2019.

    Journal ref: CVPR 2019 Workshop

  27. Double Refinement Network for Efficient Indoor Monocular Depth Estimation

    Authors: Nikita Durasov, Mikhail Romanov, Valeriya Bubnova, Pavel Bogomolov, Anton Konushin

    Abstract: Monocular depth estimation is the task of obtaining a measure of distance for each pixel using a single image. It is an important problem in computer vision and is usually solved using neural networks. Though recent works in this area have shown significant improvement in accuracy, the state-of-the-art methods tend to require massive amounts of memory and time to process an image. The main purpose… ▽ More

    Submitted 4 April, 2019; v1 submitted 20 November, 2018; originally announced November 2018.

    Journal ref: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

  28. arXiv:1710.06512  [pdf, other

    cs.CV

    Pose-based Deep Gait Recognition

    Authors: Anna Sokolova, Anton Konushin

    Abstract: Human gait or walking manner is a biometric feature that allows identification of a person when other biometric features such as the face or iris are not visible. In this paper, we present a new pose-based convolutional neural network model for gait recognition. Unlike many methods that consider the full-height silhouette of a moving person, we consider the motion of points in the areas around hum… ▽ More

    Submitted 8 February, 2018; v1 submitted 17 October, 2017; originally announced October 2017.