Search | arXiv e-print repository

Visual place recognition for aerial imagery: A survey

Authors: Ivan Moskalenko, Anastasiia Kornilova, Gonzalo Ferrer

Abstract: Aerial imagery and its direct application to visual localization is an essential problem for many Robotics and Computer Vision tasks. While Global Navigation Satellite Systems (GNSS) are the standard default solution for solving the aerial localization problem, it is subject to a number of limitations, such as, signal instability or solution unreliability that make this option not so desirable. Co… ▽ More Aerial imagery and its direct application to visual localization is an essential problem for many Robotics and Computer Vision tasks. While Global Navigation Satellite Systems (GNSS) are the standard default solution for solving the aerial localization problem, it is subject to a number of limitations, such as, signal instability or solution unreliability that make this option not so desirable. Consequently, visual geolocalization is emerging as a viable alternative. However, adapting Visual Place Recognition (VPR) task to aerial imagery presents significant challenges, including weather variations and repetitive patterns. Current VPR reviews largely neglect the specific context of aerial data. This paper introduces a methodology tailored for evaluating VPR techniques specifically in the domain of aerial imagery, providing a comprehensive assessment of various methods and their performance. However, we not only compare various VPR methods, but also demonstrate the importance of selecting appropriate zoom and overlap levels when constructing map tiles to achieve maximum efficiency of VPR algorithms in the case of aerial imagery. The code is available on our GitHub repository -- https://github.com/prime-slam/aero-vloc. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2403.17550 [pdf, other]

DeepMIF: Deep Monotonic Implicit Fields for Large-Scale LiDAR 3D Map**

Authors: Kutay Yılmaz, Matthias Nießner, Anastasiia Kornilova, Alexey Artemov

Abstract: Recently, significant progress has been achieved in sensing real large-scale outdoor 3D environments, particularly by using modern acquisition equipment such as LiDAR sensors. Unfortunately, they are fundamentally limited in their ability to produce dense, complete 3D scenes. To address this issue, recent learning-based methods integrate neural implicit representations and optimizable feature grid… ▽ More Recently, significant progress has been achieved in sensing real large-scale outdoor 3D environments, particularly by using modern acquisition equipment such as LiDAR sensors. Unfortunately, they are fundamentally limited in their ability to produce dense, complete 3D scenes. To address this issue, recent learning-based methods integrate neural implicit representations and optimizable feature grids to approximate surfaces of 3D scenes. However, naively fitting samples along raw LiDAR rays leads to noisy 3D map** results due to the nature of sparse, conflicting LiDAR measurements. Instead, in this work we depart from fitting LiDAR data exactly, instead letting the network optimize a non-metric monotonic implicit field defined in 3D space. To fit our field, we design a learning system integrating a monotonicity loss that enables optimizing neural monotonic fields and leverages recent progress in large-scale 3D map**. Our algorithm achieves high-quality dense 3D map** performance as captured by multiple quantitative and perceptual measures and visual results obtained for Mai City, Newer College, and KITTI benchmarks. The code of our approach will be made publicly available. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 8 pages, 6 figures

arXiv:2403.16318 [pdf, other]

AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans

Authors: Cedric Perauer, Laurenz Adrian Heidrich, Haifan Zhang, Matthias Nießner, Anastasiia Kornilova, Alexey Artemov

Abstract: Recently, progress in acquisition equipment such as LiDAR sensors has enabled sensing increasingly spacious outdoor 3D environments. Making sense of such 3D acquisitions requires fine-grained scene understanding, such as constructing instance-based 3D scene segmentations. Commonly, a neural network is trained for this task; however, this requires access to a large, densely annotated dataset, which… ▽ More Recently, progress in acquisition equipment such as LiDAR sensors has enabled sensing increasingly spacious outdoor 3D environments. Making sense of such 3D acquisitions requires fine-grained scene understanding, such as constructing instance-based 3D scene segmentations. Commonly, a neural network is trained for this task; however, this requires access to a large, densely annotated dataset, which is widely known to be challenging to obtain. To address this issue, in this work we propose to predict instance segmentations for 3D scenes in an unsupervised way, without relying on ground-truth annotations. To this end, we construct a learning framework consisting of two components: (1) a pseudo-annotation scheme for generating initial unsupervised pseudo-labels; and (2) a self-training algorithm for instance segmentation to fit robust, accurate instances from initial noisy proposals. To enable generating 3D instance mask proposals, we construct a weighted proxy-graph by connecting 3D points with edges integrating multi-modal image- and point-based self-supervised features, and perform graph-cuts to isolate individual pseudo-instances. We then build on a state-of-the-art point-based architecture and train a 3D instance segmentation model, resulting in significant refinement of initial proposals. To scale to arbitrary complexity 3D scenes, we design our algorithm to operate on local 3D point chunks and construct a merging step to generate scene-level instance segmentations. Experiments on the challenging SemanticKITTI benchmark demonstrate the potential of our approach, where it attains 13.3% higher Average Precision and 9.1% higher F1 score compared to the best-performing baseline. The code will be made publicly available at https://github.com/artonson/autoinst. △ Less

Submitted 24 March, 2024; originally announced March 2024.

Comments: 9 pages, 7 figures

ACM Class: I.4.6; I.2.9

arXiv:2401.11438 [pdf, ps, other]

Tensor force impact on shell evolution in neutron-rich Si and Ni isotopes

Authors: S. V. Sidorov, A. S. Kornilova, T. Yu. Tretyakova

Abstract: The influence of the tensor interaction of nucleons on the characteristics of neutron-rich silicon and nickel isotopes was studied in this work. Tensor forces are taken into account within the framework of the Hartree-Fock approach with the Skyrme interaction. It is shown that the addition of tensor component of interaction improves the description of the splitting between different single-particl… ▽ More The influence of the tensor interaction of nucleons on the characteristics of neutron-rich silicon and nickel isotopes was studied in this work. Tensor forces are taken into account within the framework of the Hartree-Fock approach with the Skyrme interaction. It is shown that the addition of tensor component of interaction improves the description of the splitting between different single-particle states and leads to a decrease in nucleon-nucleon pairing correlations in silicon and nickel nuclei. Special attention was given to the role of isovector tensor forces relevant for interaction of like nucleons. △ Less

Submitted 21 January, 2024; originally announced January 2024.

Comments: 13 pages, 7 figures, accepted in Chinese Physics C

arXiv:2308.09066 [pdf, ps, other]

Uplift Modeling: from Causal Inference to Personalization

Authors: Felipe Moraes, Hugo Manuel Proença, Anastasiia Kornilova, Javier Albert, Dmitri Goldenberg

Abstract: Uplift modeling is a collection of machine learning techniques for estimating causal effects of a treatment at the individual or subgroup levels. Over the last years, causality and uplift modeling have become key trends in personalization at online e-commerce platforms, enabling the selection of the best treatment for each user in order to maximize the target business metric. Uplift modeling can b… ▽ More Uplift modeling is a collection of machine learning techniques for estimating causal effects of a treatment at the individual or subgroup levels. Over the last years, causality and uplift modeling have become key trends in personalization at online e-commerce platforms, enabling the selection of the best treatment for each user in order to maximize the target business metric. Uplift modeling can be particularly useful for personalized promotional campaigns, where the potential benefit caused by a promotion needs to be weighed against the potential costs. In this tutorial we will cover basic concepts of causality and introduce the audience to state-of-the-art techniques in uplift modeling. We will discuss the advantages and the limitations of different approaches and dive into the unique setup of constrained uplift modeling. Finally, we will present real-life applications and discuss challenges in implementing these models in production. △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2304.05342 [pdf, other]

TT-SDF2PC: Registration of Point Cloud and Compressed SDF Directly in the Memory-Efficient Tensor Train Domain

Authors: Alexey I. Boyko, Anastasiia Kornilova, Rahim Tariverdizadeh, Mirfarid Musavian, Larisa Markeeva, Ivan Oseledets, Gonzalo Ferrer

Abstract: This paper addresses the following research question: ``can one compress a detailed 3D representation and use it directly for point cloud registration?''. Map compression of the scene can be achieved by the tensor train (TT) decomposition of the signed distance function (SDF) representation. It regulates the amount of data reduced by the so-called TT-ranks. Using this representation we have prop… ▽ More This paper addresses the following research question: ``can one compress a detailed 3D representation and use it directly for point cloud registration?''. Map compression of the scene can be achieved by the tensor train (TT) decomposition of the signed distance function (SDF) representation. It regulates the amount of data reduced by the so-called TT-ranks. Using this representation we have proposed an algorithm, the TT-SDF2PC, that is capable of directly registering a PC to the compressed SDF by making use of efficient calculations of its derivatives in the TT domain, saving computations and memory. We compare TT-SDF2PC with SOTA local and global registration methods in a synthetic dataset and a real dataset and show on par performance while requiring significantly less resources. △ Less

Submitted 11 April, 2023; originally announced April 2023.

arXiv:2304.01055 [pdf, other]

Eigen-Factors an Alternating Optimization for Back-end Plane SLAM of 3D Point Clouds

Authors: Gonzalo Ferrer, Dmitrii Iarosh, Anastasiia Kornilova

Abstract: Modern depth sensors can generate a huge number of 3D points in few seconds to be latter processed by Localization and Map** algorithms. Ideally, these algorithms should handle efficiently large sizes of Point Clouds under the assumption that using more points implies more information available. The Eigen Factors (EF) is a new algorithm that solves SLAM by using planes as the main geometric prim… ▽ More Modern depth sensors can generate a huge number of 3D points in few seconds to be latter processed by Localization and Map** algorithms. Ideally, these algorithms should handle efficiently large sizes of Point Clouds under the assumption that using more points implies more information available. The Eigen Factors (EF) is a new algorithm that solves SLAM by using planes as the main geometric primitive. To do so, EF exhaustively calculates the error of all points at complexity $O(1)$, thanks to the {\em Summation matrix} $S$ of homogeneous points. The solution of EF is highly efficient: i) the state variables are only the sensor poses -- trajectory, while the plane parameters are estimated previously in closed from and ii) EF alternating optimization uses a Newton-Raphson method by a direct analytical calculation of the gradient and the Hessian, which turns out to be a block diagonal matrix. Since we require to differentiate over eigenvalues and matrix elements, we have developed an intuitive methodology to calculate partial derivatives in the manifold of rigid body transformations $SE(3)$, which could be applied to unrelated problems that require analytical derivatives of certain complexity. We evaluate EF and other state-of-the-art plane SLAM back-end algorithms in a synthetic environment. The evaluation is extended to ICL dataset (RGBD) and LiDAR KITTI dataset. Code is publicly available at https://github.com/prime-slam/EF-plane-SLAM. △ Less

Submitted 4 September, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

arXiv:2303.05162 [pdf, other]

EVOLIN Benchmark: Evaluation of Line Detection and Association

Authors: Kirill Ivanov, Gonzalo Ferrer, Anastasiia Kornilova

Abstract: Lines are interesting geometrical features commonly seen in indoor and urban environments. There is missing a complete benchmark where one can evaluate lines from a sequential stream of images in all its stages: Line detection, Line Association and Pose error. To do so, we present a complete and exhaustive benchmark for visual lines in a SLAM front-end, both for RGB and RGBD, by providing a pletho… ▽ More Lines are interesting geometrical features commonly seen in indoor and urban environments. There is missing a complete benchmark where one can evaluate lines from a sequential stream of images in all its stages: Line detection, Line Association and Pose error. To do so, we present a complete and exhaustive benchmark for visual lines in a SLAM front-end, both for RGB and RGBD, by providing a plethora of complementary metrics. We have also labelled data from well-known SLAM datasets in order to have all in one poses and accurately annotated lines. In particular, we have evaluated 17 line detection algorithms, 5 line associations methods and the resultant pose error for aligning a pair of frames with several combinations of detector-association. We have packaged all methods and evaluations metrics and made them publicly available on web-page https://prime-slam.github.io/evolin/. △ Less

Submitted 31 July, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

arXiv:2303.05123 [pdf, other]

Dominating Set Database Selection for Visual Place Recognition

Authors: Anastasiia Kornilova, Ivan Moskalenko, Timofei Pushkin, Fakhriddin Tojiboev, Rahim Tariverdizadeh, Gonzalo Ferrer

Abstract: This paper presents an approach for creating a visual place recognition (VPR) database for localization in indoor environments from RGBD scanning sequences. The proposed approach is formulated as a minimization problem in terms of dominating set algorithm for graph, constructed from spatial information, and referred as DominatingSet. Our algorithm shows better scene coverage in comparison to other… ▽ More This paper presents an approach for creating a visual place recognition (VPR) database for localization in indoor environments from RGBD scanning sequences. The proposed approach is formulated as a minimization problem in terms of dominating set algorithm for graph, constructed from spatial information, and referred as DominatingSet. Our algorithm shows better scene coverage in comparison to other methodologies that are used for database creation. Also, we demonstrate that using DominatingSet, a database size could be up to 250-1400 times smaller than the original scanning sequence while maintaining a recall rate of more than 80% on testing sequences. We evaluated our algorithm on 7-scenes and BundleFusion datasets and an additionally recorded sequence in a highly repetitive office setting. In addition, the database selection can produce weakly-supervised labels for fine-tuning neural place recognition algorithms to particular settings, improving even more their accuracy. The paper also presents a fully automated pipeline for VPR database creation from RGBD scanning sequences, as well as a set of metrics for VPR database evaluation. The code and released data are available on our web-page~ -- https://prime-slam.github.io/place-recognition-db/ △ Less

Submitted 21 January, 2024; v1 submitted 9 March, 2023; originally announced March 2023.

arXiv:2204.11337 [pdf, other]

An Item Response Theory Framework for Persuasion

Authors: Anastassia Kornilova, Daniel Argyle, Vladimir Eidelman

Abstract: In this paper, we apply Item Response Theory, popular in education and political science research, to the analysis of argument persuasiveness in language. We empirically evaluate the model's performance on three datasets, including a novel dataset in the area of political advocacy. We show the advantages of separating these components under several style and content representations, including eval… ▽ More In this paper, we apply Item Response Theory, popular in education and political science research, to the analysis of argument persuasiveness in language. We empirically evaluate the model's performance on three datasets, including a novel dataset in the area of political advocacy. We show the advantages of separating these components under several style and content representations, including evaluating the ability of the speaker embeddings generated by the model to parallel real-world observations about persuadability. △ Less

Submitted 24 April, 2022; originally announced April 2022.

arXiv:2204.10211 [pdf, other]

SmartPortraits: Depth Powered Handheld Smartphone Dataset of Human Portraits for State Estimation, Reconstruction and Synthesis

Authors: Anastasiia Kornilova, Marsel Faizullin, Konstantin Pakulev, Andrey Sadkov, Denis Kukushkin, Azat Akhmetyanov, Timur Akhtyamov, Hekmat Taherinejad, Gonzalo Ferrer

Abstract: We present a dataset of 1000 video sequences of human portraits recorded in real and uncontrolled conditions by using a handheld smartphone accompanied by an external high-quality depth camera. The collected dataset contains 200 people captured in different poses and locations and its main purpose is to bridge the gap between raw measurements obtained from a smartphone and downstream applications,… ▽ More We present a dataset of 1000 video sequences of human portraits recorded in real and uncontrolled conditions by using a handheld smartphone accompanied by an external high-quality depth camera. The collected dataset contains 200 people captured in different poses and locations and its main purpose is to bridge the gap between raw measurements obtained from a smartphone and downstream applications, such as state estimation, 3D reconstruction, view synthesis, etc. The sensors employed in data collection are the smartphone's camera and Inertial Measurement Unit (IMU), and an external Azure Kinect DK depth camera software synchronized with sub-millisecond precision to the smartphone system. During the recording, the smartphone flash is used to provide a periodic secondary source of lightning. Accurate mask of the foremost person is provided as well as its impact on the camera alignment accuracy. For evaluation purposes, we compare multiple state-of-the-art camera alignment methods by using a Motion Capture system. We provide a smartphone visual-inertial benchmark for portrait capturing, where we report results for multiple methods and motivate further use of the provided trajectories, available in the dataset, in view synthesis and 3D reconstruction tasks. △ Less

Submitted 21 April, 2022; originally announced April 2022.

Comments: Accepted to CVPR'2022

arXiv:2204.05799 [pdf, other]

EVOPS Benchmark: Evaluation of Plane Segmentation from RGBD and LiDAR Data

Authors: Anastasiia Kornilova, Dmitrii Iarosh, Denis Kukushkin, Nikolai Goncharov, Pavel Mokeev, Arthur Saliou, Gonzalo Ferrer

Abstract: This paper provides the EVOPS dataset for plane segmentation from 3D data, both from RGBD images and LiDAR point clouds. We have designed two annotation methodologies (RGBD and LiDAR) running on well-known and widely-used datasets for SLAM evaluation and we have provided a complete set of benchmarking tools including point, planes and segmentation metrics. The data includes a total number of 10k R… ▽ More This paper provides the EVOPS dataset for plane segmentation from 3D data, both from RGBD images and LiDAR point clouds. We have designed two annotation methodologies (RGBD and LiDAR) running on well-known and widely-used datasets for SLAM evaluation and we have provided a complete set of benchmarking tools including point, planes and segmentation metrics. The data includes a total number of 10k RGBD and 7K LiDAR frames over different selected scenes which consist of high quality segmented planes. The experiments report quality of SOTA methods for RGBD plane segmentation on our annotated data. We also have provided learnable baseline for plane segmentation in LiDAR point clouds. All labeled data and benchmark tools used have been made publicly available at https://evops.netlify.app/. △ Less

Submitted 24 August, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

Comments: Accepted to IROS'2022

arXiv:2111.03552 [pdf, other]

doi 10.1109/JSEN.2022.3150973

SmartDepthSync: Open Source Synchronized Video Recording System of Smartphone RGB and Depth Camera Range Image Frames with Sub-millisecond Precision

Authors: Marsel Faizullin, Anastasiia Kornilova, Azat Akhmetyanov, Konstantin Pakulev, Andrey Sadkov, Gonzalo Ferrer

Abstract: Nowadays, smartphones can produce a synchronized (synced) stream of high-quality data, including RGB images, inertial measurements, and other data. Therefore, smartphones are becoming appealing sensor systems in the robotics community. Unfortunately, there is still the need for external supporting sensing hardware, such as a depth camera precisely synced with the smartphone sensors. In this pape… ▽ More Nowadays, smartphones can produce a synchronized (synced) stream of high-quality data, including RGB images, inertial measurements, and other data. Therefore, smartphones are becoming appealing sensor systems in the robotics community. Unfortunately, there is still the need for external supporting sensing hardware, such as a depth camera precisely synced with the smartphone sensors. In this paper, we propose a hardware-software recording system that presents a heterogeneous structure and contains a smartphone and an external depth camera for recording visual, depth, and inertial data that are mutually synchronized. The system is synced at the time and the frame levels: every RGB image frame from the smartphone camera is exposed at the same moment of time with a depth camera frame with sub-millisecond precision. We provide a method and a tool for sync performance evaluation that can be applied to any pair of depth and RGB cameras. Our system could be replicated, modified, or extended by employing our open-sourced materials. △ Less

Submitted 13 September, 2022; v1 submitted 5 November, 2021; originally announced November 2021.

Comments: IEEE Sensors Journal paper

arXiv:2107.02625 [pdf, other]

Open-Source LiDAR Time Synchronization System by Mimicking GNSS-clock

Authors: Marsel Faizullin, Anastasiia Kornilova, Gonzalo Ferrer

Abstract: Data fusion algorithms that employ LiDAR measurements, such as Visual-LiDAR, LiDAR-Inertial, or Multiple LiDAR Odometry and simultaneous localization and map** (SLAM) rely on precise timestam** schemes that grant synchronicity to data from LiDAR and other sensors. Poor synchronization performance, due to incorrect timestam** procedure, may negatively affect the algorithms' state estimation r… ▽ More Data fusion algorithms that employ LiDAR measurements, such as Visual-LiDAR, LiDAR-Inertial, or Multiple LiDAR Odometry and simultaneous localization and map** (SLAM) rely on precise timestam** schemes that grant synchronicity to data from LiDAR and other sensors. Poor synchronization performance, due to incorrect timestam** procedure, may negatively affect the algorithms' state estimation results. To provide highly accurate and precise synchronization between the sensors, we introduce an open-source hardware-software LiDAR to other sensors time synchronization system that exploits a dedicated hardware LiDAR time synchronization interface by providing emulated GNSS-clock to this interface, no physical GNSS-receiver is needed. The emulator is based on a general-purpose microcontroller and, due to concise hardware and software architecture, can be easily modified or extended for synchronization of sets of different sensors such as cameras, inertial measurement units (IMUs), wheel encoders, other LiDARs, etc. In the paper, we provide an example of such a system with synchronized LiDAR and IMU sensors. We conducted an evaluation of the sensors synchronization accuracy and precision, and state 1 microsecond performance. We compared our results with timestam** provided by ROS software and by a LiDAR inner clocking scheme to underline clear advantages over these two baseline methods. △ Less

Submitted 13 September, 2022; v1 submitted 6 July, 2021; originally announced July 2021.

Comments: Accepted to IEEE ISPCS 2022 Conference (International Symposium on Precision Clock Synchronization for Measurement, Control and Communication)

arXiv:2107.00987 [pdf, other]

Sub-millisecond Video Synchronization of Multiple Android Smartphones

Authors: Azat Akhmetyanov, Anastasiia Kornilova, Marsel Faizullin, David Pozo, Gonzalo Ferrer

Abstract: This paper addresses the problem of building an affordable easy-to-setup synchronized multi-view camera system, which is in demand for many Computer Vision and Robotics applications in high-dynamic environments. In our work, we propose a solution for this problem -- a publicly-available Android application for synchronized video recording on multiple smartphones with sub-millisecond accuracy. We p… ▽ More This paper addresses the problem of building an affordable easy-to-setup synchronized multi-view camera system, which is in demand for many Computer Vision and Robotics applications in high-dynamic environments. In our work, we propose a solution for this problem -- a publicly-available Android application for synchronized video recording on multiple smartphones with sub-millisecond accuracy. We present a generalized mathematical model of timestam** for Android smartphones and prove its applicability on 47 different physical devices. Also, we estimate the time drift parameter for those smartphones, which is less than 1.2 msec per minute for most of the considered devices, that makes smartphones' camera system a worthy analog for professional multi-view systems. Finally, we demonstrate Android-app performance on the camera system built from Android smartphones quantitatively on setup with lights and qualitatively -- on panorama stitching task. △ Less

Submitted 26 August, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

Comments: Accepted to conference IEEE Sensors'2021 as Lecture presentation

arXiv:2106.11351 [pdf, other]

doi 10.1109/ECMR50962.2021.9568822

Be your own Benchmark: No-Reference Trajectory Metric on Registered Point Clouds

Authors: Anastasiia Kornilova, Gonzalo Ferrer

Abstract: This paper addresses the problem of assessing trajectory quality in conditions when no ground truth poses are available or when their accuracy is not enough for the specific task - for example, small-scale map** in outdoor scenes. In our work, we propose a no-reference metric, Mutually Orthogonal Metric (MOM), that estimates the quality of the map from registered point clouds via the trajectory… ▽ More This paper addresses the problem of assessing trajectory quality in conditions when no ground truth poses are available or when their accuracy is not enough for the specific task - for example, small-scale map** in outdoor scenes. In our work, we propose a no-reference metric, Mutually Orthogonal Metric (MOM), that estimates the quality of the map from registered point clouds via the trajectory poses. MOM strongly correlates with full-reference trajectory metric Relative Pose Error, making it a trajectory benchmarking tool on setups where 3D sensing technologies are employed. We provide a mathematical foundation for such correlation and confirm it statistically in synthetic environments. Furthermore, since our metric uses a subset of points from mutually orthogonal surfaces, we provide an algorithm for the extraction of such subset and evaluate its performance in synthetic CARLA environment and on KITTI dataset. The code of the proposed metric is publicly available as pip-package. △ Less

Submitted 12 August, 2021; v1 submitted 21 June, 2021; originally announced June 2021.

Comments: Accepted for the 10th European Conference on Mobile Robots (ECMR 2021)

arXiv:2105.11179 [pdf, other]

doi 10.1007/978-3-030-89880-9_46

Smart mobile microscopy: towards fully-automated digitization

Authors: A. Kornilova, I. Kirilenko, D. Iarosh, V. Kutuev, M. Strutovsky

Abstract: Mobile microscopy is a newly formed field that emerged from a combination of optical microscopy capabilities and spread, functionality, and ever-increasing computing resources of mobile devices. Despite the idea of creating a system that would successfully merge a microscope, numerous computer vision methods, and a mobile device is regularly examined, the resulting implementations still require th… ▽ More Mobile microscopy is a newly formed field that emerged from a combination of optical microscopy capabilities and spread, functionality, and ever-increasing computing resources of mobile devices. Despite the idea of creating a system that would successfully merge a microscope, numerous computer vision methods, and a mobile device is regularly examined, the resulting implementations still require the presence of a qualified operator to control specimen digitization. In this paper, we address the task of surpassing this constraint and present a ``smart'' mobile microscope concept aimed at automatic digitization of the most valuable visual information about the specimen. We perform this through combining automated microscope setup control and classic techniques such as auto-focusing, in-focus filtering, and focus-stacking -- adapted and optimized as parts of a mobile cross-platform library. △ Less

Submitted 24 May, 2021; originally announced May 2021.

arXiv:2101.10737 [pdf, other]

doi 10.1145/3437963.3441812

Mining the Stars: Learning Quality Ratings with User-facing Explanations for Vacation Rentals

Authors: Anastasiia Kornilova, Lucas Bernardi

Abstract: Online Travel Platforms are virtual two-sided marketplaces where guests search for accommodations and accommodation providers list their properties such as hotels and vacation rentals. The large majority of hotels are rated by official institutions with a number of stars indicating the quality of service they provide. It is a simple and effective mechanism that contributes to match supply with dem… ▽ More Online Travel Platforms are virtual two-sided marketplaces where guests search for accommodations and accommodation providers list their properties such as hotels and vacation rentals. The large majority of hotels are rated by official institutions with a number of stars indicating the quality of service they provide. It is a simple and effective mechanism that contributes to match supply with demand by hel** guests to find options meeting their criteria and accommodation suppliers to market their product to the right segment directly impacting the number of transactions on the platform. Unfortunately, no similar rating system exists for the large majority of vacation rentals, making it difficult for guests to search and compare options and hard for vacation rentals suppliers to market their product effectively. In this work we describe a machine learned quality rating system for vacation rentals. The problem is challenging, mainly due to explainability requirements and the lack of ground truth. We present techniques to address these challenges and empirical evidence of their efficacy. Our system was successfully deployed and validated through Online Controlled Experiments performed in Booking. com, a large Online Travel Platform, and running for more than one year, impacting more than a million accommodations and millions of guests. △ Less

Submitted 27 January, 2021; v1 submitted 26 January, 2021; originally announced January 2021.

Comments: 8 pages

arXiv:2007.13701 [pdf, other]

doi 10.1109/ISBI48211.2021.9434133

Deep learning Framework for Mobile Microscopy

Authors: Anatasiia Kornilova, Mikhail Salnikov, Olga Novitskaya, Maria Begicheva, Egor Sevriugov, Kirill Shcherbakov, Valeriya Pronina, Dmitry V. Dylov

Abstract: Mobile microscopy is a promising technology to assist and to accelerate disease diagnostics, with its widespread adoption being hindered by the mediocre quality of acquired images. Although some paired image translation and super-resolution approaches for mobile microscopy have emerged, a set of essential challenges, necessary for automating it in a high-throughput setting, still await to be addre… ▽ More Mobile microscopy is a promising technology to assist and to accelerate disease diagnostics, with its widespread adoption being hindered by the mediocre quality of acquired images. Although some paired image translation and super-resolution approaches for mobile microscopy have emerged, a set of essential challenges, necessary for automating it in a high-throughput setting, still await to be addressed. The issues like in-focus/out-of-focus classification, fast scanning deblurring, focus-stacking, etc. -- all have specific peculiarities when the data are recorded using a mobile device. In this work, we aspire to create a comprehensive pipeline by connecting a set of methods purposely tuned to mobile microscopy: (1) a CNN model for stable in-focus / out-of-focus classification, (2) modified DeblurGAN architecture for image deblurring, (3) FuseGAN model for combining in-focus parts from multiple images to boost the detail. We discuss the limitations of the existing solutions developed for professional clinical microscopes, propose corresponding improvements, and compare to the other state-of-the-art mobile analytics solutions. △ Less

Submitted 18 February, 2021; v1 submitted 27 July, 2020; originally announced July 2020.

arXiv:1910.00523 [pdf, other]

doi 10.18653/v1/D19-5406

BillSum: A Corpus for Automatic Summarization of US Legislation

Authors: Anastassia Kornilova, Vlad Eidelman

Abstract: Automatic summarization methods have been studied on a variety of domains, including news and scientific articles. Yet, legislation has not previously been considered for this task, despite US Congress and state governments releasing tens of thousands of bills every year. In this paper, we introduce BillSum, the first dataset for summarization of US Congressional and California state bills (https:… ▽ More Automatic summarization methods have been studied on a variety of domains, including news and scientific articles. Yet, legislation has not previously been considered for this task, despite US Congress and state governments releasing tens of thousands of bills every year. In this paper, we introduce BillSum, the first dataset for summarization of US Congressional and California state bills (https://github.com/FiscalNote/BillSum). We explain the properties of the dataset that make it more challenging to process than other domains. Then, we benchmark extractive methods that consider neural sentence representations and traditional contextual features. Finally, we demonstrate that models built on Congressional bills can be used to summarize California bills, thus, showing that methods developed on this dataset can transfer to states without human-written summaries. △ Less

Submitted 3 December, 2019; v1 submitted 1 October, 2019; originally announced October 2019.

arXiv:1806.05284 [pdf, other]

How Predictable is Your State? Leveraging Lexical and Contextual Information for Predicting Legislative Floor Action at the State Level

Authors: Vlad Eidelman, Anastassia Kornilova, Daniel Argyle

Abstract: Modeling U.S. Congressional legislation and roll-call votes has received significant attention in previous literature. However, while legislators across 50 state governments and D.C. propose over 100,000 bills each year, and on average enact over 30% of them, state level analysis has received relatively less attention due in part to the difficulty in obtaining the necessary data. Since each state… ▽ More Modeling U.S. Congressional legislation and roll-call votes has received significant attention in previous literature. However, while legislators across 50 state governments and D.C. propose over 100,000 bills each year, and on average enact over 30% of them, state level analysis has received relatively less attention due in part to the difficulty in obtaining the necessary data. Since each state legislature is guided by their own procedures, politics and issues, however, it is difficult to qualitatively asses the factors that affect the likelihood of a legislative initiative succeeding. Herein, we present several methods for modeling the likelihood of a bill receiving floor action across all 50 states and D.C. We utilize the lexical content of over 1 million bills, along with contextual legislature and legislator derived features to build our predictive models, allowing a comparison of the factors that are important to the lawmaking process. Furthermore, we show that these signals hold complementary predictive power, together achieving an average improvement in accuracy of 18% over state specific baselines. △ Less

Submitted 13 June, 2018; originally announced June 2018.

Comments: In Proceedings of COLING 2018

arXiv:1805.08182 [pdf, other]

Party Matters: Enhancing Legislative Embeddings with Author Attributes for Vote Prediction

Authors: Anastassia Kornilova, Daniel Argyle, Vlad Eidelman

Abstract: Predicting how Congressional legislators will vote is important for understanding their past and future behavior. However, previous work on roll-call prediction has been limited to single session settings, thus did not consider generalization across sessions. In this paper, we show that metadata is crucial for modeling voting outcomes in new contexts, as changes between sessions lead to changes in… ▽ More Predicting how Congressional legislators will vote is important for understanding their past and future behavior. However, previous work on roll-call prediction has been limited to single session settings, thus did not consider generalization across sessions. In this paper, we show that metadata is crucial for modeling voting outcomes in new contexts, as changes between sessions lead to changes in the underlying data generation process. We show how augmenting bill text with the sponsors' ideologies in a neural network model can achieve an average of a 4% boost in accuracy over the previous state-of-the-art. △ Less

Submitted 21 May, 2018; originally announced May 2018.

Showing 1–22 of 22 results for author: Kornilova, A