Skip to main content

Showing 1–46 of 46 results for author: Nikolenko, S

.
  1. arXiv:2407.01108  [pdf

    cond-mat.mtrl-sci physics.app-ph

    Anomalous Behavior of the Dielectric and Pyroelectric Responses of Ferroelectric Fine-Grained Ceramics

    Authors: Oleksandr S. Pylypchuk, Serhii E. Ivanchenko, Mykola Y. Yelisieiev, Andrii S. Nikolenko, Victor I. Styopkin, Bohdan Pokhylko, Vladyslav Kushnir, Denis O. Stetsenko, Oleksii Bereznykov, Oksana V. Leschenko, Eugene A. Eliseev, Vladimir N. Poroshin, Nicholas V. Morozovsky, Victor V. Vainberg, Anna N. Morozovska

    Abstract: We revealed the anomalous temperature behavior of the giant dielectric permittivity and unusual frequency dependences of the pyroelectric response of the fine-grained ceramics prepared by the spark plasma sintering of the ferroelectric BaTiO3 nanoparticles. The temperature dependences of the electro-resistivity indicate the frequency-dependent transition in the electro-transport mechanisms between… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 34 pages including 12 figures and 2 Appendixes

  2. arXiv:2406.15035  [pdf, other

    cs.CV

    Improving Interpretability and Robustness for the Detection of AI-Generated Images

    Authors: Tatiana Gaintseva, Laida Kushnareva, German Magai, Irina Piontkovskaya, Sergey Nikolenko, Martin Benning, Serguei Barannikov, Gregory Slabaugh

    Abstract: With growing abilities of generative models, artificial content detection becomes an increasingly important and difficult task. However, all popular approaches to this problem suffer from poor generalization across domains and generative models. In this work, we focus on the robustness of AI-generated image (AIGI) detectors. We analyze existing state-of-the-art AIGI detection methods based on froz… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  3. arXiv:2406.14347  [pdf, other

    physics.chem-ph cs.LG stat.ML

    $\nabla^2$DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials

    Authors: Kuzma Khrabrov, Anton Ber, Artem Tsypin, Konstantin Ushenin, Egor Rumiantsev, Alexander Telepov, Dmitry Protasov, Ilya Shenbin, Anton Alekseev, Mikhail Shirokikh, Sergey Nikolenko, Elena Tutubalina, Artur Kadurin

    Abstract: Methods of computational quantum chemistry provide accurate approximations of molecular properties crucial for computer-aided drug discovery and other areas of chemical science. However, high computational complexity limits the scalability of their applications. Neural network potentials (NNPs) are a promising alternative to quantum chemistry methods, but they require large and diverse datasets fo… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  4. arXiv:2406.00198  [pdf, other

    cs.IR cs.LG

    ImplicitSLIM and How it Improves Embedding-based Collaborative Filtering

    Authors: Ilya Shenbin, Sergey Nikolenko

    Abstract: We present ImplicitSLIM, a novel unsupervised learning approach for sparse high-dimensional data, with applications to collaborative filtering. Sparse linear methods (SLIM) and their variations show outstanding performance, but they are memory-intensive and hard to scale. ImplicitSLIM improves embedding-based models by extracting embeddings from SLIM-like models in a computationally cheap and memo… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: Published as a conference paper at ICLR 2024; authors' version

  5. arXiv:2311.11813  [pdf, other

    cs.CL

    Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule

    Authors: Andrey Bout, Alexander Podolskiy, Sergey Nikolenko, Irina Piontkovskaya

    Abstract: Progress in neural grammatical error correction (GEC) is hindered by the lack of annotated training data. Sufficient amounts of high-quality manually annotated data are not available, so recent research has relied on generating synthetic data, pretraining on it, and then fine-tuning on real datasets; performance gains have been achieved either by ensembling or by using huge pretrained models such… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023

  6. arXiv:2311.08349  [pdf, other

    cs.CL

    AI-generated text boundary detection with RoFT

    Authors: Laida Kushnareva, Tatiana Gaintseva, German Magai, Serguei Barannikov, Dmitry Abulkhanov, Kristian Kuznetsov, Eduard Tulchinskii, Irina Piontkovskaya, Sergey Nikolenko

    Abstract: Due to the rapid development of large language models, people increasingly often encounter texts that may start as written by a human but continue as machine-generated. Detecting the boundary between human-written and machine-generated parts of such texts is a challenging problem that has not received much attention in literature. We attempt to bridge this gap and examine several ways to adapt sta… ▽ More

    Submitted 2 April, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

  7. arXiv:2311.08191  [pdf, other

    cs.CL

    GEC-DePenD: Non-Autoregressive Grammatical Error Correction with Decoupled Permutation and Decoding

    Authors: Konstantin Yakovlev, Alexander Podolskiy, Andrey Bout, Sergey Nikolenko, Irina Piontkovskaya

    Abstract: Grammatical error correction (GEC) is an important NLP task that is currently usually solved with autoregressive sequence-to-sequence models. However, approaches of this class are inherently slow due to one-by-one token generation, so non-autoregressive alternatives are needed. In this work, we propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation ne… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: ACL 2023

  8. arXiv:2311.08143  [pdf, other

    cs.CL

    Sinkhorn Transformations for Single-Query Postprocessing in Text-Video Retrieval

    Authors: Konstantin Yakovlev, Gregory Polyakov, Ilseyar Alimova, Alexander Podolskiy, Andrey Bout, Sergey Nikolenko, Irina Piontkovskaya

    Abstract: A recent trend in multimodal retrieval is related to postprocessing test set results via the dual-softmax loss (DSL). While this approach can bring significant improvements, it usually presumes that an entire matrix of test samples is available as DSL input. This work introduces a new postprocessing approach based on Sinkhorn transformations that outperforms DSL. Further, we propose a new postproc… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: SIGIR 2023

  9. arXiv:2310.06059  [pdf, other

    cs.LG math.DS

    Early Warning Prediction with Automatic Labeling in Epilepsy Patients

    Authors: Peng Zhang, Ting Gao, ** Guo, **qiao Duan, Sergey Nikolenko

    Abstract: Early warning for epilepsy patients is crucial for their safety and well-being, in particular to prevent or minimize the severity of seizures. Through the patients' EEG data, we propose a meta learning framework to improve the prediction of early ictal signals. The proposed bi-level optimization framework can help automatically label noisy data at the early ictal stage, as well as optimize the tra… ▽ More

    Submitted 11 January, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 13 pages,4 figures

  10. arXiv:2308.15952  [pdf, ps, other

    cs.CL

    Benchmarking Multilabel Topic Classification in the Kyrgyz Language

    Authors: Anton Alekseev, Sergey I. Nikolenko, Gulnara Kabaeva

    Abstract: Kyrgyz is a very underrepresented language in terms of modern natural language processing resources. In this work, we present a new public benchmark for topic classification in Kyrgyz, introducing a dataset based on collected and annotated data from the news site 24.KG and presenting several baseline models for news classification in the multilabel setting. We train and evaluate both classical sta… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

    Comments: Accepted to AIST 2023

  11. arXiv:2307.09141  [pdf, other

    cs.AI

    Machine Learning for SAT: Restricted Heuristics and New Graph Representations

    Authors: Mikhail Shirokikh, Ilya Shenbin, Anton Alekseev, Sergey Nikolenko

    Abstract: Boolean satisfiability (SAT) is a fundamental NP-complete problem with many applications, including automated planning and scheduling. To solve large instances, SAT solvers have to rely on heuristics, e.g., choosing a branching variable in DPLL and CDCL solvers. Such heuristics can be improved with machine learning (ML) models; they can reduce the number of steps but usually hinder the running tim… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  12. arXiv:2306.04723  [pdf, other

    cs.CL cs.AI cs.LG math.AT

    Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts

    Authors: Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Serguei Barannikov, Irina Piontkovskaya, Sergey Nikolenko, Evgeny Burnaev

    Abstract: Rapidly increasing quality of AI-generated content makes it difficult to distinguish between human and AI-generated texts, which may lead to undesirable consequences for society. Therefore, it becomes increasingly important to study the properties of human texts that are invariant over different text domains and varying proficiency of human writers, can be easily calculated for any language, and c… ▽ More

    Submitted 31 October, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    MSC Class: 68T50

  13. arXiv:2305.11626  [pdf, other

    cs.CL cs.SE

    CCT-Code: Cross-Consistency Training for Multilingual Clone Detection and Code Search

    Authors: Nikita Sorokin, Dmitry Abulkhanov, Sergey Nikolenko, Valentin Malykh

    Abstract: We consider the clone detection and information retrieval problems for source code, well-known tasks important for any programming language. Although it is also an important and interesting problem to find code snippets that operate identically but are written in different programming languages, to the best of our knowledge multilingual clone detection has not been studied in literature. In this w… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  14. arXiv:2305.11625  [pdf, other

    cs.CL cs.SE

    Searching by Code: a New SearchBySnippet Dataset and SnippeR Retrieval Model for Searching by Code Snippets

    Authors: Ivan Sedykh, Dmitry Abulkhanov, Nikita Sorokin, Sergey Nikolenko, Valentin Malykh

    Abstract: Code search is an important and well-studied task, but it usually means searching for code by a text query. We argue that using a code snippet (and possibly an error traceback) as a query while looking for bugfixing instructions and code samples is a natural use case not covered by prior art. Moreover, existing datasets use code comments rather than full-text descriptions as text, making them unsu… ▽ More

    Submitted 27 May, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: COLING 2024

  15. arXiv:2304.13393  [pdf, other

    cs.IR cs.CV

    STIR: Siamese Transformer for Image Retrieval Postprocessing

    Authors: Aleksei Shabanov, Aleksei Tarasov, Sergey Nikolenko

    Abstract: Current metric learning approaches for image retrieval are usually based on learning a space of informative latent representations where simple approaches such as the cosine distance will work well. Recent state of the art methods such as HypViT move to more complex embedding spaces that may yield better results but are harder to scale to production environments. In this work, we first construct a… ▽ More

    Submitted 27 April, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: 14 pages, 3 figures

    ACM Class: H.3.3

  16. arXiv:2211.17223  [pdf, other

    cs.SD cs.CL cs.LG eess.AS math.AT

    Topological Data Analysis for Speech Processing

    Authors: Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Serguei Barannikov, Irina Piontkovskaya, Sergey Nikolenko, Evgeny Burnaev

    Abstract: We apply topological data analysis (TDA) to speech classification problems and to the introspection of a pretrained speech model, HuBERT. To this end, we introduce a number of topological and algebraic features derived from Transformer attention maps and embeddings. We show that a simple linear classifier built on top of such features outperforms a fine-tuned classification head. In particular, we… ▽ More

    Submitted 6 June, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: Accepted to INTERSPEECH 2023 conference

    Journal ref: Proc. INTERSPEECH 2023, pages 311--315

  17. Personality-Driven Social Multimedia Content Recommendation

    Authors: Qi Yang, Sergey Nikolenko, Alfred Huang, Aleksandr Farseev

    Abstract: Social media marketing plays a vital role in promoting brand and product values to wide audiences. In order to boost their advertising revenues, global media buying platforms such as Facebook Ads constantly reduce the reach of branded organic posts, pushing brands to spend more on paid media ads. In order to run organic and paid social media marketing efficiently, it is necessary to understand the… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

  18. arXiv:2206.12514  [pdf, other

    cs.CL

    DetIE: Multilingual Open Information Extraction Inspired by Object Detection

    Authors: Michael Vasilkovsky, Anton Alekseev, Valentin Malykh, Ilya Shenbin, Elena Tutubalina, Dmitriy Salikhov, Mikhail Stepnov, Andrey Chertok, Sergey Nikolenko

    Abstract: State of the art neural methods for open information extraction (OpenIE) usually extract triplets (or tuples) iteratively in an autoregressive or predicate-based manner in order not to produce duplicates. In this work, we propose a different approach to the problem that can be equally or more successful. Namely, we present a novel single-pass method for OpenIE inspired by object detection algorith… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

    Comments: Accepted to the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

  19. Near-Zero-Shot Suggestion Mining with a Little Help from WordNet

    Authors: Anton Alekseev, Elena Tutubalina, Sejeong Kwon, Sergey Nikolenko

    Abstract: In this work, we explore the constructive side of online reviews: advice, tips, requests, and suggestions that users provide about goods, venues, services, and other items of interest. To reduce training costs and annotation efforts needed to build a classifier for a specific label set, we present and evaluate several entailment-based zero-shot approaches to suggestion classification in a label-fu… ▽ More

    Submitted 25 November, 2021; originally announced November 2021.

    Comments: Accepted to the 10th International Conference on Analysis of Images, Social Networks and Texts (AIST 2021)

    Journal ref: Analysis of Images, Social Networks and Texts. AIST 2021. Lecture Notes in Computer Science, vol 13217. Springer, Cham

  20. arXiv:2102.06268  [pdf, other

    cond-mat.mtrl-sci cond-mat.mes-hall physics.app-ph physics.chem-ph

    Paraelectric KH$_2$PO$_4$ Nanocrystals in Monolithic Mesoporous Silica: Structure and Lattice Dynamics

    Authors: Yaroslav Shchur, Andriy V. Kityk, Viktor V. Strelchuk, Andrii S. Nikolenko, Nazariy A. Andrushchak, Patrick Huber, Anatolii S. Andrushchak

    Abstract: Combining dielectric crystals with mesoporous solids allows a versatile design of functional nanomaterials, where the porous host provides a mechanical rigid scaffold structure and the molecular filling adds the functionalization. Here, we report a study of the complex lattice dynamics of a SiO$_2$:KH$_2$PO$_4$ nanocomposite consisting of a monolithic, mesoporous silica glass host with KH$_2$PO… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

    Comments: 10 pages, 4 figures, in press

    Journal ref: Journal of Alloys and Compounds (2021)

  21. arXiv:2009.12419  [pdf, other

    cs.CV cs.LG eess.IV

    Towards General Purpose Geometry-Preserving Single-View Depth Estimation

    Authors: Mikhail Romanov, Nikolay Patatkin, Anna Vorontsova, Sergey Nikolenko, Anton Konushin, Dmitry Senyushkin

    Abstract: Single-view depth estimation (SVDE) plays a crucial role in scene understanding for AR applications, 3D modeling, and robotics, providing the geometry of a scene based on a single image. Recent works have shown that a successful solution strongly relies on the diversity and volume of training data. This data can be sourced from stereo movies and photos. However, they do not provide geometrically c… ▽ More

    Submitted 9 February, 2021; v1 submitted 25 September, 2020; originally announced September 2020.

  22. Improving unsupervised neural aspect extraction for online discussions using out-of-domain classification

    Authors: Anton Alekseev, Elena Tutubalina, Valentin Malykh, Sergey Nikolenko

    Abstract: Deep learning architectures based on self-attention have recently achieved and surpassed state of the art results in the task of unsupervised aspect extraction and topic modeling. While models such as neural attention-based aspect extraction (ABAE) have been successfully applied to user-generated texts, they are less coherent when applied to traditional data sources such as news articles and newsg… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

    Comments: Journal of Intelligent & Fuzzy Systems, pre-press, https://content.iospress.com/articles/journal-of-intelligent-and-fuzzy-systems/ifs179908

  23. The Russian Drug Reaction Corpus and Neural Models for Drug Reactions and Effectiveness Detection in User Reviews

    Authors: Elena Tutubalina, Ilseyar Alimova, Zulfat Miftahutdinov, Andrey Sakhovskiy, Valentin Malykh, Sergey Nikolenko

    Abstract: The Russian Drug Reaction Corpus (RuDReC) is a new partially annotated corpus of consumer reviews in Russian about pharmaceutical products for the detection of health-related named entities and the effectiveness of pharmaceutical products. The corpus itself consists of two parts, the raw one and the labelled one. The raw part includes 1.4 million health-related user-generated texts collected from… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

    Comments: 9 pages, 9 tables, 4 figures

    Journal ref: Bioinformatics, 2020

  24. arXiv:2003.08791  [pdf, other

    cs.CV cs.LG eess.IV

    High-Resolution Daytime Translation Without Domain Labels

    Authors: Ivan Anokhin, Pavel Solovev, Denis Korzhenkov, Alexey Kharlamov, Taras Khakhulin, Alexey Silvestrov, Sergey Nikolenko, Victor Lempitsky, Gleb Sterkin

    Abstract: Modeling daytime changes in high resolution photographs, e.g., re-rendering the same scene under different illuminations typical for day, night, or dawn, is a challenging image manipulation task. We present the high-resolution daytime translation (HiDT) model for this task. HiDT combines a generative image-to-image model and a new upsampling scheme that allows to apply image translation at high re… ▽ More

    Submitted 23 March, 2020; v1 submitted 19 March, 2020; originally announced March 2020.

    Comments: accepted to CVPR 2020

  25. arXiv:2002.08130  [pdf

    cond-mat.mes-hall

    KH2PO4 + Host Matrix (Alumina / SiO$_2$) Nanocomposite: Raman Scattering Insight

    Authors: Ya. Shchur, A. S. Andrushchak, V. V. Strelchuk, A. S. Nikolenko, V. T. Adamiv, N. A. Andrushchak, P. Göring, P. Huber, A. V. Kityk

    Abstract: We report on the synthesis and Raman scattering characterization of composite materials based on the hostnanoporous matrices filled with nanostructured KH2PO4 (KDP) crystal. Silica (SiO2) and anodized aluminium oxide (AAO) were used as host matrices with various pore diameters, inter-pore spacing and morphology. The structure of the nanocomposites was investigated by X-ray diffraction and scanning… ▽ More

    Submitted 19 February, 2020; originally announced February 2020.

    Comments: 4 pages, 2 figures; 21st International Conference on Transparent Optical Networks (ICTON) 2019

  26. RecVAE: a New Variational Autoencoder for Top-N Recommendations with Implicit Feedback

    Authors: Ilya Shenbin, Anton Alekseev, Elena Tutubalina, Valentin Malykh, Sergey I. Nikolenko

    Abstract: Recent research has shown the advantages of using autoencoders based on deep neural networks for collaborative filtering. In particular, the recently proposed Mult-VAE model, which used the multinomial likelihood variational autoencoders, has shown excellent results for top-N recommendations. In this work, we propose the Recommender VAE (RecVAE) model that originates from our research on regulariz… ▽ More

    Submitted 23 December, 2019; originally announced December 2019.

    Comments: In The Thirteenth ACM International Conference on Web Search and Data Mining (WSDM '20), February 3-7, 2020, Houston, TX, USA. ACM, New York, NY, USA, 9 pages

  27. arXiv:1909.11512  [pdf, other

    cs.LG cs.CR cs.CV

    Synthetic Data for Deep Learning

    Authors: Sergey I. Nikolenko

    Abstract: Synthetic data is an increasingly popular tool for training deep learning models, especially in computer vision but also in other areas. In this work, we attempt to provide a comprehensive survey of the various directions in the development and application of synthetic data. First, we discuss synthetic datasets for basic computer vision problems, both low-level (e.g., optical flow estimation) and… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

    Comments: 156 pages, 24 figures, 719 references

  28. arXiv:1908.07069  [pdf, other

    cs.IR cs.CL cs.LG

    CommentsRadar: Dive into Unique Data on All Comments on the Web

    Authors: Sergey Nikolenko, Elena Tutubalina, Zulfat Miftahutdinov, Eugene Beloded

    Abstract: We introduce an entity-centric search engineCommentsRadarthatpairs entity queries with articles and user opinions covering a widerange of topics from top commented sites. The engine aggregatesarticles and comments for these articles, extracts named entities,links them together and with knowledge base entries, performssentiment analysis, and aggregates the results, aiming to mine fortemporal trends… ▽ More

    Submitted 16 August, 2019; originally announced August 2019.

  29. arXiv:1908.02511  [pdf, other

    cs.LG cs.AI cs.CV

    Free-Lunch Saliency via Attention in Atari Agents

    Authors: Dmitry Nikulin, Anastasia Ianina, Vladimir Aliev, Sergey Nikolenko

    Abstract: We propose a new approach to visualize saliency maps for deep neural network models and apply it to deep reinforcement learning agents trained on Atari environments. Our method adds an attention module that we call FLS (Free Lunch Saliency) to the feature extractor from an established baseline (Mnih et al., 2015). This addition results in a trainable model that can produce saliency maps, i.e., vis… ▽ More

    Submitted 30 October, 2019; v1 submitted 7 August, 2019; originally announced August 2019.

    Comments: 2019 ICCV Workshop on Interpreting and Explaining Visual Artificial Intelligence Models. 15 pages, 14 figures, 5 tables

  30. arXiv:1907.04399  [pdf, other

    cs.NI cs.DS

    New Competitiveness Bounds for the Shared Memory Switch

    Authors: Ivan Bochkov, Alex Davydow, Nikita Gaevoy, Sergey I. Nikolenko

    Abstract: We consider one of the simplest and best known buffer management architectures: the shared memory switch with multiple output queues and uniform packets. It was one of the first models studied by competitive analysis, with the Longest Queue Drop (LQD) buffer management policy shown to be at least $\sqrt{2}$- and at most $2$-competitive; a general lower bound of $4/3$ has been proven for all determ… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Comments: 23 pages, 8 figures

    MSC Class: 68W27

  31. arXiv:1905.01743  [pdf, other

    eess.IV cs.CV

    Breast Tumor Cellularity Assessment using Deep Neural Networks

    Authors: Alexander Rakhlin, Aleksei Tiulpin, Alexey A. Shvets, Alexandr A. Kalinin, Vladimir I. Iglovikov, Sergey Nikolenko

    Abstract: Breast cancer is one of the main causes of death worldwide. Histopathological cellularity assessment of residual tumors in post-surgical tissues is used to analyze a tumor's response to a therapy. Correct cellularity assessment increases the chances of getting an appropriate treatment and facilitates the patient's survival. In current clinical practice, tumor cellularity is manually estimated by p… ▽ More

    Submitted 3 September, 2019; v1 submitted 5 May, 2019; originally announced May 2019.

  32. arXiv:1901.07829  [pdf, other

    cs.CL cs.AI

    AspeRa: Aspect-based Rating Prediction Model

    Authors: Sergey I. Nikolenko, Elena Tutubalina, Valentin Malykh, Ilya Shenbin, Anton Alekseev

    Abstract: We propose a novel end-to-end Aspect-based Rating Prediction model (AspeRa) that estimates user rating based on review texts for the items and at the same time discovers coherent aspects of reviews that can be used to explain predictions or profile users. The AspeRa model uses max-margin losses for joint item and user embedding learning and a dual-headed architecture; it significantly outperforms… ▽ More

    Submitted 23 January, 2019; originally announced January 2019.

    Comments: accepted to ECIR 2019

  33. arXiv:1901.06345  [pdf, ps, other

    cs.CV

    Adapting Convolutional Neural Networks for Geographical Domain Shift

    Authors: Pavel Ostyakov, Sergey I. Nikolenko

    Abstract: We present the winning solution for the Inclusive Images Competition organized as part of the Conference on Neural Information Processing Systems (NeurIPS 2018) Competition Track. The competition was organized to study ways to cope with domain shift in image processing, specifically geographical shift: the training and two test sets in the competition had different geographical distributions. Our… ▽ More

    Submitted 18 January, 2019; originally announced January 2019.

  34. arXiv:1811.12823  [pdf, other

    cs.LG cs.AI cs.DB stat.ML

    Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

    Authors: Daniil Polykovskiy, Alexander Zhebrak, Benjamin Sanchez-Lengeling, Sergey Golovanov, Oktai Tatanov, Stanislav Belyaev, Rauf Kurbanov, Aleksey Artamonov, Vladimir Aladinskiy, Mark Veselov, Artur Kadurin, Simon Johansson, Hongming Chen, Sergey Nikolenko, Alan Aspuru-Guzik, Alex Zhavoronkov

    Abstract: Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervised predictive models in the downstream tasks. While there are plenty of generative models, it is unclear how to compare an… ▽ More

    Submitted 28 October, 2020; v1 submitted 29 November, 2018; originally announced November 2018.

  35. Sequence Learning with RNNs for Medical Concept Normalization in User-Generated Texts

    Authors: Elena Tutubalina, Zulfat Miftahutdinov, Sergey Nikolenko, Valentin Malykh

    Abstract: In this work, we consider the medical concept normalization problem, i.e., the problem of map** a disease mention in free-form text to a concept in a controlled vocabulary, usually to the standard thesaurus in the Unified Medical Language System (UMLS). This task is challenging since medical terminology is very different when coming from health care professionals or from the general public in th… ▽ More

    Submitted 29 November, 2018; v1 submitted 28 November, 2018; originally announced November 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

    Report number: ML4H/2018/117

    Journal ref: Journal of Biomedical Informatics. - 2018. - Vol.84, Is.. - P.93-102

  36. arXiv:1811.11067  [pdf, other

    cs.LG cs.AI stat.ML

    Learning State Representations in Complex Systems with Multimodal Data

    Authors: Pavel Solovev, Vladimir Aliev, Pavel Ostyakov, Gleb Sterkin, Elizaveta Logacheva, Stepan Troeshestov, Roman Suvorov, Anton Mashikhin, Oleg Khomenko, Sergey I. Nikolenko

    Abstract: Representation learning becomes especially important for complex systems with multimodal data sources such as cameras or sensors. Recent advances in reinforcement learning and optimal control make it possible to design control algorithms on these latent representations, but the field still lacks a large-scale standard dataset for unified comparison. In this work, we present a large-scale dataset a… ▽ More

    Submitted 15 January, 2019; v1 submitted 27 November, 2018; originally announced November 2018.

    Comments: Fixed references

  37. arXiv:1811.07630  [pdf, other

    cs.CV cs.LG cs.NE

    SEIGAN: Towards Compositional Image Generation by Simultaneously Learning to Segment, Enhance, and Inpaint

    Authors: Pavel Ostyakov, Roman Suvorov, Elizaveta Logacheva, Oleg Khomenko, Sergey I. Nikolenko

    Abstract: We present a novel approach to image manipulation and understanding by simultaneously learning to segment object masks, paste objects to another background image, and remove them from original images. For this purpose, we develop a novel generative model for compositional image generation, SEIGAN (Segment-Enhance-Inpaint Generative Adversarial Network), which learns these three operations together… ▽ More

    Submitted 15 January, 2019; v1 submitted 19 November, 2018; originally announced November 2018.

  38. arXiv:1809.04403  [pdf, other

    cs.CV cs.LG

    Label Denoising with Large Ensembles of Heterogeneous Neural Networks

    Authors: Pavel Ostyakov, Elizaveta Logacheva, Roman Suvorov, Vladimir Aliev, Gleb Sterkin, Oleg Khomenko, Sergey I. Nikolenko

    Abstract: Despite recent advances in computer vision based on various convolutional architectures, video understanding remains an important challenge. In this work, we present and discuss a top solution for the large-scale video classification (labeling) problem introduced as a Kaggle competition based on the YouTube-8M dataset. We show and compare different approaches to preprocessing, data augmentation, m… ▽ More

    Submitted 15 January, 2019; v1 submitted 12 September, 2018; originally announced September 2018.

  39. arXiv:1510.04235  [pdf, other

    cs.NI

    BASEL (Buffering Architecture SpEcification Language)

    Authors: Kirill Kogan, Danushka Menikkumbura, Gustavo Petri, Youngtae Noh, Sergey Nikolenko, Patrick Eugster

    Abstract: Buffering architectures and policies for their efficient management constitute one of the core ingredients of a network architecture. In this work we introduce a new specification language, BASEL, that allows to express virtual buffering architectures and management policies representing a variety of economic models. BASEL does not require the user to implement policies in a high-level language; r… ▽ More

    Submitted 14 October, 2015; originally announced October 2015.

    Comments: 11 pages, 11 figures, 2 tables

  40. arXiv:1211.2756  [pdf, other

    q-bio.QM cs.CE cs.DS q-bio.GN

    BayesHammer: Bayesian clustering for error correction in single-cell sequencing

    Authors: Sergey I. Nikolenko, Anton I. Korobeynikov, Max A. Alekseyev

    Abstract: Error correction of sequenced reads remains a difficult task, especially in single-cell sequencing projects with extremely non-uniform coverage. While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic. We introduce s… ▽ More

    Submitted 12 November, 2012; originally announced November 2012.

    Journal ref: BMC Genomics 14(Suppl 1) (2013), pp. S7

  41. arXiv:1204.5443  [pdf, other

    cs.NI

    FIFO Queueing Policies for Packets with Heterogeneous Processing

    Authors: Kirill Kogan, Alejandro López-Ortiz, Sergey I. Nikolenko, Alexander V. Sirotkin, Denis Tugaryov

    Abstract: We consider the problem of managing a bounded size First-In-First-Out (FIFO) queue buffer, where each incoming unit-sized packet requires several rounds of processing before it can be transmitted out. Our objective is to maximize the total number of successfully transmitted packets. We consider both push-out (when the policy is permitted to drop already admitted packets) and non-push-out cases. In… ▽ More

    Submitted 24 April, 2012; originally announced April 2012.

    Comments: 15 pages

  42. arXiv:1202.5755  [pdf, other

    cs.NI cs.PF

    Balancing Work and Size with Bounded Buffers

    Authors: Kirill Kogan, Alejandro Lopez-Ortiz, Sergey I. Nikolenko, Gabriel Scalosub, Michael Segal

    Abstract: We consider the fundamental problem of managing a bounded size queue buffer where traffic consists of packets of varying size, where each packet requires several rounds of processing before it can be transmitted from the queue buffer. The goal in such an environment is to maximize the overall size of packets that are successfully transmitted. This model is motivated by the ever-growing ubiquity of… ▽ More

    Submitted 5 September, 2013; v1 submitted 26 February, 2012; originally announced February 2012.

    Comments: 22 pages, 7 figures

  43. arXiv:0802.2863  [pdf, ps, other

    cs.CC cs.CR

    New Combinatorial Complete One-Way Functions

    Authors: Arist Kojevnikov, Sergey I. Nikolenko

    Abstract: In 2003, Leonid A. Levin presented the idea of a combinatorial complete one-way function and a sketch of the proof that Tiling represents such a function. In this paper, we present two new one-way functions based on semi-Thue string rewriting systems and a version of the Post Correspondence Problem and prove their completeness. Besides, we present an alternative proof of Levin's result. We also… ▽ More

    Submitted 20 February, 2008; originally announced February 2008.

    Journal ref: Dans Proceedings of the 25th Annual Symposium on the Theoretical Aspects of Computer Science - STACS 2008, Bordeaux : France (2008)

  44. arXiv:math/0606335  [pdf

    math.AG

    Chow ring structure made simple

    Authors: S. Nikolenko, N. Semenov

    Abstract: We show how to translate the task of computing the multiplicative structure of a Chow ring of a projective homogeneous variety into an easily understandable combinatorial task of calculating in the corresponding polynomial ring. The algorithms are also presented as a Maple package. Then we proceed to compute the multiplicative structure of the Chow rings for projective homogeneous varieties E6/P… ▽ More

    Submitted 14 June, 2006; originally announced June 2006.

    Comments: 17 pages

    MSC Class: 14M15

  45. arXiv:math/0502382  [pdf, ps, other

    math.AG

    Motivic decomposition of anisotropic varieties of type F_4 into generalized Rost motives

    Authors: S. Nikolenko, N. Semenov, K. Zainoulline

    Abstract: This an extended version of the previous preprint dated by February 2005. We prove that the Chow motive of an anisotropic projective homogeneous variety of type F4 is isomorphic to the direct sum of twisted copies of a generalized Rost motive. In particular, we provide an explicit construction of a generalized Rost motive for a generically splitting variety for a symbol in K_3^M(k)/3. We also… ▽ More

    Submitted 22 September, 2005; v1 submitted 17 February, 2005; originally announced February 2005.

    Comments: 20 pages, XYPIC

    Journal ref: J. of K-theory 3 (2009), no.1, 85-102.

  46. arXiv:cs/0301012  [pdf, ps, other

    cs.CC

    Hard satisfiable formulas for DPLL-type algorithms

    Authors: Sergey I. Nikolenko

    Abstract: We address lower bounds on the time complexity of algorithms solving the propositional satisfiability problem. Namely, we consider two DPLL-type algorithms, enhanced with the unit clause and pure literal heuristics. Exponential lower bounds for solving satisfiability on provably satisfiable formulas are proven.

    Submitted 15 January, 2003; originally announced January 2003.

    Comments: 9 pages

    ACM Class: F.2.2