Search | arXiv e-print repository

Anomalous Behavior of the Dielectric and Pyroelectric Responses of Ferroelectric Fine-Grained Ceramics

Authors: Oleksandr S. Pylypchuk, Serhii E. Ivanchenko, Mykola Y. Yelisieiev, Andrii S. Nikolenko, Victor I. Styopkin, Bohdan Pokhylko, Vladyslav Kushnir, Denis O. Stetsenko, Oleksii Bereznykov, Oksana V. Leschenko, Eugene A. Eliseev, Vladimir N. Poroshin, Nicholas V. Morozovsky, Victor V. Vainberg, Anna N. Morozovska

Abstract: We revealed the anomalous temperature behavior of the giant dielectric permittivity and unusual frequency dependences of the pyroelectric response of the fine-grained ceramics prepared by the spark plasma sintering of the ferroelectric BaTiO3 nanoparticles. The temperature dependences of the electro-resistivity indicate the frequency-dependent transition in the electro-transport mechanisms between… ▽ More We revealed the anomalous temperature behavior of the giant dielectric permittivity and unusual frequency dependences of the pyroelectric response of the fine-grained ceramics prepared by the spark plasma sintering of the ferroelectric BaTiO3 nanoparticles. The temperature dependences of the electro-resistivity indicate the frequency-dependent transition in the electro-transport mechanisms between the lower and higher conductivity states accompanied by the maximum in the temperature dependence of the loss angle tangent. The pyroelectric thermal-wave probing revealed the existence of the spatially inhomogeneous counter-polarized ferroelectric state at the opposite surfaces of the ceramic sample. We described the anomalous temperature behavior of the giant dielectric response and losses using the core-shell model for ceramic grains and modified Maxwell-Wagner approach. We assume that core shells and grain boundaries, which contain high concentration of space charge carriers due to the presence of graphite inclusions in the inter-grain space, can effectively screen weakly conductive ferroelectric grain cores. The superparaelectric-like state with a giant dielectric response can appear in the paraelectric shells and inter-grain space due to the step-like thermal activation of localized polarons in the spatial regions, agreeing with experimentally observed frequency-dependent transition of the electro-transport mechanism. The obtained results can be the key for the description of complex electrophysical properties inherent to the strongly inhomogeneous media with electrically coupled insulating ferroelectric nanoregions and semiconducting superparaelectric-like regions. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 34 pages including 12 figures and 2 Appendixes

arXiv:2406.15035 [pdf, other]

Improving Interpretability and Robustness for the Detection of AI-Generated Images

Authors: Tatiana Gaintseva, Laida Kushnareva, German Magai, Irina Piontkovskaya, Sergey Nikolenko, Martin Benning, Serguei Barannikov, Gregory Slabaugh

Abstract: With growing abilities of generative models, artificial content detection becomes an increasingly important and difficult task. However, all popular approaches to this problem suffer from poor generalization across domains and generative models. In this work, we focus on the robustness of AI-generated image (AIGI) detectors. We analyze existing state-of-the-art AIGI detection methods based on froz… ▽ More With growing abilities of generative models, artificial content detection becomes an increasingly important and difficult task. However, all popular approaches to this problem suffer from poor generalization across domains and generative models. In this work, we focus on the robustness of AI-generated image (AIGI) detectors. We analyze existing state-of-the-art AIGI detection methods based on frozen CLIP embeddings and show how to interpret them, shedding light on how images produced by various AI generators differ from real ones. Next we propose two ways to improve robustness: based on removing harmful components of the embedding vector and based on selecting the best performing attention heads in the image encoder model. Our methods increase the mean out-of-distribution (OOD) classification score by up to 6% for cross-model transfer. We also propose a new dataset for AIGI detection and use it in our evaluation; we believe this dataset will help boost further research. The dataset and code are provided as a supplement. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.14347 [pdf, other]

$\nabla^2$DFT: A Universal Quantum Chemistry Dataset of Drug-Like Molecules and a Benchmark for Neural Network Potentials

Authors: Kuzma Khrabrov, Anton Ber, Artem Tsypin, Konstantin Ushenin, Egor Rumiantsev, Alexander Telepov, Dmitry Protasov, Ilya Shenbin, Anton Alekseev, Mikhail Shirokikh, Sergey Nikolenko, Elena Tutubalina, Artur Kadurin

Abstract: Methods of computational quantum chemistry provide accurate approximations of molecular properties crucial for computer-aided drug discovery and other areas of chemical science. However, high computational complexity limits the scalability of their applications. Neural network potentials (NNPs) are a promising alternative to quantum chemistry methods, but they require large and diverse datasets fo… ▽ More Methods of computational quantum chemistry provide accurate approximations of molecular properties crucial for computer-aided drug discovery and other areas of chemical science. However, high computational complexity limits the scalability of their applications. Neural network potentials (NNPs) are a promising alternative to quantum chemistry methods, but they require large and diverse datasets for training. This work presents a new dataset and benchmark called $\nabla^2$DFT that is based on the nablaDFT. It contains twice as much molecular structures, three times more conformations, new data types and tasks, and state-of-the-art models. The dataset includes energies, forces, 17 molecular properties, Hamiltonian and overlap matrices, and a wavefunction object. All calculations were performed at the DFT level ($ω$B97X-D/def2-SVP) for each conformation. Moreover, $\nabla^2$DFT is the first dataset that contains relaxation trajectories for a substantial number of drug-like molecules. We also introduce a novel benchmark for evaluating NNPs in molecular property prediction, Hamiltonian prediction, and conformational optimization tasks. Finally, we propose an extendable framework for training NNPs and implement 10 models within it. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.00198 [pdf, other]

ImplicitSLIM and How it Improves Embedding-based Collaborative Filtering

Authors: Ilya Shenbin, Sergey Nikolenko

Abstract: We present ImplicitSLIM, a novel unsupervised learning approach for sparse high-dimensional data, with applications to collaborative filtering. Sparse linear methods (SLIM) and their variations show outstanding performance, but they are memory-intensive and hard to scale. ImplicitSLIM improves embedding-based models by extracting embeddings from SLIM-like models in a computationally cheap and memo… ▽ More We present ImplicitSLIM, a novel unsupervised learning approach for sparse high-dimensional data, with applications to collaborative filtering. Sparse linear methods (SLIM) and their variations show outstanding performance, but they are memory-intensive and hard to scale. ImplicitSLIM improves embedding-based models by extracting embeddings from SLIM-like models in a computationally cheap and memory-efficient way, without explicit learning of heavy SLIM-like models. We show that ImplicitSLIM improves performance and speeds up convergence for both state of the art and classical collaborative filtering methods. The source code for ImplicitSLIM, related models, and applications is available at https://github.com/ilya-shenbin/ImplicitSLIM. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: Published as a conference paper at ICLR 2024; authors' version

arXiv:2311.11813 [pdf, other]

Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule

Authors: Andrey Bout, Alexander Podolskiy, Sergey Nikolenko, Irina Piontkovskaya

Abstract: Progress in neural grammatical error correction (GEC) is hindered by the lack of annotated training data. Sufficient amounts of high-quality manually annotated data are not available, so recent research has relied on generating synthetic data, pretraining on it, and then fine-tuning on real datasets; performance gains have been achieved either by ensembling or by using huge pretrained models such… ▽ More Progress in neural grammatical error correction (GEC) is hindered by the lack of annotated training data. Sufficient amounts of high-quality manually annotated data are not available, so recent research has relied on generating synthetic data, pretraining on it, and then fine-tuning on real datasets; performance gains have been achieved either by ensembling or by using huge pretrained models such as XXL-T5 as the backbone. In this work, we explore an orthogonal direction: how to use available data more efficiently. First, we propose auxiliary tasks that exploit the alignment between the original and corrected sentences, such as predicting a sequence of corrections. We formulate each task as a sequence-to-sequence problem and perform multi-task training. Second, we discover that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance, so we set out to find the best training schedule. Together, these two ideas lead to significant improvements, producing results that improve state of the art with much smaller models; in particular, we outperform the best models based on T5-XXL (11B parameters) with a BART-based model (400M parameters). △ Less

Submitted 20 November, 2023; originally announced November 2023.

Comments: EMNLP 2023

arXiv:2311.08349 [pdf, other]

AI-generated text boundary detection with RoFT

Authors: Laida Kushnareva, Tatiana Gaintseva, German Magai, Serguei Barannikov, Dmitry Abulkhanov, Kristian Kuznetsov, Eduard Tulchinskii, Irina Piontkovskaya, Sergey Nikolenko

Abstract: Due to the rapid development of large language models, people increasingly often encounter texts that may start as written by a human but continue as machine-generated. Detecting the boundary between human-written and machine-generated parts of such texts is a challenging problem that has not received much attention in literature. We attempt to bridge this gap and examine several ways to adapt sta… ▽ More Due to the rapid development of large language models, people increasingly often encounter texts that may start as written by a human but continue as machine-generated. Detecting the boundary between human-written and machine-generated parts of such texts is a challenging problem that has not received much attention in literature. We attempt to bridge this gap and examine several ways to adapt state of the art artificial text detection classifiers to the boundary detection setting. We push all detectors to their limits, using the Real or Fake text benchmark that contains short texts on several topics and includes generations of various language models. We use this diversity to deeply examine the robustness of all detectors in cross-domain and cross-model settings to provide baselines and insights for future research. In particular, we find that perplexity-based approaches to boundary detection tend to be more robust to peculiarities of domain-specific data than supervised fine-tuning of the RoBERTa model; we also find which features of the text confuse boundary detection algorithms and negatively influence their performance in cross-domain settings. △ Less

Submitted 2 April, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

arXiv:2311.08191 [pdf, other]

GEC-DePenD: Non-Autoregressive Grammatical Error Correction with Decoupled Permutation and Decoding

Authors: Konstantin Yakovlev, Alexander Podolskiy, Andrey Bout, Sergey Nikolenko, Irina Piontkovskaya

Abstract: Grammatical error correction (GEC) is an important NLP task that is currently usually solved with autoregressive sequence-to-sequence models. However, approaches of this class are inherently slow due to one-by-one token generation, so non-autoregressive alternatives are needed. In this work, we propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation ne… ▽ More Grammatical error correction (GEC) is an important NLP task that is currently usually solved with autoregressive sequence-to-sequence models. However, approaches of this class are inherently slow due to one-by-one token generation, so non-autoregressive alternatives are needed. In this work, we propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network that outputs a self-attention weight matrix that can be used in beam search to find the best permutation of input tokens (with auxiliary {ins} tokens) and a decoder network based on a step-unrolled denoising autoencoder that fills in specific tokens. This allows us to find the token permutation after only one forward pass of the permutation network, avoiding autoregressive constructions. We show that the resulting network improves over previously known non-autoregressive methods for GEC and reaches the level of autoregressive methods that do not use language-specific synthetic data generation methods. Our results are supported by a comprehensive experimental validation on the ConLL-2014 and Write&Improve+LOCNESS datasets and an extensive ablation study that supports our architectural and algorithmic choices. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: ACL 2023

arXiv:2311.08143 [pdf, other]

Sinkhorn Transformations for Single-Query Postprocessing in Text-Video Retrieval

Authors: Konstantin Yakovlev, Gregory Polyakov, Ilseyar Alimova, Alexander Podolskiy, Andrey Bout, Sergey Nikolenko, Irina Piontkovskaya

Abstract: A recent trend in multimodal retrieval is related to postprocessing test set results via the dual-softmax loss (DSL). While this approach can bring significant improvements, it usually presumes that an entire matrix of test samples is available as DSL input. This work introduces a new postprocessing approach based on Sinkhorn transformations that outperforms DSL. Further, we propose a new postproc… ▽ More A recent trend in multimodal retrieval is related to postprocessing test set results via the dual-softmax loss (DSL). While this approach can bring significant improvements, it usually presumes that an entire matrix of test samples is available as DSL input. This work introduces a new postprocessing approach based on Sinkhorn transformations that outperforms DSL. Further, we propose a new postprocessing setting that does not require access to multiple test queries. We show that our approach can significantly improve the results of state of the art models such as CLIP4Clip, BLIP, X-CLIP, and DRL, thus achieving a new state-of-the-art on several standard text-video retrieval datasets both with access to the entire test set and in the single-query setting. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: SIGIR 2023

arXiv:2310.06059 [pdf, other]

Early Warning Prediction with Automatic Labeling in Epilepsy Patients

Authors: Peng Zhang, Ting Gao, ** Guo, **qiao Duan, Sergey Nikolenko

Abstract: Early warning for epilepsy patients is crucial for their safety and well-being, in particular to prevent or minimize the severity of seizures. Through the patients' EEG data, we propose a meta learning framework to improve the prediction of early ictal signals. The proposed bi-level optimization framework can help automatically label noisy data at the early ictal stage, as well as optimize the tra… ▽ More Early warning for epilepsy patients is crucial for their safety and well-being, in particular to prevent or minimize the severity of seizures. Through the patients' EEG data, we propose a meta learning framework to improve the prediction of early ictal signals. The proposed bi-level optimization framework can help automatically label noisy data at the early ictal stage, as well as optimize the training accuracy of the backbone model. To validate our approach, we conduct a series of experiments to predict seizure onset in various long-term windows, with LSTM and ResNet implemented as the baseline models. Our study demonstrates that not only the ictal prediction accuracy obtained by meta learning is significantly improved, but also the resulting model captures some intrinsic patterns of the noisy data that a single backbone model could not learn. As a result, the predicted probability generated by the meta network serves as a highly effective early warning indicator. △ Less

Submitted 11 January, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

Comments: 13 pages,4 figures

arXiv:2308.15952 [pdf, ps, other]

Benchmarking Multilabel Topic Classification in the Kyrgyz Language

Authors: Anton Alekseev, Sergey I. Nikolenko, Gulnara Kabaeva

Abstract: Kyrgyz is a very underrepresented language in terms of modern natural language processing resources. In this work, we present a new public benchmark for topic classification in Kyrgyz, introducing a dataset based on collected and annotated data from the news site 24.KG and presenting several baseline models for news classification in the multilabel setting. We train and evaluate both classical sta… ▽ More Kyrgyz is a very underrepresented language in terms of modern natural language processing resources. In this work, we present a new public benchmark for topic classification in Kyrgyz, introducing a dataset based on collected and annotated data from the news site 24.KG and presenting several baseline models for news classification in the multilabel setting. We train and evaluate both classical statistical and neural models, reporting the scores, discussing the results, and proposing directions for future work. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: Accepted to AIST 2023

arXiv:2307.09141 [pdf, other]

Machine Learning for SAT: Restricted Heuristics and New Graph Representations

Authors: Mikhail Shirokikh, Ilya Shenbin, Anton Alekseev, Sergey Nikolenko

Abstract: Boolean satisfiability (SAT) is a fundamental NP-complete problem with many applications, including automated planning and scheduling. To solve large instances, SAT solvers have to rely on heuristics, e.g., choosing a branching variable in DPLL and CDCL solvers. Such heuristics can be improved with machine learning (ML) models; they can reduce the number of steps but usually hinder the running tim… ▽ More Boolean satisfiability (SAT) is a fundamental NP-complete problem with many applications, including automated planning and scheduling. To solve large instances, SAT solvers have to rely on heuristics, e.g., choosing a branching variable in DPLL and CDCL solvers. Such heuristics can be improved with machine learning (ML) models; they can reduce the number of steps but usually hinder the running time because useful models are relatively large and slow. We suggest the strategy of making a few initial steps with a trained ML model and then releasing control to classical heuristics; this simplifies cold start for SAT solving and can decrease both the number of steps and overall runtime, but requires a separate decision of when to release control to the solver. Moreover, we introduce a modification of Graph-Q-SAT tailored to SAT problems converted from other domains, e.g., open shop scheduling problems. We validate the feasibility of our approach with random and industrial SAT problems. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2306.04723 [pdf, other]

Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts

Authors: Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Serguei Barannikov, Irina Piontkovskaya, Sergey Nikolenko, Evgeny Burnaev

Abstract: Rapidly increasing quality of AI-generated content makes it difficult to distinguish between human and AI-generated texts, which may lead to undesirable consequences for society. Therefore, it becomes increasingly important to study the properties of human texts that are invariant over different text domains and varying proficiency of human writers, can be easily calculated for any language, and c… ▽ More Rapidly increasing quality of AI-generated content makes it difficult to distinguish between human and AI-generated texts, which may lead to undesirable consequences for society. Therefore, it becomes increasingly important to study the properties of human texts that are invariant over different text domains and varying proficiency of human writers, can be easily calculated for any language, and can robustly separate natural and AI-generated texts regardless of the generation model and sampling method. In this work, we propose such an invariant for human-written texts, namely the intrinsic dimensionality of the manifold underlying the set of embeddings for a given text sample. We show that the average intrinsic dimensionality of fluent texts in a natural language is hovering around the value $9$ for several alphabet-based languages and around $7$ for Chinese, while the average intrinsic dimensionality of AI-generated texts for each language is $\approx 1.5$ lower, with a clear statistical separation between human-generated and AI-generated distributions. This property allows us to build a score-based artificial text detector. The proposed detector's accuracy is stable over text domains, generator models, and human writer proficiency levels, outperforming SOTA detectors in model-agnostic and cross-domain scenarios by a significant margin. △ Less

Submitted 31 October, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

MSC Class: 68T50

arXiv:2305.11626 [pdf, other]

CCT-Code: Cross-Consistency Training for Multilingual Clone Detection and Code Search

Authors: Nikita Sorokin, Dmitry Abulkhanov, Sergey Nikolenko, Valentin Malykh

Abstract: We consider the clone detection and information retrieval problems for source code, well-known tasks important for any programming language. Although it is also an important and interesting problem to find code snippets that operate identically but are written in different programming languages, to the best of our knowledge multilingual clone detection has not been studied in literature. In this w… ▽ More We consider the clone detection and information retrieval problems for source code, well-known tasks important for any programming language. Although it is also an important and interesting problem to find code snippets that operate identically but are written in different programming languages, to the best of our knowledge multilingual clone detection has not been studied in literature. In this work, we formulate the multilingual clone detection problem and present XCD, a new benchmark dataset produced from the CodeForces submissions dataset. Moreover, we present a novel training procedure, called cross-consistency training (CCT), that we apply to train language models on source code in different programming languages. The resulting CCT-LM model, initialized with GraphCodeBERT and fine-tuned with CCT, achieves new state of the art, outperforming existing approaches on the POJ-104 clone detection benchmark with 95.67\% MAP and AdvTest code search benchmark with 47.18\% MRR; it also shows the best results on the newly created multilingual clone detection benchmark XCD across all programming languages. △ Less

Submitted 19 May, 2023; originally announced May 2023.

arXiv:2305.11625 [pdf, other]

Searching by Code: a New SearchBySnippet Dataset and SnippeR Retrieval Model for Searching by Code Snippets

Authors: Ivan Sedykh, Dmitry Abulkhanov, Nikita Sorokin, Sergey Nikolenko, Valentin Malykh

Abstract: Code search is an important and well-studied task, but it usually means searching for code by a text query. We argue that using a code snippet (and possibly an error traceback) as a query while looking for bugfixing instructions and code samples is a natural use case not covered by prior art. Moreover, existing datasets use code comments rather than full-text descriptions as text, making them unsu… ▽ More Code search is an important and well-studied task, but it usually means searching for code by a text query. We argue that using a code snippet (and possibly an error traceback) as a query while looking for bugfixing instructions and code samples is a natural use case not covered by prior art. Moreover, existing datasets use code comments rather than full-text descriptions as text, making them unsuitable for this use case. We present a new SearchBySnippet dataset implementing the search-by-code use case based on StackOverflow data; we show that on SearchBySnippet, existing architectures fall short of a simple BM25 baseline even after fine-tuning. We present a new single encoder model SnippeR that outperforms several strong baselines on SearchBySnippet with a result of 0.451 Recall@10; we propose the SearchBySnippet dataset and SnippeR as a new important benchmark for code search evaluation. △ Less

Submitted 27 May, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

Comments: COLING 2024

arXiv:2304.13393 [pdf, other]

STIR: Siamese Transformer for Image Retrieval Postprocessing

Authors: Aleksei Shabanov, Aleksei Tarasov, Sergey Nikolenko

Abstract: Current metric learning approaches for image retrieval are usually based on learning a space of informative latent representations where simple approaches such as the cosine distance will work well. Recent state of the art methods such as HypViT move to more complex embedding spaces that may yield better results but are harder to scale to production environments. In this work, we first construct a… ▽ More Current metric learning approaches for image retrieval are usually based on learning a space of informative latent representations where simple approaches such as the cosine distance will work well. Recent state of the art methods such as HypViT move to more complex embedding spaces that may yield better results but are harder to scale to production environments. In this work, we first construct a simpler model based on triplet loss with hard negatives mining that performs at the state of the art level but does not have these drawbacks. Second, we introduce a novel approach for image retrieval postprocessing called Siamese Transformer for Image Retrieval (STIR) that reranks several top outputs in a single forward pass. Unlike previously proposed Reranking Transformers, STIR does not rely on global/local feature extraction and directly compares a query image and a retrieved candidate on pixel level with the usage of attention mechanism. The resulting approach defines a new state of the art on standard image retrieval datasets: Stanford Online Products and DeepFashion In-shop. We also release the source code at https://github.com/OML-Team/open-metric-learning/tree/main/pipelines/postprocessing/ and an interactive demo of our approach at https://dapladoc-oml-postprocessing-demo-srcappmain-pfh2g0.streamlit.app/ △ Less

Submitted 27 April, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

Comments: 14 pages, 3 figures

ACM Class: H.3.3

arXiv:2211.17223 [pdf, other]

doi 10.21437/Interspeech.2023-1861

Topological Data Analysis for Speech Processing

Authors: Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva, Daniil Cherniavskii, Serguei Barannikov, Irina Piontkovskaya, Sergey Nikolenko, Evgeny Burnaev

Abstract: We apply topological data analysis (TDA) to speech classification problems and to the introspection of a pretrained speech model, HuBERT. To this end, we introduce a number of topological and algebraic features derived from Transformer attention maps and embeddings. We show that a simple linear classifier built on top of such features outperforms a fine-tuned classification head. In particular, we… ▽ More We apply topological data analysis (TDA) to speech classification problems and to the introspection of a pretrained speech model, HuBERT. To this end, we introduce a number of topological and algebraic features derived from Transformer attention maps and embeddings. We show that a simple linear classifier built on top of such features outperforms a fine-tuned classification head. In particular, we achieve an improvement of about $9\%$ accuracy and $5\%$ ERR on four common datasets; on CREMA-D, the proposed feature set reaches a new state of the art performance with accuracy $80.155$. We also show that topological features are able to reveal functional roles of speech Transformer heads; e.g., we find the heads capable to distinguish between pairs of sample sources (natural/synthetic) or voices without any downstream fine-tuning. Our results demonstrate that TDA is a promising new approach for speech analysis, especially for tasks that require structural prediction. Appendices, an introduction to TDA, and other additional materials are available here - https://topohubert.github.io/speech-topology-webpages/ △ Less

Submitted 6 June, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

Comments: Accepted to INTERSPEECH 2023 conference

Journal ref: Proc. INTERSPEECH 2023, pages 311--315

arXiv:2207.12236 [pdf, other]

doi 10.1145/3503161.3548769

Personality-Driven Social Multimedia Content Recommendation

Authors: Qi Yang, Sergey Nikolenko, Alfred Huang, Aleksandr Farseev

Abstract: Social media marketing plays a vital role in promoting brand and product values to wide audiences. In order to boost their advertising revenues, global media buying platforms such as Facebook Ads constantly reduce the reach of branded organic posts, pushing brands to spend more on paid media ads. In order to run organic and paid social media marketing efficiently, it is necessary to understand the… ▽ More Social media marketing plays a vital role in promoting brand and product values to wide audiences. In order to boost their advertising revenues, global media buying platforms such as Facebook Ads constantly reduce the reach of branded organic posts, pushing brands to spend more on paid media ads. In order to run organic and paid social media marketing efficiently, it is necessary to understand the audience, tailoring the content to fit their interests and online behaviours, which is impossible to do manually at a large scale. At the same time, various personality type categorization schemes such as the Myers-Briggs Personality Type indicator make it possible to reveal the dependencies between personality traits and user content preferences on a wider scale by categorizing audience behaviours in a unified and structured manner. This problem is yet to be studied in depth by the research community, while the level of impact of different personality traits on content recommendation accuracy has not been widely utilised and comprehensively evaluated so far. Specifically, in this work we investigate the impact of human personality traits on the content recommendation model by applying a novel personality-driven multi-view content recommender system called Personality Content Marketing Recommender Engine, or PersiC. Our experimental results and real-world case study demonstrate not just PersiC's ability to perform efficient human personality-driven multi-view content recommendation, but also allow for actionable digital ad strategy recommendations, which when deployed are able to improve digital advertising efficiency by over 420% as compared to the original human-guided approach. △ Less

Submitted 25 July, 2022; originally announced July 2022.

arXiv:2206.12514 [pdf, other]

DetIE: Multilingual Open Information Extraction Inspired by Object Detection

Authors: Michael Vasilkovsky, Anton Alekseev, Valentin Malykh, Ilya Shenbin, Elena Tutubalina, Dmitriy Salikhov, Mikhail Stepnov, Andrey Chertok, Sergey Nikolenko

Abstract: State of the art neural methods for open information extraction (OpenIE) usually extract triplets (or tuples) iteratively in an autoregressive or predicate-based manner in order not to produce duplicates. In this work, we propose a different approach to the problem that can be equally or more successful. Namely, we present a novel single-pass method for OpenIE inspired by object detection algorith… ▽ More State of the art neural methods for open information extraction (OpenIE) usually extract triplets (or tuples) iteratively in an autoregressive or predicate-based manner in order not to produce duplicates. In this work, we propose a different approach to the problem that can be equally or more successful. Namely, we present a novel single-pass method for OpenIE inspired by object detection algorithms from computer vision. We use an order-agnostic loss based on bipartite matching that forces unique predictions and a Transformer-based encoder-only architecture for sequence labeling. The proposed approach is faster and shows superior or similar performance in comparison with state of the art models on standard benchmarks in terms of both quality metrics and inference time. Our model sets the new state of the art performance of 67.7% F1 on CaRB evaluated as OIE2016 while being 3.35x faster at inference than previous state of the art. We also evaluate the multilingual version of our model in the zero-shot setting for two languages and introduce a strategy for generating synthetic multilingual data to fine-tune the model for each specific language. In this setting, we show performance improvement 15% on multilingual Re-OIE2016, reaching 75% F1 for both Portuguese and Spanish languages. Code and models are available at https://github.com/sberbank-ai/DetIE. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Comments: Accepted to the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

arXiv:2111.12956 [pdf, other]

doi 10.1007/978-3-031-16500-9_3

Near-Zero-Shot Suggestion Mining with a Little Help from WordNet

Authors: Anton Alekseev, Elena Tutubalina, Sejeong Kwon, Sergey Nikolenko

Abstract: In this work, we explore the constructive side of online reviews: advice, tips, requests, and suggestions that users provide about goods, venues, services, and other items of interest. To reduce training costs and annotation efforts needed to build a classifier for a specific label set, we present and evaluate several entailment-based zero-shot approaches to suggestion classification in a label-fu… ▽ More In this work, we explore the constructive side of online reviews: advice, tips, requests, and suggestions that users provide about goods, venues, services, and other items of interest. To reduce training costs and annotation efforts needed to build a classifier for a specific label set, we present and evaluate several entailment-based zero-shot approaches to suggestion classification in a label-fully-unseen fashion. In particular, we introduce the strategy of assigning target class labels to sentences in English language with user intentions, which significantly improves prediction quality. The proposed strategies are evaluated with a comprehensive experimental study that validated our results both quantitatively and qualitatively. △ Less

Submitted 25 November, 2021; originally announced November 2021.

Comments: Accepted to the 10th International Conference on Analysis of Images, Social Networks and Texts (AIST 2021)

Journal ref: Analysis of Images, Social Networks and Texts. AIST 2021. Lecture Notes in Computer Science, vol 13217. Springer, Cham

arXiv:2102.06268 [pdf, other]

Paraelectric KH$_2$PO$_4$ Nanocrystals in Monolithic Mesoporous Silica: Structure and Lattice Dynamics

Authors: Yaroslav Shchur, Andriy V. Kityk, Viktor V. Strelchuk, Andrii S. Nikolenko, Nazariy A. Andrushchak, Patrick Huber, Anatolii S. Andrushchak

Abstract: Combining dielectric crystals with mesoporous solids allows a versatile design of functional nanomaterials, where the porous host provides a mechanical rigid scaffold structure and the molecular filling adds the functionalization. Here, we report a study of the complex lattice dynamics of a SiO$_2$:KH$_2$PO$_4$ nanocomposite consisting of a monolithic, mesoporous silica glass host with KH$_2$PO… ▽ More Combining dielectric crystals with mesoporous solids allows a versatile design of functional nanomaterials, where the porous host provides a mechanical rigid scaffold structure and the molecular filling adds the functionalization. Here, we report a study of the complex lattice dynamics of a SiO$_2$:KH$_2$PO$_4$ nanocomposite consisting of a monolithic, mesoporous silica glass host with KH$_2$PO$_4$ nanocrystals embedded in its tubular channels $\sim$12 nm across. A micro-Raman investigation performed in the spectral range of 70-1600 cm$^{-1}$ reveals the complex lattice dynamics of the confined crystals. Their Raman spectrum resembles the one taken from bulk KH$_2$PO$_4$ crystals and thus, along with X-ray diffraction experiments, corroborates the successful solution-based synthesis of KH$_2$PO$_4$ nanocrystals with a structure analogous to the bulk material. We succeeded in observing not only the high-frequency internal modes ($\sim$900-1200 cm$^{-1}$), typical of internal vibrations of the PO$_4$ tetrahedra, but, more importantly, also the lowest frequency modes typical of bulk KH$_2$PO$_4$ crystals. The experimental Raman spectrum was interpreted with a group theory analysis and first-principle lattice dynamics calculations. The analysis of calculated eigen-vectors indicates the involvement of hydrogen atoms in most phonon modes corroborating the substantial significance of the hydrogen subsystem in the lattice dynamics of paraelectric bulk and of KH$_2$PO$_4$ crystals in extreme spatial confinement. A marginal redistribution of relative Raman intensities of the confined compared to unconfined crystals presumably originates in slightly changed crystal fields and interatomic interactions, in particular for the parts of the nanocrystals in close proximity to the silica pore surfaces. △ Less

Submitted 11 February, 2021; originally announced February 2021.

Comments: 10 pages, 4 figures, in press

Journal ref: Journal of Alloys and Compounds (2021)

arXiv:2009.12419 [pdf, other]

Towards General Purpose Geometry-Preserving Single-View Depth Estimation

Authors: Mikhail Romanov, Nikolay Patatkin, Anna Vorontsova, Sergey Nikolenko, Anton Konushin, Dmitry Senyushkin

Abstract: Single-view depth estimation (SVDE) plays a crucial role in scene understanding for AR applications, 3D modeling, and robotics, providing the geometry of a scene based on a single image. Recent works have shown that a successful solution strongly relies on the diversity and volume of training data. This data can be sourced from stereo movies and photos. However, they do not provide geometrically c… ▽ More Single-view depth estimation (SVDE) plays a crucial role in scene understanding for AR applications, 3D modeling, and robotics, providing the geometry of a scene based on a single image. Recent works have shown that a successful solution strongly relies on the diversity and volume of training data. This data can be sourced from stereo movies and photos. However, they do not provide geometrically complete depth maps (as disparities contain unknown shift value). Therefore, existing models trained on this data are not able to recover correct 3D representations. Our work shows that a model trained on this data along with conventional datasets can gain accuracy while predicting correct scene geometry. Surprisingly, only a small portion of geometrically correct depth maps are required to train a model that performs equally to a model trained on the full geometrically correct dataset. After that, we train computationally efficient models on a mixture of datasets using the proposed method. Through quantitative comparison on completely unseen datasets and qualitative comparison of 3D point clouds, we show that our model defines the new state of the art in general-purpose SVDE. △ Less

Submitted 9 February, 2021; v1 submitted 25 September, 2020; originally announced September 2020.

arXiv:2006.09766 [pdf, other]

doi 10.3233/JIFS-179908

Improving unsupervised neural aspect extraction for online discussions using out-of-domain classification

Authors: Anton Alekseev, Elena Tutubalina, Valentin Malykh, Sergey Nikolenko

Abstract: Deep learning architectures based on self-attention have recently achieved and surpassed state of the art results in the task of unsupervised aspect extraction and topic modeling. While models such as neural attention-based aspect extraction (ABAE) have been successfully applied to user-generated texts, they are less coherent when applied to traditional data sources such as news articles and newsg… ▽ More Deep learning architectures based on self-attention have recently achieved and surpassed state of the art results in the task of unsupervised aspect extraction and topic modeling. While models such as neural attention-based aspect extraction (ABAE) have been successfully applied to user-generated texts, they are less coherent when applied to traditional data sources such as news articles and newsgroup documents. In this work, we introduce a simple approach based on sentence filtering in order to improve topical aspects learned from newsgroups-based content without modifying the basic mechanism of ABAE. We train a probabilistic classifier to distinguish between out-of-domain texts (outer dataset) and in-domain texts (target dataset). Then, during data preparation we filter out sentences that have a low probability of being in-domain and train the neural model on the remaining sentences. The positive effect of sentence filtering on topic coherence is demonstrated in comparison to aspect extraction models trained on unfiltered texts. △ Less

Submitted 17 June, 2020; originally announced June 2020.

Comments: Journal of Intelligent & Fuzzy Systems, pre-press, https://content.iospress.com/articles/journal-of-intelligent-and-fuzzy-systems/ifs179908

arXiv:2004.03659 [pdf, other]

doi 10.1093/bioinformatics/btaa675

The Russian Drug Reaction Corpus and Neural Models for Drug Reactions and Effectiveness Detection in User Reviews

Authors: Elena Tutubalina, Ilseyar Alimova, Zulfat Miftahutdinov, Andrey Sakhovskiy, Valentin Malykh, Sergey Nikolenko

Abstract: The Russian Drug Reaction Corpus (RuDReC) is a new partially annotated corpus of consumer reviews in Russian about pharmaceutical products for the detection of health-related named entities and the effectiveness of pharmaceutical products. The corpus itself consists of two parts, the raw one and the labelled one. The raw part includes 1.4 million health-related user-generated texts collected from… ▽ More The Russian Drug Reaction Corpus (RuDReC) is a new partially annotated corpus of consumer reviews in Russian about pharmaceutical products for the detection of health-related named entities and the effectiveness of pharmaceutical products. The corpus itself consists of two parts, the raw one and the labelled one. The raw part includes 1.4 million health-related user-generated texts collected from various Internet sources, including social media. The labelled part contains 500 consumer reviews about drug therapy with drug- and disease-related information. Labels for sentences include health-related issues or their absence. The sentences with one are additionally labelled at the expression level for identification of fine-grained subtypes such as drug classes and drug forms, drug indications, and drug reactions. Further, we present a baseline model for named entity recognition (NER) and multi-label sentence classification tasks on this corpus. The macro F1 score of 74.85% in the NER task was achieved by our RuDR-BERT model. For the sentence classification task, our model achieves the macro F1 score of 68.82% gaining 7.47% over the score of BERT model trained on Russian data. We make the RuDReC corpus and pretrained weights of domain-specific BERT models freely available at https://github.com/cimm-kzn/RuDReC △ Less

Submitted 7 April, 2020; originally announced April 2020.

Comments: 9 pages, 9 tables, 4 figures

Journal ref: Bioinformatics, 2020

arXiv:2003.08791 [pdf, other]

High-Resolution Daytime Translation Without Domain Labels

Authors: Ivan Anokhin, Pavel Solovev, Denis Korzhenkov, Alexey Kharlamov, Taras Khakhulin, Alexey Silvestrov, Sergey Nikolenko, Victor Lempitsky, Gleb Sterkin

Abstract: Modeling daytime changes in high resolution photographs, e.g., re-rendering the same scene under different illuminations typical for day, night, or dawn, is a challenging image manipulation task. We present the high-resolution daytime translation (HiDT) model for this task. HiDT combines a generative image-to-image model and a new upsampling scheme that allows to apply image translation at high re… ▽ More Modeling daytime changes in high resolution photographs, e.g., re-rendering the same scene under different illuminations typical for day, night, or dawn, is a challenging image manipulation task. We present the high-resolution daytime translation (HiDT) model for this task. HiDT combines a generative image-to-image model and a new upsampling scheme that allows to apply image translation at high resolution. The model demonstrates competitive results in terms of both commonly used GAN metrics and human evaluation. Importantly, this good performance comes as a result of training on a dataset of still landscape images with no daytime labels available. Our results are available at https://saic-mdal.github.io/HiDT/. △ Less

Submitted 23 March, 2020; v1 submitted 19 March, 2020; originally announced March 2020.

Comments: accepted to CVPR 2020

arXiv:2002.08130 [pdf]

doi 10.1109/ICTON.2019.8840345

KH2PO4 + Host Matrix (Alumina / SiO$_2$) Nanocomposite: Raman Scattering Insight

Authors: Ya. Shchur, A. S. Andrushchak, V. V. Strelchuk, A. S. Nikolenko, V. T. Adamiv, N. A. Andrushchak, P. Göring, P. Huber, A. V. Kityk

Abstract: We report on the synthesis and Raman scattering characterization of composite materials based on the hostnanoporous matrices filled with nanostructured KH2PO4 (KDP) crystal. Silica (SiO2) and anodized aluminium oxide (AAO) were used as host matrices with various pore diameters, inter-pore spacing and morphology. The structure of the nanocomposites was investigated by X-ray diffraction and scanning… ▽ More We report on the synthesis and Raman scattering characterization of composite materials based on the hostnanoporous matrices filled with nanostructured KH2PO4 (KDP) crystal. Silica (SiO2) and anodized aluminium oxide (AAO) were used as host matrices with various pore diameters, inter-pore spacing and morphology. The structure of the nanocomposites was investigated by X-ray diffraction and scanning electron microscopy. Raman scattering reveals the creation of one-dimensional nanostructured KDP inside the SiO2 matrix. We clearly observed the stretching ν1, ν3 and bending ν2 vibrations of PO4 tetrahedral groups in the Raman spectrum of SiO2 + KDP. In Raman scattering spectra of AAO + KDP nanocomposite, the broad fluorescence background of AAO matrix dominates to a great extent, hindering thus the detecting of the KDP compound spectral response. △ Less

Submitted 19 February, 2020; originally announced February 2020.

Comments: 4 pages, 2 figures; 21st International Conference on Transparent Optical Networks (ICTON) 2019

arXiv:1912.11160 [pdf, other]

doi 10.1145/3336191.3371831

RecVAE: a New Variational Autoencoder for Top-N Recommendations with Implicit Feedback

Authors: Ilya Shenbin, Anton Alekseev, Elena Tutubalina, Valentin Malykh, Sergey I. Nikolenko

Abstract: Recent research has shown the advantages of using autoencoders based on deep neural networks for collaborative filtering. In particular, the recently proposed Mult-VAE model, which used the multinomial likelihood variational autoencoders, has shown excellent results for top-N recommendations. In this work, we propose the Recommender VAE (RecVAE) model that originates from our research on regulariz… ▽ More Recent research has shown the advantages of using autoencoders based on deep neural networks for collaborative filtering. In particular, the recently proposed Mult-VAE model, which used the multinomial likelihood variational autoencoders, has shown excellent results for top-N recommendations. In this work, we propose the Recommender VAE (RecVAE) model that originates from our research on regularization techniques for variational autoencoders. RecVAE introduces several novel ideas to improve Mult-VAE, including a novel composite prior distribution for the latent codes, a new approach to setting the $β$ hyperparameter for the $β$-VAE framework, and a new approach to training based on alternating updates. In experimental evaluation, we show that RecVAE significantly outperforms previously proposed autoencoder-based models, including Mult-VAE and RaCT, across classical collaborative filtering datasets, and present a detailed ablation study to assess our new developments. Code and models are available at https://github.com/ilya-shenbin/RecVAE. △ Less

Submitted 23 December, 2019; originally announced December 2019.

Comments: In The Thirteenth ACM International Conference on Web Search and Data Mining (WSDM '20), February 3-7, 2020, Houston, TX, USA. ACM, New York, NY, USA, 9 pages

arXiv:1909.11512 [pdf, other]

Synthetic Data for Deep Learning

Authors: Sergey I. Nikolenko

Abstract: Synthetic data is an increasingly popular tool for training deep learning models, especially in computer vision but also in other areas. In this work, we attempt to provide a comprehensive survey of the various directions in the development and application of synthetic data. First, we discuss synthetic datasets for basic computer vision problems, both low-level (e.g., optical flow estimation) and… ▽ More Synthetic data is an increasingly popular tool for training deep learning models, especially in computer vision but also in other areas. In this work, we attempt to provide a comprehensive survey of the various directions in the development and application of synthetic data. First, we discuss synthetic datasets for basic computer vision problems, both low-level (e.g., optical flow estimation) and high-level (e.g., semantic segmentation), synthetic environments and datasets for outdoor and urban scenes (autonomous driving), indoor scenes (indoor navigation), aerial navigation, simulation environments for robotics, applications of synthetic data outside computer vision (in neural programming, bioinformatics, NLP, and more); we also survey the work on improving synthetic data development and alternative ways to produce it such as GANs. Second, we discuss in detail the synthetic-to-real domain adaptation problem that inevitably arises in applications of synthetic data, including synthetic-to-real refinement with GAN-based models and domain adaptation at the feature/model level without explicit data transformations. Third, we turn to privacy-related applications of synthetic data and review the work on generating synthetic datasets with differential privacy guarantees. We conclude by highlighting the most promising directions for further work in synthetic data studies. △ Less

Submitted 25 September, 2019; originally announced September 2019.

Comments: 156 pages, 24 figures, 719 references

arXiv:1908.07069 [pdf, other]

CommentsRadar: Dive into Unique Data on All Comments on the Web

Authors: Sergey Nikolenko, Elena Tutubalina, Zulfat Miftahutdinov, Eugene Beloded

Abstract: We introduce an entity-centric search engineCommentsRadarthatpairs entity queries with articles and user opinions covering a widerange of topics from top commented sites. The engine aggregatesarticles and comments for these articles, extracts named entities,links them together and with knowledge base entries, performssentiment analysis, and aggregates the results, aiming to mine fortemporal trends… ▽ More We introduce an entity-centric search engineCommentsRadarthatpairs entity queries with articles and user opinions covering a widerange of topics from top commented sites. The engine aggregatesarticles and comments for these articles, extracts named entities,links them together and with knowledge base entries, performssentiment analysis, and aggregates the results, aiming to mine fortemporal trends and other insights. In this work, we present thegeneral engine, discuss the models used for all steps of this pipeline,and introduce several case studies that discover important insightsfrom online commenting data. △ Less

Submitted 16 August, 2019; originally announced August 2019.

arXiv:1908.02511 [pdf, other]

Free-Lunch Saliency via Attention in Atari Agents

Authors: Dmitry Nikulin, Anastasia Ianina, Vladimir Aliev, Sergey Nikolenko

Abstract: We propose a new approach to visualize saliency maps for deep neural network models and apply it to deep reinforcement learning agents trained on Atari environments. Our method adds an attention module that we call FLS (Free Lunch Saliency) to the feature extractor from an established baseline (Mnih et al., 2015). This addition results in a trainable model that can produce saliency maps, i.e., vis… ▽ More We propose a new approach to visualize saliency maps for deep neural network models and apply it to deep reinforcement learning agents trained on Atari environments. Our method adds an attention module that we call FLS (Free Lunch Saliency) to the feature extractor from an established baseline (Mnih et al., 2015). This addition results in a trainable model that can produce saliency maps, i.e., visualizations of the importance of different parts of the input for the agent's current decision making. We show experimentally that a network with an FLS module exhibits performance similar to the baseline (i.e., it is "free", with no performance cost) and can be used as a drop-in replacement for reinforcement learning agents. We also design another feature extractor that scores slightly lower but provides higher-fidelity visualizations. In addition to attained scores, we report saliency metrics evaluated on the Atari-HEAD dataset of human gameplay. △ Less

Submitted 30 October, 2019; v1 submitted 7 August, 2019; originally announced August 2019.

Comments: 2019 ICCV Workshop on Interpreting and Explaining Visual Artificial Intelligence Models. 15 pages, 14 figures, 5 tables

arXiv:1907.04399 [pdf, other]

New Competitiveness Bounds for the Shared Memory Switch

Authors: Ivan Bochkov, Alex Davydow, Nikita Gaevoy, Sergey I. Nikolenko

Abstract: We consider one of the simplest and best known buffer management architectures: the shared memory switch with multiple output queues and uniform packets. It was one of the first models studied by competitive analysis, with the Longest Queue Drop (LQD) buffer management policy shown to be at least $\sqrt{2}$- and at most $2$-competitive; a general lower bound of $4/3$ has been proven for all determ… ▽ More We consider one of the simplest and best known buffer management architectures: the shared memory switch with multiple output queues and uniform packets. It was one of the first models studied by competitive analysis, with the Longest Queue Drop (LQD) buffer management policy shown to be at least $\sqrt{2}$- and at most $2$-competitive; a general lower bound of $4/3$ has been proven for all deterministic online algorithms. Closing the gap between $\sqrt{2}$ and $2$ has remained an open problem in competitive analysis for more than a decade, with only marginal success in reducing the upper bound of $2$. In this work, we first present a simplified proof for the $\sqrt{2}$ lower bound for LQD and then, using a reduction to the continuous case, improve the general lower bound for all deterministic online algorithms from $\frac 43$ to $\sqrt{2}$. Then, we proceed to improve the lower bound of $\sqrt{2}$ specifically for LQD, showing that LQD is at least $1.44546086$-competitive. We are able to prove the bound by presenting an explicit construction of the optimal clairvoyant algorithm which then allows for two different ways to prove lower bounds: by direct computer simulations and by proving lower bounds via linear programming. The linear programming approach yields a lower bound for LQD of $1.4427902$ (still larger than $\sqrt{2}$). △ Less

Submitted 9 July, 2019; originally announced July 2019.

Comments: 23 pages, 8 figures

MSC Class: 68W27

arXiv:1905.01743 [pdf, other]

Breast Tumor Cellularity Assessment using Deep Neural Networks

Authors: Alexander Rakhlin, Aleksei Tiulpin, Alexey A. Shvets, Alexandr A. Kalinin, Vladimir I. Iglovikov, Sergey Nikolenko

Abstract: Breast cancer is one of the main causes of death worldwide. Histopathological cellularity assessment of residual tumors in post-surgical tissues is used to analyze a tumor's response to a therapy. Correct cellularity assessment increases the chances of getting an appropriate treatment and facilitates the patient's survival. In current clinical practice, tumor cellularity is manually estimated by p… ▽ More Breast cancer is one of the main causes of death worldwide. Histopathological cellularity assessment of residual tumors in post-surgical tissues is used to analyze a tumor's response to a therapy. Correct cellularity assessment increases the chances of getting an appropriate treatment and facilitates the patient's survival. In current clinical practice, tumor cellularity is manually estimated by pathologists; this process is tedious and prone to errors or low agreement rates between assessors. In this work, we evaluated three strong novel Deep Learning-based approaches for automatic assessment of tumor cellularity from post-treated breast surgical specimens stained with hematoxylin and eosin. We validated the proposed methods on the BreastPathQ SPIE challenge dataset that consisted of 2395 image patches selected from whole slide images acquired from 64 patients. Compared to expert pathologist scoring, our best performing method yielded the Cohen's kappa coefficient of 0.70 (vs. 0.42 previously known in literature) and the intra-class correlation coefficient of 0.89 (vs. 0.83). Our results suggest that Deep Learning-based methods have a significant potential to alleviate the burden on pathologists, enhance the diagnostic workflow, and, thereby, facilitate better clinical outcomes in breast cancer treatment. △ Less

Submitted 3 September, 2019; v1 submitted 5 May, 2019; originally announced May 2019.

arXiv:1901.07829 [pdf, other]

AspeRa: Aspect-based Rating Prediction Model

Authors: Sergey I. Nikolenko, Elena Tutubalina, Valentin Malykh, Ilya Shenbin, Anton Alekseev

Abstract: We propose a novel end-to-end Aspect-based Rating Prediction model (AspeRa) that estimates user rating based on review texts for the items and at the same time discovers coherent aspects of reviews that can be used to explain predictions or profile users. The AspeRa model uses max-margin losses for joint item and user embedding learning and a dual-headed architecture; it significantly outperforms… ▽ More We propose a novel end-to-end Aspect-based Rating Prediction model (AspeRa) that estimates user rating based on review texts for the items and at the same time discovers coherent aspects of reviews that can be used to explain predictions or profile users. The AspeRa model uses max-margin losses for joint item and user embedding learning and a dual-headed architecture; it significantly outperforms recently proposed state-of-the-art models such as DeepCoNN, HFT, NARRE, and TransRev on two real world data sets of user reviews. With qualitative examination of the aspects and quantitative evaluation of rating prediction models based on these aspects, we show how aspect embeddings can be used in a recommender system. △ Less

Submitted 23 January, 2019; originally announced January 2019.

Comments: accepted to ECIR 2019

arXiv:1901.06345 [pdf, ps, other]

Adapting Convolutional Neural Networks for Geographical Domain Shift

Authors: Pavel Ostyakov, Sergey I. Nikolenko

Abstract: We present the winning solution for the Inclusive Images Competition organized as part of the Conference on Neural Information Processing Systems (NeurIPS 2018) Competition Track. The competition was organized to study ways to cope with domain shift in image processing, specifically geographical shift: the training and two test sets in the competition had different geographical distributions. Our… ▽ More We present the winning solution for the Inclusive Images Competition organized as part of the Conference on Neural Information Processing Systems (NeurIPS 2018) Competition Track. The competition was organized to study ways to cope with domain shift in image processing, specifically geographical shift: the training and two test sets in the competition had different geographical distributions. Our solution has proven to be relatively straightforward and simple: it is an ensemble of several CNNs where only the last layer is fine-tuned with the help of a small labeled set of tuning labels made available by the organizers. We believe that while domain shift remains a formidable problem, our approach opens up new possibilities for alleviating this problem in practice, where small labeled datasets from the target domain are usually either available or can be obtained and labeled cheaply. △ Less

Submitted 18 January, 2019; originally announced January 2019.

arXiv:1811.12823 [pdf, other]

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

Authors: Daniil Polykovskiy, Alexander Zhebrak, Benjamin Sanchez-Lengeling, Sergey Golovanov, Oktai Tatanov, Stanislav Belyaev, Rauf Kurbanov, Aleksey Artamonov, Vladimir Aladinskiy, Mark Veselov, Artur Kadurin, Simon Johansson, Hongming Chen, Sergey Nikolenko, Alan Aspuru-Guzik, Alex Zhavoronkov

Abstract: Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervised predictive models in the downstream tasks. While there are plenty of generative models, it is unclear how to compare an… ▽ More Generative models are becoming a tool of choice for exploring the molecular space. These models learn on a large training dataset and produce novel molecular structures with similar properties. Generated structures can be utilized for virtual screening or training semi-supervised predictive models in the downstream tasks. While there are plenty of generative models, it is unclear how to compare and rank them. In this work, we introduce a benchmarking platform called Molecular Sets (MOSES) to standardize training and comparison of molecular generative models. MOSES provides a training and testing datasets, and a set of metrics to evaluate the quality and diversity of generated structures. We have implemented and compared several molecular generation models and suggest to use our results as reference points for further advancements in generative chemistry research. The platform and source code are available at https://github.com/molecularsets/moses. △ Less

Submitted 28 October, 2020; v1 submitted 29 November, 2018; originally announced November 2018.

arXiv:1811.11523 [pdf, ps, other]

doi 10.1016/j.jbi.2018.06.006

Sequence Learning with RNNs for Medical Concept Normalization in User-Generated Texts

Authors: Elena Tutubalina, Zulfat Miftahutdinov, Sergey Nikolenko, Valentin Malykh

Abstract: In this work, we consider the medical concept normalization problem, i.e., the problem of map** a disease mention in free-form text to a concept in a controlled vocabulary, usually to the standard thesaurus in the Unified Medical Language System (UMLS). This task is challenging since medical terminology is very different when coming from health care professionals or from the general public in th… ▽ More In this work, we consider the medical concept normalization problem, i.e., the problem of map** a disease mention in free-form text to a concept in a controlled vocabulary, usually to the standard thesaurus in the Unified Medical Language System (UMLS). This task is challenging since medical terminology is very different when coming from health care professionals or from the general public in the form of social media texts. We approach it as a sequence learning problem, with recurrent neural networks trained to obtain semantic representations of one- and multi-word expressions. We develop end-to-end neural architectures tailored specifically to medical concept normalization, including bidirectional LSTM and GRU with an attention mechanism and additional semantic similarity features based on UMLS. Our evaluation over a standard benchmark shows that our model improves over a state of the art baseline for classification based on CNNs. △ Less

Submitted 29 November, 2018; v1 submitted 28 November, 2018; originally announced November 2018.

Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

Report number: ML4H/2018/117

Journal ref: Journal of Biomedical Informatics. - 2018. - Vol.84, Is.. - P.93-102

arXiv:1811.11067 [pdf, other]

Learning State Representations in Complex Systems with Multimodal Data

Authors: Pavel Solovev, Vladimir Aliev, Pavel Ostyakov, Gleb Sterkin, Elizaveta Logacheva, Stepan Troeshestov, Roman Suvorov, Anton Mashikhin, Oleg Khomenko, Sergey I. Nikolenko

Abstract: Representation learning becomes especially important for complex systems with multimodal data sources such as cameras or sensors. Recent advances in reinforcement learning and optimal control make it possible to design control algorithms on these latent representations, but the field still lacks a large-scale standard dataset for unified comparison. In this work, we present a large-scale dataset a… ▽ More Representation learning becomes especially important for complex systems with multimodal data sources such as cameras or sensors. Recent advances in reinforcement learning and optimal control make it possible to design control algorithms on these latent representations, but the field still lacks a large-scale standard dataset for unified comparison. In this work, we present a large-scale dataset and evaluation framework for representation learning for the complex task of landing an airplane. We implement and compare several approaches to representation learning on this dataset in terms of the quality of simple supervised learning tasks and disentanglement scores. The resulting representations can be used for further tasks such as anomaly detection, optimal control, model-based reinforcement learning, and other applications. △ Less

Submitted 15 January, 2019; v1 submitted 27 November, 2018; originally announced November 2018.

Comments: Fixed references

arXiv:1811.07630 [pdf, other]

SEIGAN: Towards Compositional Image Generation by Simultaneously Learning to Segment, Enhance, and Inpaint

Authors: Pavel Ostyakov, Roman Suvorov, Elizaveta Logacheva, Oleg Khomenko, Sergey I. Nikolenko

Abstract: We present a novel approach to image manipulation and understanding by simultaneously learning to segment object masks, paste objects to another background image, and remove them from original images. For this purpose, we develop a novel generative model for compositional image generation, SEIGAN (Segment-Enhance-Inpaint Generative Adversarial Network), which learns these three operations together… ▽ More We present a novel approach to image manipulation and understanding by simultaneously learning to segment object masks, paste objects to another background image, and remove them from original images. For this purpose, we develop a novel generative model for compositional image generation, SEIGAN (Segment-Enhance-Inpaint Generative Adversarial Network), which learns these three operations together in an adversarial architecture with additional cycle consistency losses. To train, SEIGAN needs only bounding box supervision and does not require pairing or ground truth masks. SEIGAN produces better generated images (evaluated by human assessors) than other approaches and produces high-quality segmentation masks, improving over other adversarially trained approaches and getting closer to the results of fully supervised training. △ Less

Submitted 15 January, 2019; v1 submitted 19 November, 2018; originally announced November 2018.

arXiv:1809.04403 [pdf, other]

Label Denoising with Large Ensembles of Heterogeneous Neural Networks

Authors: Pavel Ostyakov, Elizaveta Logacheva, Roman Suvorov, Vladimir Aliev, Gleb Sterkin, Oleg Khomenko, Sergey I. Nikolenko

Abstract: Despite recent advances in computer vision based on various convolutional architectures, video understanding remains an important challenge. In this work, we present and discuss a top solution for the large-scale video classification (labeling) problem introduced as a Kaggle competition based on the YouTube-8M dataset. We show and compare different approaches to preprocessing, data augmentation, m… ▽ More Despite recent advances in computer vision based on various convolutional architectures, video understanding remains an important challenge. In this work, we present and discuss a top solution for the large-scale video classification (labeling) problem introduced as a Kaggle competition based on the YouTube-8M dataset. We show and compare different approaches to preprocessing, data augmentation, model architectures, and model combination. Our final model is based on a large ensemble of video- and frame-level models but fits into rather limiting hardware constraints. We apply an approach based on knowledge distillation to deal with noisy labels in the original dataset and the recently developed mixup technique to improve the basic models. △ Less

Submitted 15 January, 2019; v1 submitted 12 September, 2018; originally announced September 2018.

arXiv:1510.04235 [pdf, other]

BASEL (Buffering Architecture SpEcification Language)

Authors: Kirill Kogan, Danushka Menikkumbura, Gustavo Petri, Youngtae Noh, Sergey Nikolenko, Patrick Eugster

Abstract: Buffering architectures and policies for their efficient management constitute one of the core ingredients of a network architecture. In this work we introduce a new specification language, BASEL, that allows to express virtual buffering architectures and management policies representing a variety of economic models. BASEL does not require the user to implement policies in a high-level language; r… ▽ More Buffering architectures and policies for their efficient management constitute one of the core ingredients of a network architecture. In this work we introduce a new specification language, BASEL, that allows to express virtual buffering architectures and management policies representing a variety of economic models. BASEL does not require the user to implement policies in a high-level language; rather, the entire buffering architecture and its policy are reduced to several comparators and simple functions. We show examples of buffering architectures in BASEL and demonstrate empirically the impact of various settings on performance. △ Less

Submitted 14 October, 2015; originally announced October 2015.

Comments: 11 pages, 11 figures, 2 tables

arXiv:1211.2756 [pdf, other]

doi 10.1186/1471-2164-14-S1-S7

BayesHammer: Bayesian clustering for error correction in single-cell sequencing

Authors: Sergey I. Nikolenko, Anton I. Korobeynikov, Max A. Alekseyev

Abstract: Error correction of sequenced reads remains a difficult task, especially in single-cell sequencing projects with extremely non-uniform coverage. While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic. We introduce s… ▽ More Error correction of sequenced reads remains a difficult task, especially in single-cell sequencing projects with extremely non-uniform coverage. While existing error correction tools designed for standard (multi-cell) sequencing data usually come up short in single-cell sequencing projects, algorithms actually used for single-cell error correction have been so far very simplistic. We introduce several novel algorithms based on Hamming graphs and Bayesian subclustering in our new error correction tool BayesHammer. While BayesHammer was designed for single-cell sequencing, we demonstrate that it also improves on existing error correction tools for multi-cell sequencing data while working much faster on real-life datasets. We benchmark BayesHammer on both $k$-mer counts and actual assembly results with the SPAdes genome assembler. △ Less

Submitted 12 November, 2012; originally announced November 2012.

Journal ref: BMC Genomics 14(Suppl 1) (2013), pp. S7

arXiv:1204.5443 [pdf, other]

FIFO Queueing Policies for Packets with Heterogeneous Processing

Authors: Kirill Kogan, Alejandro López-Ortiz, Sergey I. Nikolenko, Alexander V. Sirotkin, Denis Tugaryov

Abstract: We consider the problem of managing a bounded size First-In-First-Out (FIFO) queue buffer, where each incoming unit-sized packet requires several rounds of processing before it can be transmitted out. Our objective is to maximize the total number of successfully transmitted packets. We consider both push-out (when the policy is permitted to drop already admitted packets) and non-push-out cases. In… ▽ More We consider the problem of managing a bounded size First-In-First-Out (FIFO) queue buffer, where each incoming unit-sized packet requires several rounds of processing before it can be transmitted out. Our objective is to maximize the total number of successfully transmitted packets. We consider both push-out (when the policy is permitted to drop already admitted packets) and non-push-out cases. In particular, we provide analytical guarantees for the throughput performance of our algorithms. We further conduct a comprehensive simulation study which experimentally validates the predicted theoretical behaviour. △ Less

Submitted 24 April, 2012; originally announced April 2012.

Comments: 15 pages

arXiv:1202.5755 [pdf, other]

Balancing Work and Size with Bounded Buffers

Authors: Kirill Kogan, Alejandro Lopez-Ortiz, Sergey I. Nikolenko, Gabriel Scalosub, Michael Segal

Abstract: We consider the fundamental problem of managing a bounded size queue buffer where traffic consists of packets of varying size, where each packet requires several rounds of processing before it can be transmitted from the queue buffer. The goal in such an environment is to maximize the overall size of packets that are successfully transmitted. This model is motivated by the ever-growing ubiquity of… ▽ More We consider the fundamental problem of managing a bounded size queue buffer where traffic consists of packets of varying size, where each packet requires several rounds of processing before it can be transmitted from the queue buffer. The goal in such an environment is to maximize the overall size of packets that are successfully transmitted. This model is motivated by the ever-growing ubiquity of network processors architectures, which must deal with heterogeneously-sized traffic, with heterogeneous processing requirements. Our work addresses the tension between two conflicting algorithmic approaches in such settings: the tendency to favor packets with fewer processing requirements, thus leading to fast contributions to the accumulated throughput, as opposed to preferring packets of larger size, which imply a large increase in throughput at each step. We present a model for studying such systems, and present competitive algorithms whose performance depend on the maximum size a packet may have, and maximum amount of processing a packet may require. We further provide lower bounds on algorithms performance in such settings. △ Less

Submitted 5 September, 2013; v1 submitted 26 February, 2012; originally announced February 2012.

Comments: 22 pages, 7 figures

arXiv:0802.2863 [pdf, ps, other]

New Combinatorial Complete One-Way Functions

Authors: Arist Kojevnikov, Sergey I. Nikolenko

Abstract: In 2003, Leonid A. Levin presented the idea of a combinatorial complete one-way function and a sketch of the proof that Tiling represents such a function. In this paper, we present two new one-way functions based on semi-Thue string rewriting systems and a version of the Post Correspondence Problem and prove their completeness. Besides, we present an alternative proof of Levin's result. We also… ▽ More In 2003, Leonid A. Levin presented the idea of a combinatorial complete one-way function and a sketch of the proof that Tiling represents such a function. In this paper, we present two new one-way functions based on semi-Thue string rewriting systems and a version of the Post Correspondence Problem and prove their completeness. Besides, we present an alternative proof of Levin's result. We also discuss the properties a combinatorial problem should have in order to hold a complete one-way function. △ Less

Submitted 20 February, 2008; originally announced February 2008.

Journal ref: Dans Proceedings of the 25th Annual Symposium on the Theoretical Aspects of Computer Science - STACS 2008, Bordeaux : France (2008)

arXiv:math/0606335 [pdf]

Chow ring structure made simple

Authors: S. Nikolenko, N. Semenov

Abstract: We show how to translate the task of computing the multiplicative structure of a Chow ring of a projective homogeneous variety into an easily understandable combinatorial task of calculating in the corresponding polynomial ring. The algorithms are also presented as a Maple package. Then we proceed to compute the multiplicative structure of the Chow rings for projective homogeneous varieties E6/P… ▽ More We show how to translate the task of computing the multiplicative structure of a Chow ring of a projective homogeneous variety into an easily understandable combinatorial task of calculating in the corresponding polynomial ring. The algorithms are also presented as a Maple package. Then we proceed to compute the multiplicative structure of the Chow rings for projective homogeneous varieties E6/P1, E7/P7, and E8/P8. △ Less

Submitted 14 June, 2006; originally announced June 2006.

Comments: 17 pages

MSC Class: 14M15

arXiv:math/0502382 [pdf, ps, other]

Motivic decomposition of anisotropic varieties of type F_4 into generalized Rost motives

Authors: S. Nikolenko, N. Semenov, K. Zainoulline

Abstract: This an extended version of the previous preprint dated by February 2005. We prove that the Chow motive of an anisotropic projective homogeneous variety of type F4 is isomorphic to the direct sum of twisted copies of a generalized Rost motive. In particular, we provide an explicit construction of a generalized Rost motive for a generically splitting variety for a symbol in K_3^M(k)/3. We also… ▽ More This an extended version of the previous preprint dated by February 2005. We prove that the Chow motive of an anisotropic projective homogeneous variety of type F4 is isomorphic to the direct sum of twisted copies of a generalized Rost motive. In particular, we provide an explicit construction of a generalized Rost motive for a generically splitting variety for a symbol in K_3^M(k)/3. We also establish a motivic isomorphism between two anisotropic non-isomorphic projective homogeneous varieties of type F4. All our results hold for Chow motives with integral coefficients. △ Less

Submitted 22 September, 2005; v1 submitted 17 February, 2005; originally announced February 2005.

Comments: 20 pages, XYPIC

Journal ref: J. of K-theory 3 (2009), no.1, 85-102.

arXiv:cs/0301012 [pdf, ps, other]

Hard satisfiable formulas for DPLL-type algorithms

Authors: Sergey I. Nikolenko

Abstract: We address lower bounds on the time complexity of algorithms solving the propositional satisfiability problem. Namely, we consider two DPLL-type algorithms, enhanced with the unit clause and pure literal heuristics. Exponential lower bounds for solving satisfiability on provably satisfiable formulas are proven. We address lower bounds on the time complexity of algorithms solving the propositional satisfiability problem. Namely, we consider two DPLL-type algorithms, enhanced with the unit clause and pure literal heuristics. Exponential lower bounds for solving satisfiability on provably satisfiable formulas are proven. △ Less

Submitted 15 January, 2003; originally announced January 2003.

Comments: 9 pages

ACM Class: F.2.2

Showing 1–46 of 46 results for author: Nikolenko, S