Skip to main content

Showing 1–8 of 8 results for author: Kuriyozov, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.01916  [pdf, ps, other

    cs.IR cs.CL cs.LG

    CoLe and LYS at BioASQ MESINESP8 Task: similarity based descriptor assignment in Spanish

    Authors: Francisco J. Ribadas-Pena, Shuyuan Cao, Elmurod Kuriyozov

    Abstract: In this paper, we describe our participation in the MESINESP Task of the BioASQ biomedical semantic indexing challenge. The participating system follows an approach based solely on conventional information retrieval tools. We have evaluated various alternatives for extracting index terms from IBECS/LILACS documents in order to be stored in an Apache Lucene index. Those indexed representations are… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted at the 8th BioASQ Workshop at the 11th Conference and Labs of the Evaluation Forum (CLEF) 2020. 11 pages

    MSC Class: 68P20; 68T50 ACM Class: H.3.3; I.2.7

    Journal ref: Working Notes of CLEF 2020. Vol. 2696 of CEUR Workshop Proceedings (CEUR-WS.org)

  2. Design and Implementation of a Tool for Extracting Uzbek Syllables

    Authors: Ulugbek Salaev, Elmurod Kuriyozov, Gayrat Matlatipov

    Abstract: The accurate syllabification of words plays a vital role in various Natural Language Processing applications. Syllabification is a versatile linguistic tool with applications in linguistic research, language technology, education, and various fields where understanding and processing language is essential. In this paper, we present a comprehensive approach to syllabification for the Uzbek language… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: Accepted for publication at The Proceedings of 2023 IEEE XVI International Scientific and Technical Conference Actual Problems of Electronic Instrument Engineering (APEIE), 10-12 Nov. 2023

  3. arXiv:2302.14494  [pdf

    cs.CL

    Text classification dataset and analysis for Uzbek language

    Authors: Elmurod Kuriyozov, Ulugbek Salaev, Sanatbek Matlatipov, Gayrat Matlatipov

    Abstract: Text classification is an important task in Natural Language Processing (NLP), where the goal is to categorize text data into predefined classes. In this study, we analyse the dataset creation steps and evaluation techniques of multi-label news categorisation task as part of text classification. We first present a newly obtained dataset for Uzbek text classification, which was collected from 10 di… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: Preprint of the paper accepted to The 10th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics. April 21-23, 2023, Poznan, Poland

  4. arXiv:2301.12711  [pdf

    cs.CL

    UzbekTagger: The rule-based POS tagger for Uzbek language

    Authors: Maksud Sharipov, Elmurod Kuriyozov, Ollabergan Yuldashev, Ogabek Sobirov

    Abstract: This research paper presents a part-of-speech (POS) annotated dataset and tagger tool for the low-resource Uzbek language. The dataset includes 12 tags, which were used to develop a rule-based POS-tagger tool. The corpus text used in the annotation process was made sure to be balanced over 20 different fields in order to ensure its representativeness. Uzbek being an agglutinative language so the m… ▽ More

    Submitted 1 March, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: Preprint of the accepted paper to The 10th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, April 21-23, 2023, Poznan, Poland

  5. arXiv:2205.15930  [pdf, other

    cs.CL cs.AI

    Uzbek Sentiment Analysis based on local Restaurant Reviews

    Authors: Sanatbek Matlatipov, Hulkar Rahimboeva, Jaloliddin Rajabov, Elmurod Kuriyozov

    Abstract: Extracting useful information for sentiment analysis and classification problems from a big amount of user-generated feedback, such as restaurant reviews, is a crucial task of natural language processing, which is not only for customer satisfaction where it can give personalized services, but can also influence the further development of a company. In this paper, we present a work done on collecti… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: The International Conference on Agglutinative Language Technologies as a challenge of Natural Language Processing (ALTNLP) 2022, Koper, Slovenia

  6. arXiv:2205.09578  [pdf, other

    cs.CL

    A machine transliteration tool between Uzbek alphabets

    Authors: Ulugbek Salaev, Elmurod Kuriyozov, Carlos Gómez-Rodríguez

    Abstract: Machine transliteration, as defined in this paper, is a process of automatically transforming written script of words from a source alphabet into words of another target alphabet within the same language, while preserving their meaning, as well as pronunciation. The main goal of this paper is to present a machine transliteration tool between three common scripts used in low-resource Uzbek language… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: Preprint of a conference paper: The International Conference on Agglutinative Language Technologies as a challenge of Natural Language Processing (ALTNLP)

  7. arXiv:2205.06072  [pdf, other

    cs.CL

    SimRelUz: Similarity and Relatedness scores as a Semantic Evaluation dataset for Uzbek language

    Authors: Ulugbek Salaev, Elmurod Kuriyozov, Carlos Gómez-Rodríguez

    Abstract: Semantic relatedness between words is one of the core concepts in natural language processing, thus making semantic evaluation an important task. In this paper, we present a semantic model evaluation dataset: SimRelUz - a collection of similarity and relatedness scores of word pairs for the low-resource Uzbek language. The dataset consists of more than a thousand pairs of words carefully selected… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: Final version, published in the proceedings of SIGUL workshop of LREC 2022

  8. arXiv:2005.08340  [pdf, other

    cs.CL

    Cross-Lingual Word Embeddings for Turkic Languages

    Authors: Elmurod Kuriyozov, Yerai Doval, Carlos Gómez-Rodríguez

    Abstract: There has been an increasing interest in learning cross-lingual word embeddings to transfer knowledge obtained from a resource-rich language, such as English, to lower-resource languages for which annotated data is scarce, such as Turkish, Russian, and many others. In this paper, we present the first viability study of established techniques to align monolingual embedding spaces for Turkish, Uzbek… ▽ More

    Submitted 17 May, 2020; originally announced May 2020.

    Comments: Final version, published in the proceedings of LREC 2020

    MSC Class: 68T50; 91F20 ACM Class: I.2.7

    Journal ref: Proceedings of The 12th Language Resources and Evaluation Conference, Marseille, France, 2020, pp. 4047-4055