Skip to main content

Showing 1–7 of 7 results for author: Krizhanovsky, A A

Searching in archive cs. Search in all archives.
.
  1. arXiv:1011.1368  [pdf

    cs.IR

    Transformation of Wiktionary entry structure into tables and relations in a relational database schema

    Authors: A. A. Krizhanovsky

    Abstract: This paper addresses the question of automatic data extraction from the Wiktionary, which is a multilingual and multifunctional dictionary. Wiktionary is a collaborative project working on the same principles as the Wikipedia. The Wiktionary entry is a plain text from the text processing point of view. Wiktionary guidelines prescribe the entry layout and rules, which should be followed by editors… ▽ More

    Submitted 5 November, 2010; originally announced November 2010.

    Comments: 10 pages, 7 figures, preprint

    MSC Class: 68W25; 90C35 ACM Class: I.7.2; I.7.3; I.7.5; H.3.1; H.3.3

  2. arXiv:1006.5040  [pdf, other

    cs.IR

    The comparison of Wiktionary thesauri transformed into the machine-readable format

    Authors: A. A. Krizhanovsky

    Abstract: Wiktionary is a unique, peculiar, valuable and original resource for natural language processing (NLP). The paper describes an open-source Wiktionary parser: its architecture and requirements followed by a description of Wiktionary features to be taken into account, some open problems of Wiktionary and the parser. The current implementation of the parser extracts the definitions, semantic relation… ▽ More

    Submitted 25 June, 2010; originally announced June 2010.

    Comments: 23 pages, 3 tables, 6 figures, preprint

    MSC Class: 68W25; 90C35 ACM Class: I.7.2; I.7.3; I.7.5; H.3.1; H.3.3

  3. arXiv:0907.2209  [pdf

    cs.IR

    Related terms search based on WordNet / Wiktionary and its application in Ontology Matching

    Authors: A. A. Krizhanovsky, Feiyu Lin

    Abstract: A set of ontology matching algorithms (for finding correspondences between concepts) is based on a thesaurus that provides the source data for the semantic distance calculations. In this wiki era, new resources may spring up and improve this kind of semantic search. In the paper a solution of this task based on Russian Wiktionary is compared to WordNet based algorithms. Metrics are estimated usi… ▽ More

    Submitted 12 October, 2009; v1 submitted 13 July, 2009; originally announced July 2009.

    Comments: 7 pages, 2 tables, 3 figures; In: RCDL 2009. September 17-21, Petrozavodsk, Russia. - pp. 363-369

    ACM Class: I.7.2; I.7.3; I.7.5; H.3.1; H.3.3

  4. arXiv:0808.1753  [pdf

    cs.IR cs.CL

    Index wiki database: design and experiments

    Authors: A. A. Krizhanovsky

    Abstract: With the fantastic growth of Internet usage, information search in documents of a special type called a "wiki page" that is written using a simple markup language, has become an important problem. This paper describes the software architectural model for indexing wiki texts in three languages (Russian, English, and German) and the interaction between the software components (GATE, Lemmatizer, an… ▽ More

    Submitted 23 September, 2008; v1 submitted 12 August, 2008; originally announced August 2008.

    Comments: 18 pages, 4 tables, 4 figures; FLINS'08, Corpus Linguistics'08, AIS/CAD'08; v2: table 3 changed

    ACM Class: I.7.2; I.7.3; I.7.5; H.3.1; H.3.3

  5. arXiv:0804.2354  [pdf

    cs.IR cs.CL

    Information filtering based on wiki index database

    Authors: A. V. Smirnov, A. A. Krizhanovsky

    Abstract: In this paper we present a profile-based approach to information filtering by an analysis of the content of text documents. The Wikipedia index database is created and used to automatically generate the user profile from the user document collection. The problem-oriented Wikipedia subcorpora are created (using knowledge extracted from the user profile) for each topic of user interests. The index… ▽ More

    Submitted 8 May, 2008; v1 submitted 15 April, 2008; originally announced April 2008.

    Comments: 9 pages, 1 table, 2 figures, 8th International FLINS Conference on Computational Intelligence in Decision and Control, Madrid, Spain, September 21-24, 2008; v2: typo

    ACM Class: I.7.2; I.7.3; I.7.5; H.3.1; H.3.3

  6. arXiv:0710.0169  [pdf

    cs.IR cs.CL

    Evaluation experiments on related terms search in Wikipedia: Information Content and Adapted HITS (In Russian)

    Authors: A. A. Krizhanovsky

    Abstract: The classification of metrics and algorithms search for related terms via WordNet, Roget's Thesaurus, and Wikipedia was extended to include adapted HITS algorithm. Evaluation experiments on Information Content and adapted HITS algorithm are described. The test collection of Russian word pairs with human-assigned similarity judgments is proposed. ----- Klassifikacija metrik i algoritmov poisk… ▽ More

    Submitted 16 January, 2008; v1 submitted 1 October, 2007; originally announced October 2007.

    Comments: 10 pages, 1 figure, 3 tables, in Russian, short version of the paper to be published in Proceedings of the Wiki-Conference 2007, Russia, St. Petersburg, October 27-28. http://tinyurl.com/2czd6e ; v3: +figure; v4: typo in Table 3; v5: +desc (res_hypo formula); v6: typo

    ACM Class: H.3.1; H.3.3; H.4.3; G.2.2

  7. arXiv:cs/0610058  [pdf

    cs.IR

    Context-sensitive access to e-document corpus

    Authors: A. V. Smirnov, T. V. Levashova, M. P. Pashkin, N. G. Shilov, A. A. Krizhanovsky, A. M. Kashevnik, A. S. Komarova

    Abstract: The methodology of context-sensitive access to e-documents considers context as a problem model based on the knowledge extracted from the application domain, and presented in the form of application ontology. Efficient access to an information in the text form is needed. Wiki resources as a modern text format provides huge number of text in a semi formalized structure. At the first stage of the… ▽ More

    Submitted 11 October, 2006; originally announced October 2006.

    Comments: 9 pages, 1 figure, short version of this paper was presented at the International Conference Corpus Linguistics 2006. October 10-14, St. Petersburg, Russia

    ACM Class: H.3.1; H.3.3; H.4.3; G.2.2