-
Teaching a Massive Open Online Course on Natural Language Processing
Authors:
Ekaterina Artemova,
Murat Apishev,
Veronika Sarkisyan,
Sergey Aksenov,
Denis Kirjanov,
Oleg Serikov
Abstract:
This paper presents a new Massive Open Online Course on Natural Language Processing, targeted at non-English speaking students. The course lasts 12 weeks; every week consists of lectures, practical sessions, and quiz assignments. Three weeks out of 12 are followed by Kaggle-style coding assignments.
Our course intends to serve multiple purposes: (i) familiarize students with the core concepts an…
▽ More
This paper presents a new Massive Open Online Course on Natural Language Processing, targeted at non-English speaking students. The course lasts 12 weeks; every week consists of lectures, practical sessions, and quiz assignments. Three weeks out of 12 are followed by Kaggle-style coding assignments.
Our course intends to serve multiple purposes: (i) familiarize students with the core concepts and methods in NLP, such as language modeling or word or sentence representations, (ii) show that recent advances, including pre-trained Transformer-based models, are built upon these concepts; (iii) introduce architectures for most demanded real-life applications, (iv) develop practical skills to process texts in multiple languages. The course was prepared and recorded during 2020, launched by the end of the year, and in early 2021 has received positive feedback.
△ Less
Submitted 4 May, 2021; v1 submitted 26 April, 2021;
originally announced April 2021.
-
RuREBus: a Case Study of Joint Named Entity Recognition and Relation Extraction from e-Government Domain
Authors:
Vitaly Ivanin,
Ekaterina Artemova,
Tatiana Batura,
Vladimir Ivanov,
Veronika Sarkisyan,
Elena Tutubalina,
Ivan Smurov
Abstract:
We show-case an application of information extraction methods, such as named entity recognition (NER) and relation extraction (RE) to a novel corpus, consisting of documents, issued by a state agency. The main challenges of this corpus are: 1) the annotation scheme differs greatly from the one used for the general domain corpora, and 2) the documents are written in a language other than English. U…
▽ More
We show-case an application of information extraction methods, such as named entity recognition (NER) and relation extraction (RE) to a novel corpus, consisting of documents, issued by a state agency. The main challenges of this corpus are: 1) the annotation scheme differs greatly from the one used for the general domain corpora, and 2) the documents are written in a language other than English. Unlike expectations, the state-of-the-art transformer-based models show modest performance for both tasks, either when approached sequentially, or in an end-to-end fashion. Our experiments have demonstrated that fine-tuning on a large unlabeled corpora does not automatically yield significant improvement and thus we may conclude that more sophisticated strategies of leveraging unlabelled texts are demanded. In this paper, we describe the whole developed pipeline, starting from text annotation, baseline development, and designing a shared task in hopes of improving the baseline. Eventually, we realize that the current NER and RE technologies are far from being mature and do not overcome so far challenges like ours.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
So What's the Plan? Mining Strategic Planning Documents
Authors:
Ekaterina Artemova,
Tatiana Batura,
Anna Golenkovskaya,
Vitaly Ivanin,
Vladimir Ivanov,
Veronika Sarkisyan,
Ivan Smurov,
Elena Tutubalina
Abstract:
In this paper we present a corpus of Russian strategic planning documents, RuREBus. This project is grounded both from language technology and e-government perspectives. Not only new language sources and tools are being developed, but also their applications to e-goverment research. We demonstrate the pipeline for creating a text corpus from scratch. First, the annotation schema is designed. Next…
▽ More
In this paper we present a corpus of Russian strategic planning documents, RuREBus. This project is grounded both from language technology and e-government perspectives. Not only new language sources and tools are being developed, but also their applications to e-goverment research. We demonstrate the pipeline for creating a text corpus from scratch. First, the annotation schema is designed. Next texts are marked up using human-in-the-loop strategy, so that preliminary annotations are derived from a machine learning model and are manually corrected. The amount of annotated texts is large enough to showcase what insights can be gained from RuREBus.
△ Less
Submitted 7 July, 2020; v1 submitted 1 July, 2020;
originally announced July 2020.
-
Coherent Dissociation $^{16}$O~$\rightarrow$~4$α$ in Photoemulsion at an Incident Momentum of 4.5 GeV/$c$ per Nucleon
Authors:
N. P. Andreeva,
Z. V. Anzon,
V. I. Bubnov,
A. Sh. Gaitinov,
G. Zh. Eligbaeva,
L. E. Eremenko,
G. S. Kalyachkina,
E. K. Kanygina,
A. M. Seitimbetov,
I. Ya. Chasnikov,
Ts. I. Shakhova,
M. Haiduk,
S. A. Krasnov,
T. N. Maksimkina,
K. D. Tolstov,
G. M. Chernov,
N. A. Salmanova,
D. A. Salomov,
A. Khushvaktova,
F. A. Avetyan,
N. A. Marutyan,
L. T. Sarkisova,
V. F. Sarkisyan,
M. I. Adamovich,
Yu. A. Bashmakov
, et al. (15 additional authors not shown)
Abstract:
First searches for the coherent dissociation of relativistic oxygen nuclei into four a particles are reported. It is shown that reactions of this type are characterized by a significantly lower decay temperature than the conventional multifragmentation of residual projectile nuclei. The momentum spectra and correlations of a panicles are not reproduced by the simple statistical model of direct fra…
▽ More
First searches for the coherent dissociation of relativistic oxygen nuclei into four a particles are reported. It is shown that reactions of this type are characterized by a significantly lower decay temperature than the conventional multifragmentation of residual projectile nuclei. The momentum spectra and correlations of a panicles are not reproduced by the simple statistical model of direct fragmentation. The possibility that the oxygen nucleus undergoing fragmentation acquires a nonzero angular momentum in the collision process is discussed.
△ Less
Submitted 14 September, 2011;
originally announced September 2011.
-
Coherent Dissociation of Relativistic C-9 Nuclei
Authors:
D. O. Krivenkov,
D. A. Artemenkov,
V. Bradnova,
S. Voka,
P. I. Zarubin,
I. G. Zarubina,
N. V. Kondratieva,
A. I. Malakhov,
A. A. Moiseenko,
G. I. Orlova,
N. G. Peresadko,
N. G. Polukhina,
P. A. Rukoyatkin,
V. V. Rusakova,
V. R. Sarkisyan,
R. Stanoeva,
M. Haiduc,
S. P. Kharlamov
Abstract:
Results on the coherent dissociation of relativistic $^9$C nuclei in a nuclear track emulsion are described. These results include the charge topology and kinematical features of final states. Events of C-9 to 3He-3 coherent dissociation are identified.
Results on the coherent dissociation of relativistic $^9$C nuclei in a nuclear track emulsion are described. These results include the charge topology and kinematical features of final states. Events of C-9 to 3He-3 coherent dissociation are identified.
△ Less
Submitted 13 April, 2011;
originally announced April 2011.
-
Fragmentation of relativistic nuclei in peripheral interactions in nuclear track emulsion
Authors:
D. A. Artemenkov,
V. Bradnova,
M. M. Chernyavsky,
L. A. Goncharova,
M. Haiduc,
N. A. Kachalova,
S. P. Kharlamov,
A. D. Kovalenko,
A. I. Malakhov,
A. A. Moiseenko,
G. I. Orlova,
N. G. Peresadko,
N. G. Polukhina,
P. A. Rukoyatkin,
V. V. Rusakova,
V. R. Sarkisyan,
R. Stanoeva,
T. V. Shchedrina,
S. Vokál,
A. Vokálová,
P. I. Zarubin,
I. G. Zarubina
Abstract:
The technique of nuclear track emulsions is used to explore the fragmentation of light relativistic nuclei down to the most peripheral interactions - nuclear "white" stars. A complete pattern of therelativistic dissociation of a $^8$B nucleus with target fragment accompaniment is presented. Relativistic dissociation $^{9}$Be$\to2α$ is explored using significant statistics and a relative contribu…
▽ More
The technique of nuclear track emulsions is used to explore the fragmentation of light relativistic nuclei down to the most peripheral interactions - nuclear "white" stars. A complete pattern of therelativistic dissociation of a $^8$B nucleus with target fragment accompaniment is presented. Relativistic dissociation $^{9}$Be$\to2α$ is explored using significant statistics and a relative contribution of $^{8}$Be decays from 0$^+$ and 2$^+$ states is established. Target fragment accompaniments are shown for relativistic fragmentation $^{14}$N$\to$3He+H and $^{22}$Ne$\to$5He. The leading role of the electromagnetic dissociation on heavy nuclei with respect to break-ups on target protons is demonstrated in all these cases. It is possible to conclude that the peripheral dissociation of relativistic nuclei in nuclear track emulsion is a unique tool to study many-body systems composed of lightest nuclei and nucleons in the energy scale relevant for nuclear astrophysics.
△ Less
Submitted 3 July, 2009;
originally announced July 2009.
-
Interplay of inequivalent atomic positions in resonant x-ray diffraction of Fe3BO6
Authors:
G Beutier,
E Ovchinnikova,
S P Collins,
V E Dmitrienko,
J E Lorenzo,
J-L Hodeau,
A Kirfel,
Y Joly,
A A Antonenko,
V A Sarkisyan,
A Bombardi
Abstract:
'Forbidden' Bragg reflections of iron orthoborate Fe3BO6 were studied theoretically and experimentally in the vicinity of the iron K edge. Their energy spectra are explained as resulting from the interference of x-rays scattered from two inequivalent crystallographic sites occupied by iron ions. This particular structure property gives rise to complex azimuthal dependences of the reflection inte…
▽ More
'Forbidden' Bragg reflections of iron orthoborate Fe3BO6 were studied theoretically and experimentally in the vicinity of the iron K edge. Their energy spectra are explained as resulting from the interference of x-rays scattered from two inequivalent crystallographic sites occupied by iron ions. This particular structure property gives rise to complex azimuthal dependences of the reflection intensities in the pre-edge region as they result from the interplay of site specific dipole-quadrupole and quadrupole-quadrupole resonant scattering. Also evidenced is an isotropic character of the absorption spectrum. Self-absorption correction to the diffraction data, as well as possible contributions of thermal vibrations and magnetic order, are discussed. Particular care is given to extracting clean spectra from the data, and it is demonstrated that excellent results can be obtained even from measurements that appear corrupted by several effects such as poor crystal quality and multiple scattering.
△ Less
Submitted 12 June, 2009;
originally announced June 2009.
-
First results on the interactions of relativistic $^9$C nuclei in nuclear track emulsion
Authors:
D. O. Krivenkov,
D. A. Artemenkov,
V. Bradnova,
M. Haiduc,
S. P. Kharlamov,
V. N. Kondratieva,
A. I. Malakhov,
A. A. Moiseenko,
G. I. Orlova,
N. G. Peresadko,
N. G. Polukhina,
P. A. Rukoyatkin,
V. V. Rusakova,
V. R. Sarkisyan,
R. Stanoeva,
T. V. Shchedrina,
S. Vokál,
P. I. Zarubin,
I. G. Zarubina
Abstract:
\indent First results of the exposure of nuclear track emulsions in a secondary beam enriched by $^9$C nuclei at energy of 1.2 A GeV are described. The presented statistics corresponds to the most peripheral $^9$C interactions. For the first time a dissociation $^9$C $\to3^3$He not accompanied by target fragments and mesons is identified.\par
\indent First results of the exposure of nuclear track emulsions in a secondary beam enriched by $^9$C nuclei at energy of 1.2 A GeV are described. The presented statistics corresponds to the most peripheral $^9$C interactions. For the first time a dissociation $^9$C $\to3^3$He not accompanied by target fragments and mesons is identified.\par
△ Less
Submitted 12 November, 2008;
originally announced November 2008.
-
Topology of "white" stars in relativistic fragmentation of light nuclei
Authors:
N. P. Andreeva,
V. Bradnova,
S. Vokal,
A. Vokalova,
A. Sh. Gaitinov,
S. G. Gerasimov,
L. A. Goncharova,
V. A. Dronov,
P. I. Zarubin,
I. G. Zarubina,
G. I. Orlova,
A. D. Kovalenko,
A. Kravchakova,
V. G. Larionova,
F. G. Lepekhin,
O. V. Levitskaya,
A. I. Malakhov,
A. A. Moiseenko,
G. I. Orlova,
N. G. Peresadko,
N. G. Polukhina,
P. A. Rukoyatkin,
V. V. Rusakova,
N. A. Salmanova,
V. R. Sarkisyan
, et al. (8 additional authors not shown)
Abstract:
In the present paper, experimental observations of the multifragmentation processes of light relativistic nuclei carried out by means of emulsions are reviewed. Events of the type of "white" stars in which the dissociation of relativistic nuclei is not accompanied by the production of mesons and the target-nucleus fragments are considered.
A distinctive feature of the charge topology in the di…
▽ More
In the present paper, experimental observations of the multifragmentation processes of light relativistic nuclei carried out by means of emulsions are reviewed. Events of the type of "white" stars in which the dissociation of relativistic nuclei is not accompanied by the production of mesons and the target-nucleus fragments are considered.
A distinctive feature of the charge topology in the dissociation of the Ne, Mg, Si, and S nuclei is an almost total suppression of the binary splitting of nuclei to fragments with charges higher than 2. The growth of the nuclear fragmentation degree is revealed in an increase in the multiplicity of singly and doubly charged fragments with decreasing charge of the non-excited part of the fragmenting nucleus.
The processes of dissociation of stable Li, Be, B, C, N, and O isotopes to charged fragments were used to study special features of the formation of systems consisting of the lightest $α$, d, and t nuclei. Clustering in form of the $^3$He nucleus can be detected in "white" stars via the dissociation of neutron-deficient Be, B, C, and N isotopes.
△ Less
Submitted 16 May, 2006; v1 submitted 15 May, 2006;
originally announced May 2006.
-
Clustering in light nuclei in fragmentation above 1 A GeV
Authors:
N. P. Andreeva,
D. A. Artemenkov,
V. Bradnova,
M. M. Chernyavsky,
A. Sh. Gaitinov,
N. A. Kachalova,
S. P. Kharlamov,
A. D. Kovalenko,
M. Haiduc,
S. G. Gerasimov,
L. A. Goncharova,
V. G. Larionova,
A. I. Malakhov,
A. A. Moiseenko,
G. I. Orlova,
N. G. Peresadko,
N. G. Polukhina,
P. A. Rukoyatkin,
V. V. Rusakova,
V. R. Sarkisyan,
T. V. Shchedrina,
E. Stan,
R. Stanoeva,
I. Tsakov,
S. Vokal
, et al. (3 additional authors not shown)
Abstract:
The relativistic invariant approach is applied to analyzing the 3.3 A GeV $^{22}$Ne fragmentation in a nuclear track emulsion. New results on few-body dissociations have been obtained from the emulsion exposures to 2.1 A GeV $^{14}$N and 1.2 A GeV $^{9}$Be nuclei. It can be asserted that the use of the invariant approach is an effective means of obtaining conclusions about the behavior of system…
▽ More
The relativistic invariant approach is applied to analyzing the 3.3 A GeV $^{22}$Ne fragmentation in a nuclear track emulsion. New results on few-body dissociations have been obtained from the emulsion exposures to 2.1 A GeV $^{14}$N and 1.2 A GeV $^{9}$Be nuclei. It can be asserted that the use of the invariant approach is an effective means of obtaining conclusions about the behavior of systems involving a few He nuclei at a relative energy close to 1 MeV per nucleon. The first observations of fragmentation of 1.2 A GeV $^{8}$B and $^{9}$C nuclei in emulsion are described. The presented results allow one to justify the development of few-body aspects of nuclear astrophysics.
△ Less
Submitted 24 April, 2006; v1 submitted 9 April, 2006;
originally announced April 2006.