-
Arabic Diacritic Recovery Using a Feature-Rich biLSTM Model
Authors:
Kareem Darwish,
Ahmed Abdelali,
Hamdy Mubarak,
Mohamed Eldesouki
Abstract:
Diacritics (short vowels) are typically omitted when writing Arabic text, and readers have to reintroduce them to correctly pronounce words. There are two types of Arabic diacritics: the first are core-word diacritics (CW), which specify the lexical selection, and the second are case endings (CE), which typically appear at the end of the word stem and generally specify their syntactic roles. Recov…
▽ More
Diacritics (short vowels) are typically omitted when writing Arabic text, and readers have to reintroduce them to correctly pronounce words. There are two types of Arabic diacritics: the first are core-word diacritics (CW), which specify the lexical selection, and the second are case endings (CE), which typically appear at the end of the word stem and generally specify their syntactic roles. Recovering CEs is relatively harder than recovering core-word diacritics due to inter-word dependencies, which are often distant. In this paper, we use a feature-rich recurrent neural network model that uses a variety of linguistic and surface-level features to recover both core word diacritics and case endings. Our model surpasses all previous state-of-the-art systems with a CW error rate (CWER) of 2.86\% and a CE error rate (CEER) of 3.7% for Modern Standard Arabic (MSA) and CWER of 2.2% and CEER of 2.5% for Classical Arabic (CA). When combining diacritized word cores with case endings, the resultant word error rate is 6.0% and 4.3% for MSA and CA respectively. This highlights the effectiveness of feature engineering for such deep neural models.
△ Less
Submitted 4 February, 2020;
originally announced February 2020.
-
Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM
Authors:
Mohamed Eldesouki,
Younes Samih,
Ahmed Abdelali,
Mohammed Attia,
Hamdy Mubarak,
Kareem Darwish,
Kallmeyer Laura
Abstract:
Arabic word segmentation is essential for a variety of NLP applications such as machine translation and information retrieval. Segmentation entails breaking words into their constituent stems, affixes and clitics. In this paper, we compare two approaches for segmenting four major Arabic dialects using only several thousand training examples for each dialect. The two approaches involve posing the p…
▽ More
Arabic word segmentation is essential for a variety of NLP applications such as machine translation and information retrieval. Segmentation entails breaking words into their constituent stems, affixes and clitics. In this paper, we compare two approaches for segmenting four major Arabic dialects using only several thousand training examples for each dialect. The two approaches involve posing the problem as a ranking problem, where an SVM ranker picks the best segmentation, and as a sequence labeling problem, where a bi-LSTM RNN coupled with CRF determines where best to segment words. We are able to achieve solid segmentation results for all dialects using rather limited training data. We also show that employing Modern Standard Arabic data for domain adaptation and assuming context independence improve overall results.
△ Less
Submitted 19 August, 2017;
originally announced August 2017.
-
Analysis of a chaotic spiking neural model: The NDS neuron
Authors:
Mohammad Alhawarat,
Waleed Nazih,
Mohammad Eldesouki
Abstract:
Further analysis and experimentation is carried out in this paper for a chaotic dynamic model, viz. the Nonlinear Dynamic State neuron (NDS). The analysis and experimentations are performed to further understand the underlying dynamics of the model and enhance it as well. Chaos provides many interesting properties that can be exploited to achieve computational tasks. Such properties are sensitivit…
▽ More
Further analysis and experimentation is carried out in this paper for a chaotic dynamic model, viz. the Nonlinear Dynamic State neuron (NDS). The analysis and experimentations are performed to further understand the underlying dynamics of the model and enhance it as well. Chaos provides many interesting properties that can be exploited to achieve computational tasks. Such properties are sensitivity to initial conditions, space filling, control and synchronization.Chaos might play an important role in information processing tasks in human brain as suggested by biologists. If artificial neural networks (ANNs) is equipped with chaos then it will enrich the dynamic behaviours of such networks. The NDS model has some limitations and can be overcome in different ways. In this paper different approaches are followed to push the boundaries of the NDS model in order to enhance it. One way is to study the effects of scaling the parameters of the chaotic equations of the NDS model and study the resulted dynamics. Another way is to study the method that is used in discretization of the original Rössler that the NDS model is based on. These approaches have revealed some facts about the NDS attractor and suggest why such a model can be stabilized to large number of unstable periodic orbits (UPOs) which might correspond to memories in phase space.
△ Less
Submitted 16 August, 2014;
originally announced August 2014.
-
Studying a Chaotic Spiking Neural Model
Authors:
Mohammad Alhawarat,
Waleed Nazih,
Mohammad Eldesouki
Abstract:
Dynamics of a chaotic spiking neuron model are being studied mathematically and experimentally. The Nonlinear Dynamic State neuron (NDS) is analysed to further understand the model and improve it. Chaos has many interesting properties such as sensitivity to initial conditions, space filling, control and synchronization. As suggested by biologists, these properties may be exploited and play vital r…
▽ More
Dynamics of a chaotic spiking neuron model are being studied mathematically and experimentally. The Nonlinear Dynamic State neuron (NDS) is analysed to further understand the model and improve it. Chaos has many interesting properties such as sensitivity to initial conditions, space filling, control and synchronization. As suggested by biologists, these properties may be exploited and play vital role in carrying out computational tasks in human brain. The NDS model has some limitations; in thus paper the model is investigated to overcome some of these limitations in order to enhance the model. Therefore, the models parameters are tuned and the resulted dynamics are studied. Also, the discretization method of the model is considered. Moreover, a mathematical analysis is carried out to reveal the underlying dynamics of the model after tuning of its parameters. The results of the aforementioned methods revealed some facts regarding the NDS attractor and suggest the stabilization of a large number of unstable periodic orbits (UPOs) which might correspond to memories in phase space.
△ Less
Submitted 26 October, 2013;
originally announced October 2013.