Skip to main content

Showing 1–34 of 34 results for author: Darwish, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.16380  [pdf, other

    eess.AS cs.AI cs.CL cs.LG

    An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset Generation

    Authors: Ahmet Gunduz, Kamer Ali Yuksel, Kareem Darwish, Golara Javadi, Fabio Minazzi, Nicola Sobieski, Sebastien Bratieres

    Abstract: Data availability is crucial for advancing artificial intelligence applications, including voice-based technologies. As content creation, particularly in social media, experiences increasing demand, translation and text-to-speech (TTS) technologies have become essential tools. Notably, the performance of these TTS technologies is highly dependent on the quality of the training data, emphasizing th… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 9 Pages, 6 Figures, 4 Tables, LREC-COLING 2024

  2. arXiv:2206.07373  [pdf, other

    cs.CL cs.SD eess.AS

    NatiQ: An End-to-end Text-to-Speech System for Arabic

    Authors: Ahmed Abdelali, Nadir Durrani, Cenk Demiroglu, Fahim Dalvi, Hamdy Mubarak, Kareem Darwish

    Abstract: NatiQ is end-to-end text-to-speech system for Arabic. Our speech synthesizer uses an encoder-decoder architecture with attention. We used both tacotron-based models (tacotron-1 and tacotron-2) and the faster transformer model for generating mel-spectrograms from characters. We concatenated Tacotron1 with the WaveRNN vocoder, Tacotron2 with the WaveGlow vocoder and ESPnet transformer with the paral… ▽ More

    Submitted 16 November, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

  3. arXiv:2111.09574  [pdf, other

    cs.CL

    Automatic Expansion and Retargeting of Arabic Offensive Language Training

    Authors: Hamdy Mubarak, Ahmed Abdelali, Kareem Darwish, Younes Samih

    Abstract: Rampant use of offensive language on social media led to recent efforts on automatic identification of such language. Though offensive language has general characteristics, attacks on specific entities may exhibit distinct phenomena such as malicious alterations in the spelling of names. In this paper, we present a method for identifying entity specific offensive language. We employ two key insigh… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

  4. arXiv:2109.12844  [pdf, other

    cs.SI cs.CY

    News Consumption in Time of Conflict: 2021 Palestinian-Israel War as an Example

    Authors: Kareem Darwish

    Abstract: This paper examines news consumption in response to a major polarizing event, and we use the May 2021 Israeli-Palestinian conflict as an example. We conduct a detailed analysis of the news consumption of more than eight thousand Twitter users who are either pro-Palestinian or pro-Israeli and authored more than 29 million tweets between January 1 and August 17, 2021. We identified the stance of use… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

  5. arXiv:2106.06017  [pdf, other

    cs.CL

    Cross-lingual Emotion Detection

    Authors: Sabit Hassan, Shaden Shaar, Kareem Darwish

    Abstract: Emotion detection can provide us with a window into understanding human behavior. Due to the complex dynamics of human emotions, however, constructing annotated datasets to train automated models can be expensive. Thus, we explore the efficacy of cross-lingual approaches that would use data from a source language to build models for emotion detection in a target language. We compare three approach… ▽ More

    Submitted 4 May, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

  6. arXiv:2102.10684  [pdf, other

    cs.CL cs.AI

    Pre-Training BERT on Arabic Tweets: Practical Considerations

    Authors: Ahmed Abdelali, Sabit Hassan, Hamdy Mubarak, Kareem Darwish, Younes Samih

    Abstract: Pretraining Bidirectional Encoder Representations from Transformers (BERT) for downstream NLP tasks is a non-trival task. We pretrained 5 BERT models that differ in the size of their training sets, mixture of formal and informal Arabic, and linguistic preprocessing. All are intended to support Arabic dialects and social media. The experiments highlight the centrality of data diversity and the effi… ▽ More

    Submitted 21 February, 2021; originally announced February 2021.

    Comments: 6 pages, 5 figures

  7. arXiv:2101.09345  [pdf

    cs.CL

    BERT Transformer model for Detecting Arabic GPT2 Auto-Generated Tweets

    Authors: Fouzi Harrag, Maria Debbah, Kareem Darwish, Ahmed Abdelali

    Abstract: During the last two decades, we have progressively turned to the Internet and social media to find news, entertain conversations and share opinion. Recently, OpenAI has developed a ma-chine learning system called GPT-2 for Generative Pre-trained Transformer-2, which can pro-duce deepfake texts. It can generate blocks of text based on brief writing prompts that look like they were written by humans… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

    Journal ref: Proceedings of the Fifth Arabic Natural Language Processing Workshop (WANLP @ COLING 2020)

  8. arXiv:2011.12631  [pdf, ps, other

    cs.CL

    A Panoramic Survey of Natural Language Processing in the Arab World

    Authors: Kareem Darwish, Nizar Habash, Mourad Abbas, Hend Al-Khalifa, Huseein T. Al-Natsheh, Samhaa R. El-Beltagy, Houda Bouamor, Karim Bouzoubaa, Violetta Cavalli-Sforza, Wassim El-Hajj, Mustafa Jarrar, Hamdy Mubarak

    Abstract: The term natural language refers to any system of symbolic communication (spoken, signed or written) without intentional human planning and design. This distinguishes natural languages such as Arabic and Japanese from artificially constructed languages such as Esperanto or Python. Natural language processing (NLP) is the sub-field of artificial intelligence (AI) focused on modeling natural languag… ▽ More

    Submitted 27 September, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

  9. arXiv:2007.09655  [pdf, other

    cs.SI cs.CL

    Political Framing: US COVID19 Blame Game

    Authors: Chereen Shurafa, Kareem Darwish, Wajdi Zaghouani

    Abstract: Through the use of Twitter, framing has become a prominent presidential campaign tool for politically active users. Framing is used to influence thoughts by evoking a particular perspective on an event. In this paper, we show that the COVID19 pandemic rather than being viewed as a public health issue, political rhetoric surrounding it is mostly shaped through a blame frame (blame Trump, China, or… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

    Comments: Social Informatics 2020 (SocInfo2020)

  10. arXiv:2007.07996  [pdf, other

    cs.IR cs.CL cs.LG cs.SI

    Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms

    Authors: Firoj Alam, Fahim Dalvi, Shaden Shaar, Nadir Durrani, Hamdy Mubarak, Alex Nikolov, Giovanni Da San Martino, Ahmed Abdelali, Hassan Sajjad, Kareem Darwish, Preslav Nakov

    Abstract: With the outbreak of the COVID-19 pandemic, people turned to social media to read and to share timely information including statistics, warnings, advice, and inspirational stories. Unfortunately, alongside all this useful information, there was also a new blending of medical and political misinformation and disinformation, which gave rise to the first global infodemic. While fighting this infodemi… ▽ More

    Submitted 9 April, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

    Comments: COVID-19, Infodemic, Disinformation, Misinformation, Fake News, Call to Arms, Crowdsourcing Annotations

    MSC Class: 68T50 ACM Class: I.2.7

  11. arXiv:2005.09649  [pdf, other

    cs.SI cs.CL cs.CY

    Embeddings-Based Clustering for Target Specific Stances: The Case of a Polarized Turkey

    Authors: Ammar Rashed, Mucahid Kutlu, Kareem Darwish, Tamer Elsayed, Cansın Bayrak

    Abstract: On June 24, 2018, Turkey conducted a highly consequential election in which the Turkish people elected their president and parliament in the first election under a new presidential system. During the election period, the Turkish people extensively shared their political opinions on Twitter. One aspect of polarization among the electorate was support for or opposition to the reelection of Recep Tay… ▽ More

    Submitted 24 February, 2022; v1 submitted 19 May, 2020; originally announced May 2020.

    Comments: arXiv admin note: text overlap with arXiv:1909.10213

    Journal ref: ICWSM, vol. 15, no. 1, pp. 537-548, May 2021

  12. arXiv:2005.06557  [pdf, other

    cs.CL

    Arabic Dialect Identification in the Wild

    Authors: Ahmed Abdelali, Hamdy Mubarak, Younes Samih, Sabit Hassan, Kareem Darwish

    Abstract: We present QADI, an automatically collected dataset of tweets belonging to a wide range of country-level Arabic dialects -covering 18 different countries in the Middle East and North Africa region. Our method for building this dataset relies on applying multiple filters to identify users who belong to different countries based on their account descriptions and to eliminate tweets that are either w… ▽ More

    Submitted 15 May, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

    Comments: 13 pages, 7 figures, 4 tables

  13. arXiv:2005.00033  [pdf, other

    cs.CL cs.CY cs.IR

    Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society

    Authors: Firoj Alam, Shaden Shaar, Fahim Dalvi, Hassan Sajjad, Alex Nikolov, Hamdy Mubarak, Giovanni Da San Martino, Ahmed Abdelali, Nadir Durrani, Kareem Darwish, Abdulaziz Al-Homaid, Wajdi Zaghouani, Tommaso Caselli, Gijs Danoe, Friso Stolk, Britt Bruntink, Preslav Nakov

    Abstract: With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic. Fighting this infodemic has been declared one of the most important focus areas of the World Health Organization, with dangers ranging from promoting fake cures, rumors, and conspiracy theories to spreadin… ▽ More

    Submitted 22 September, 2021; v1 submitted 30 April, 2020; originally announced May 2020.

    Comments: disinformation, misinformation, factuality, fact-checking, fact-checkers, check-worthiness, Social Media Platforms, COVID-19, social media

    MSC Class: 68T50 ACM Class: I.2; I.2.7

    Journal ref: EMNLP-2021 (Findings)

  14. arXiv:2004.03485  [pdf, other

    cs.SI cs.CL

    A Few Topical Tweets are Enough for Effective User-Level Stance Detection

    Authors: Younes Samih, Kareem Darwish

    Abstract: Stance detection entails ascertaining the position of a user towards a target, such as an entity, topic, or claim. Recent work that employs unsupervised classification has shown that performing stance detection on vocal Twitter users, who have many tweets on a target, can yield very high accuracy (+98%). However, such methods perform poorly or fail completely for less vocal users, who may have aut… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

  15. arXiv:2004.02192  [pdf, other

    cs.CL

    Arabic Offensive Language on Twitter: Analysis and Experiments

    Authors: Hamdy Mubarak, Ammar Rashed, Kareem Darwish, Younes Samih, Ahmed Abdelali

    Abstract: Detecting offensive language on Twitter has many applications ranging from detecting/predicting bullying to measuring polarization. In this paper, we focus on building a large Arabic offensive tweet dataset. We introduce a method for building a dataset that is not biased by topic, dialect, or target. We produce the largest Arabic dataset to date with special tags for vulgarity and hate speech. We… ▽ More

    Submitted 9 March, 2021; v1 submitted 5 April, 2020; originally announced April 2020.

    Comments: 10 pages, 6 figures, 3 tables

  16. arXiv:2002.01207  [pdf, other

    cs.CL cs.LG

    Arabic Diacritic Recovery Using a Feature-Rich biLSTM Model

    Authors: Kareem Darwish, Ahmed Abdelali, Hamdy Mubarak, Mohamed Eldesouki

    Abstract: Diacritics (short vowels) are typically omitted when writing Arabic text, and readers have to reintroduce them to correctly pronounce words. There are two types of Arabic diacritics: the first are core-word diacritics (CW), which specify the lexical selection, and the second are case endings (CE), which typically appear at the end of the word stem and generally specify their syntactic roles. Recov… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

  17. arXiv:2001.02125  [pdf, other

    cs.SI

    Quantifying Polarization on Twitter: the Kavanaugh Nomination

    Authors: Kareem Darwish

    Abstract: This paper addresses polarization quantification, particularly as it pertains to the nomination of Brett Kavanaugh to the US Supreme Court and his subsequent confirmation with the narrowest margin since 1881. Republican (GOP) and Democratic (DNC) senators voted overwhelmingly along party lines. In this paper, we examine political polarization concerning the nomination among Twitter users. To do so… ▽ More

    Submitted 5 January, 2020; originally announced January 2020.

    Comments: 13 pages, 4 figures, 5 tables. International Conference on Social Informatics. Springer, Cham, 2019. arXiv admin note: substantial text overlap with arXiv:1810.06687

    ACM Class: J.4

  18. arXiv:1910.02028  [pdf, other

    cs.CL cs.IR

    Tanbih: Get To Know What You Are Reading

    Authors: Yifan Zhang, Giovanni Da San Martino, Alberto Barrón-Cedeño, Salvatore Romeo, Jisun An, Haewoon Kwak, Todor Staykovski, Israa Jaradat, Georgi Karadzhov, Ramy Baly, Kareem Darwish, James Glass, Preslav Nakov

    Abstract: We introduce Tanbih, a news aggregator with intelligent analysis tools to help readers understanding what's behind a news story. Our system displays news grouped into events and generates media profiles that show the general factuality of reporting, the degree of propagandistic content, hyper-partisanship, leading political ideology, general frame of reporting, and stance with respect to various c… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: EMNLP-2019

  19. arXiv:1909.10213  [pdf, ps, other

    cs.SI

    Embedding-based Qualitative Analysis of Polarization in Turkey

    Authors: Mucahid Kutlu, Kareem Darwish, Cansin Bayrak, Ammar Rashed, Tamer Elsayed

    Abstract: On June 24, 2018, Turkey conducted a highly-consequential election in which the Turkish people elected their president and parliament in the first election under a new presidential system. During the election period, the Turkish people extensively shared their political opinions on Twitter. One access of polarization among the electorate was support for or opposition to the reelection of Recep Tay… ▽ More

    Submitted 23 September, 2019; originally announced September 2019.

  20. arXiv:1907.01260  [pdf, other

    cs.SI cs.IR

    Predicting the Topical Stance of Media and Popular Twitter Users

    Authors: Peter Stefanov, Kareem Darwish, Atanas Atanasov, Preslav Nakov

    Abstract: Discovering the stances of media outlets and influential people on current, debatable topics is important for social statisticians and policy makers. Many supervised solutions exist for determining viewpoints, but manually annotating training data is costly. In this paper, we propose a cascaded method that uses unsupervised learning to ascertain the stance of Twitter users with respect to a polari… ▽ More

    Submitted 21 May, 2020; v1 submitted 2 July, 2019; originally announced July 2019.

    MSC Class: 91D30

  21. arXiv:1904.02000  [pdf, other

    cs.SI

    Unsupervised User Stance Detection on Twitter

    Authors: Kareem Darwish, Peter Stefanov, Michaël Aupetit, Preslav Nakov

    Abstract: We present a highly effective unsupervised framework for detecting the stance of prolific Twitter users with respect to controversial topics. In particular, we use dimensionality reduction to project users onto a low-dimensional space, followed by clustering, which allows us to find core users that are representative of the different stances. Our framework has three major advantages over pre-exist… ▽ More

    Submitted 21 May, 2020; v1 submitted 3 April, 2019; originally announced April 2019.

    MSC Class: 62P25; 91D30

  22. arXiv:1810.06687  [pdf, other

    cs.SI

    To Kavanaugh or Not to Kavanaugh: That is the Polarizing Question

    Authors: Kareem Darwish

    Abstract: On October 6, 2018, the US Senate confirmed Brett Kavanaugh with the narrowest margin for a successful confirmation since 1881 and where the senators voted overwhelmingly along party lines. In this paper, we examine whether the political polarization in the Senate is reflected among the general public. To do so, we analyze the views of more than 128 thousand Twitter users. We show that users suppo… ▽ More

    Submitted 15 October, 2018; originally announced October 2018.

  23. arXiv:1810.06619  [pdf, other

    cs.CL

    Diacritization of Maghrebi Arabic Sub-Dialects

    Authors: Ahmed Abdelali, Mohammed Attia, Younes Samih, Kareem Darwish, Hamdy Mubarak

    Abstract: Diacritization process attempt to restore the short vowels in Arabic written text; which typically are omitted. This process is essential for applications such as Text-to-Speech (TTS). While diacritization of Modern Standard Arabic (MSA) still holds the lion share, research on dialectal Arabic (DA) diacritization is very limited. In this paper, we present our contribution and results on the automa… ▽ More

    Submitted 30 May, 2019; v1 submitted 15 October, 2018; originally announced October 2018.

    Comments: 6 pages, 3 figures

  24. arXiv:1807.06655  [pdf, other

    cs.SI

    Devam vs. Tamam: 2018 Turkish Elections

    Authors: Mucahid Kutlu, Kareem Darwish, Tamer Elsayed

    Abstract: On June 24, 2018, Turkey held a historical election, transforming its parliamentary system to a presidential one. One of the main questions for Turkish voters was whether to start this new political era with reelecting its long-time political leader Recep Tayyip Erdogan or not. In this paper, we analyzed 108M tweets posted in the two months leading to the election to understand the groups that sup… ▽ More

    Submitted 17 July, 2018; originally announced July 2018.

  25. arXiv:1708.05891  [pdf, other

    cs.CL

    Arabic Multi-Dialect Segmentation: bi-LSTM-CRF vs. SVM

    Authors: Mohamed Eldesouki, Younes Samih, Ahmed Abdelali, Mohammed Attia, Hamdy Mubarak, Kareem Darwish, Kallmeyer Laura

    Abstract: Arabic word segmentation is essential for a variety of NLP applications such as machine translation and information retrieval. Segmentation entails breaking words into their constituent stems, affixes and clitics. In this paper, we compare two approaches for segmenting four major Arabic dialects using only several thousand training examples for each dialect. The two approaches involve posing the p… ▽ More

    Submitted 19 August, 2017; originally announced August 2017.

  26. arXiv:1707.07276  [pdf, other

    cs.SI

    Seminar Users in the Arabic Twitter Sphere

    Authors: Kareem Darwish, Dimitar Alexandrov, Preslav Nakov, Yelena Mejova

    Abstract: We introduce the notion of "seminar users", who are social media users engaged in propaganda in support of a political entity. We develop a framework that can identify such users with 84.4% precision and 76.1% recall. While our dataset is from the Arab region, omitting language-specific features has only a minor impact on classification performance, and thus, our approach could work for detecting… ▽ More

    Submitted 23 July, 2017; originally announced July 2017.

    Comments: to appear in SocInfo 2017

  27. arXiv:1707.03375  [pdf, other

    cs.SI

    Trump vs. Hillary: What went Viral during the 2016 US Presidential Election

    Authors: Kareem Darwish, Walid Magdy, Tahar Zanouda

    Abstract: In this paper, we present quantitative and qualitative analysis of the top retweeted tweets (viral tweets) pertaining to the US presidential elections from September 1, 2016 to Election Day on November 8, 2016. For everyday, we tagged the top 50 most retweeted tweets as supporting or attacking either candidate or as neutral/irrelevant. Then we analyzed the tweets in each class for: general trends… ▽ More

    Submitted 11 July, 2017; originally announced July 2017.

    Comments: Paper to appear in Springer SocInfo 2017

  28. arXiv:1707.02591  [pdf, other

    cs.RO

    Flexible human-robot cooperation models for assisted shop-floor tasks

    Authors: Kourosh Darwish, Francesco Wanderlingh, Barbara Bruno, Enrico Simetti, Fulvio Mastrogiovanni, Giuseppe Casalino

    Abstract: The Industry 4.0 paradigm emphasizes the crucial benefits that collaborative robots, i.e., robots able to work alongside and together with humans, could bring to the whole production process. In this context, an enabling technology yet unreached is the design of flexible robots able to deal at all levels with humans' intrinsic variability, which is not only a necessary element for a comfortable wo… ▽ More

    Submitted 9 July, 2017; originally announced July 2017.

    Comments: Submitted to Mechatronics (Elsevier)

    MSC Class: 68T40

  29. arXiv:1610.01655  [pdf, other

    cs.SI

    Trump vs. Hillary Analyzing Viral Tweets during US Presidential Elections 2016

    Authors: Walid Magdy, Kareem Darwish

    Abstract: In this paper, we provide a quantitative and qualitative analyses of the viral tweets related to the US presidential election. In our study, we focus on analyzing the most retweeted 50 tweets for everyday during September and October 2016. The resulting set is composed 3,050 viral tweets, and they were retweeted over 20.5 million times. We manually annotated the tweets as favorable of Trump, Clint… ▽ More

    Submitted 3 November, 2016; v1 submitted 5 October, 2016; originally announced October 2016.

    Comments: In version 2, analysis of viral tweets of October 2016 is added to the paper

  30. arXiv:1512.04570  [pdf, other

    cs.SI

    Quantifying Public Response towards Islam on Twitter after Paris Attacks

    Authors: Walid Magdy, Kareem Darwish, Norah Abokhodair

    Abstract: The Paris terrorist attacks occurred on November 13, 2015 prompted a massive response on social media including Twitter, with millions of posted tweets in the first few hours after the attacks. Most of the tweets were condemning the attacks and showing support to Parisians. One of the trending debates related to the attacks concerned possible association between terrorism and Islam and Muslims in… ▽ More

    Submitted 14 December, 2015; originally announced December 2015.

    Comments: 9 pages, 5 figures

    ACM Class: J.4; K.4.2

  31. arXiv:1512.04310  [pdf, other

    cs.SI

    Attitudes towards Refugees in Light of the Paris Attacks

    Authors: Kareem Darwish, Walid Magdy

    Abstract: The Paris attacks prompted a massive response on social media including Twitter. This paper explores the immediate response of English speakers on Twitter towards Middle Eastern refugees in Europe. We show that antagonism towards refugees is mostly coming from the United States and is mostly partisan.

    Submitted 15 December, 2015; v1 submitted 14 December, 2015; originally announced December 2015.

    Comments: 3 pages, 1 table, and 2 figures

    ACM Class: J.4; K.4.2

  32. arXiv:1503.02401  [pdf, other

    cs.SI physics.soc-ph

    #FailedRevolutions: Using Twitter to Study the Antecedents of ISIS Support

    Authors: Walid Magdy, Kareem Darwish, Ingmar Weber

    Abstract: Within a fairly short amount of time, the Islamic State of Iraq and Syria (ISIS) has managed to put large swaths of land in Syria and Iraq under their control. To many observers, the sheer speed at which this "state" was established was dumbfounding. To better understand the roots of this organization and its supporters we present a study using data from Twitter. We start by collecting large amoun… ▽ More

    Submitted 9 March, 2015; originally announced March 2015.

    Comments: Submitted to ICWSM 2015

  33. arXiv:1410.3097  [pdf, other

    cs.SI physics.soc-ph

    Content and Network Dynamics Behind Egyptian Political Polarization on Twitter

    Authors: Javier Borge-Holthoefer, Walid Magdy, Kareem Darwish, Ingmar Weber

    Abstract: There is little doubt about whether social networks play a role in modern protests. This agreement has triggered an entire research avenue, in which social structure and content analysis have been central --but are typically exploited separately. Here, we combine these two approaches to shed light on the opinion evolution dynamics in Egypt during the summer of 2013 along two axes (Islamist/Secul… ▽ More

    Submitted 12 October, 2014; originally announced October 2014.

    Comments: To appear in the Proceedings of the 18th Conference on Computer-Supported Cooperative Work and Social Computing CSCW (2015)

  34. arXiv:1306.6755  [pdf, ps, other

    cs.CL cs.IR

    Arabizi Detection and Conversion to Arabic

    Authors: Kareem Darwish

    Abstract: Arabizi is Arabic text that is written using Latin characters. Arabizi is used to present both Modern Standard Arabic (MSA) or Arabic dialects. It is commonly used in informal settings such as social networking sites and is often with mixed with English. In this paper we address the problems of: identifying Arabizi in text and converting it to Arabic characters. We used word and sequence-level fea… ▽ More

    Submitted 28 June, 2013; originally announced June 2013.

    ACM Class: I.2.7