Search | arXiv e-print repository

A Large and Diverse Arabic Corpus for Language Modeling

Authors: Abbas Raza Ali, Muhammad Ajmal Siddiqui, Rema Algunaibet, Hasan Raza Ali

Abstract: Language models (LMs) have introduced a major paradigm shift in Natural Language Processing (NLP) modeling where large pre-trained LMs became integral to most of the NLP tasks. The LMs are intelligent enough to find useful and relevant representations of the language without any supervision. Perhaps, these models are used to fine-tune typical NLP tasks with significantly high accuracy as compared… ▽ More Language models (LMs) have introduced a major paradigm shift in Natural Language Processing (NLP) modeling where large pre-trained LMs became integral to most of the NLP tasks. The LMs are intelligent enough to find useful and relevant representations of the language without any supervision. Perhaps, these models are used to fine-tune typical NLP tasks with significantly high accuracy as compared to the traditional approaches. Conversely, the training of these models requires a massively large corpus that is a good representation of the language. English LMs generally perform better than their other language counterparts, due to the availability of massive English corpora. This work elaborates on the design and development of a large Arabic corpus. It consists of over 500 GB of Arabic cleaned text targeted at improving cross-domain knowledge and downstream generalization capability of large-scale language models. Moreover, the corpus is utilized in the training of a large Arabic LM. In order to evaluate the effectiveness of the LM, a number of typical NLP tasks are fine-tuned. The tasks demonstrate a significant boost from 4.5 to 8.5% when compared to tasks fine-tuned on multi-lingual BERT (mBERT). To the best of my knowledge, this is currently the largest clean and diverse Arabic corpus ever collected. △ Less

Submitted 8 May, 2023; v1 submitted 23 January, 2022; originally announced January 2022.

arXiv:2201.07921 [pdf]

Demand-Driven Asset Reutilization Analytics

Authors: Abbas Raza Ali, Pitipong J. Lin

Abstract: Manufacturers have long benefited from reusing returned products and parts. This benevolent approach can minimize cost and help the manufacturer to play a role in sustaining the environment, something which is of utmost importance these days because of growing environment concerns. Reuse of returned parts and products aids environment sustainability because doing so helps reduce the use of raw mat… ▽ More Manufacturers have long benefited from reusing returned products and parts. This benevolent approach can minimize cost and help the manufacturer to play a role in sustaining the environment, something which is of utmost importance these days because of growing environment concerns. Reuse of returned parts and products aids environment sustainability because doing so helps reduce the use of raw materials, eliminate energy use to produce new parts, and minimize waste materials. However, handling returns effectively and efficiently can be difficult if the processes do not provide the visibility that is necessary to track, manage, and re-use the returns. This paper applies advanced analytics on procurement data to increase reutilization in new build by optimizing Equal-to-New (ETN) parts return. This will reduce 'the spend' on new buy parts for building new product units. The process involves forecasting and matching returns supply to demand for new build. Complexity in the process is the forecasting and matching while making sure a reutilization engineering process is available. Also, this will identify high demand/value/yield parts for development engineering to focus. Analytics has been applied on different levels to enhance the optimization process including forecast of upgraded parts. Machine Learning algorithms are used to build an automated infrastructure that can support the transformation of ETN parts utilization in the procurement parts planning process. This system incorporate returns forecast in the planning cycle to reduce suppliers liability from 9 weeks to 12 months planning cycle, e.g., reduce 5% of 10 million US dollars liability. △ Less

Submitted 28 December, 2021; originally announced January 2022.

Comments: 2014 ASE BIGDATA/SOCIALCOM/CYBERSECURITY Conference, Stanford University, May 27-31, 2014. Publisher: ASE, 2014 ISBN: 1625610009, 9781625610003

arXiv:2201.02737 [pdf, other]

doi 10.1109/ICCI-CC.2018.8482078

Cognitive Computing to Optimize IT Services

Authors: Abbas Raza Ali

Abstract: In this paper, the challenges of maintaining a healthy IT operational environment have been addressed by proactively analyzing IT Service Desk tickets, customer satisfaction surveys, and social media data. A Cognitive solution goes beyond the traditional structured data analysis by deep analyses of both structured and unstructured text. The salient features of the proposed platform include languag… ▽ More In this paper, the challenges of maintaining a healthy IT operational environment have been addressed by proactively analyzing IT Service Desk tickets, customer satisfaction surveys, and social media data. A Cognitive solution goes beyond the traditional structured data analysis by deep analyses of both structured and unstructured text. The salient features of the proposed platform include language identification, translation, hierarchical extraction of the most frequently occurring topics, entities and their relationships, text summarization, sentiments, and knowledge extraction from the unstructured text using Natural Language Processing techniques. Moreover, the insights from unstructured text combined with structured data allow the development of various classification, segmentation, and time-series forecasting use-cases on the incident, problem, and change datasets. Further, the text and predictive insights together with raw data are used for visualization and exploration of actionable insights on a rich and interactive dashboard. However, it is hard not only to find these insights using traditional structured data analysis but it might also take a very long time to discover them, especially while dealing with a massive amount of unstructured data. By taking action on these insights, organizations can benefit from a significant reduction of ticket volume, reduced operational costs, and increased customer satisfaction. In various experiments, on average, upto 18-25% of yearly ticket volume has been reduced using the proposed approach. △ Less

Submitted 28 December, 2021; originally announced January 2022.

Comments: 2018 IEEE 17th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC)

arXiv:2112.14678 [pdf, other]

doi 10.1109/IJCNN48605.2020.9206658

Multi-Dialect Arabic Speech Recognition

Authors: Abbas Raza Ali

Abstract: This paper presents the design and development of multi-dialect automatic speech recognition for Arabic. Deep neural networks are becoming an effective tool to solve sequential data problems, particularly, adopting an end-to-end training of the system. Arabic speech recognition is a complex task because of the existence of multiple dialects, non-availability of large corpora, and missing vocalizat… ▽ More This paper presents the design and development of multi-dialect automatic speech recognition for Arabic. Deep neural networks are becoming an effective tool to solve sequential data problems, particularly, adopting an end-to-end training of the system. Arabic speech recognition is a complex task because of the existence of multiple dialects, non-availability of large corpora, and missing vocalization. Thus, the first contribution of this work is the development of a large multi-dialectal corpus with either full or at least partially vocalized transcription. Additionally, the open-source corpus has been gathered from multiple sources that bring non-standard Arabic alphabets in transcription which are normalized by defining a common character-set. The second contribution is the development of a framework to train an acoustic model achieving state-of-the-art performance. The network architecture comprises of a combination of convolutional and recurrent layers. The spectrogram features of the audio data are extracted in the frequency vs time domain and fed in the network. The output frames, produced by the recurrent model, are further trained to align the audio features with its corresponding transcription sequences. The sequence alignment is performed using a beam search decoder with a tetra-gram language model. The proposed system achieved a 14% error rate which outperforms previous systems. △ Less

Submitted 25 December, 2021; originally announced December 2021.

Comments: 2020 International Joint Conference on Neural Networks (IJCNN)

arXiv:2104.14116 [pdf, other]

doi 10.1109/IJCNN52387.2021.9533786

An Automated Approach for Timely Diagnosis and Prognosis of Coronavirus Disease

Authors: Abbas Raza Ali, Marcin Budka

Abstract: Since the outbreak of Coronavirus Disease 2019 (COVID-19), most of the impacted patients have been diagnosed with high fever, dry cough, and soar throat leading to severe pneumonia. Hence, to date, the diagnosis of COVID-19 from lung imaging is proved to be a major evidence for early diagnosis of the disease. Although nucleic acid detection using real-time reverse-transcriptase polymerase chain re… ▽ More Since the outbreak of Coronavirus Disease 2019 (COVID-19), most of the impacted patients have been diagnosed with high fever, dry cough, and soar throat leading to severe pneumonia. Hence, to date, the diagnosis of COVID-19 from lung imaging is proved to be a major evidence for early diagnosis of the disease. Although nucleic acid detection using real-time reverse-transcriptase polymerase chain reaction (rRT-PCR) remains a gold standard for the detection of COVID-19, the proposed approach focuses on the automated diagnosis and prognosis of the disease from a non-contrast chest computed tomography (CT)scan for timely diagnosis and triage of the patient. The prognosis covers the quantification and assessment of the disease to help hospitals with the management and planning of crucial resources, such as medical staff, ventilators and intensive care units (ICUs) capacity. The approach utilises deep learning techniques for automated quantification of the severity of COVID-19 disease via measuring the area of multiple rounded ground-glass opacities (GGO) and consolidations in the periphery (CP) of the lungs and accumulating them to form a severity score. The severity of the disease can be correlated with the medicines prescribed during the triage to assess the effectiveness of the treatment. The proposed approach shows promising results where the classification model achieved 93% accuracy on hold-out data. △ Less

Submitted 29 April, 2021; originally announced April 2021.

Comments: to be published in IJCNN 2021

Journal ref: 2021 International Joint Conference on Neural Networks (IJCNN)

arXiv:2007.10818 [pdf, other]

A Review of Meta-level Learning in the Context of Multi-component, Multi-level Evolving Prediction Systems

Authors: Abbas Raza Ali, Marcin Budka, Bogdan Gabrys

Abstract: The exponential growth of volume, variety and velocity of data is raising the need for investigations of automated or semi-automated ways to extract useful patterns from the data. It requires deep expert knowledge and extensive computational resources to find the most appropriate map** of learning methods for a given problem. It becomes a challenge in the presence of numerous configurations of l… ▽ More The exponential growth of volume, variety and velocity of data is raising the need for investigations of automated or semi-automated ways to extract useful patterns from the data. It requires deep expert knowledge and extensive computational resources to find the most appropriate map** of learning methods for a given problem. It becomes a challenge in the presence of numerous configurations of learning algorithms on massive amounts of data. So there is a need for an intelligent recommendation engine that can advise what is the best learning algorithm for a dataset. The techniques that are commonly used by experts are based on a trial and error approach evaluating and comparing a number of possible solutions against each other, using their prior experience on a specific domain, etc. The trial and error approach combined with the expert's prior knowledge, though computationally and time expensive, have been often shown to work for stationary problems where the processing is usually performed off-line. However, this approach would not normally be feasible to apply to non-stationary problems where streams of data are continuously arriving. Furthermore, in a non-stationary environment, the manual analysis of data and testing of various methods whenever there is a change in the underlying data distribution would be very difficult or simply infeasible. In that scenario and within an on-line predictive system, there are several tasks where Meta-learning can be used to effectively facilitate best recommendations including 1) pre-processing steps, 2) learning algorithms or their combination, 3) adaptivity mechanisms and their parameters, 4) recurring concept extraction, and 5) concept drift detection. △ Less

Submitted 17 July, 2020; originally announced July 2020.

arXiv:1904.10032 [pdf, other]

Leveraging Orientation for Weakly Supervised Object Detection with Application to Firearm Localization

Authors: Javed Iqbal, Muhammad Akhtar Munir, Arif Mahmood, Afsheen Rafaqat Ali, Mohsen Ali

Abstract: Automatic detection of firearms is important for enhancing the security and safety of people, however, it is a challenging task owing to the wide variations in shape, size, and appearance of firearms. Also, most of the generic object detectors process axis-aligned rectangular areas though, a thin and long rifle may actually cover only a small percentage of that area and the rest may contain irrele… ▽ More Automatic detection of firearms is important for enhancing the security and safety of people, however, it is a challenging task owing to the wide variations in shape, size, and appearance of firearms. Also, most of the generic object detectors process axis-aligned rectangular areas though, a thin and long rifle may actually cover only a small percentage of that area and the rest may contain irrelevant details suppressing the required object signatures. To handle these challenges, we propose a weakly supervised Orientation Aware Object Detection (OAOD) algorithm which learns to detect oriented object bounding boxes (OBB) while using AxisAligned Bounding Boxes (AABB) for training. The proposed OAOD is different from the existing oriented object detectors which strictly require OBB during training which may not always be present. The goal of training on AABB and detection of OBB is achieved by employing a multistage scheme, with Stage-1 predicting the AABB and Stage-2 predicting OBB. In-between the two stages, the oriented proposal generation module along with the object aligned RoI pooling is designed to extract features based on the predicted orientation and to make these features orientation invariant. A diverse and challenging dataset consisting of eleven thousand images is also proposed for firearm detection which is manually annotated for firearm classification and localization. The proposed ITU Firearm dataset (ITUF) contains a wide range of guns and rifles. The OAOD algorithm is evaluated on the ITUF dataset and compared with current state-of-the-art object detectors, including fully supervised oriented object detectors. OAOD has outperformed both types of object detectors with a significant margin. The experimental results (mAP: 88.3 on AABB & mAP: 77.5 on OBB) demonstrate the effectiveness of the proposed algorithm for firearm detection. △ Less

Submitted 29 January, 2021; v1 submitted 22 April, 2019; originally announced April 2019.

Comments: Accepted for Publication in Neurocomputing

arXiv:1707.08148 [pdf, other]

doi 10.5244/C.31.171

Automatic Image Transformation for Inducing Affect

Authors: Afsheen Rafaqat Ali, Mohsen Ali

Abstract: Current image transformation and recoloring algorithms try to introduce artistic effects in the photographed images, based on user input of target image(s) or selection of pre-designed filters. These manipulations, although intended to enhance the impact of an image on the viewer, do not include the option of image transformation by specifying the affect information. In this paper we present an au… ▽ More Current image transformation and recoloring algorithms try to introduce artistic effects in the photographed images, based on user input of target image(s) or selection of pre-designed filters. These manipulations, although intended to enhance the impact of an image on the viewer, do not include the option of image transformation by specifying the affect information. In this paper we present an automatic image-transformation method that transforms the source image such that it can induce an emotional affect on the viewer, as desired by the user. Our proposed novel image emotion transfer algorithm does not require a user-specified target image. The proposed algorithm uses features extracted from top layers of deep convolutional neural network and the user-specified emotion distribution to select multiple target images from an image database for color transformation, such that the resultant image has desired emotional impact. Our method can handle more diverse set of photographs than the previous methods. We conducted a detailed user study showing the effectiveness of our proposed method. A discussion and reasoning of failure cases has also been provided, indicating inherent limitation of color-transfer based methods in the use of emotion assignment. Project Page: http://im.itu.edu.pk/affective-image-transfer/ △ Less

Submitted 7 August, 2021; v1 submitted 25 July, 2017; originally announced July 2017.

Comments: Published at British Machine Vision Conference (BMVC) 2017

arXiv:1705.02751 [pdf, other]

High-Level Concepts for Affective Understanding of Images

Authors: Afsheen Rafaqat Ali, Usman Shahid, Mohsen Ali, Jeffrey Ho

Abstract: This paper aims to bridge the affective gap between image content and the emotional response of the viewer it elicits by using High-Level Concepts (HLCs). In contrast to previous work that relied solely on low-level features or used convolutional neural network (CNN) as a black-box, we use HLCs generated by pretrained CNNs in an explicit way to investigate the relations/associations between these… ▽ More This paper aims to bridge the affective gap between image content and the emotional response of the viewer it elicits by using High-Level Concepts (HLCs). In contrast to previous work that relied solely on low-level features or used convolutional neural network (CNN) as a black-box, we use HLCs generated by pretrained CNNs in an explicit way to investigate the relations/associations between these HLCs and a (small) set of Ekman's emotional classes. As a proof-of-concept, we first propose a linear admixture model for modeling these relations, and the resulting computational framework allows us to determine the associations between each emotion class and certain HLCs (objects and places). This linear model is further extended to a nonlinear model using support vector regression (SVR) that aims to predict the viewer's emotional response using both low-level image features and HLCs extracted from images. These class-specific regressors are then assembled into a regressor ensemble that provide a flexible and effective predictor for predicting viewer's emotional responses from images. Experimental results have demonstrated that our results are comparable to existing methods, with a clear view of the association between HLCs and emotional classes that is ostensibly missing in most existing work. △ Less

Submitted 8 May, 2017; originally announced May 2017.

arXiv:1301.0550 [pdf]

Markov Equivalence Classes for Maximal Ancestral Graphs

Authors: Ayesha R. Ali, Thomas S. Richardson

Abstract: Ancestral graphs are a class of graphs that encode conditional independence relations arising in DAG models with latent and selection variables, corresponding to marginalization and conditioning. However, for any ancestral graph, there may be several other graphs to which it is Markov equivalent. We introduce a simple representation of a Markov equivalence class of ancestral graphs, thereby faci… ▽ More Ancestral graphs are a class of graphs that encode conditional independence relations arising in DAG models with latent and selection variables, corresponding to marginalization and conditioning. However, for any ancestral graph, there may be several other graphs to which it is Markov equivalent. We introduce a simple representation of a Markov equivalence class of ancestral graphs, thereby facilitating model search. \ More specifically, we define a join operation on ancestral graphs which will associate a unique graph with a Markov equivalence class. We also extend the separation criterion for ancestral graphs (which is an extension of d-separation) and provide a proof of the pairwise Markov property for joined ancestral graphs. △ Less

Submitted 12 December, 2012; originally announced January 2013.

Comments: Appears in Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI2002)

Report number: UAI-P-2002-PG-1-9

arXiv:1207.1365 [pdf]

Towards Characterizing Markov Equivalence Classes for Directed Acyclic Graphs with Latent Variables

Authors: Ayesha R. Ali, Thomas S. Richardson, Peter L. Spirtes, Jiji Zhang

Abstract: It is well known that there may be many causal explanations that are consistent with a given set of data. Recent work has been done to represent the common aspects of these explanations into one representation. In this paper, we address what is less well known: how do the relationships common to every causal explanation among the observed variables of some DAG process change in the presence of lat… ▽ More It is well known that there may be many causal explanations that are consistent with a given set of data. Recent work has been done to represent the common aspects of these explanations into one representation. In this paper, we address what is less well known: how do the relationships common to every causal explanation among the observed variables of some DAG process change in the presence of latent variables? Ancestral graphs provide a class of graphs that can encode conditional independence relations that arise in DAG models with latent and selection variables. In this paper we present a set of orientation rules that construct the Markov equivalence class representative for ancestral graphs, given a member of the equivalence class. These rules are sound and complete. We also show that when the equivalence class includes a DAG, the equivalence class representative is the essential graph for the said DAG △ Less

Submitted 4 July, 2012; originally announced July 2012.

Comments: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005)

Report number: UAI-P-2005-PG-10-17

Showing 1–11 of 11 results for author: Ali, A R