-
Exploring Syntactic Patterns in Urdu: A Deep Dive into Dependency Analysis
Authors:
Nudrat Habib
Abstract:
Parsing is the process of breaking a sentence into its grammatical components and identifying the syntactic structure of the sentence. The syntactically correct sentence structure is achieved by assigning grammatical labels to its constituents using lexicon and syntactic rules. In linguistics, parser is extremely useful due to the number of different applications like name entity recognition, QA s…
▽ More
Parsing is the process of breaking a sentence into its grammatical components and identifying the syntactic structure of the sentence. The syntactically correct sentence structure is achieved by assigning grammatical labels to its constituents using lexicon and syntactic rules. In linguistics, parser is extremely useful due to the number of different applications like name entity recognition, QA systems and information extraction, etc. The two most common techniques used for parsing are phrase structure and dependency Structure. Because Urdu is a low-resource language, there has been little progress in building an Urdu parser. A comparison of several parsers revealed that the dependency parsing approach is better suited for order-free languages such as Urdu. We have made significant progress in parsing Urdu, a South Asian language with a complex morphology. For Urdu dependency parsing, a basic feature model consisting of word location, word head, and dependency relation is employed as a starting point, followed by more complex feature models. The dependency tagset is designed after careful consideration of the complex morphological structure of the Urdu language, word order variation, and lexical ambiguity and it contains 22 tags. Our dataset comprises of sentences from news articles, and we tried to include sentences of different complexity (which is quite challenging), to get reliable results. All experiments are performed using MaltParser, exploring all 9 algorithms and classifiers. We have achieved a 70 percent overall best-labeled accuracy (LA), as well as an 84 percent overall best-unlabeled attachment score (UAS) using the Nivreeager algorithm. The comparison of output data with treebank test data that has been manually parsed is then used to carry out error assessment and to identify the errors produced by the parser.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
On the Limitations of Large Language Models (LLMs): False Attribution
Authors:
Tosin Adewumi,
Nudrat Habib,
Lama Alkhaled,
Elisa Barney
Abstract:
In this work, we provide insight into one important limitation of large language models (LLMs), i.e. false attribution, and introduce a new hallucination metric - Simple Hallucination Index (SHI). The task of automatic author attribution for relatively small chunks of text is an important NLP task but can be challenging. We empirically evaluate the power of 3 open SotA LLMs in zero-shot setting (L…
▽ More
In this work, we provide insight into one important limitation of large language models (LLMs), i.e. false attribution, and introduce a new hallucination metric - Simple Hallucination Index (SHI). The task of automatic author attribution for relatively small chunks of text is an important NLP task but can be challenging. We empirically evaluate the power of 3 open SotA LLMs in zero-shot setting (LLaMA-2-13B, Mixtral 8x7B, and Gemma-7B), especially as human annotation can be costly. We collected the top 10 most popular books, according to Project Gutenberg, divided each one into equal chunks of 400 words, and asked each LLM to predict the author. We then randomly sampled 162 chunks for human evaluation from each of the annotated books, based on the error margin of 7% and a confidence level of 95% for the book with the most chunks (Great Expectations by Charles Dickens, having 922 chunks). The average results show that Mixtral 8x7B has the highest prediction accuracy, the lowest SHI, and a Pearson's correlation (r) of 0.737, 0.249, and -0.9996, respectively, followed by LLaMA-2-13B and Gemma-7B. However, Mixtral 8x7B suffers from high hallucinations for 3 books, rising as high as an SHI of 0.87 (in the range 0-1, where 1 is the worst). The strong negative correlation of accuracy and SHI, given by r, demonstrates the fidelity of the new hallucination metric, which is generalizable to other tasks. We publicly release the annotated chunks of data and our codes to aid the reproducibility and evaluation of other models.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Instruction Makes a Difference
Authors:
Tosin Adewumi,
Nudrat Habib,
Lama Alkhaled,
Elisa Barney
Abstract:
We introduce Instruction Document Visual Question Answering (iDocVQA) dataset and Large Language Document (LLaDoc) model, for training Language-Vision (LV) models for document analysis and predictions on document images, respectively. Usually, deep neural networks for the DocVQA task are trained on datasets lacking instructions. We show that using instruction-following datasets improves performanc…
▽ More
We introduce Instruction Document Visual Question Answering (iDocVQA) dataset and Large Language Document (LLaDoc) model, for training Language-Vision (LV) models for document analysis and predictions on document images, respectively. Usually, deep neural networks for the DocVQA task are trained on datasets lacking instructions. We show that using instruction-following datasets improves performance. We compare performance across document-related datasets using the recent state-of-the-art (SotA) Large Language and Vision Assistant (LLaVA)1.5 as the base model. We also evaluate the performance of the derived models for object hallucination using the Polling-based Object Probing Evaluation (POPE) dataset. The results show that instruction-tuning performance ranges from 11X to 32X of zero-shot performance and from 0.1% to 4.2% over non-instruction (traditional task) finetuning. Despite the gains, these still fall short of human performance (94.36%), implying there's much room for improvement.
△ Less
Submitted 13 June, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
Deriving Weeklong Activity-Travel Dairy from Google Location History: Survey Tool Development and A Field Test in Toronto
Authors:
Melvyn Li,
Kaili Wang,
Yicong Liu,
Khandker Nurul Habib
Abstract:
This paper introduces an innovative travel survey methodology that utilizes Google Location History (GLH) data to generate travel diaries for transportation demand analysis. By leveraging the accuracy and omnipresence among smartphone users of GLH, the proposed methodology avoids the need for proprietary GPS tracking applications to collect smartphone-based GPS data. This research enhanced an exis…
▽ More
This paper introduces an innovative travel survey methodology that utilizes Google Location History (GLH) data to generate travel diaries for transportation demand analysis. By leveraging the accuracy and omnipresence among smartphone users of GLH, the proposed methodology avoids the need for proprietary GPS tracking applications to collect smartphone-based GPS data. This research enhanced an existing travel survey designer, Travel Activity Internet Survey Interface (TRAISI), to make it capable of deriving travel diaries from the respondents' GLH. The feasibility of this data collection approach is showcased through the Google Timeline Travel Survey (GTTS) conducted in the Greater Toronto Area, Canada. The resultant dataset from the GTTS is demographically representative and offers detailed and accurate travel behavioural insights.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Zephyr: Direct Distillation of LM Alignment
Authors:
Lewis Tunstall,
Edward Beeching,
Nathan Lambert,
Nazneen Rajani,
Kashif Rasul,
Younes Belkada,
Shengyi Huang,
Leandro von Werra,
Clémentine Fourrier,
Nathan Habib,
Nathan Sarrazin,
Omar Sanseviero,
Alexander M. Rush,
Thomas Wolf
Abstract:
We aim to produce a smaller language model that is aligned to user intent. Previous research has shown that applying distilled supervised fine-tuning (dSFT) on larger models significantly improves task accuracy; however, these models are unaligned, i.e. they do not respond well to natural prompts. To distill this property, we experiment with the use of preference data from AI Feedback (AIF). Start…
▽ More
We aim to produce a smaller language model that is aligned to user intent. Previous research has shown that applying distilled supervised fine-tuning (dSFT) on larger models significantly improves task accuracy; however, these models are unaligned, i.e. they do not respond well to natural prompts. To distill this property, we experiment with the use of preference data from AI Feedback (AIF). Starting from a dataset of outputs ranked by a teacher model, we apply distilled direct preference optimization (dDPO) to learn a chat model with significantly improved intent alignment. The approach requires only a few hours of training without any additional sampling during fine-tuning. The final result, Zephyr-7B, sets the state-of-the-art on chat benchmarks for 7B parameter models, and requires no human annotation. In particular, results on MT-Bench show that Zephyr-7B surpasses Llama2-Chat-70B, the best open-access RLHF-based model. Code, models, data, and tutorials for the system are available at https://github.com/huggingface/alignment-handbook.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Modelling Non-Condensing Compositional Convection for Applications to Super-Earth and Sub-Neptune Atmospheres
Authors:
Namrah Habib,
Raymond T. Pierrehumbert
Abstract:
Compositional convection is atmospheric mixing driven by density variations caused by compositional gradients. Previous studies have suggested that compositional gradients of atmospheric trace species within planetary atmospheres can impact convection and the final atmospheric temperature profile. In this work, we employ 3D convection resolving simulations using Cloud Model 1 (CM1) to gain a funda…
▽ More
Compositional convection is atmospheric mixing driven by density variations caused by compositional gradients. Previous studies have suggested that compositional gradients of atmospheric trace species within planetary atmospheres can impact convection and the final atmospheric temperature profile. In this work, we employ 3D convection resolving simulations using Cloud Model 1 (CM1) to gain a fundamental understanding of how compositional variation influences convection and the final atmospheric state of exoplanet atmospheres. We perform 3D initial value problem simulations of non-condensing compositional convection for Earth-Air, $\rm H_2$, and $\rm CO_2$ atmospheres. Conventionally, atmospheric convection is assumed to mix the atmosphere to a final, marginally stable state defined by a unique temperature profile. However, when there is compositional variation within an atmosphere, a continuous family of stable end states is possible, differing in the final state composition profile. Our CM1 simulations are used to determine which of the family of possible compositional end states is selected. Leveraging the results from our CM1 simulations, we develop a dry convective adjustment scheme for use in General Circulation Models (GCMs). This scheme relies on an energy analysis to determine the final adjusted atmospheric state. Our convection scheme produces results that agree with our CM1 simulations and can easily be implemented in GCMs to improve modelling of compositional convection in exoplanet atmospheres.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
Towards exploring adversarial learning for anomaly detection in complex driving scenes
Authors:
Nour Habib,
Yunsu Cho,
Abhishek Buragohain,
Andreas Rausch
Abstract:
One of the many Autonomous Systems (ASs), such as autonomous driving cars, performs various safety-critical functions. Many of these autonomous systems take advantage of Artificial Intelligence (AI) techniques to perceive their environment. But these perceiving components could not be formally verified, since, the accuracy of such AI-based components has a high dependency on the quality of trainin…
▽ More
One of the many Autonomous Systems (ASs), such as autonomous driving cars, performs various safety-critical functions. Many of these autonomous systems take advantage of Artificial Intelligence (AI) techniques to perceive their environment. But these perceiving components could not be formally verified, since, the accuracy of such AI-based components has a high dependency on the quality of training data. So Machine learning (ML) based anomaly detection, a technique to identify data that does not belong to the training data could be used as a safety measuring indicator during the development and operational time of such AI-based components. Adversarial learning, a sub-field of machine learning has proven its ability to detect anomalies in images and videos with impressive results on simple data sets. Therefore, in this work, we investigate and provide insight into the performance of such techniques on a highly complex driving scenes dataset called Berkeley DeepDrive.
△ Less
Submitted 17 June, 2023;
originally announced July 2023.
-
Modelling the Frequency of Home Deliveries: An Induced Travel Demand Contribution of Aggrandized E-shop** in Toronto during COVID-19 Pandemics
Authors:
Yicong Liu,
Kaili Wang,
Patrick Loa,
Khandker Nurul Habib
Abstract:
The COVID-19 pandemic dramatically catalyzed the proliferation of e-shop**. The dramatic growth of e-shop** will undoubtedly cause significant impacts on travel demand. As a result, transportation modeller's ability to model e-shop** demand is becoming increasingly important. This study developed models to predict household' weekly home delivery frequencies. We used both classical econometri…
▽ More
The COVID-19 pandemic dramatically catalyzed the proliferation of e-shop**. The dramatic growth of e-shop** will undoubtedly cause significant impacts on travel demand. As a result, transportation modeller's ability to model e-shop** demand is becoming increasingly important. This study developed models to predict household' weekly home delivery frequencies. We used both classical econometric and machine learning techniques to obtain the best model. It is found that socioeconomic factors such as having an online grocery membership, household members' average age, the percentage of male household members, the number of workers in the household and various land use factors influence home delivery demand. This study also compared the interpretations and performances of the machine learning models and the classical econometric model. Agreement is found in the variable's effects identified through the machine learning and econometric models. However, with similar recall accuracy, the ordered probit model, a classical econometric model, can accurately predict the aggregate distribution of household delivery demand. In contrast, both machine learning models failed to match the observed distribution.
△ Less
Submitted 21 September, 2022;
originally announced September 2022.
-
How Large is too Large? A Review of the Issues related to Sample Size Requirements of Regional Household Travel Surveys with a Case Study on the Greater Toronto and Hamilton Area (GTHA)
Authors:
Khandker Nurul Habib,
Wafic El-Assi,
Tian Lin
Abstract:
The paper presents a review of sample size issues related to regional household travel surveys. A review of current practices reveals that different perspectives and, as a result, different practices exist in Canada, US, and abroad on sample size. The paper uses data from the Transportation Tomorrow Survey (TTS) - a household travel survey conducted every five years in the Greater Toronto and Hami…
▽ More
The paper presents a review of sample size issues related to regional household travel surveys. A review of current practices reveals that different perspectives and, as a result, different practices exist in Canada, US, and abroad on sample size. The paper uses data from the Transportation Tomorrow Survey (TTS) - a household travel survey conducted every five years in the Greater Toronto and Hamilton Area (GTHA) - for a set of empirical investigations that asses the adequacy of household travel survey samples. The empirical investigations reveal that even with a 5% sample size, a full representation of the population and its corresponding travel behaviour may be difficult (at the 95% confidence level). Therefore, based on the results of the empirical investigations and the literature review, the paper proposes a flexible framework for household travel survey sample size determination, especially for Canadian municipalities.
△ Less
Submitted 1 May, 2020;
originally announced May 2020.
-
On the Factors Influencing the Choices of Weekly Telecommuting Frequencies of Post-secondary Students in Toronto
Authors:
Khandker Nurul Habib,
Ph. D.,
PEng
Abstract:
The paper presents an empirical investigation of telecommuting frequency choices by post-secondary students in Toronto. It uses a dataset collected through a large-scale travel survey conducted on post-secondary students of four major universities in Toronto and it employs multiple alternative econometric modelling techniques for the empirical investigation. Results contribute on two fronts. First…
▽ More
The paper presents an empirical investigation of telecommuting frequency choices by post-secondary students in Toronto. It uses a dataset collected through a large-scale travel survey conducted on post-secondary students of four major universities in Toronto and it employs multiple alternative econometric modelling techniques for the empirical investigation. Results contribute on two fronts. Firstly, it presents empirical investigations of factors affecting telecommuting frequency choices of post-secondary students that are rare in literature. Secondly, it identifies better a performing econometric modelling technique for modelling telecommuting frequency choices. Empirical investigation clearly reveals that telecommuting for school related activities is prevalent among post-secondary students in Toronto. Around 80 percent of 0.18 million of the post-secondary students of the region, who make roughly 36,000 trips per day, also telecommute at least once a week. Considering that large numbers of students need to spend a long time travelling from home to campus with around 33 percent spending more than two hours a day on travelling, telecommuting has potential to enhance their quality of life. Empirical investigations reveal that car ownership and living farther from the campus have similar positive effects on the choice of higher frequency of telecommuting. Students who use a bicycle for regular travel are least likely to telecommute, compared to those using transit or a private car.
△ Less
Submitted 9 April, 2020;
originally announced April 2020.
-
Overcoming the Challenges Associated with Image-based Map** of Small Bodies in Preparation for the OSIRIS-REx Mission to (101955) Bennu
Authors:
D. N. DellaGiustina,
C. A. Bennett,
K. Becker,
D. R Golish,
L. Le Corre,
D. A. Cook,
K. L. Edmundson,
M. Chojnacki,
S. S. Sutton,
M. P. Milazzo,
B. Carcich,
M. C. Nolan,
N. Habib,
K. N. Burke,
T. Becker,
P. H. Smith,
K. J. Walsh,
K. Getzandanner,
D. R. Wibben,
J. M. Leonard,
M. M. Westermann,
A. T. Polit,
J. N. Kidd Jr.,
C. W. Hergenrother,
W. V. Boynton
, et al. (16 additional authors not shown)
Abstract:
The OSIRIS-REx Asteroid Sample Return Mission is the third mission in NASA's New Frontiers Program and is the first U.S. mission to return samples from an asteroid to Earth. The most important decision ahead of the OSIRIS-REx team is the selection of a prime sample-site on the surface of asteroid (101955) Bennu. Mission success hinges on identifying a site that is safe and has regolith that can re…
▽ More
The OSIRIS-REx Asteroid Sample Return Mission is the third mission in NASA's New Frontiers Program and is the first U.S. mission to return samples from an asteroid to Earth. The most important decision ahead of the OSIRIS-REx team is the selection of a prime sample-site on the surface of asteroid (101955) Bennu. Mission success hinges on identifying a site that is safe and has regolith that can readily be ingested by the spacecraft's sampling mechanism. To inform this mission-critical decision, the surface of Bennu is mapped using the OSIRIS-REx Camera Suite and the images are used to develop several foundational data products. Acquiring the necessary inputs to these data products requires observational strategies that are defined specifically to overcome the challenges associated with map** a small irregular body. We present these strategies in the context of assessing candidate sample-sites at Bennu according to a framework of decisions regarding the relative safety, sampleability, and scientific value across the asteroid's surface. To create data products that aid these assessments, we describe the best practices developed by the OSIRIS-REx team for image-based map** of irregular small bodies. We emphasize the importance of using 3D shape models and the ability to work in body-fixed rectangular coordinates when dealing with planetary surfaces that cannot be uniquely addressed by body-fixed latitude and longitude.
△ Less
Submitted 23 October, 2018;
originally announced October 2018.