-
A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Authors:
Md Tahmid Rahman Laskar,
Sawsan Alqahtani,
M Saiful Bari,
Mizanur Rahman,
Mohammad Abdullah Matin Khan,
Haidar Khan,
Israt Jahan,
Amran Bhuiyan,
Chee Wei Tan,
Md Rizwan Parvez,
Enamul Hoque,
Shafiq Joty,
Jimmy Huang
Abstract:
Large Language Models (LLMs) have recently gained significant attention due to their remarkable capabilities in performing diverse tasks across various domains. However, a thorough evaluation of these models is crucial before deploying them in real-world applications to ensure they produce reliable performance. Despite the well-established importance of evaluating LLMs in the community, the comple…
▽ More
Large Language Models (LLMs) have recently gained significant attention due to their remarkable capabilities in performing diverse tasks across various domains. However, a thorough evaluation of these models is crucial before deploying them in real-world applications to ensure they produce reliable performance. Despite the well-established importance of evaluating LLMs in the community, the complexity of the evaluation process has led to varied evaluation setups, causing inconsistencies in findings and interpretations. To address this, we systematically review the primary challenges and limitations causing these inconsistencies and unreliable evaluations in various steps of LLM evaluation. Based on our critical review, we present our perspectives and recommendations to ensure LLM evaluations are reproducible, reliable, and robust.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Authors:
Norah Alzahrani,
Hisham Abdullah Alyahya,
Yazeed Alnumay,
Sultan Alrashed,
Shaykhah Alsubaie,
Yusef Almushaykeh,
Faisal Mirza,
Nouf Alotaibi,
Nora Altwairesh,
Areeb Alowisheq,
M Saiful Bari,
Haidar Khan
Abstract:
Large Language Model (LLM) leaderboards based on benchmark rankings are regularly used to guide practitioners in model selection. Often, the published leaderboard rankings are taken at face value - we show this is a (potentially costly) mistake. Under existing leaderboards, the relative performance of LLMs is highly sensitive to (often minute) details. We show that for popular multiple-choice ques…
▽ More
Large Language Model (LLM) leaderboards based on benchmark rankings are regularly used to guide practitioners in model selection. Often, the published leaderboard rankings are taken at face value - we show this is a (potentially costly) mistake. Under existing leaderboards, the relative performance of LLMs is highly sensitive to (often minute) details. We show that for popular multiple-choice question benchmarks (e.g., MMLU), minor perturbations to the benchmark, such as changing the order of choices or the method of answer selection, result in changes in rankings up to 8 positions. We explain this phenomenon by conducting systematic experiments over three broad categories of benchmark perturbations and identifying the sources of this behavior. Our analysis results in several best-practice recommendations, including the advantage of a hybrid scoring method for answer selection. Our study highlights the dangers of relying on simple benchmark evaluations and charts the path for more robust evaluation schemes on the existing benchmarks. The code for this paper is available at https://github.com/National-Center-for-AI-Saudi-Arabia/lm-evaluation-harness.
△ Less
Submitted 3 July, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls of Large Language Models on Bengali NLP
Authors:
Mohsinul Kabir,
Mohammed Saidul Islam,
Md Tahmid Rahman Laskar,
Mir Tafseer Nayeem,
M Saiful Bari,
Enamul Hoque
Abstract:
Large Language Models (LLMs) have emerged as one of the most important breakthroughs in NLP for their impressive skills in language generation and other language-specific tasks. Though LLMs have been evaluated in various tasks, mostly in English, they have not yet undergone thorough evaluation in under-resourced languages such as Bengali (Bangla). To this end, this paper introduces BenLLM-Eval, wh…
▽ More
Large Language Models (LLMs) have emerged as one of the most important breakthroughs in NLP for their impressive skills in language generation and other language-specific tasks. Though LLMs have been evaluated in various tasks, mostly in English, they have not yet undergone thorough evaluation in under-resourced languages such as Bengali (Bangla). To this end, this paper introduces BenLLM-Eval, which consists of a comprehensive evaluation of LLMs to benchmark their performance in the Bengali language that has modest resources. In this regard, we select various important and diverse Bengali NLP tasks, such as text summarization, question answering, paraphrasing, natural language inference, transliteration, text classification, and sentiment analysis for zero-shot evaluation of popular LLMs, namely, GPT-3.5, LLaMA-2-13b-chat, and Claude-2. Our experimental results demonstrate that while in some Bengali NLP tasks, zero-shot LLMs could achieve performance on par, or even better than current SOTA fine-tuned models; in most tasks, their performance is quite poor (with the performance of open-source LLMs like LLaMA-2-13b-chat being significantly bad) in comparison to the current SOTA results. Therefore, it calls for further efforts to develop a better understanding of LLMs in modest-resourced languages like Bengali.
△ Less
Submitted 19 March, 2024; v1 submitted 22 September, 2023;
originally announced September 2023.
-
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
Authors:
Md Tahmid Rahman Laskar,
M Saiful Bari,
Mizanur Rahman,
Md Amran Hossen Bhuiyan,
Shafiq Joty,
Jimmy Xiangji Huang
Abstract:
The development of large language models (LLMs) such as ChatGPT has brought a lot of attention recently. However, their evaluation in the benchmark academic datasets remains under-explored due to the difficulty of evaluating the generative outputs produced by this model against the ground truth. In this paper, we aim to present a thorough evaluation of ChatGPT's performance on diverse academic dat…
▽ More
The development of large language models (LLMs) such as ChatGPT has brought a lot of attention recently. However, their evaluation in the benchmark academic datasets remains under-explored due to the difficulty of evaluating the generative outputs produced by this model against the ground truth. In this paper, we aim to present a thorough evaluation of ChatGPT's performance on diverse academic datasets, covering tasks like question-answering, text summarization, code generation, commonsense reasoning, mathematical problem-solving, machine translation, bias detection, and ethical considerations. Specifically, we evaluate ChatGPT across 140 tasks and analyze 255K responses it generates in these datasets. This makes our work the largest evaluation of ChatGPT in NLP benchmarks. In short, our study aims to validate the strengths and weaknesses of ChatGPT in various tasks and provide insights for future research using LLMs. We also report a new emergent ability to follow multi-query instructions that we mostly found in ChatGPT and other instruction-tuned models. Our extensive evaluation shows that even though ChatGPT is capable of performing a wide variety of tasks, and may obtain impressive performance in several benchmark datasets, it is still far from achieving the ability to reliably solve many challenging tasks. By providing a thorough assessment of ChatGPT's performance across diverse NLP tasks, this paper sets the stage for a targeted deployment of ChatGPT-like LLMs in real-world applications.
△ Less
Submitted 5 July, 2023; v1 submitted 29 May, 2023;
originally announced May 2023.
-
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval
Authors:
Mohammad Abdullah Matin Khan,
M Saiful Bari,
Xuan Long Do,
Weishi Wang,
Md Rizwan Parvez,
Shafiq Joty
Abstract:
Recently, pre-trained large language models (LLMs) have shown impressive abilities in generating codes from natural language descriptions, repairing buggy codes, translating codes between languages, and retrieving relevant code segments. However, the evaluation of these models has often been performed in a scattered way on only one or two specific tasks, in a few languages, at a partial granularit…
▽ More
Recently, pre-trained large language models (LLMs) have shown impressive abilities in generating codes from natural language descriptions, repairing buggy codes, translating codes between languages, and retrieving relevant code segments. However, the evaluation of these models has often been performed in a scattered way on only one or two specific tasks, in a few languages, at a partial granularity (e.g., function) level, and in many cases without proper training data. Even more concerning is that in most cases the evaluation of generated codes has been done in terms of mere lexical overlap with a reference code rather than actual execution. We introduce xCodeEval, the largest executable multilingual multitask benchmark to date consisting of $25$M document-level coding examples ($16.5$B tokens) from about $7.5$K unique problems covering up to $11$ programming languages with execution-level parallelism. It features a total of $7$ tasks involving code understanding, generation, translation and retrieval. xCodeEval adopts an execution-based evaluation and offers a multilingual code execution engine, ExecEval that supports unit test based execution in all the $11$ languages. To address the challenge of balancing the distributions of text-code samples over multiple attributes in validation/test sets, we propose a novel data splitting and a data selection schema based on the geometric mean and graph-theoretic principle. Our experiments with OpenAI's LLMs (zero-shot) and open-LLMs (zero-shot and fine-tuned) on the tasks and languages demonstrate **xCodeEval** to be quite challenging as per the current advancements in language models.
△ Less
Submitted 6 November, 2023; v1 submitted 6 March, 2023;
originally announced March 2023.
-
SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning
Authors:
M Saiful Bari,
Aston Zhang,
Shuai Zheng,
Xingjian Shi,
Yi Zhu,
Shafiq Joty,
Mu Li
Abstract:
Pre-trained large language models can efficiently interpolate human-written prompts in a natural way. Multitask prompted learning can help generalization through a diverse set of tasks at once, thus enhancing the potential for more effective downstream fine-tuning. To perform efficient multitask-inference in the same batch, parameter-efficient fine-tuning methods such as prompt tuning have been pr…
▽ More
Pre-trained large language models can efficiently interpolate human-written prompts in a natural way. Multitask prompted learning can help generalization through a diverse set of tasks at once, thus enhancing the potential for more effective downstream fine-tuning. To perform efficient multitask-inference in the same batch, parameter-efficient fine-tuning methods such as prompt tuning have been proposed. However, the existing prompt tuning methods may lack generalization. We propose SPT, a semi-parametric prompt tuning method for multitask prompted learning. The novel component of SPT is a memory bank from where memory prompts are retrieved based on discrete prompts. Extensive experiments, such as (i) fine-tuning a full language model with SPT on 31 different tasks from 8 different domains and evaluating zero-shot generalization on 9 heldout datasets under 5 NLP task categories and (ii) pretraining SPT on the GLUE datasets and evaluating fine-tuning on the SuperGLUE datasets, demonstrate effectiveness of SPT.
△ Less
Submitted 21 December, 2022;
originally announced December 2022.
-
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting
Authors:
Zheng-Xin Yong,
Hailey Schoelkopf,
Niklas Muennighoff,
Alham Fikri Aji,
David Ifeoluwa Adelani,
Khalid Almubarak,
M Saiful Bari,
Lintang Sutawika,
Jungo Kasai,
Ahmed Baruwa,
Genta Indra Winata,
Stella Biderman,
Edward Raff,
Dragomir Radev,
Vassilina Nikoulina
Abstract:
The BLOOM model is a large publicly available multilingual language model, but its pretraining was limited to 46 languages. To extend the benefits of BLOOM to other languages without incurring prohibitively large costs, it is desirable to adapt BLOOM to new languages not seen during pretraining. In this work, we apply existing language adaptation strategies to BLOOM and benchmark its zero-shot pro…
▽ More
The BLOOM model is a large publicly available multilingual language model, but its pretraining was limited to 46 languages. To extend the benefits of BLOOM to other languages without incurring prohibitively large costs, it is desirable to adapt BLOOM to new languages not seen during pretraining. In this work, we apply existing language adaptation strategies to BLOOM and benchmark its zero-shot prompting performance on eight new languages in a resource-constrained setting. We find language adaptation to be effective at improving zero-shot performance in new languages. Surprisingly, we find that adapter-based finetuning is more effective than continued pretraining for large models. In addition, we discover that prompting performance is not significantly affected by language specifics, such as the writing system. It is primarily determined by the size of the language adaptation data. We also add new languages to BLOOMZ, which is a multitask finetuned version of BLOOM capable of following task instructions zero-shot. We find including a new language in the multitask fine-tuning mixture to be the most effective method to teach BLOOMZ a new language. We conclude that with sufficient training data language adaptation can generalize well to diverse languages. Our code is available at https://github.com/bigscience-workshop/multilingual-modeling.
△ Less
Submitted 27 May, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Authors:
BigScience Workshop,
:,
Teven Le Scao,
Angela Fan,
Christopher Akiki,
Ellie Pavlick,
Suzana Ilić,
Daniel Hesslow,
Roman Castagné,
Alexandra Sasha Luccioni,
François Yvon,
Matthias Gallé,
Jonathan Tow,
Alexander M. Rush,
Stella Biderman,
Albert Webson,
Pawan Sasanka Ammanamanchi,
Thomas Wang,
Benoît Sagot,
Niklas Muennighoff,
Albert Villanova del Moral,
Olatunji Ruwase,
Rachel Bawden,
Stas Bekman,
Angelina McMillan-Major
, et al. (369 additional authors not shown)
Abstract:
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access…
▽ More
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
△ Less
Submitted 27 June, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Crosslingual Generalization through Multitask Finetuning
Authors:
Niklas Muennighoff,
Thomas Wang,
Lintang Sutawika,
Adam Roberts,
Stella Biderman,
Teven Le Scao,
M Saiful Bari,
Sheng Shen,
Zheng-Xin Yong,
Hailey Schoelkopf,
Xiangru Tang,
Dragomir Radev,
Alham Fikri Aji,
Khalid Almubarak,
Samuel Albanie,
Zaid Alyafeai,
Albert Webson,
Edward Raff,
Colin Raffel
Abstract:
Multitask prompted finetuning (MTF) has been shown to help large language models generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused on English data and models. We apply MTF to the pretrained multilingual BLOOM and mT5 model families to produce finetuned variants called BLOOMZ and mT0. We find finetuning large multilingual language models on English tasks wi…
▽ More
Multitask prompted finetuning (MTF) has been shown to help large language models generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused on English data and models. We apply MTF to the pretrained multilingual BLOOM and mT5 model families to produce finetuned variants called BLOOMZ and mT0. We find finetuning large multilingual language models on English tasks with English prompts allows for task generalization to non-English languages that appear only in the pretraining corpus. Finetuning on multilingual tasks with English prompts further improves performance on English and non-English tasks leading to various state-of-the-art zero-shot results. We also investigate finetuning on multilingual tasks with prompts that have been machine-translated from English to match the language of each dataset. We find training on these machine-translated prompts leads to better performance on human-written prompts in the respective languages. Surprisingly, we find models are capable of zero-shot generalization to tasks in languages they have never intentionally seen. We conjecture that the models are learning higher-level capabilities that are both task- and language-agnostic. In addition, we introduce xP3, a composite of supervised datasets in 46 languages with English and machine-translated prompts. Our code, datasets and models are freely available at https://github.com/bigscience-workshop/xmtf.
△ Less
Submitted 29 May, 2023; v1 submitted 3 November, 2022;
originally announced November 2022.
-
What Language Model to Train if You Have One Million GPU Hours?
Authors:
Teven Le Scao,
Thomas Wang,
Daniel Hesslow,
Lucile Saulnier,
Stas Bekman,
M Saiful Bari,
Stella Biderman,
Hady Elsahar,
Niklas Muennighoff,
Jason Phang,
Ofir Press,
Colin Raffel,
Victor Sanh,
Sheng Shen,
Lintang Sutawika,
Jaesung Tae,
Zheng Xin Yong,
Julien Launay,
Iz Beltagy
Abstract:
The crystallization of modeling methods around the Transformer architecture has been a boon for practitioners. Simple, well-motivated architectural variations can transfer across tasks and scale, increasing the impact of modeling research. However, with the emergence of state-of-the-art 100B+ parameters models, large language models are increasingly expensive to accurately design and train. Notabl…
▽ More
The crystallization of modeling methods around the Transformer architecture has been a boon for practitioners. Simple, well-motivated architectural variations can transfer across tasks and scale, increasing the impact of modeling research. However, with the emergence of state-of-the-art 100B+ parameters models, large language models are increasingly expensive to accurately design and train. Notably, it can be difficult to evaluate how modeling decisions may impact emergent capabilities, given that these capabilities arise mainly from sheer scale alone. In the process of building BLOOM--the Big Science Large Open-science Open-access Multilingual language model--our goal is to identify an architecture and training setup that makes the best use of our 1,000,000 A100-GPU-hours budget. Specifically, we perform an ablation study at the billion-parameter scale comparing different modeling practices and their impact on zero-shot generalization. In addition, we study the impact of various popular pre-training corpora on zero-shot generalization. We also study the performance of a multilingual model and how it compares to the English-only one. Finally, we consider the scaling behaviour of Transformers to choose the target model size, shape, and training setup. All our models and code are open-sourced at https://huggingface.co/bigscience .
△ Less
Submitted 7 November, 2022; v1 submitted 27 October, 2022;
originally announced October 2022.
-
Statistical Analysis Based Feature Selection Enhanced RF-PUF with >99.8% Accuracy on Unmodified Commodity Transmitters for IoT Physical Security
Authors:
Md Faizul Bari,
Parv Agrawal,
Baibhab Chatterjee,
Shreyas Sen
Abstract:
Due to the diverse and mobile nature of the deployment environment, smart commodity devices are vulnerable to various attacks which can grant unauthorized access to a rogue device in a large, connected network. Traditional digital signature-based authentication methods are vulnerable to key recovery attacks, CSRF, etc. To circumvent this, RF-PUF had been proposed as a promising alternative that ut…
▽ More
Due to the diverse and mobile nature of the deployment environment, smart commodity devices are vulnerable to various attacks which can grant unauthorized access to a rogue device in a large, connected network. Traditional digital signature-based authentication methods are vulnerable to key recovery attacks, CSRF, etc. To circumvent this, RF-PUF had been proposed as a promising alternative that utilizes the inherent nonidealities of the devices as physical signatures. RF-PUF offers a robust authentication method that is resilient to key-hacking methods due to the absence of secret key requirements and does not require any additional circuitry on the transmitter end, eliminating additional power, area, and computational burden. In this work, for the first time, we analyze the effectiveness of RF-PUF on commodity devices, purchased off-the-shelf, without any modifications whatsoever. Data were collected from 30 Xbee S2C modules and released as a public dataset. A new feature has been engineered through statistical property analysis. With a new and robust feature set, it has been shown that 95% accuracy can be achieved using only ~1.8 ms of test data, reaching >99.8% accuracy with more data and a network of higher model capacity, without any assisting digital preamble. The design space has been explored in detail and the effect of the wireless channel has been determined. The performance of some popular ML algorithms has been compared with the NN approach. A thorough investigation on various PUF properties has been done and both intra and inter-PUF distances have been calculated. With extensive testing of 41238000 cases, the detection probability for RF-PUF for our data is found to be 0.9987, which, for the first time, experimentally establishes RF-PUF as a strong authentication method. Finally, the potential attack models and the robustness of RF-PUF against them have been discussed.
△ Less
Submitted 18 January, 2022;
originally announced February 2022.
-
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Authors:
Stephen H. Bach,
Victor Sanh,
Zheng-Xin Yong,
Albert Webson,
Colin Raffel,
Nihal V. Nayak,
Abheesht Sharma,
Taewoon Kim,
M Saiful Bari,
Thibault Fevry,
Zaid Alyafeai,
Manan Dey,
Andrea Santilli,
Zhiqing Sun,
Srulik Ben-David,
Canwen Xu,
Gunjan Chhablani,
Han Wang,
Jason Alan Fries,
Maged S. Al-shaibani,
Shanya Sharma,
Urmish Thakker,
Khalid Almubarak,
Xiangru Tang,
Dragomir Radev
, et al. (2 additional authors not shown)
Abstract:
PromptSource is a system for creating, sharing, and using natural language prompts. Prompts are functions that map an example from a dataset to a natural language input and target output. Using prompts to train and query language models is an emerging area in NLP that requires new tools that let users develop and refine these prompts collaboratively. PromptSource addresses the emergent challenges…
▽ More
PromptSource is a system for creating, sharing, and using natural language prompts. Prompts are functions that map an example from a dataset to a natural language input and target output. Using prompts to train and query language models is an emerging area in NLP that requires new tools that let users develop and refine these prompts collaboratively. PromptSource addresses the emergent challenges in this new setting with (1) a templating language for defining data-linked prompts, (2) an interface that lets users quickly iterate on prompt development by observing outputs of their prompts on many examples, and (3) a community-driven set of guidelines for contributing new prompts to a common pool. Over 2,000 prompts for roughly 170 datasets are already available in PromptSource. PromptSource is available at https://github.com/bigscience-workshop/promptsource.
△ Less
Submitted 29 March, 2022; v1 submitted 2 February, 2022;
originally announced February 2022.
-
Theoretical Analysis of an XGBoost Framework for Product Cannibalization
Authors:
Gautham Bekal,
Mohammad Bari
Abstract:
This paper is an extension of our work where we presented a three-stage XGBoost algorithm for forecasting sales under product cannibalization scenario. Previously we developed the model based on our intuition and provided empirical evidence on its performance. In this study we would briefly go over the algorithm and then provide mathematical reasoning behind its working.
This paper is an extension of our work where we presented a three-stage XGBoost algorithm for forecasting sales under product cannibalization scenario. Previously we developed the model based on our intuition and provided empirical evidence on its performance. In this study we would briefly go over the algorithm and then provide mathematical reasoning behind its working.
△ Less
Submitted 2 December, 2021;
originally announced December 2021.
-
An XGBoost-Based Forecasting Framework for Product Cannibalization
Authors:
Gautham Bekal,
Mohammad Bari
Abstract:
Two major challenges in demand forecasting are product cannibalization and long term forecasting. Product cannibalization is a phenomenon in which high demand of some products leads to reduction in sales of other products. Long term forecasting involves forecasting the sales over longer time frame that is critical for strategic business purposes. Also, conventional methods, for instance, recurrent…
▽ More
Two major challenges in demand forecasting are product cannibalization and long term forecasting. Product cannibalization is a phenomenon in which high demand of some products leads to reduction in sales of other products. Long term forecasting involves forecasting the sales over longer time frame that is critical for strategic business purposes. Also, conventional methods, for instance, recurrent neural networks may be ineffective where train data size is small as in the case in this study. This work presents XGBoost-based three-stage framework that addresses product cannibalization and associated long term error propagation problems. The performance of the proposed three-stage XGBoost-based framework is compared to and is found superior than that of regular XGBoost algorithm.
△ Less
Submitted 24 November, 2021;
originally announced November 2021.
-
Multitask Prompted Training Enables Zero-Shot Task Generalization
Authors:
Victor Sanh,
Albert Webson,
Colin Raffel,
Stephen H. Bach,
Lintang Sutawika,
Zaid Alyafeai,
Antoine Chaffin,
Arnaud Stiegler,
Teven Le Scao,
Arun Raja,
Manan Dey,
M Saiful Bari,
Canwen Xu,
Urmish Thakker,
Shanya Sharma Sharma,
Eliza Szczechla,
Taewoon Kim,
Gunjan Chhablani,
Nihal Nayak,
Debajyoti Datta,
Jonathan Chang,
Mike Tian-Jian Jiang,
Han Wang,
Matteo Manica,
Sheng Shen
, et al. (16 additional authors not shown)
Abstract:
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language models' pretraining (Radford et al., 2019). Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale,…
▽ More
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language models' pretraining (Radford et al., 2019). Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily map** any natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts with diverse wording. These prompted datasets allow for benchmarking the ability of a model to perform completely held-out tasks. We fine-tune a pretrained encoder-decoder model (Raffel et al., 2020; Lester et al., 2021) on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several standard datasets, often outperforming models up to 16x its size. Further, our approach attains strong performance on a subset of tasks from the BIG-bench benchmark, outperforming models up to 6x its size. All trained models are available at https://github.com/bigscience-workshop/t-zero and all prompts are available at https://github.com/bigscience-workshop/promptsource.
△ Less
Submitted 17 March, 2022; v1 submitted 15 October, 2021;
originally announced October 2021.
-
Nearest Neighbour Few-Shot Learning for Cross-lingual Classification
Authors:
M Saiful Bari,
Batool Haider,
Saab Mansour
Abstract:
Even though large pre-trained multilingual models (e.g. mBERT, XLM-R) have led to significant performance gains on a wide range of cross-lingual NLP tasks, success on many downstream tasks still relies on the availability of sufficient annotated data. Traditional fine-tuning of pre-trained models using only a few target samples can cause over-fitting. This can be quite limiting as most languages i…
▽ More
Even though large pre-trained multilingual models (e.g. mBERT, XLM-R) have led to significant performance gains on a wide range of cross-lingual NLP tasks, success on many downstream tasks still relies on the availability of sufficient annotated data. Traditional fine-tuning of pre-trained models using only a few target samples can cause over-fitting. This can be quite limiting as most languages in the world are under-resourced. In this work, we investigate cross-lingual adaptation using a simple nearest neighbor few-shot (<15 samples) inference technique for classification tasks. We experiment using a total of 16 distinct languages across two NLP tasks- XNLI and PAWS-X. Our approach consistently improves traditional fine-tuning using only a handful of labeled samples in target locales. We also demonstrate its generalization capability across tasks.
△ Less
Submitted 5 September, 2021;
originally announced September 2021.
-
AUGVIC: Exploiting BiText Vicinity for Low-Resource NMT
Authors:
Tasnim Mohiuddin,
M Saiful Bari,
Shafiq Joty
Abstract:
The success of Neural Machine Translation (NMT) largely depends on the availability of large bitext training corpora. Due to the lack of such large corpora in low-resource language pairs, NMT systems often exhibit poor performance. Extra relevant monolingual data often helps, but acquiring it could be quite expensive, especially for low-resource languages. Moreover, domain mismatch between bitext…
▽ More
The success of Neural Machine Translation (NMT) largely depends on the availability of large bitext training corpora. Due to the lack of such large corpora in low-resource language pairs, NMT systems often exhibit poor performance. Extra relevant monolingual data often helps, but acquiring it could be quite expensive, especially for low-resource languages. Moreover, domain mismatch between bitext (train/test) and monolingual data might degrade the performance. To alleviate such issues, we propose AUGVIC, a novel data augmentation framework for low-resource NMT which exploits the vicinal samples of the given bitext without using any extra monolingual data explicitly. It can diversify the in-domain bitext data with finer level control. Through extensive experiments on four low-resource language pairs comprising data from different domains, we have shown that our method is comparable to the traditional back-translation that uses extra in-domain monolingual data. When we combine the synthetic parallel data generated from AUGVIC with the ones from the extra monolingual data, we achieve further improvements. We show that AUGVIC helps to attenuate the discrepancies between relevant and distant-domain monolingual data in traditional back-translation. To understand the contributions of different components of AUGVIC, we perform an in-depth framework analysis.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.
-
LNMap: Departures from Isomorphic Assumption in Bilingual Lexicon Induction Through Non-Linear Map** in Latent Space
Authors:
Tasnim Mohiuddin,
M Saiful Bari,
Shafiq Joty
Abstract:
Most of the successful and predominant methods for bilingual lexicon induction (BLI) are map**-based, where a linear map** function is learned with the assumption that the word embedding spaces of different languages exhibit similar geometric structures (i.e., approximately isomorphic). However, several recent studies have criticized this simplified assumption showing that it does not hold in…
▽ More
Most of the successful and predominant methods for bilingual lexicon induction (BLI) are map**-based, where a linear map** function is learned with the assumption that the word embedding spaces of different languages exhibit similar geometric structures (i.e., approximately isomorphic). However, several recent studies have criticized this simplified assumption showing that it does not hold in general even for closely related languages. In this work, we propose a novel semi-supervised method to learn cross-lingual word embeddings for BLI. Our model is independent of the isomorphic assumption and uses nonlinear map** in the latent space of two independently trained auto-encoders. Through extensive experiments on fifteen (15) different language pairs (in both directions) comprising resource-rich and low-resource languages from two different datasets, we demonstrate that our method outperforms existing models by a good margin. Ablation studies show the importance of different model components and the necessity of non-linear map**.
△ Less
Submitted 21 October, 2020; v1 submitted 28 April, 2020;
originally announced April 2020.
-
UXLA: A Robust Unsupervised Data Augmentation Framework for Zero-Resource Cross-Lingual NLP
Authors:
M Saiful Bari,
Tasnim Mohiuddin,
Shafiq Joty
Abstract:
Transfer learning has yielded state-of-the-art (SoTA) results in many supervised NLP tasks. However, annotated data for every target task in every target language is rare, especially for low-resource languages. We propose UXLA, a novel unsupervised data augmentation framework for zero-resource transfer learning scenarios. In particular, UXLA aims to solve cross-lingual adaptation problems from a s…
▽ More
Transfer learning has yielded state-of-the-art (SoTA) results in many supervised NLP tasks. However, annotated data for every target task in every target language is rare, especially for low-resource languages. We propose UXLA, a novel unsupervised data augmentation framework for zero-resource transfer learning scenarios. In particular, UXLA aims to solve cross-lingual adaptation problems from a source language task distribution to an unknown target language task distribution, assuming no training label in the target language. At its core, UXLA performs simultaneous self-training with data augmentation and unsupervised sample selection. To show its effectiveness, we conduct extensive experiments on three diverse zero-resource cross-lingual transfer tasks. UXLA achieves SoTA results in all the tasks, outperforming the baselines by a good margin. With an in-depth framework dissection, we demonstrate the cumulative contributions of different components to its success.
△ Less
Submitted 26 June, 2021; v1 submitted 27 April, 2020;
originally announced April 2020.
-
Decoupled molecular and inorganic framework dynamics in CH$_3$NH$_3$PbCl$_3$
Authors:
M. Songvilay,
Zitian Wang,
V. Garcia Sakai,
T. Guidi,
M. Bari,
Z. -G. Ye,
Guangyong Xu,
K. L. Brown,
P. M. Gehring,
C. Stock
Abstract:
The organic-inorganic lead halide perovskites are composed of organic molecules imbedded in an inorganic framework. The compounds with general formula CH$_{3}$NH$_{3}$PbX$_{3}$ (MAPbX$_{2}$) display large photovoltaic efficiencies for halogens $X$=Cl, Br, and I in a wide variety of sample geometries and preparation methods. The organic cation and inorganic framework are bound by hydrogen bonds tha…
▽ More
The organic-inorganic lead halide perovskites are composed of organic molecules imbedded in an inorganic framework. The compounds with general formula CH$_{3}$NH$_{3}$PbX$_{3}$ (MAPbX$_{2}$) display large photovoltaic efficiencies for halogens $X$=Cl, Br, and I in a wide variety of sample geometries and preparation methods. The organic cation and inorganic framework are bound by hydrogen bonds that tether the molecules to the halide anions, and this has been suggested to be important to the optoelectronic properties. We have studied the effects of this bonding using time-of-flight neutron spectroscopy to measure the molecular dynamics in CH$_3$NH$_3$PbCl$_3$ (MAPbCl$_3$). Low-energy/high-resolution neutron backscattering reveals thermally-activated molecular dynamics with a characteristic temperature of $\sim$ 95\,K. At this same temperature, higher-energy neutron spectroscopy indicates the presence of an anomalous broadening in energy (reduced lifetime) associated with the molecular vibrations. By contrast, neutron powder diffraction shows that a spatially long-range structural phase transitions occurs at 178\,K (cubic $\rightarrow$ tetragonal) and 173\,K (tetragonal $\rightarrow$ orthorhombic). The large difference between these two temperature scales suggests that the molecular and inorganic lattice dynamics in MAPbCl$_3$ are actually decoupled. With the assumption that underlying physical mechanisms do not change with differing halogens in the organic-inorganic perovskites, we speculate that the energy scale most relevant to the photovoltaic properties of the lead-halogen perovskites is set by the lead-halide bond, not by the hydrogen bond.
△ Less
Submitted 26 December, 2019; v1 submitted 17 December, 2019;
originally announced December 2019.
-
Zero-Resource Cross-Lingual Named Entity Recognition
Authors:
M Saiful Bari,
Shafiq Joty,
Prathyusha Jwalapuram
Abstract:
Recently, neural methods have achieved state-of-the-art (SOTA) results in Named Entity Recognition (NER) tasks for many languages without the need for manually crafted features. However, these models still require manually annotated training data, which is not available for many languages. In this paper, we propose an unsupervised cross-lingual NER model that can transfer NER knowledge from one la…
▽ More
Recently, neural methods have achieved state-of-the-art (SOTA) results in Named Entity Recognition (NER) tasks for many languages without the need for manually crafted features. However, these models still require manually annotated training data, which is not available for many languages. In this paper, we propose an unsupervised cross-lingual NER model that can transfer NER knowledge from one language to another in a completely unsupervised way without relying on any bilingual dictionary or parallel data. Our model achieves this through word-level adversarial learning and augmented fine-tuning with parameter sharing and feature augmentation. Experiments on five different languages demonstrate the effectiveness of our approach, outperforming existing models by a good margin and setting a new SOTA for each language pair.
△ Less
Submitted 21 November, 2019;
originally announced November 2019.
-
Common acoustic phonon lifetimes in inorganic and hybrid lead halide perovskites
Authors:
M. Songvilay,
N. Giles-Donovan,
M. Bari,
Z. -G. Ye,
J. L. Minns,
M. A. Green,
Guangyong Xu,
P. M. Gehring,
K. Schmalzl,
W. D. Ratcliff,
C. M. Brown,
D. Chernyshov,
W. van Beek,
S. Cochran,
C. Stock
Abstract:
The acoustic phonons in the organic-inorganic lead halide perovskites have been reported to have anomalously short lifetimes over a large part of the Brillouin zone. The resulting shortened mean free paths of the phonons have been implicated as the origin of the low thermal conductivity. We apply neutron spectroscopy to show that the same acoustic phonon energy linewidth broadening (corresponding…
▽ More
The acoustic phonons in the organic-inorganic lead halide perovskites have been reported to have anomalously short lifetimes over a large part of the Brillouin zone. The resulting shortened mean free paths of the phonons have been implicated as the origin of the low thermal conductivity. We apply neutron spectroscopy to show that the same acoustic phonon energy linewidth broadening (corresponding to shortened lifetimes) occurs in the fully inorganic CsPbBr$_{3}$ by comparing the results on the organic-inorganic CH$_{3}$NH$_{3}$PbCl$_{3}$. We investigate the critical dynamics near the three zone boundaries of the cubic $Pm\overline{3}m$ Brillouin zone of CsPbBr$_{3}$ and find energy and momentum broadened dynamics at momentum points where the Cs-site ($A$-site) motions contribute to the cross section. Neutron diffraction is used to confirm that both the Cs and Br sites have unusually large thermal displacements with an anisotropy that mirrors the low temperature structural distortions. The presence of an organic molecule is not necessary to disrupt the low-energy acoustic phonons at momentum transfers located away from the zone center in the lead halide perovskites and such dam** may be driven by the large displacements or possibly disorder on the $A$ site.
△ Less
Submitted 21 September, 2019; v1 submitted 27 August, 2019;
originally announced August 2019.
-
Find It: A Novel Way to Learn Through Play
Authors:
Md. Tashfiqul Bari,
Tanvir Hassan,
Raisa Tabassum,
Zubaida Ahmed,
Swakkhar Shatabda
Abstract:
Autism Spectrum Disorder (ASD) is the area where many researches enduring like Magnetic Resonance Imaging (MRI), called diffusion tensor imaging, Early Start Denver Model (ESDM) to provide an easier life for the people diagnosed. After years and years of combined funding sources from public and private funding, these researches show great promises in recent years. In this paper, we have tried to s…
▽ More
Autism Spectrum Disorder (ASD) is the area where many researches enduring like Magnetic Resonance Imaging (MRI), called diffusion tensor imaging, Early Start Denver Model (ESDM) to provide an easier life for the people diagnosed. After years and years of combined funding sources from public and private funding, these researches show great promises in recent years. In this paper, we have tried to show a way how children with Down Syndrome Autism can learn through game therapy. These game therapies have shown an immense number of improvements among those children to learn alphabets along with develo** their motor skills and memory challenges.
△ Less
Submitted 12 July, 2019;
originally announced July 2019.
-
The high voltage system with pressure and temperature corrections for the novel MPGD-based photon detectors of COMPASS RICH-1
Authors:
J. Agarwala,
M. Bari,
F. Bradamante,
A. Bressan,
C. Chatterjee,
A. Cicuttin,
P. Ciliberti,
M. Crespo,
S. Dalla Torre,
S. Dasgupta,
B. Gobbo,
M. Gregori,
G. Hamar,
S. Levorato,
A. Martin,
G. Menon,
L. B. Rizzuto,
Triloki,
F. Tessarotto,
Y. X. Zhao
Abstract:
The novel MPGD-based photon detectors of COMPASS RICH-1 consist of large-size hybrid MPGDs with multi-layer architecture including two layers of Thick-GEMs and a bulk resistive MicroMegas. The top surface of the first THGEM is coated with a CsI film which also acts as photo-cathode. These detectors have been successfully in operation at COMPASS since 2016. Concerning bias-voltage supply, the Thick…
▽ More
The novel MPGD-based photon detectors of COMPASS RICH-1 consist of large-size hybrid MPGDs with multi-layer architecture including two layers of Thick-GEMs and a bulk resistive MicroMegas. The top surface of the first THGEM is coated with a CsI film which also acts as photo-cathode. These detectors have been successfully in operation at COMPASS since 2016. Concerning bias-voltage supply, the Thick-GEMs are segmented in order to reduce the energy released in case of occasional discharges, while the MicroMegas anode is segmented into pads individually biased with positive voltage while the micromesh is grounded. In total, there are about ten different electrode types and more than 20000 electrodes supplied by more than 100 HV channels, where appropriate correlations among the applied voltages are required for the correct operation of the detectors. Therefore, a robust control system is mandatory, implemented by a custom designed software package, while commercial power supply units are used. This sophisticated control system allows to protect the detectors against errors by the operator, to monitor and log voltages and currents at 1 Hz rate, and automatically react to detector misbehaviour. In addition, a voltage compensation system has been developed to automatically adjust the biasing voltage according to environmental pressure and temperature variations, to achieve constant gain over time. This development answers to a more general need. In fact, voltage compensation is always a requirement for the stability of gaseous detectors and its need is enhanced in multi-layer ones.
In this paper, the HV system and its performance are described in details, as well as the stability of the novel MPGD-based photon detectors during the physics data taking at COMPASS.
△ Less
Submitted 4 July, 2019;
originally announced July 2019.
-
Anonymity Network Tor and Performance Analysis of ARANEA; an IOT Based Privacy-Preserving Router
Authors:
AKM Bahalul Haque,
Sharaban Tahura Nisa,
Md. Amdadul Bari,
Ayvee Nusreen Anika
Abstract:
There was a time when the word security was only confined to the physical protection of things that were valuable which must be guarded against all the odds. Today, in a world where people can do things virtually have emerged the necessity to protect the virtual world. Every single facet of our life is being controlled by the internet one way or another. There is no privacy in the cyberspace as th…
▽ More
There was a time when the word security was only confined to the physical protection of things that were valuable which must be guarded against all the odds. Today, in a world where people can do things virtually have emerged the necessity to protect the virtual world. Every single facet of our life is being controlled by the internet one way or another. There is no privacy in the cyberspace as the data which we are browsing on the internet is being monitored on the other side by someone. Each work we are doing on the internet is getting tracked or the data are getting leaked without consent. To browse the internet securely we developed a router named Aranea which relates to the browser Tor. Tor gives traffic anonymity and security. The Tor browser can be used in both positive and negative purpose. Tor encrypts data, it hides the location and identity of the user, it hides the IP address of the device, it hides the network traffic and many more. By using Tor browser each user can browse the internet safely in the cyber world. Our goal is to create an additional security bridge through the router Aranea for every user so that each user can simply browse the internet anonymously.
△ Less
Submitted 4 June, 2019;
originally announced June 2019.
-
XDoser, A Benchmarking Tool for System Load Measurement Using Denial of Service Features
Authors:
AKM Bahalul Haque,
Rabeya Sultana,
Mohammad Sajid Fahad,
MD Nasif Latif,
Md. Amdadul Bari
Abstract:
Technology has developed so fast that we feel both safe as well as unsafe in both ways. Systems used today are always prone to attack by malicious users. In most cases, services are hindered because these systems cannot handle the amount of over loads the attacker provides. So, proper service load measurement is necessary. The tool that is being described in this paper for developments is based on…
▽ More
Technology has developed so fast that we feel both safe as well as unsafe in both ways. Systems used today are always prone to attack by malicious users. In most cases, services are hindered because these systems cannot handle the amount of over loads the attacker provides. So, proper service load measurement is necessary. The tool that is being described in this paper for developments is based on the Denial of Service methodologies. This tool, XDoser will put a synthetic load on the servers for testing purpose. The HTTP Flood method is used which includes an HTTP POST method as it forces the website to gather the maximum resources possible in response to every single request. The tool developed in this paper will focus on overloading the backend with multiple requests. So, the tool can be implemented for servers new or old for synthetic test endurance testing.
△ Less
Submitted 30 May, 2019;
originally announced May 2019.
-
A Unified Linear-Time Framework for Sentence-Level Discourse Parsing
Authors:
Xiang Lin,
Shafiq Joty,
Prathyusha Jwalapuram,
M Saiful Bari
Abstract:
We propose an efficient neural framework for sentence-level discourse analysis in accordance with Rhetorical Structure Theory (RST). Our framework comprises a discourse segmenter to identify the elementary discourse units (EDU) in a text, and a discourse parser that constructs a discourse tree in a top-down fashion. Both the segmenter and the parser are based on Pointer Networks and operate in lin…
▽ More
We propose an efficient neural framework for sentence-level discourse analysis in accordance with Rhetorical Structure Theory (RST). Our framework comprises a discourse segmenter to identify the elementary discourse units (EDU) in a text, and a discourse parser that constructs a discourse tree in a top-down fashion. Both the segmenter and the parser are based on Pointer Networks and operate in linear time. Our segmenter yields an $F_1$ score of 95.4, and our parser achieves an $F_1$ score of 81.7 on the aggregated labeled (relation) metric, surpassing previous approaches by a good margin and approaching human agreement on both tasks (98.3 and 83.0 $F_1$).
△ Less
Submitted 12 June, 2019; v1 submitted 14 May, 2019;
originally announced May 2019.
-
A Secure Communication Scheme for Corporate and Defense Community
Authors:
Akm. B. Haque,
Md. A. Bari,
S. S. Arman,
FT. Progga
Abstract:
Security is one of the major concerns of modern communication systems. Users demand a secure communication environment that provides privacy to the people while they are sharing messages to anyone. Privacy is a prime concern nowadays. This paper aims to provide an optimal platform for communication between the sender and the receiver. This prototype designed in the paper will provide a better anon…
▽ More
Security is one of the major concerns of modern communication systems. Users demand a secure communication environment that provides privacy to the people while they are sharing messages to anyone. Privacy is a prime concern nowadays. This paper aims to provide an optimal platform for communication between the sender and the receiver. This prototype designed in the paper will provide a better anonymous path for routing messages. It will ensure ones full privacy while he or she is using this system for communication. As the proposed system provides a secure communication environment, it is supposed to be useful for the secret communication inside different governmental and non-governmental organizations. The law enforcements can use this system for any of their operations as it will encrypt and decrypt the message to give a secure platform for communication.
△ Less
Submitted 6 March, 2019;
originally announced March 2019.
-
Lifetime-shortened acoustic phonons and static order at the Brillouin zone boundary in the organic-inorganic perovskite CH$_3$NH$_3$PbCl$_3$
Authors:
M. Songvilay,
M. Bari,
Z. -G. Ye,
Guangyong Xu,
P. M. Gehring,
W. D. Ratcliff,
K. Schmalzl,
F. Bourdarot,
B. Roessli,
C. Stock
Abstract:
Lead halide hybrid perovskites consist of an inorganic framework hosting a molecular cation located in the interstitial space. These compounds have been extensively studied as they have been identified as promising materials for photovoltaic applications with the interaction between the molecular cation and the inorganic framework implicated as influential for the electronic properties. CH3NH3PbCl…
▽ More
Lead halide hybrid perovskites consist of an inorganic framework hosting a molecular cation located in the interstitial space. These compounds have been extensively studied as they have been identified as promising materials for photovoltaic applications with the interaction between the molecular cation and the inorganic framework implicated as influential for the electronic properties. CH3NH3PbCl3 undergoes two structural transitions from a high temperature cubic unit cell to a tetragonal phase at 177 K and an orthorhombic transition at 170 K. We have measured the low-frequency lattice dynamics using neutron spectroscopy and observe an energy broadening in the acoustic phonon linewidth towards the symmetry point QX =(2,1/2,0) when approaching the transitions. Concomitant with these zone boundary anomalies is a hardening of the entire acoustic phonon branch measured near the (2, 0, 0) Bragg position with decreasing temperature. Measurements of the elastic scattering at the Brillouin zone edges QX = (2,1/2,0), QM = (3/2,1/2,0), and QR = (3/2,3/2,5/2) show Bragg peaks appearing below these structural transitions. Based on selection rules of neutron scattering, we suggest that the higher 177 K transition is displacive with a distortion of the local octahedral environment and the lower transition is a rigid tilt transition of the octahedra. We do not observe any critical broadening in energy or momentum, beyond resolution, of these peaks near the transitions. We compare these results to the critical properties reported near the structural transitions in other perovskites. We suggest that the simultaneous onset of static resolution-limited Bragg peaks at the zone boundaries and the changes in acoustic phonon energies near the zone center is evidence of a coupling between the inorganic framework and the molecular cation.
△ Less
Submitted 22 January, 2019; v1 submitted 26 September, 2018;
originally announced September 2018.
-
RHIP, a Radio-controlled High-Voltage Insulated Picoammeter and its usage in studying ion backflow in MPGD-based photon detectors
Authors:
M. Bari,
B. Gobbo,
S. Dalla Torre,
M. Gregori,
S. Levorato,
G. Menon,
F. Tessarotto
Abstract:
A picoammeter system has been developed and engineering. It consists in a current-voltage converter, based on an operational amplifier with very low input current, a high precision ADC, a radio controlled data acquisition unit and the computer-based control, visualization and storage. The precision is of the order of a tenth of picoampers and it can measure currents between electrodes at potential…
▽ More
A picoammeter system has been developed and engineering. It consists in a current-voltage converter, based on an operational amplifier with very low input current, a high precision ADC, a radio controlled data acquisition unit and the computer-based control, visualization and storage. The precision is of the order of a tenth of picoampers and it can measure currents between electrodes at potentials up to 8 kV. The system is battery powered and a number of strategies have been implemented to limit the power consumption. The system is designed for multichannel applications, up to 256 parallel channels. The overall implementation is cost-effective to make the availability of multichannel setups easily affordable. The design, implementation and performance of the picoammeter system are described in detail as well as a an application: the measurement of ion backflow in MPGD-based photon detectors.
△ Less
Submitted 6 March, 2018;
originally announced March 2018.
-
Supervised Machine Learning for Signals Having RRC Shaped Pulses
Authors:
Mohammad Bari,
Hussain Taher,
Syed Saad Sherazi,
Milos Doroslovacki
Abstract:
Classification performances of the supervised machine learning techniques such as support vector machines, neural networks and logistic regression are compared for modulation recognition purposes. The simple and robust features are used to distinguish continuous-phase FSK from QAM-PSK signals. Signals having root-raised-cosine shaped pulses are simulated in extreme noisy conditions having joint im…
▽ More
Classification performances of the supervised machine learning techniques such as support vector machines, neural networks and logistic regression are compared for modulation recognition purposes. The simple and robust features are used to distinguish continuous-phase FSK from QAM-PSK signals. Signals having root-raised-cosine shaped pulses are simulated in extreme noisy conditions having joint impurities of block fading, lack of symbol and sampling synchronization, carrier offset, and additive white Gaussian noise. The features are based on sample mean and sample variance of the imaginary part of the product of two consecutive complex signal values.
△ Less
Submitted 17 May, 2017;
originally announced May 2017.
-
Separation of Signals Consisting of Amplitude and Instantaneous Frequency RRC Pulses Using SNR Uniform Training
Authors:
Mohammad Bari,
Milos Doroslovacki
Abstract:
This work presents sample mean and sample variance based features that distinguish continuous phase FSK from QAM and PSK modulations. Root raised cosine pulses are used for signal generation. Support vector machines are employed for signals separation. They are trained for only one value of SNR and used to classify the signals from a wide range of SNR. A priori information about carrier amplitude,…
▽ More
This work presents sample mean and sample variance based features that distinguish continuous phase FSK from QAM and PSK modulations. Root raised cosine pulses are used for signal generation. Support vector machines are employed for signals separation. They are trained for only one value of SNR and used to classify the signals from a wide range of SNR. A priori information about carrier amplitude, carrier phase, carrier offset, roll-off factor and initial symbol phase is relaxed. Effectiveness of the method is tested by observing the joint effects of AWGN, carrier offset, lack of symbol and sampling synchronization, and fast fading.
△ Less
Submitted 29 January, 2016;
originally announced February 2016.
-
On Orchestrating Virtual Network Functions in NFV
Authors:
Md. Faizul Bari,
Shihabur Rahman Chowdhury,
Reaz Ahmed,
Raouf Boutaba
Abstract:
Middleboxes or network appliances like firewalls, proxies and WAN optimizers have become an integral part of today's ISP and enterprise networks. Middlebox functionalities are usually deployed on expensive and proprietary hardware that require trained personnel for deployment and maintenance. Middleboxes contribute significantly to a network's capital and operational costs. In addition, organizati…
▽ More
Middleboxes or network appliances like firewalls, proxies and WAN optimizers have become an integral part of today's ISP and enterprise networks. Middlebox functionalities are usually deployed on expensive and proprietary hardware that require trained personnel for deployment and maintenance. Middleboxes contribute significantly to a network's capital and operational costs. In addition, organizations often require their traffic to pass through a specific sequence of middleboxes for compliance with security and performance policies. This makes the middlebox deployment and maintenance tasks even more complicated. Network Function Virtualization (NFV) is an emerging and promising technology that is envisioned to overcome these challenges. It proposes to move packet processing from dedicated hardware middleboxes to software running on commodity servers. In NFV terminology, software middleboxes are referred to as Virtualized Network Functions (VNFs). It is a challenging problem to determine the required number and placement of VNFs that optimizes network operational costs and utilization, without violating service level agreements. We call this the VNF Orchestration Problem (VNF-OP) and provide an Integer Linear Programming (ILP) formulation with implementation in CPLEX. We also provide a dynamic programming based heuristic to solve larger instances of VNF-OP. Trace driven simulations on real-world network topologies demonstrate that the heuristic can provide solutions that are within 1.3 times of the optimal solution. Our experiments suggest that a VNF based approach can provide more than 4x reduction in the operational cost of a network.
△ Less
Submitted 25 March, 2015; v1 submitted 21 March, 2015;
originally announced March 2015.
-
FPGA Implementation of LS Code Generator for CDM Based MIMO Channel Sounder
Authors:
M. Habib Ullah,
Md. Niamul Bari,
A. Unggul Priantoro
Abstract:
MIMO (Multi Input Multi Output) wireless communication system is an innovative solution to improve the bandwidth efficiency by exploiting multipath-richness of the propagation environment. The degree of multipath-richness of the channel will determine the capacity gain attainable by MIMO deployment. Therefore, it is very important to have accurate knowledge of the propagation environment/radio c…
▽ More
MIMO (Multi Input Multi Output) wireless communication system is an innovative solution to improve the bandwidth efficiency by exploiting multipath-richness of the propagation environment. The degree of multipath-richness of the channel will determine the capacity gain attainable by MIMO deployment. Therefore, it is very important to have accurate knowledge of the propagation environment/radio channel before MIMO implement. The radio channel behavior can be estimated by channel measurement or channel sounding. CDM (Code Division multiplexing) is one of the channel sounding techniques that allow accurate measurement at the cost of hardware complexity. CDM based channel sounder, requires code with excellent autocorrelation and cross-correlation properties which generally difficult to achieve simultaneously. Theoretical analysis and computer simulation result demonstrated that, having excellent correlation propertied Loosely Synchronous (LS) code sequence perform efficiently. Finally, the an efficient LS code generator as a data source for transmitter implemented in Xilinx FPGA that can be integrated into CDM based 2x2 MIMO complete channel sounder.
△ Less
Submitted 21 February, 2010;
originally announced February 2010.
-
The CDF-II Online Silicon Vertex Tracker
Authors:
A. Bardi,
A. Belloni,
R. Carosi,
A. Cerri,
G. Chlachidze,
M. Dell'Orso,
S. Donati,
S. Galeotti,
P. Giannetti,
V. Glagolev,
E. Meschi,
F. Morsani,
D. Passuello,
G. Punzi,
L. Ristori,
A. Semenov,
F. Spinella,
A. Barchiesi,
M. Rescigno,
S. Sarkar,
L. Zanello,
M. Bari,
S. Belforte,
A. M. Zanetti,
I. Fiori
, et al. (14 additional authors not shown)
Abstract:
The Online Silicon Vertex Tracker is the new CDF-II level 2 trigger processor designed to reconstruct 2-D tracks within the Silicon Vertex Detector with high speed and accuracy. By performing a precise measurement of impact parameters the SVT allows tagging online B events which typically show displaced secondary vertices. Physics simulations show that this will greatly enhance the CDF-II B-phys…
▽ More
The Online Silicon Vertex Tracker is the new CDF-II level 2 trigger processor designed to reconstruct 2-D tracks within the Silicon Vertex Detector with high speed and accuracy. By performing a precise measurement of impact parameters the SVT allows tagging online B events which typically show displaced secondary vertices. Physics simulations show that this will greatly enhance the CDF-II B-physics capability. The SVT has been fully assembled and operational since the beginning of Tevatron RunII in April 2001. In this paper we briefly review the SVT design and physics motivation and then describe its performance during the early phase (April-October 2001) of run II.
△ Less
Submitted 10 December, 2001;
originally announced December 2001.
-
Fast instability indicator in few dimensional dynamical systems
Authors:
Piero Cipriani,
Maria Teresa Di Bari
Abstract:
Using the tools of Differential Geometry, we define a new <<fast>> chaoticity indicator, able to detect dynamical instability of trajectories much more effectively, (i.e. "quickly") than the usual tools, like Lyapunov Characteristic Numbers (LCN's) or Poincare` Surface of Section. Moreover, at variance with other "fast" indicators proposed in the Literature, it gives informations about the asymp…
▽ More
Using the tools of Differential Geometry, we define a new <<fast>> chaoticity indicator, able to detect dynamical instability of trajectories much more effectively, (i.e. "quickly") than the usual tools, like Lyapunov Characteristic Numbers (LCN's) or Poincare` Surface of Section. Moreover, at variance with other "fast" indicators proposed in the Literature, it gives informations about the asymptotic behaviour of trajectories, though being local in phase-space. Furthermore, it detects the chaotic or regular nature of geodesics without any reference to a given perturbation and it allows also to discriminate between different regimes (and possibly sources) of chaos in distinct regions of phase-space.
△ Less
Submitted 7 August, 2001;
originally announced August 2001.
-
The (In)stability of Bianchi IX Dynamics: Geodesic Deviation Equations in Finsler Spaces
Authors:
Maria Di Bari,
Piero Cipriani
Abstract:
We explore the dynamical stability of the minisuperspace Hamiltonian of the Bianchi IX cosmological models, giving a gauge-invariant and unapproximated description of the full continuous dynamics, achieved through a geometrical description of the equations of motion in the framework of the theory of Finsler Spaces. The numerical integrations of the geodesics and geodesic deviation equations show…
▽ More
We explore the dynamical stability of the minisuperspace Hamiltonian of the Bianchi IX cosmological models, giving a gauge-invariant and unapproximated description of the full continuous dynamics, achieved through a geometrical description of the equations of motion in the framework of the theory of Finsler Spaces. The numerical integrations of the geodesics and geodesic deviation equations show clearly the absence of any "traditional" signature of Chaos, while suggesting a chaotic scattering dynamics scenario.
△ Less
Submitted 9 July, 1998;
originally announced July 1998.
-
Finsler Geometric Local Indicator of Chaos for single orbits in the Henon-Heiles hamiltonian
Authors:
Piero Cipriani,
Maria Di Bari
Abstract:
Translating the dynamics of the Henon--Heiles hamiltonian as a geodesic flow on a Finsler manifold, we obtain a local and synthetic Geometric Indicator of Chaos (GIC) for two degrees of freedom continuous dynamical systems. It represents a link between local quantities and asymptotic behaviour of orbits giving a strikingly evident, one-to-one, correspondence between geometry and instability.
Translating the dynamics of the Henon--Heiles hamiltonian as a geodesic flow on a Finsler manifold, we obtain a local and synthetic Geometric Indicator of Chaos (GIC) for two degrees of freedom continuous dynamical systems. It represents a link between local quantities and asymptotic behaviour of orbits giving a strikingly evident, one-to-one, correspondence between geometry and instability.
△ Less
Submitted 9 July, 1998;
originally announced July 1998.