-
DAEDALUS: Defense Against Firmware ROP Exploits Using Stochastic Software Diversity
Authors:
Islam Obaidat,
Meera Sridhar,
Fatemeh Tavakoli
Abstract:
This paper presents DAEDALUS, a software diversity-based framework designed to resist ROP attacks on Linux-based IoT devices. DAEDALUS generates unique, semantically equivalent but syntactically different rewrites of IoT firmware, disrupting large-scale replication of ROP attacks. DAEDALUS employs STOKE, a stochastic optimizer for x86 binaries, as its core diversity engine but introduces significa…
▽ More
This paper presents DAEDALUS, a software diversity-based framework designed to resist ROP attacks on Linux-based IoT devices. DAEDALUS generates unique, semantically equivalent but syntactically different rewrites of IoT firmware, disrupting large-scale replication of ROP attacks. DAEDALUS employs STOKE, a stochastic optimizer for x86 binaries, as its core diversity engine but introduces significant extensions to address unique IoT firmware challenges. DAEDALUS's effectiveness is evaluated using DDoSim, a published botnet DDoS attack simulation testbed. Results demonstrate that DAEDALUS successfully neutralizes ROP payloads by diversifying critical basic blocks in the firmware, preventing attackers from compromising multiple devices for DDoS attacks via memory error vulnerabilities. The findings indicate that DAEDALUS not only mitigates the impact of ROP attacks on individual IoT devices through probabilistic protection but also thwarts large-scale ROP attacks across multiple devices.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Low-Resource Compositional Semantic Parsing with Concept Pretraining
Authors:
Subendhu Rongali,
Mukund Sridhar,
Haidar Khan,
Konstantine Arkoudas,
Wael Hamza,
Andrew McCallum
Abstract:
Semantic parsing plays a key role in digital voice assistants such as Alexa, Siri, and Google Assistant by map** natural language to structured meaning representations. When we want to improve the capabilities of a voice assistant by adding a new domain, the underlying semantic parsing model needs to be retrained using thousands of annotated examples from the new domain, which is time-consuming…
▽ More
Semantic parsing plays a key role in digital voice assistants such as Alexa, Siri, and Google Assistant by map** natural language to structured meaning representations. When we want to improve the capabilities of a voice assistant by adding a new domain, the underlying semantic parsing model needs to be retrained using thousands of annotated examples from the new domain, which is time-consuming and expensive. In this work, we present an architecture to perform such domain adaptation automatically, with only a small amount of metadata about the new domain and without any new training data (zero-shot) or with very few examples (few-shot). We use a base seq2seq (sequence-to-sequence) architecture and augment it with a concept encoder that encodes intent and slot tags from the new domain. We also introduce a novel decoder-focused approach to pretrain seq2seq models to be concept aware using Wikidata and use it to help our model learn important concepts and perform well in low-resource settings. We report few-shot and zero-shot results for compositional semantic parsing on the TOPv2 dataset and show that our model outperforms prior approaches in few-shot settings for the TOPv2 and SNIPS datasets.
△ Less
Submitted 30 January, 2023; v1 submitted 23 January, 2023;
originally announced January 2023.
-
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Authors:
Saleh Soltan,
Shankar Ananthakrishnan,
Jack FitzGerald,
Rahul Gupta,
Wael Hamza,
Haidar Khan,
Charith Peris,
Stephen Rawls,
Andy Rosenbaum,
Anna Rumshisky,
Chandana Satya Prakash,
Mukund Sridhar,
Fabian Triefenbach,
Apurv Verma,
Gokhan Tur,
Prem Natarajan
Abstract:
In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks. In particular, we train a 20 billion parameter multilingual seq2seq model called Alexa Teacher Model (AlexaTM 20B) and show that it achieves s…
▽ More
In this work, we demonstrate that multilingual large-scale sequence-to-sequence (seq2seq) models, pre-trained on a mixture of denoising and Causal Language Modeling (CLM) tasks, are more efficient few-shot learners than decoder-only models on various tasks. In particular, we train a 20 billion parameter multilingual seq2seq model called Alexa Teacher Model (AlexaTM 20B) and show that it achieves state-of-the-art (SOTA) performance on 1-shot summarization tasks, outperforming a much larger 540B PaLM decoder model. AlexaTM 20B also achieves SOTA in 1-shot machine translation, especially for low-resource languages, across almost all language pairs supported by the model (Arabic, English, French, German, Hindi, Italian, Japanese, Marathi, Portuguese, Spanish, Tamil, and Telugu) on Flores-101 dataset. We also show in zero-shot setting, AlexaTM 20B outperforms GPT3 (175B) on SuperGLUE and SQuADv2 datasets and provides SOTA performance on multilingual tasks such as XNLI, XCOPA, Paws-X, and XWinograd. Overall, our results present a compelling case for seq2seq models as a powerful alternative to decoder-only models for Large-scale Language Model (LLM) training.
△ Less
Submitted 3 August, 2022; v1 submitted 2 August, 2022;
originally announced August 2022.
-
Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems
Authors:
Jack FitzGerald,
Shankar Ananthakrishnan,
Konstantine Arkoudas,
Davide Bernardi,
Abhishek Bhagia,
Claudio Delli Bovi,
** Cao,
Rakesh Chada,
Amit Chauhan,
Luoxin Chen,
Anurag Dwarakanath,
Satyam Dwivedi,
Turan Gojayev,
Karthik Gopalakrishnan,
Thomas Gueudre,
Dilek Hakkani-Tur,
Wael Hamza,
Jonathan Hueser,
Kevin Martin Jose,
Haidar Khan,
Beiye Liu,
Jianhua Lu,
Alessandro Manzotti,
Pradeep Natarajan,
Karolina Owczarzak
, et al. (16 additional authors not shown)
Abstract:
We present results from a large-scale experiment on pretraining encoders with non-embedding parameter counts ranging from 700M to 9.3B, their subsequent distillation into smaller models ranging from 17M-170M parameters, and their application to the Natural Language Understanding (NLU) component of a virtual assistant system. Though we train using 70% spoken-form data, our teacher models perform co…
▽ More
We present results from a large-scale experiment on pretraining encoders with non-embedding parameter counts ranging from 700M to 9.3B, their subsequent distillation into smaller models ranging from 17M-170M parameters, and their application to the Natural Language Understanding (NLU) component of a virtual assistant system. Though we train using 70% spoken-form data, our teacher models perform comparably to XLM-R and mT5 when evaluated on the written-form Cross-lingual Natural Language Inference (XNLI) corpus. We perform a second stage of pretraining on our teacher models using in-domain data from our system, improving error rates by 3.86% relative for intent classification and 7.01% relative for slot filling. We find that even a 170M-parameter model distilled from our Stage 2 teacher model has 2.88% better intent classification and 7.69% better slot filling error rates when compared to the 2.3B-parameter teacher trained only on public data (Stage 1), emphasizing the importance of in-domain data for pretraining. When evaluated offline using labeled NLU data, our 17M-parameter Stage 2 distilled model outperforms both XLM-R Base (85M params) and DistillBERT (42M params) by 4.23% to 6.14%, respectively. Finally, we present results from a full virtual assistant experimentation platform, where we find that models trained using our pretraining and distillation pipeline outperform models distilled from 85M-parameter teachers by 3.74%-4.91% on an automatic measurement of full-system user dissatisfaction.
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
Instilling Type Knowledge in Language Models via Multi-Task QA
Authors:
Shuyang Li,
Mukund Sridhar,
Chandana Satya Prakash,
** Cao,
Wael Hamza,
Julian McAuley
Abstract:
Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge -- their types. Previous methods to learn entity types rely on training classifiers on datasets with coarse, noisy, and incomplete labels. We introduce a method to instill fine-grained type knowledge in language models with text-to-text pre-training on type-centric questions leveraging…
▽ More
Understanding human language often necessitates understanding entities and their place in a taxonomy of knowledge -- their types. Previous methods to learn entity types rely on training classifiers on datasets with coarse, noisy, and incomplete labels. We introduce a method to instill fine-grained type knowledge in language models with text-to-text pre-training on type-centric questions leveraging knowledge base documents and knowledge graphs. We create the WikiWiki dataset: entities and passages from 10M Wikipedia articles linked to the Wikidata knowledge graph with 41K types. Models trained on WikiWiki achieve state-of-the-art performance in zero-shot dialog state tracking benchmarks, accurately infer entity types in Wikipedia articles, and can discover new types deemed useful by human judges.
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
Towards Realistic Single-Task Continuous Learning Research for NER
Authors:
Justin Payan,
Yuval Merhav,
He Xie,
Satyapriya Krishna,
Anil Ramakrishna,
Mukund Sridhar,
Rahul Gupta
Abstract:
There is an increasing interest in continuous learning (CL), as data privacy is becoming a priority for real-world machine learning applications. Meanwhile, there is still a lack of academic NLP benchmarks that are applicable for realistic CL settings, which is a major challenge for the advancement of the field. In this paper we discuss some of the unrealistic data characteristics of public datase…
▽ More
There is an increasing interest in continuous learning (CL), as data privacy is becoming a priority for real-world machine learning applications. Meanwhile, there is still a lack of academic NLP benchmarks that are applicable for realistic CL settings, which is a major challenge for the advancement of the field. In this paper we discuss some of the unrealistic data characteristics of public datasets, study the challenges of realistic single-task continuous learning as well as the effectiveness of data rehearsal as a way to mitigate accuracy loss. We construct a CL NER dataset from an existing publicly available dataset and release it along with the code to the research community.
△ Less
Submitted 27 October, 2021;
originally announced October 2021.
-
Zero-shot Generalization in Dialog State Tracking through Generative Question Answering
Authors:
Shuyang Li,
** Cao,
Mukund Sridhar,
Henghui Zhu,
Shang-Wen Li,
Wael Hamza,
Julian McAuley
Abstract:
Dialog State Tracking (DST), an integral part of modern dialog systems, aims to track user preferences and constraints (slots) in task-oriented dialogs. In real-world settings with constantly changing services, DST systems must generalize to new domains and unseen slot types. Existing methods for DST do not generalize well to new slot names and many require known ontologies of slot types and value…
▽ More
Dialog State Tracking (DST), an integral part of modern dialog systems, aims to track user preferences and constraints (slots) in task-oriented dialogs. In real-world settings with constantly changing services, DST systems must generalize to new domains and unseen slot types. Existing methods for DST do not generalize well to new slot names and many require known ontologies of slot types and values for inference. We introduce a novel ontology-free framework that supports natural language queries for unseen constraints and slots in multi-domain task-oriented dialogs. Our approach is based on generative question-answering using a conditional language model pre-trained on substantive English sentences. Our model improves joint goal accuracy in zero-shot domain adaptation settings by up to 9% (absolute) over the previous state-of-the-art on the MultiWOZ 2.1 dataset.
△ Less
Submitted 20 January, 2021;
originally announced January 2021.
-
Automatic Discovery of Novel Intents & Domains from Text Utterances
Authors:
Nikhita Vedula,
Rahul Gupta,
Aman Alok,
Mukund Sridhar
Abstract:
One of the primary tasks in Natural Language Understanding (NLU) is to recognize the intents as well as domains of users' spoken and written language utterances. Most existing research formulates this as a supervised classification problem with a closed-world assumption, i.e. the domains or intents to be identified are pre-defined or known beforehand. Real-world applications however increasingly e…
▽ More
One of the primary tasks in Natural Language Understanding (NLU) is to recognize the intents as well as domains of users' spoken and written language utterances. Most existing research formulates this as a supervised classification problem with a closed-world assumption, i.e. the domains or intents to be identified are pre-defined or known beforehand. Real-world applications however increasingly encounter dynamic, rapidly evolving environments with newly emerging intents and domains, about which no information is known during model training. We propose a novel framework, ADVIN, to automatically discover novel domains and intents from large volumes of unlabeled data. We first employ an open classification model to identify all utterances potentially consisting of a novel intent. Next, we build a knowledge transfer component with a pairwise margin loss function. It learns discriminative deep features to group together utterances and discover multiple latent intent categories within them in an unsupervised manner. We finally hierarchically link mutually related intents into domains, forming an intent-domain taxonomy. ADVIN significantly outperforms baselines on three benchmark datasets, and real user utterances from a commercial voice-powered agent.
△ Less
Submitted 21 May, 2020;
originally announced June 2020.
-
$k$-Cut: A Simple Approximately-Uniform Method for Sampling Ballots in Post-Election Audits
Authors:
Mayuri Sridhar,
Ronald L. Rivest
Abstract:
We present an approximate sampling framework and discuss how risk-limiting audits can compensate for these approximations, while maintaining their "risk-limiting" properties. Our framework is general and can compensate for counting mistakes made during audits.
Moreover, we present and analyze a simple approximate sampling method,"$k$-cut", for picking a ballot randomly from a stack, without coun…
▽ More
We present an approximate sampling framework and discuss how risk-limiting audits can compensate for these approximations, while maintaining their "risk-limiting" properties. Our framework is general and can compensate for counting mistakes made during audits.
Moreover, we present and analyze a simple approximate sampling method,"$k$-cut", for picking a ballot randomly from a stack, without counting. Our method involves doing $k$ "cuts", each involving moving a random portion of ballots from the top to the bottom of the stack, and then picking the ballot on top. Unlike conventional methods of picking a ballot at random, $k$-cut does not require identification numbers on the ballots or counting many ballots per draw. We analyze how close the distribution of chosen ballots is to the uniform distribution, and design different mitigation procedures. We show that $k=6$ cuts is enough for an risk-limiting election audit, based on empirical data, which would provide a significant increase in efficiency.
△ Less
Submitted 2 January, 2019; v1 submitted 21 November, 2018;
originally announced November 2018.
-
Software Reuse in Cardiology Related Medical Database Using K-Means Clustering Technique
Authors:
M. Bhanu Sridhar,
Y. Srinivas,
M. H. M. Krishna Prasad
Abstract:
Software technology based on reuse is identified as a process of designing software for the reuse purpose. The software reuse is a process in which the existing software is used to build new software. A metric is a quantitative indicator of an attribute of an item or thing. Reusability is the likelihood for a segment of source code that can be used again to add new functionalities with slight or n…
▽ More
Software technology based on reuse is identified as a process of designing software for the reuse purpose. The software reuse is a process in which the existing software is used to build new software. A metric is a quantitative indicator of an attribute of an item or thing. Reusability is the likelihood for a segment of source code that can be used again to add new functionalities with slight or no modification. A lot of research has been projected using reusability in reducing code, domain, requirements, design etc., but very little work is reported using software reuse in medical domain. An attempt is made to bridge the gap in this direction, using the concepts of clustering and classifying the data based on the distance measures. In this paper cardiologic database is considered for study. The developed model will be useful for Doctors or Paramedics to find out the patients level in the cardiologic disease, deduce the medicines required in seconds and propose them to the patient. In order to measure the reusability K means clustering algorithm is used.
△ Less
Submitted 5 November, 2013;
originally announced November 2013.
-
Software Reuse in Medical Database for Cardiac Patients using Pearson Family Equations
Authors:
M. Bhanu Sridhar,
Y. Srinivas,
M. H. M. Krishna Prasad
Abstract:
Software reuse is a subfield of software engineering that is used to adopt the existing software for similar purposes. Reuse Metrics determine the extent to which an existing software component is reused in new software with an objective to minimize the errors and cost of the new project. In this paper, medical database related to cardiology is considered. The Pearson Type I Distribution is used t…
▽ More
Software reuse is a subfield of software engineering that is used to adopt the existing software for similar purposes. Reuse Metrics determine the extent to which an existing software component is reused in new software with an objective to minimize the errors and cost of the new project. In this paper, medical database related to cardiology is considered. The Pearson Type I Distribution is used to calculate the probability density function (pdf) and thereby utilizing it for clustering the data. Further, coupling methodology is used to bring out the similarity of the new patient data by comparing it with the existing data. By this, the concerned treatment to be followed for the new patient is deduced by comparing with that of the previous patients case history. The metrics proposed by Chidamber and Kemerer are utilized for this purpose. This model will be useful for the medical field through software, particularly in remote areas.
△ Less
Submitted 3 December, 2012;
originally announced December 2012.
-
On the Sum Rate of a 2 x 2 Interference Network
Authors:
Murali Sridhar,
Srikrishna Bhashyam
Abstract:
In an M x N interference network, there are M transmitters and N receivers with each transmitter having independent messages for each of the 2^N -1 possible non-empty subsets of the receivers. We consider the 2 x 2 interference network with 6 possible messages, of which the 2 x 2 interference channel and X channel are special cases obtained by using only 2 and 4 messages respectively. Starting fro…
▽ More
In an M x N interference network, there are M transmitters and N receivers with each transmitter having independent messages for each of the 2^N -1 possible non-empty subsets of the receivers. We consider the 2 x 2 interference network with 6 possible messages, of which the 2 x 2 interference channel and X channel are special cases obtained by using only 2 and 4 messages respectively. Starting from an achievable rate region similar to the Han-Kobayashi region, we obtain an achievable sum rate. For the Gaussian interference network, we determine which of the 6 messages are sufficient for maximizing the sum rate within this rate region for the low, mixed, and strong interference conditions. It is observed that 2 messages are sufficient in several cases.
△ Less
Submitted 23 December, 2016; v1 submitted 10 August, 2012;
originally announced August 2012.
-
E-Speed Governors For Public Transport Vehicles
Authors:
C. S. Sridhar,
R. ShashiKumar,
S. Madhava Kumar,
Manjula Sridhar,
Varun. D
Abstract:
An accident is unexpected, unusual, unintended and identifiable external event which occurs at any place and at any time. The major concern faced by the government and traffic officials is over speeding at limited speed zones like hospitals, schools or residential places leading to causalities and more deaths on the roads. Hence the speed of the vehicles is to be regulated and confined to the limi…
▽ More
An accident is unexpected, unusual, unintended and identifiable external event which occurs at any place and at any time. The major concern faced by the government and traffic officials is over speeding at limited speed zones like hospitals, schools or residential places leading to causalities and more deaths on the roads. Hence the speed of the vehicles is to be regulated and confined to the limits as prescribed by the traffic regulations. In this paper we propose a solution in the form of providing E-speed governor fitted with a wireless communication system consisting of a Rx which receives the information regarding the speed regulation for their zones. The TX will be made highly intelligent and decide when receiver should be made active to regulate the speed and unwarranted honking from the vehicles which can be deactivated in the silent zones.
△ Less
Submitted 24 May, 2010;
originally announced May 2010.
-
Grid enabled virtual screening against malaria
Authors:
N. Jacq,
J. Salzemann,
F. Jacq,
Y. Legré,
E. Medernach,
J. Montagnat,
A. Maass,
M. Reichstadt,
H. Schwichtenberg,
M. Sridhar,
V. Kasam,
M. Zimmermann,
M. Hofmann,
V. Breton
Abstract:
WISDOM is an international initiative to enable a virtual screening pipeline on a grid infrastructure. Its first attempt was to deploy large scale in silico docking on a public grid infrastructure. Protein-ligand docking is about computing the binding energy of a protein target to a library of potential drugs using a scoring algorithm. Previous deployments were either limited to one cluster, to…
▽ More
WISDOM is an international initiative to enable a virtual screening pipeline on a grid infrastructure. Its first attempt was to deploy large scale in silico docking on a public grid infrastructure. Protein-ligand docking is about computing the binding energy of a protein target to a library of potential drugs using a scoring algorithm. Previous deployments were either limited to one cluster, to grids of clusters in the tightly protected environment of a pharmaceutical laboratory or to pervasive grids. The first large scale docking experiment ran on the EGEE grid production service from 11 July 2005 to 19 August 2005 against targets relevant to research on malaria and saw over 41 million compounds docked for the equivalent of 80 years of CPU time. Up to 1,700 computers were simultaneously used in 15 countries around the world. Issues related to the deployment and the monitoring of the in silico docking experiment as well as experience with grid operation and services are reported in the paper. The main problem encountered for such a large scale deployment was the grid infrastructure stability. Although the overall success rate was above 80%, a lot of monitoring and supervision was still required at the application level to resubmit the jobs that failed. But the experiment demonstrated how grid infrastructures have a tremendous capacity to mobilize very large CPU resources for well targeted goals during a significant period of time. This success leads to a second computing challenge targeting Avian Flu neuraminidase N1.
△ Less
Submitted 17 November, 2006;
originally announced November 2006.
-
Large Scale In Silico Screening on Grid Infrastructures
Authors:
N. Jacq,
V. Breton,
H. -Y. Chen,
L. -Y. Ho,
M. Hofmann,
H. -C. Lee,
Y. Legré,
S. -C. Lin,
A. Maass,
E. Medernach,
I. Merelli,
L. Milanesi,
G. Rastelli,
M. Reichstadt,
J. Salzemann,
H. Schwichtenberg,
M. Sridhar,
V. Kasam,
Y. -T. Wu,
M. Zimmermann
Abstract:
Large-scale grid infrastructures for in silico drug discovery open opportunities of particular interest to neglected and emerging diseases. In 2005 and 2006, we have been able to deploy large scale in silico docking within the framework of the WISDOM initiative against Malaria and Avian Flu requiring about 105 years of CPU on the EGEE, Auvergrid and TWGrid infrastructures. These achievements dem…
▽ More
Large-scale grid infrastructures for in silico drug discovery open opportunities of particular interest to neglected and emerging diseases. In 2005 and 2006, we have been able to deploy large scale in silico docking within the framework of the WISDOM initiative against Malaria and Avian Flu requiring about 105 years of CPU on the EGEE, Auvergrid and TWGrid infrastructures. These achievements demonstrated the relevance of large-scale grid infrastructures for the virtual screening by molecular docking. This also allowed evaluating the performances of the grid infrastructures and to identify specific issues raised by large-scale deployment.
△ Less
Submitted 17 November, 2006;
originally announced November 2006.