Search | arXiv e-print repository

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models

Authors: Neha Sengupta, Sunil Kumar Sahu, Bokang Jia, Satheesh Katipomu, Haonan Li, Fajri Koto, William Marshall, Gurpreet Gosal, Cynthia Liu, Zhiming Chen, Osama Mohammed Afzal, Samta Kamboj, Onkar Pandit, Rahul Pal, Lalit Pradhan, Zain Muhammad Mujahid, Massa Baali, Xudong Han, Sondos Mahmoud Bsharat, Alham Fikri Aji, Zhiqiang Shen, Zhengzhong Liu, Natalia Vassilieva, Joel Hestness, Andy Hock , et al. (7 additional authors not shown)

Abstract: We introduce Jais and Jais-chat, new state-of-the-art Arabic-centric foundation and instruction-tuned open generative large language models (LLMs). The models are based on the GPT-3 decoder-only architecture and are pretrained on a mixture of Arabic and English texts, including source code in various programming languages. With 13 billion parameters, they demonstrate better knowledge and reasoning… ▽ More We introduce Jais and Jais-chat, new state-of-the-art Arabic-centric foundation and instruction-tuned open generative large language models (LLMs). The models are based on the GPT-3 decoder-only architecture and are pretrained on a mixture of Arabic and English texts, including source code in various programming languages. With 13 billion parameters, they demonstrate better knowledge and reasoning capabilities in Arabic than any existing open Arabic and multilingual models by a sizable margin, based on extensive evaluation. Moreover, the models are competitive in English compared to English-centric open models of similar size, despite being trained on much less English data. We provide a detailed description of the training, the tuning, the safety alignment, and the evaluation of the models. We release two open versions of the model -- the foundation Jais model, and an instruction-tuned Jais-chat variant -- with the aim of promoting research on Arabic LLMs. Available at https://huggingface.co/inception-mbzuai/jais-13b-chat △ Less

Submitted 29 September, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

Comments: Arabic-centric, foundation model, large-language model, LLM, generative model, instruction-tuned, Jais, Jais-chat

MSC Class: 68T50 ACM Class: F.2.2; I.2.7

arXiv:2105.02189 [pdf]

doi 10.1145/3411764.3445541

The Kids Are / Not / Sort of All Right

Authors: Caroline Pitt, Ari Hock, Leila Zelnick, Katie Davis

Abstract: We investigated changes in and factors affecting American adolescents' subjective wellbeing during the early months (April - August 2020) of the coronavirus pandemic in the United States. Twenty-one teens (14 - 19 years) participated in interviews at the start and end of the study and completed ecological momentary assessments three times per week between the interviews. There was an aggregate tre… ▽ More We investigated changes in and factors affecting American adolescents' subjective wellbeing during the early months (April - August 2020) of the coronavirus pandemic in the United States. Twenty-one teens (14 - 19 years) participated in interviews at the start and end of the study and completed ecological momentary assessments three times per week between the interviews. There was an aggregate trend toward increased wellbeing, with considerable variation within and across participants. Teens reported greater reliance on networked technologies as their unstructured time increased during lockdown. Using multilevel growth modeling, we found that how much total time teens spent with technology had less bearing on daily fluctuations in wellbeing than the satisfaction and meaning they derived from their technology use. Ultimately, teens felt online communication could not replace face-to-face interactions. We conducted two follow-up participatory design sessions with nine teens to explore these insights in greater depth and reflect on general implications for design to support teens' meaningful technology experiences and wellbeing during disruptive life events. △ Less

Submitted 5 May, 2021; originally announced May 2021.

Comments: 22 pages, 2 figures, 2 tables, Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI '21)

ACM Class: K.4.2; H.5.m

arXiv:1910.01500 [pdf, other]

MLPerf Training Benchmark

Authors: Peter Mattson, Christine Cheng, Cody Coleman, Greg Diamos, Paulius Micikevicius, David Patterson, Hanlin Tang, Gu-Yeon Wei, Peter Bailis, Victor Bittorf, David Brooks, Dehao Chen, Debojyoti Dutta, Udit Gupta, Kim Hazelwood, Andrew Hock, Xinyuan Huang, Atsushi Ike, Bill Jia, Daniel Kang, David Kanter, Naveen Kumar, Jeffery Liao, Guokai Ma, Deepak Narayanan , et al. (12 additional authors not shown)

Abstract: Machine learning (ML) needs industry-standard performance benchmarks to support design and competitive evaluation of the many emerging software and hardware solutions for ML. But ML training presents three unique benchmarking challenges absent from other domains: optimizations that improve training throughput can increase the time to solution, training is stochastic and time to solution exhibits h… ▽ More Machine learning (ML) needs industry-standard performance benchmarks to support design and competitive evaluation of the many emerging software and hardware solutions for ML. But ML training presents three unique benchmarking challenges absent from other domains: optimizations that improve training throughput can increase the time to solution, training is stochastic and time to solution exhibits high variance, and software and hardware systems are so diverse that fair benchmarking with the same binary, code, and even hyperparameters is difficult. We therefore present MLPerf, an ML benchmark that overcomes these challenges. Our analysis quantitatively evaluates MLPerf's efficacy at driving performance and scalability improvements across two rounds of results from multiple vendors. △ Less

Submitted 2 March, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

Comments: MLSys 2020

arXiv:1603.05933 [pdf, other]

Distributed Iterative Learning Control for a Team of Quadrotors

Authors: Andreas Hock, Angela P. Schoellig

Abstract: The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding a given formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicles. We present a distributed iterative learning control (ILC) approach wher… ▽ More The goal of this work is to enable a team of quadrotors to learn how to accurately track a desired trajectory while holding a given formation. We solve this problem in a distributed manner, where each vehicle has only access to the information of its neighbors. The desired trajectory is only available to one (or few) vehicles. We present a distributed iterative learning control (ILC) approach where each vehicle learns from the experience of its own and its neighbors' previous task repetitions, and adapts its feedforward input to improve performance. Existing algorithms are extended in theory to make them more applicable to real-world experiments. In particular, we prove stability for any causal learning function with gains chosen according to a simple scalar condition. Previous proofs were restricted to a specific learning function that only depends on the tracking error derivative (D-type ILC). Our extension provides more degrees of freedom in the ILC design and, as a result, better performance can be achieved. We also show that stability is not affected by a linear dynamic coupling between neighbors. This allows us to use an additional consensus feedback controller to compensate for non-repetitive disturbances. Experiments with two quadrotors attest the effectiveness of the proposed distributed multi-agent ILC approach. This is the first work to show distributed ILC in experiment. △ Less

Submitted 26 September, 2016; v1 submitted 18 March, 2016; originally announced March 2016.

Comments: To be presented at CDC 2016! Video can be found at https://www.youtube.com/watch?v=Qw598DRw6-Q

Showing 1–4 of 4 results for author: Hock, A