**footnotetext: Equal contribution.

MaLA-500: Massive Language Adaptation of Large Language Models

Peiqin Lin1,2, Shaoxiong Ji3, Jörg Tiedemann3, André F. T. Martins4,5,6, Hinrich Schütze1,2
1Center for Information and Language Processing, LMU Munich
2Munich Center for Machine Learning  3University of Helsinki
4Instituto Superior Técnico (Lisbon ELLIS Unit)
5Instituto de Telecomunicações  6Unbabel
[email protected], [email protected]
Abstract

Large language models (LLMs) have advanced the state of the art in natural language processing. However, their predominant design for English or a limited set of languages creates a substantial gap in their effectiveness for low-resource languages. To bridge this gap, we introduce MaLA-500, a novel large language model designed to cover an extensive range of 534 languages. To train MaLA-500, we employ vocabulary extension and continued pretraining on LLaMA 2 with Glot500-c. Our intrinsic evaluation demonstrates that MaLA-500 is better at predicting the given texts of low-resource languages than existing multilingual LLMs. Moreover, the extrinsic evaluation of in-context learning shows that MaLA-500 outperforms previous LLMs on SIB200 and Taxi1500 by a significant margin, i.e., 11.68% and 4.82% marco-average accuracy across languages. We release MaLA-500 at https://huggingface.co/MaLA-LM.

1 Introduction

Large Language Models (LLMs), e.g., LLaMA (Touvron et al., 2023a; b), Mistral (Jiang et al., 2023; 2024), and ChatGPT,111https://openai.com/blog/chatgpt have shown remarkable performance in natural language understanding and generation. Follow-up studies (Bang et al., 2023; Lai et al., 2023; Ahuja et al., 2023a; b) observe that these English-centric LLMs, such as LLaMA with mainly English as the training data, are capable of handling some high-resource non-English languages, benefiting from the inclusion of non-English language data during pretraining. However, their applicability to low-resource languages is still limited due to data scarcity.

Previous studies have released pretrained multilingual models with mostly encoder-only transformer architectures, e.g., multilingual BERT (Devlin et al., 2019) and XLM-R (Conneau et al., 2020), for around 100 languages. The paradigm shift from encoder-only to decoder-only achieves scalability for large language models with billions of model parameters, leading to the development of open multilingual models. Recently, several generative multilingual LLMs, such as XGLM (Lin et al., 2021), mGPT (Shliazhko et al., 2022), and BLOOM (Scao et al., 2022), have emerged. Notably, the current language coverage for these generative LLMs is limited to up to 60 languages, highlighting the remaining need for further work on massively multilingual LLMs for many natural languages.

ImaniGooghari et al. (2023) have achieved a significant milestone in the realm of massive language adaptation by extending the language coverage of a small-scale multilingual language model, XLM-R (Conneau et al., 2020) - an auto-encoding model with 278M parameters, from 100 languages to an impressive number of 534 languages, and introducing an extended model, Glot500-m with 395M parameters. ImaniGooghari et al. (2023) introduce the Glot500-c corpora spanning 534 languages from 47 language families, and then apply vocabulary extension and continued pretraining to create Glot500-m. The introduction of Glot500-c mitigates the challenge of data scarcity for low-resource languages. Moreover, the adaptation method is more favorable than training from scratch, as it requires fewer computational resources and emits a smaller carbon footprint. This success serves as a strong motivation for our exploration into the massive language adaptation of LLMs.

This work aims to extend the capabilities of LLMs to encompass a wider range of languages. Existing works like ImaniGooghari et al. (2023) on language adaptation of pretrained models provide extended coverage across a wide linguistic spectrum but are limited to relatively small model sizes - mostly at the hundred million scales, while other works like Yong et al. (2022) extended generative LLMs but are limited to a small number of languages. Our study pushes the boundaries by exploring language adaptation techniques for LLMs with model parameters scaling up to 10 billion for 534 languages. Our investigation delves into generative LLMs with a substantial increase in model parameters and their in-context learning capabilities in diverse languages, especially low-resource languages. This augmentation enables us to enhance contextual and linguistic relevance across a diverse range of languages.

We address the challenges of adapting LLMs to low-resource languages, such as data sparsity, domain-specific vocabulary, and linguistic diversity. Specifically, we study continued pretraining of open LLM, i.e., LLaMA 2 (Touvron et al., 2023b), vocabulary extension, and adaptation techniques, i.e., LoRA low-rank reparameterization (Hu et al., 2022). We deploy distributed training and release MaLA-500 that covers more than 500 languages in various domains. We evaluate MaLA-500 using intrinsic measures on held-out Glot500-c test set and parallel data and extrinsic metrics on downstream benchmarks: SIB200 and Taxi1500. The results show that MaLA-500 outperforms existing open LLMs of close or slightly larger model size. This work broadens the accessibility of LLMs, making them valuable for a more diverse set of language-specific use cases, especially for low-resource ones, and addressing the equality issue by removing language barriers for speakers of many languages, especially those underrepresented languages covered by existing LLMs.

2 Massive Language Adaptation

The principle of massive language adaptation of large language models accommodates the utilization of a massively multilingual corpus (Section 2.1), the strong base LLM (Section 2.2), and the technique for effective language adaptation: vocabulary extension (Section 2.3) and continued pretraining (Section 2.4).

2.1 Data

We use Glot500-c (ImaniGooghari et al., 2023) covering 534 languages222We define languages using the ISO 639-3 code combined with the corresponding written script. For example, “eng_Latn” represents English written in the Latin script. as the training data of MaLA-500. See §A for the list of languages with their data amounts. The original number of sentences ranges from 10 thousand to 63 million. Note that Glot500-c does not put full effort into collecting data for high-resource languages but focuses on low-resource languages. We sample languages from the imbalanced dataset according to a multinomial distribution, with α=0.3𝛼0.3\alpha=0.3italic_α = 0.3 for vocabulary extension and continued pretraining. We use different scales for sampling data to be used in model training and vocabulary construction. After sampling, the number of sentences for training ranges from 600 thousand to 8 million per language, leading to 1 billion sentences in total. The number of sentences for vocabulary construction ranges from 30 thousand to 400 thousand, making a total of 50 million sentences.

2.2 Model

We choose LLaMA 2 (Touvron et al., 2023b) to start continual training. LLaMA series models (Touvron et al., 2023a), with model weights released publicly, have gained popularity in the research community. Despite being English-centric compared to their multilingual counterparts, they have shown remarkable capacity for multiple languages (Ahuja et al., 2023b). We choose the latest LLaMA 2, trained on 2 trillion tokens, as our base model to benefit from its outstanding language capacity. Our study chooses the 7B model with 32 transformer layers, and leaves the extension of LLMs with larger sizes as a future work.

2.3 Vocabulary Extension

The original LLaMA 2’s 32,000 tokenizer covers English and a small fraction of other European languages using Latin or Cyrillic scripts. To enhance its capability and encoding efficiency for a broader range of languages, we extend the vocabulary with Glot500-c. Specifically, we initially train a multilingual tokenizer with SentencePiece (Kudo & Richardson, 2018) on the sampled Glot500-c with a vocabulary of 250,000. Subsequently, we merge the trained tokenizer with the original LLaMA 2 tokenizer by taking the union of their vocabularies. As a result, we obtain the MaLA-500’s tokenizer with a vocabulary size of 260,164. After vocabulary extension and resizing the embedding layer, the model size becomes 8.6B.

We measure the impact of vocabulary extension on the development set of Glot500-c by analyzing the reduction in segmentation length for each language. The results indicate that the effect of vocabulary extension varies, ranging from 8% (English, eng_Latn) to 88% (Oriya, ori_Orya). Unsurprisingly, vocabulary extension has a larger effect on languages written in non-Latin scripts than on those in the Latin script. However, for some low-resource languages written in the Latin script, e.g., Kabiyè (kbp_Latn) and Vietnamese (vie_Latn), the segmentation length is shortened by around 50%.

2.4 Continued Pretraining

We employ continued pretraining for language adaptation with low-rank adaptation (LoRA, Hu et al., 2022) to enable parameter-efficient training, given the limitation of our computing resources. LoRA injects trainable rank decomposition matrices, which approximate the large weight matrices with a lower rank, to the pretrained model weights. It reduces the computational complexity and thus saves the training cost while retaining high model quality (Hu et al., 2022). We continually train the casual language model to update the rank-decomposition matrices, embedding layer, and language modeling head while freezing the transformer weights of pretrained models, allowing the continually trained language model to learn from new data in new languages without completely losing its previous language capacity. Continual training of large language models requires substantial computational resources. We adopt efficient distributed training setups on supercomputers to make the training process feasible.

2.5 Training

Hardware and Software

We train our model on the computing cluster with the theoretical peak performance of 2 petaflops on GPU nodes. We deploy distributed training on 24 Nvidia Ampere A100 GPUs. As for software, we utilize the Huggingface Transformers (Wolf et al., 2020), PEFT (Parameter-Efficient Fine-Tuning),333https://huggingface.co/docs/peft/index and DeepSpeed (Rasley et al., 2020). We use the ZeRO redundancy optimizer (Rajbhandari et al., 2020) and maximize the batch size that fits the memory of each GPU. We employ mixed-precision training using the bfloat16 format.

Hyperparameters

The learning rate is set at 3e-4. A weight decay of 0.01 is applied to penalize large weights and mitigate overfitting. The trainable LoRA module targets the query and value matrices. The language model head is not decomposed by a LoRA module but is trained in a full-parameter manner. In our setting, the final model has 10B parameters in total, in which 2B parameters are trainable. The LoRA module is incorporated with a rank of 8, an alpha value of 32, and a dropout rate of 0.05, contributing to the model’s adaptability and regularization during training. The context window is 4k. We maximize the batch size to fit the memory, making a global batch size of 384. The model undergoes three training epochs. Checkpoints are saved every 500 steps, and we employ early stop** to select the checkpoint that exhibits the most favorable average performance on downstream tasks.

Environmental Impacts

We train our model on a carbon-neutral data center, with all electricity generated with renewable hydropower, and the waste heat is utilized in district heating to further reduce CO2 footprint.444https://www.csc.fi/sustainable-development

3 Evaluation

3.1 Benchmarks and Setup

We consider both intrinsic and extrinsic measures for evaluation. Evaluation dataset statistics are shown in Table 1.

Datasets Metric \|Data\| \|Lang\| Domain
Intrinsic Glot500-c test (ImaniGooghari et al., 2023) NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L 1000 534 Misc
PBC (Mayer & Cysouw, 2014) NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L 500 370 Bible
Extrinsic SIB200 (Adelani et al., 2023) ACC 204 177 Misc
Taxi1500 (Ma et al., 2023) ACC 111 351 Bible
Table 1: Evaluation dataset statistics. \|Data\|: test set size per language. \|Lang\|: number of evaluated languages. NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L: negative log-likelihood. ACC: Accuracy.

For intrinsic evaluation, perplexity is not comparable across models and languages due to different text segmentations. Inspired by Xue et al. (2022); Yu et al. (2023), we instead measure the negative log-likelihood (NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L) of the text using the given LLMs.

We concatenate the dataset as the input text and adopt the sliding-window strategy.555https://huggingface.co/docs/transformers/en/perplexity The evaluation of different LLMs uses the same data with the concatenation of sentences per language, thus making NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L model-comparable. In addition, we consider language-comparable NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L by measuring NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on parallel data, in which every sample in different languages contains the same semantic information. We report the model-comparable NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L of Glot500-c test set covering all 534 considered languages (§3.2), and language-comparable NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on Parallel Bible Corpus  (PBC, Mayer & Cysouw, 2014), covering 370 languages (§3.3).

For extrinsic evaluation, we evaluate the few-shot learning capability of MaLA-500 and compare it with other LLMs on SIB200 (Adelani et al., 2023) and Taxi1500 (Ma et al., 2023).

SIB200 is a topic classification dataset. The classification task involves seven classes, namely science/technology, travel, politics, sports, health, entertainment, and geography. Our evaluation spans a diverse set of 177 languages, obtained by intersecting the language sets of SIB200 and Glot500-c. Note that the flores200-based SIB200 evaluation set is included in the training data since Glot500-c includes flores200, but the classification labels are not provided.

Taxi1500 is another text classification dataset spanning 351 languages. It involves six classes, namely, Recommendation, Faith, Description, Sin, Grace, and Violence. Our evaluation efforts aim to cover as many languages as possible. However, the evaluation of massively multilingual language models is a challenging task. Due to the lack of real-world multilingual evaluation benchmarks, we use this benchmark that contains religious content.

For in-context learning evaluation, the evaluated LLM receives a structured prompt, which is the concatenation of few-shot examples and the sample intended for prediction. The format for both a few-shot example and the sample to predict is defined as follows:

Template for SIB200:

The topic of the news [sent] is [label]

Template for Taxi1500:

The topic of the verse [sent] is [label]

where [sent] is the sentence for classification, and [label] is the ground truth. [label] is included when the sample serves as a few-shot example but is omitted when predicting the sample. The constructed prompt is then used as input to the LLM. Subsequently, the evaluated LLM is prompted to estimate the probability of the label over the label set based on the provided prompt.

For SIB200, few-shots examples are randomly sampled from the in-language training sets. Since randomly selecting few-shot examples for in-context learning yields random results for both MaLA-500 and previous LLMs on Taxi1500, we consider the retriever-based in-context learning (Liu et al., 2022). Specifically, we use average word embeddings in layer 8 of the Glot500 (ImaniGooghari et al., 2023) for retrieving semantic-similar samples as suggested in previous work (Sabet et al., 2020) for all the compared models. The evaluation process is implemented using the lm-evaluation-harness,666https://github.com/EleutherAI/lm-evaluation-harness and we use accuracy (ACC) to measure the performance of classification.

3.2 Comparison across LLMs

We compare MaLA-500 with LLaMA 2-7B, mGPT-13B, BLOOM-7B1, and XGLM-7.5B on Glot500-c test set, SIB200, Taxi1500 by computing the averaged performance across languages, and the result are given in Table 2. Among the evaluated LLMs, LLaMA 2-7B performs second-best, indicating that LLaMA 2-7B has a strong multilingual capacity and that it is reasonable to select it as the base model. MaLA-500 outperforms all compared LLMs with a close or slightly larger model size across all the evaluated tasks. Notably, compared to LLaMA 2-7B, MaLA-500 gains a lower NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on the Glot500-c test set by 39.33, and has 14.94% and 4.82% improvements on SIB200 and Taxi1500, respectively. It highlights MaLA-500’s substantial contribution to enhancing the multilingual capacity of LLMs.

Model Glot500-c test (NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L \downarrow) SIB200 (ACC \uparrow) Taxi1500 (ACC \uparrow)
LLaMA 2-7B 190.58 42.08 44.07
mGPT-13B 282.46 45.34 40.98
BLOOM-7B1 202.95 44.63 43.98
XGLM-7.5B 205.07 34.36 43.24
MaLA-500 151.25 57.02 48.89
Table 2: Averaged results across languages on Glot500-c test (measured by NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L), SIB200, and Taxi1500 (measured by accuracy (%)) of different LLMs. mGPT has no model with around 7B parameters, so we choose a larger one with 13B parameters. \downarrow indicates the lower, the better. \uparrow indicates the higher, the better. The best results are bold.

Figures 1, 2 and 3 provide detailed performance analysis across languages on Glot500-c test, SIB200, and Taxi1500. In those figures, we group scores into different performance bins and display them in different colors. For Glot500-c test, MaLA-500 has more languages achieving better NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L, i.e., 61 languages with NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L less than 100 and 171 languages with NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L between 100 and 150. Besides, MaLA-500 has 54 (10%) languages achieving NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L larger than 200, which may indicate the languages are not well covered by the measured LLM. Nevertheless, the number is much less than other LLMs. For example, the second-best LLM, LLaMA 2-7B, has 231 (43%) languages achieving NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L larger than 200. For both SIB200 and Taxi1500, MaLA-500 surpasses previous LLMs in the sense that it obtains random results in fewer languages and achieves impressive performance in more languages than its counterparts.

Refer to caption
Figure 1: NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L (lower is better) on Glot500-c test with the scores grouped into four bins displayed in different colors. X-axis: the number of languages in performance ranges.
Refer to caption
Figure 2: Accuracy (higher is better) on SIB200 with the scores grouped into four bins displayed in different colors. X-axis: the number of languages in performance ranges (%).
Refer to caption
Figure 3: Accuracy (higher is better) on Taxi1500 with the scores grouped into four bins displayed in different colors. X-axis: the number of languages in performance ranges (%).

3.3 Comparison across Languages

To check in detail how MaLA-500 performs across languages, we check the performance across language families777We assign languages to families based on Glottolog: https://glottolog.org/glottolog/family. shown in Table 3. We observe that more high-resource language families, e.g., Indo-European (indo1319) and Dravidian (drav1251), achieve slightly better performance than low-resource language families, e.g., Sino-Tibetan (sino1245).

family \|Sent\| PBC (NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L \downarrow) SIB200 (ACC \uparrow) Taxi1500 (ACC \uparrow)
indo1319 988M 145.35 63.53 53.03
drav1251 135M 131.29 56.25 54.65
aust1307 113M 147.37 62.83 49.69
turk1311 109M 161.71 57.08 52.55
afro1255 100M 165.46 52.00 43.74
atla1278 57M 141.92 42.90 45.52
ural1272 50M 137.52 66.67 48.58
sino1245 29M 155.64 49.30 49.31
other 60M 167.69 55.74 46.67
Table 3: Performance comparison across language families on PBC, SIB200, and Taxi1500. \|Sent\|: sentence number used for continued pretraining in total. \downarrow indicates the lower, the better. \uparrow indicates the higher, the better.

In Table 4, we present a comprehensive analysis of the top 5 performance improvements and declines across languages on SIB200 from MaLA-500 compared to LLaMA 2-7B. We observe that MaLA-500 has substantial improvements on low-resource scripts, e.g., Kannada (kan_Knda), while has worse performance on high-resource languages, e.g., Swedish (swe_Latn), which have been well covered by LLaMA 2-7B.

high end low end
Language LLaMA 2-7B MaLA-500 ΔΔ\Deltaroman_Δ Language LLaMA 2-7B MaLA-500 ΔΔ\Deltaroman_Δ
kan_Knda 17.16 57.35 40.19 swe_Latn 71.08 60.29 -10.79
ckb_Arab 19.61 60.29 40.68 rus_Cyrl 71.57 65.20 -06.37
asm_Beng 17.16 58.82 41.66 dan_Latn 69.12 63.24 -05.88
pan_Guru 14.22 58.82 44.60 pol_Latn 74.51 68.63 -05.88
sin_Sinh 15.20 60.29 45.09 ukr_Cyrl 71.57 65.69 -05.88
Table 4: Results for five languages each with the largest (high end) and smallest (low end) gains from MaLA-500 vs. LLaMA 2-7B for SIB200. ΔΔ\Deltaroman_Δ indicates the difference between the scores of MaLA-500 and LLaMA 2-7B. See §B for detailed results for each task.

In our comprehensive analysis of contributing factors on SIB200, we note that the corpus size of a language exhibits a weak correlation of 0.13 with its performance gain. In contrast, the corpus size of the language family to which a language belongs demonstrates a moderate correlation of 0.40. A moderately high Pearson correlation of 0.53 is observed between the effect of vocabulary extension, i.e., the reduction in segmentation length, and the performance gain. This observation holds true for languages with both non-Latin scripts, such as Kannada (kan_Knda), Malayalam (mal_Mlym), and Tigrinya (tir_Ethi), as well as Latin scripts, such as Igbo (ibo_Latn) and Yoruba (yor_Latn). It demonstrates the effectiveness of vocabulary extension.

3.4 Effect of Number of Shots

Figure 4 illustrates the relationship between accuracy and the number of in-context examples (i.e., shots) on SIB200. As the number of in-context shots increases, there is a corresponding rise in accuracy. Notably, with just 1-shot, accuracy exhibits randomness at 30.88%, indicating 1-shot provides limited information for task learning. The transition from 1 shot to 2 shots/3 shots results in a notable improvement, with performances boosted by 19.83% and 26.14%, respectively. This highlights the effectiveness of increasing the number of shots. MaLA-500 achieves its peak performance at approximately 65% accuracy with 6-10 in-context shots. This may be attributed to the multi-class nature of the SIB200 dataset, necessitating more shots for learning intricate input-label map**s.

Refer to caption
Figure 4: In-context learning macro-average accuracy (%) on SIB200 with different number of shots using MaLA-500.

In Figure 5, a more nuanced portrayal of results aligns with the observations made in Figure 4. In the realm of 1-shot in-context learning, approximately 50 languages exhibit erratic results. As the number of shots increases, there is a reduction in the number of languages achieving low accuracy (25-50%), coupled with a growing cohort achieving high accuracy (75-100%).

Refer to caption
Figure 5: Detailed results of in-context learning on SIB200 using MaLA-500. X-axis: the number of languages in different accuracy ranges (%). Y-axis: number of shots.

Further examination into individual language trends reveals that some low-resource languages require more shots to achieve better performance (e.g., pes_Arab for Persian) or even exhibit poor performance with 10 shots (e.g., dzo_Tibt for Dzongkha and ayr_Latn for Central Aymara). In contrast, high-resource languages, such as fra_Latn for French, demonstrate impressive performance even with fewer shots, and increasing the number of shots results in only marginal improvement.

4 Related Work

4.1 Multilingual Language Models

Language model development has endeavored to broaden the scope of pretraining languages to address multilingual scenarios. Pretrained multilingual models have been able to accommodate up to a hundred or more languages. Noteworthy examples include mBERT Devlin et al. (2019), which supports 104 languages, XLM-R (Conneau et al., 2020) covering 100 languages, mBART (Liu et al., 2020) designed for 25 languages, mT5 (Xue et al., 2021) spanning 101 languages, XGLM (Lin et al., 2021) across 30 languages, GPT-3 covering 118 languages (93% English), mGPT (Shliazhko et al., 2022) accommodating 60 languages, and BLOOM (Scao et al., 2022) supporting 46 languages and 13 programming languages.

Surprisingly, two recent multilingual language models have surpassed the conventional limit by supporting more than 400 languages. Glot500-m (ImaniGooghari et al., 2023) spans 511 languages through vocabulary extension and continued training based on XLM-R. SERENGETI (Adebara et al., 2022) goes even further by supporting 517 African languages and language varieties, written in five different scripts, employing models inspired by both ELECTRA (Clark et al., 2020) and XLM-R. MADLAD (Kudugunta et al., 2023) covers 419 languages and trains an 8B language model from scratch with an adapted UL2 objective (Tay et al., 2022). Our work is concurrent with the MADLAD-400 language model. We distinguish it by: 1) language coverage. Our work covered more than 500 languages, a number comparable to that of encoder-only models and surpassing MADLAD-400 by an additional 100 languages. 2) training methods. We consider continual training to benefit from the learned knowledge of the original models. 3) model architecture. We adopt an open model architecture, i.e., LLaMA, while MADLAD uses decoder-only T5 architecture, which has not been supported by the HuggingFace ecosystem at the time of writing, thus leading to additional difficulty in usage.

4.2 Language Adaptation

Before the advent of LLMs, diverse approaches are employed to adapt small-scale multilingual language models to new languages. These methods include using adapters (Pfeiffer et al., 2020; Üstün et al., 2020; Pfeiffer et al., 2020; Nguyen et al., 2021; Faisal & Anastasopoulos, 2022; Yong et al., 2022), vocabulary extension and substitution (Chau et al., 2020; Wang et al., 2020; Müller et al., 2020; 2021; Pfeiffer et al., 2021; Chen et al., 2023; Downey et al., 2023), leveraging monolingual corpora (Ebrahimi & Kann, 2021; Alabi et al., 2022), and utilizing bilingual lexicons (Wang et al., 2022).

While language models have been scaled up notably, their coverage is limited to a specific set of languages. To address this constraint, various methods have been proposed to expand the applicability of these large language models across a broader range of languages, catering to both general-purpose tasks and specific applications like machine translation. These methods also involve vocabulary extension (Cui et al., 2023), continued pretraining and instruction-tuning (Yong et al., 2022; Cui et al., 2023; Chen et al., 2024; Zhao et al., 2024), and parallel corpora exploitation (Cahyawijaya et al., 2023; Yang et al., 2023; Zhu et al., 2023; Xu et al., 2023). Despite these efforts, massive language adaptation of LLMs for general-purpose tasks across diverse languages, e.g., covering many languages families and more than one hundred languages, remains an area yet to be thoroughly explored.

5 Conclusion and Future Work

We present a pioneering effort in massive language adaptation on LLMs, focusing on extending LLaMA 7B to our model, MaLA-500. This adaptation involves vocabulary extension and continued pretraining with LoRA. Our approach leads to MaLA-500 achieving state-of-the-art in-context learning capabilities, as demonstrated on the benchmarks of SIB200 and Taxi1500. We release the training scripts and model weights publicly to facilitate future research. This work marks a substantial advancement in applying LLMs to a diverse range of languages.

Our future work will focus on further improving the model capacity, for example, on machine translation across many language pairs. Alves et al. (2023) showed that LLMs (LLaMA-7B and LLaMA-13B) exhibited poor performance even on English-centric high-resource language pairs in some cases. Translation with LLMs on low-resource languages is more challenging. The LLaMA-7B model performed poorly in our preliminary experiments. Besides, our pretraining corpus does not intentionally include bilingual texts, and our MaLA-500 model is not instruction-tuned with translation data. We leave the inclusion of bilingual text during continual pretraining, instruction fine-tuning with translation data, and the evaluation on machine translation as future works.

Ethical Statement

LLMs have been known to exhibit biases present in their training data. When extending LLMs to low-resource languages, there is a risk of propagating biases from high-resource languages to underrepresented ones. Careful attention must be paid to mitigate bias and ensure fairness in data collection and model training. The paper aims to make LLMs more accessible for underrepresented languages. Still, there is a risk of creating a digital language divide if certain communities are left out due to limited technological access. Future work would address biases by conducting bias audits on the training data, debiasing the models during generation, and continuously monitoring model outputs.

Reproducibility Statement

We make the following efforts to ensure reproducible research. We release the model weights (https://huggingface.co/MaLA-LM) and codes for training and evaluation (https://github.com/MaLA-LM/mala-500). We use publicly available evaluation benchmarks which can be obtained freely or by request. The results are reproducible with our released model weights and evaluation scripts,

Acknowledgements

We thank José Pombal for constructive suggestions on training. This work is funded by The European Research Council (grants #740516, #771113 and #758969), EU’s Horizon Europe Research and Innovation Actions (UTTER, contract 101070631), and the European Union’s Horizon Europe research and innovation programme under grant agreement No 101070350 and from UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee [grant #10052546]. The authors wish to acknowledge CSC – IT Center for Science, Finland, for generous computational resources on the Mahti supercomputer and LUMI supercomputer through the LUMI extreme scale access (MOOMIN and LumiNMT). Shaoxiong Ji and Peiqin Lin acknowledge travel support from ELISE (GA no 951847).

References

  • Adebara et al. (2022) Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed, and Alcides Alcoba Inciarte. SERENGETI: Massively multilingual language models for africa. arXiv preprint arXiv:2212.10785, 2022.
  • Adelani et al. (2023) David Ifeoluwa Adelani, Hannah Liu, Xiaoyu Shen, Nikita Vassilyev, Jesujoba O. Alabi, Yanke Mao, Haonan Gao, and En-Shiun Annie Lee. SIB-200: A simple, inclusive, and big evaluation dataset for topic classification in 200+ languages and dialects. CoRR, abs/2309.07445, 2023. doi: 10.48550/arXiv.2309.07445. URL https://doi.org/10.48550/arXiv.2309.07445.
  • Ahuja et al. (2023a) Kabir Ahuja, Rishav Hada, Millicent Ochieng, Prachi Jain, Harshita Diddee, Samuel Maina, Tanuja Ganu, Sameer Segal, Maxamed Axmed, Kalika Bali, and Sunayana Sitaram. MEGA: multilingual evaluation of generative AI. CoRR, abs/2303.12528, 2023a. doi: 10.48550/arXiv.2303.12528. URL https://doi.org/10.48550/arXiv.2303.12528.
  • Ahuja et al. (2023b) Sanchit Ahuja, Divyanshu Aggarwal, Varun Gumma, Ishaan Watts, Ashutosh Sathe, Millicent Ochieng, Rishav Hada, Prachi Jain, Maxamed Axmed, Kalika Bali, and Sunayana Sitaram. MEGAVERSE: benchmarking large language models across languages, modalities, models and tasks. CoRR, abs/2311.07463, 2023b. doi: 10.48550/ARXIV.2311.07463. URL https://doi.org/10.48550/arXiv.2311.07463.
  • Alabi et al. (2022) Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, and Dietrich Klakow. Adapting pre-trained language models to african languages via multilingual adaptive fine-tuning. In Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, and Seung-Hoon Na (eds.), Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12-17, 2022, pp.  4336–4349. International Committee on Computational Linguistics, 2022. URL https://aclanthology.org/2022.coling-1.382.
  • Alves et al. (2023) Duarte Alves, Nuno Guerreiro, João Alves, José Pombal, Ricardo Rei, José de Souza, Pierre Colombo, and André FT Martins. Steering large language models for machine translation with finetuning and in-context learning. In Findings of the Association for Computational Linguistics: EMNLP 2023, pp.  11127–11148, 2023.
  • Bang et al. (2023) Ye** Bang, Samuel Cahyawijaya, Nayeon Lee, Wenliang Dai, Dan Su, Bryan Wilie, Holy Lovenia, Ziwei Ji, Tiezheng Yu, Willy Chung, Quyet V. Do, Yan Xu, and Pascale Fung. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. CoRR, abs/2302.04023, 2023. doi: 10.48550/arXiv.2302.04023. URL https://doi.org/10.48550/arXiv.2302.04023.
  • Cahyawijaya et al. (2023) Samuel Cahyawijaya, Holy Lovenia, Tiezheng Yu, Willy Chung, and Pascale Fung. Instruct-align: Teaching novel languages with to LLMs through alignment-based cross-lingual instruction. CoRR, abs/2305.13627, 2023. doi: 10.48550/arXiv.2305.13627. URL https://doi.org/10.48550/arXiv.2305.13627.
  • Chau et al. (2020) Ethan C. Chau, Lucy H. Lin, and Noah A. Smith. Parsing with multilingual bert, a small treebank, and a small corpus. In Trevor Cohn, Yulan He, and Yang Liu (eds.), Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020, volume EMNLP 2020 of Findings of ACL, pp. 1324–1334. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.findings-emnlp.118. URL https://doi.org/10.18653/v1/2020.findings-emnlp.118.
  • Chen et al. (2024) Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, Barry Haddow, and Kenneth Heafield. Monolingual or multilingual instruction tuning: Which makes a better Alpaca. In Findings of the Association for Computational Linguistics: EACL, 2024. URL https://doi.org/10.48550/arXiv.2309.08958.
  • Chen et al. (2023) Yihong Chen, Kelly Marchisio, Roberta Raileanu, David Ifeoluwa Adelani, Pontus Stenetorp, Sebastian Riedel, and Mikel Artetxe. Improving language plasticity via pretraining with active forgetting. CoRR, abs/2307.01163, 2023. doi: 10.48550/arXiv.2307.01163. URL https://doi.org/10.48550/arXiv.2307.01163.
  • Clark et al. (2020) Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. ELECTRA: Pre-training text encoders as discriminators rather than generators. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=r1xMH1BtvB.
  • Conneau et al. (2020) Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Édouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.  8440–8451, 2020.
  • Cui et al. (2023) Yiming Cui, Ziqing Yang, and Xin Yao. Efficient and effective text encoding for Chinese LLaMA and Alpaca. CoRR, abs/2304.08177, 2023. doi: 10.48550/ARXIV.2304.08177. URL https://doi.org/10.48550/arXiv.2304.08177.
  • Devlin et al. (2019) Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019.
  • Downey et al. (2023) C. M. Downey, Terra Blevins, Nora Goldfine, and Shane Steinert-Threlkeld. Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages. CoRR, abs/2309.04679, 2023. doi: 10.48550/arXiv.2309.04679. URL https://doi.org/10.48550/arXiv.2309.04679.
  • Ebrahimi & Kann (2021) Abteen Ebrahimi and Katharina Kann. How to adapt your pretrained multilingual model to 1600 languages. In Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (eds.), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, pp.  4555–4567. Association for Computational Linguistics, 2021. doi: 10.18653/v1/2021.acl-long.351. URL https://doi.org/10.18653/v1/2021.acl-long.351.
  • Faisal & Anastasopoulos (2022) Fahim Faisal and Antonios Anastasopoulos. Phylogeny-inspired adaptation of multilingual models to new languages. CoRR, abs/2205.09634, 2022. doi: 10.48550/arXiv.2205.09634. URL https://doi.org/10.48550/arXiv.2205.09634.
  • Hu et al. (2022) Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=nZeVKeeFYf9.
  • ImaniGooghari et al. (2023) Ayyoob ImaniGooghari, Peiqin Lin, Amir Hossein Kargaran, Silvia Severini, Masoud Jalili Sabet, Nora Kassner, Chunlan Ma, Helmut Schmid, André Martins, François Yvon, and Hinrich Schütze. Glot500: Scaling multilingual corpora and language models to 500 languages. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  1082–1117, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.acl-long.61. URL https://aclanthology.org/2023.acl-long.61.
  • Jiang et al. (2023) Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, and William El Sayed. Mistral 7b. CoRR, abs/2310.06825, 2023. doi: 10.48550/ARXIV.2310.06825. URL https://doi.org/10.48550/arXiv.2310.06825.
  • Jiang et al. (2024) Albert Q Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, et al. Mixtral of experts. arXiv preprint arXiv:2401.04088, 2024.
  • Kudo & Richardson (2018) Taku Kudo and John Richardson. SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Eduardo Blanco and Wei Lu (eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018: System Demonstrations, Brussels, Belgium, October 31 - November 4, 2018, pp.  66–71. Association for Computational Linguistics, 2018. doi: 10.18653/v1/d18-2012. URL https://doi.org/10.18653/v1/d18-2012.
  • Kudugunta et al. (2023) Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, et al. MADLAD-400: A multilingual and document-level large audited dataset. arXiv preprint arXiv:2309.04662, 2023.
  • Lai et al. (2023) Viet Dac Lai, Nghia Trung Ngo, Amir Pouran Ben Veyseh, Hieu Man, Franck Dernoncourt, Trung Bui, and Thien Huu Nguyen. ChatGPT beyond English: Towards a comprehensive evaluation of large language models in multilingual learning. CoRR, abs/2304.05613, 2023. doi: 10.48550/arXiv.2304.05613. URL https://doi.org/10.48550/arXiv.2304.05613.
  • Lin et al. (2021) Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, **gfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O’Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona T. Diab, Veselin Stoyanov, and Xian Li. Few-shot learning with multilingual language models. CoRR, abs/2112.10668, 2021. URL https://arxiv.longhoe.net/abs/2112.10668.
  • Liu et al. (2022) Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. What makes good in-context examples for gpt-3? In Eneko Agirre, Marianna Apidianaki, and Ivan Vulic (eds.), Proceedings of Deep Learning Inside Out: The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, DeeLIO@ACL 2022, Dublin, Ireland and Online, May 27, 2022, pp.  100–114. Association for Computational Linguistics, 2022. doi: 10.18653/V1/2022.DEELIO-1.10. URL https://doi.org/10.18653/v1/2022.deelio-1.10.
  • Liu et al. (2020) Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, and Luke Zettlemoyer. Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 8:726–742, 2020.
  • Ma et al. (2023) Chunlan Ma, Ayyoob ImaniGooghari, Haotian Ye, Ehsaneddin Asgari, and Hinrich Schütze. Taxi1500: A multilingual dataset for text classification in 1500 languages, 2023.
  • Mayer & Cysouw (2014) Thomas Mayer and Michael Cysouw. Creating a massively parallel bible corpus. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asunción Moreno, Jan Odijk, and Stelios Piperidis (eds.), Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, Reykjavik, Iceland, May 26-31, 2014, pp.  3158–3163. European Language Resources Association (ELRA), 2014. URL http://www.lrec-conf.org/proceedings/lrec2014/summaries/220.html.
  • Müller et al. (2020) Benjamin Müller, Benoît Sagot, and Djamé Seddah. Can multilingual language models transfer to an unseen dialect? A case study on north african arabizi. CoRR, abs/2005.00318, 2020. URL https://arxiv.longhoe.net/abs/2005.00318.
  • Müller et al. (2021) Benjamin Müller, Antonios Anastasopoulos, Benoît Sagot, and Djamé Seddah. When being unseen from mbert is just the beginning: Handling new languages with multilingual language models. In Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tür, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou (eds.), Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021, pp.  448–462. Association for Computational Linguistics, 2021. doi: 10.18653/v1/2021.naacl-main.38. URL https://doi.org/10.18653/v1/2021.naacl-main.38.
  • Nguyen et al. (2021) Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, and Thien Huu Nguyen. Trankit: A light-weight transformer-based toolkit for multilingual natural language processing. In Dimitra Gkatzia and Djamé Seddah (eds.), Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, EACL 2021, Online, April 19-23, 2021, pp.  80–90. Association for Computational Linguistics, 2021. doi: 10.18653/v1/2021.eacl-demos.10. URL https://doi.org/10.18653/v1/2021.eacl-demos.10.
  • Pfeiffer et al. (2020) Jonas Pfeiffer, Ivan Vulic, Iryna Gurevych, and Sebastian Ruder. MAD-X: an adapter-based framework for multi-task cross-lingual transfer. In Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pp. 7654–7673. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.emnlp-main.617. URL https://doi.org/10.18653/v1/2020.emnlp-main.617.
  • Pfeiffer et al. (2021) Jonas Pfeiffer, Ivan Vulic, Iryna Gurevych, and Sebastian Ruder. Unks everywhere: Adapting multilingual language models to new scripts. In Marie-Francine Moens, Xuan**g Huang, Lucia Specia, and Scott Wen-tau Yih (eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, pp.  10186–10203. Association for Computational Linguistics, 2021. doi: 10.18653/v1/2021.emnlp-main.800. URL https://doi.org/10.18653/v1/2021.emnlp-main.800.
  • Rajbhandari et al. (2020) Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, and Yuxiong He. ZeRO: Memory optimizations toward training trillion parameter models. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pp.  1–16. IEEE, 2020.
  • Rasley et al. (2020) Jeff Rasley, Samyam Rajbhandari, Olatunji Ruwase, and Yuxiong He. DeepSpeed: System optimizations enable training deep learning models with over 100 billion parameters. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp.  3505–3506, 2020.
  • Sabet et al. (2020) Masoud Jalili Sabet, Philipp Dufter, François Yvon, and Hinrich Schütze. Simalign: High quality word alignments without parallel training data using static and contextualized embeddings. In Trevor Cohn, Yulan He, and Yang Liu (eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, EMNLP 2020, Online Event, 16-20 November 2020, volume EMNLP 2020 of Findings of ACL, pp.  1627–1643. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.findings-emnlp.147. URL https://doi.org/10.18653/v1/2020.findings-emnlp.147.
  • Scao et al. (2022) Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, et al. BLOOM: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100, 2022.
  • Shliazhko et al. (2022) Oleh Shliazhko, Alena Fenogenova, Maria Tikhonova, Vladislav Mikhailov, Anastasia Kozlova, and Tatiana Shavrina. mGPT: Few-shot learners go multilingual. CoRR, abs/2204.07580, 2022. doi: 10.48550/arXiv.2204.07580. URL https://doi.org/10.48550/arXiv.2204.07580.
  • Tay et al. (2022) Yi Tay, Mostafa Dehghani, Vinh Q Tran, Xavier Garcia, Jason Wei, Xuezhi Wang, Hyung Won Chung, Dara Bahri, Tal Schuster, Steven Zheng, et al. UL2: Unifying language learning paradigms. In The Eleventh International Conference on Learning Representations, 2022.
  • Touvron et al. (2023a) Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurélien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. LLaMA: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023a. doi: 10.48550/arXiv.2302.13971. URL https://doi.org/10.48550/arXiv.2302.13971.
  • Touvron et al. (2023b) Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton-Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurélien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288, 2023b. doi: 10.48550/arXiv.2307.09288. URL https://doi.org/10.48550/arXiv.2307.09288.
  • Üstün et al. (2020) Ahmet Üstün, Arianna Bisazza, Gosse Bouma, and Gertjan van Noord. Udapter: Language adaptation for truly universal dependency parsing. In Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pp. 2302–2315. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.emnlp-main.180. URL https://doi.org/10.18653/v1/2020.emnlp-main.180.
  • Wang et al. (2022) Xinyi Wang, Sebastian Ruder, and Graham Neubig. Expanding pretrained models to thousands more languages via lexicon-based adaptation. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pp.  863–877. Association for Computational Linguistics, 2022. URL https://aclanthology.org/2022.acl-long.61.
  • Wang et al. (2020) Zihan Wang, Karthikeyan K, Stephen Mayhew, and Dan Roth. Extending multilingual BERT to low-resource languages. In Trevor Cohn, Yulan He, and Yang Liu (eds.), Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020, volume EMNLP 2020 of Findings of ACL, pp. 2649–2656. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.findings-emnlp.240. URL https://doi.org/10.18653/v1/2020.findings-emnlp.240.
  • Wolf et al. (2020) Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pp.  38–45, 2020.
  • Xu et al. (2023) Haoran Xu, Young ** Kim, Amr Sharaf, and Hany Hassan Awadalla. A paradigm shift in machine translation: Boosting translation performance of large language models. CoRR, abs/2309.11674, 2023. doi: 10.48550/ARXIV.2309.11674. URL https://doi.org/10.48550/arXiv.2309.11674.
  • Xue et al. (2021) Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.  483–498, 2021.
  • Xue et al. (2022) Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang, Mihir Kale, Adam Roberts, and Colin Raffel. Byt5: Towards a token-free future with pre-trained byte-to-byte models. Trans. Assoc. Comput. Linguistics, 10:291–306, 2022. doi: 10.1162/tacl“˙a“˙00461. URL https://doi.org/10.1162/tacl_a_00461.
  • Yang et al. (2023) Wen Yang, Chong Li, Jiajun Zhang, and Chengqing Zong. Bigtrans: Augmenting large language models with multilingual translation capability over 100 languages. CoRR, abs/2305.18098, 2023. doi: 10.48550/arXiv.2305.18098. URL https://doi.org/10.48550/arXiv.2305.18098.
  • Yong et al. (2022) Zheng Xin Yong, Hailey Schoelkopf, Niklas Muennighoff, Alham Fikri Aji, David Ifeoluwa Adelani, Khalid Almubarak, M. Saiful Bari, Lintang Sutawika, Jungo Kasai, Ahmed Baruwa, Genta Indra Winata, Stella Biderman, Dragomir Radev, and Vassilina Nikoulina. BLOOM+1: adding language support to BLOOM for zero-shot prompting. CoRR, abs/2212.09535, 2022. doi: 10.48550/arXiv.2212.09535. URL https://doi.org/10.48550/arXiv.2212.09535.
  • Yu et al. (2023) Lili Yu, Daniel Simig, Colin Flaherty, Armen Aghajanyan, Luke Zettlemoyer, and Mike Lewis. MEGABYTE: predicting million-byte sequences with multiscale transformers. CoRR, abs/2305.07185, 2023. doi: 10.48550/arXiv.2305.07185. URL https://doi.org/10.48550/arXiv.2305.07185.
  • Zhao et al. (2024) Jun Zhao, Zhihao Zhang, Qi Zhang, Tao Gui, and Xuan**g Huang. LLaMA beyond English: An empirical study on language capability transfer. arXiv preprint arXiv:2401.01055, 2024.
  • Zhu et al. (2023) Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, **g**g Xu, Shujian Huang, Lingpeng Kong, Jiajun Chen, and Lei Li. Extrapolating large language models to non-english by aligning languages. CoRR, abs/2308.04948, 2023. doi: 10.48550/arXiv.2308.04948. URL https://doi.org/10.48550/arXiv.2308.04948.

Appendix A Languages

The list of languages of Glot500-c used to train MaLA-500 with the number of available sentences and language family information for each language is available in Tables 5,  6 and 7.

Lang Sentnorm𝑆𝑒𝑛𝑡\|Sent\|∥ italic_S italic_e italic_n italic_t ∥ Family Lang Sentnorm𝑆𝑒𝑛𝑡\|Sent\|∥ italic_S italic_e italic_n italic_t ∥ Family Lang Sentnorm𝑆𝑒𝑛𝑡\|Sent\|∥ italic_S italic_e italic_n italic_t ∥ Family
hbs_Latn 63411156 indo1319 hin_Deva 7046700 indo1319 ton_Latn 1216118 aust1307
mal_Mlym 48098273 drav1251 kor_Hang 6468444 kore1284 tah_Latn 1190747 aust1307
aze_Latn 46300705 ory_Orya 6266475 indo1319 lat_Latn 1179913 indo1319
guj_Gujr 45738685 indo1319 urd_Arab 6009594 indo1319 srn_Latn 1172349 indo1319
ben_Beng 43514870 indo1319 swa_Latn 5989369 ewe_Latn 1161605 atla1278
kan_Knda 41836495 drav1251 sqi_Latn 5526836 indo1319 bem_Latn 1111969 atla1278
tel_Telu 41580525 drav1251 bel_Cyrl 5319675 indo1319 efi_Latn 1082621 atla1278
mlt_Latn 40654838 afro1255 afr_Latn 5157787 indo1319 bis_Latn 1070170 indo1319
fra_Latn 39197581 indo1319 nno_Latn 4899103 indo1319 orm_Latn 1067699
spa_Latn 37286756 indo1319 tat_Cyrl 4708088 turk1311 haw_Latn 1062491 aust1307
eng_Latn 36122761 indo1319 ast_Latn 4683554 indo1319 hmo_Latn 1033636 pidg1258
fil_Latn 33493255 aust1307 mon_Cyrl 4616960 mong1349 kat_Geor 1004297 kart1248
nob_Latn 32869205 indo1319 hbs_Cyrl 4598073 indo1319 pag_Latn 983637 aust1307
rus_Cyrl 31787973 indo1319 hau_Latn 4368483 afro1255 loz_Latn 964418 atla1278
deu_Latn 31015993 indo1319 sna_Latn 4019596 atla1278 fry_Latn 957422 indo1319
tur_Latn 29184662 turk1311 msa_Latn 3929084 mya_Mymr 945180 sino1245
pan_Guru 29052537 indo1319 som_Latn 3916769 afro1255 nds_Latn 944715 indo1319
mar_Deva 28748897 indo1319 srp_Cyrl 3864091 indo1319 run_Latn 943828 atla1278
por_Latn 27824391 indo1319 mlg_Latn 3715802 pnb_Arab 899895 indo1319
nld_Latn 25061426 indo1319 zul_Latn 3580113 atla1278 rar_Latn 894515 aust1307
ara_Arab 24524122 arz_Arab 3488224 afro1255 fij_Latn 887134 aust1307
zho_Hani 24143786 nya_Latn 3409030 atla1278 wls_Latn 882167 aust1307
ita_Latn 23539857 indo1319 tam_Taml 3388255 drav1251 ckb_Arab 874441 indo1319
ind_Latn 23018106 aust1307 hat_Latn 3226932 indo1319 ven_Latn 860249 atla1278
ell_Grek 22033282 indo1319 uzb_Latn 3223485 turk1311 zsm_Latn 859947 aust1307
bul_Cyrl 21823004 indo1319 sot_Latn 3205510 atla1278 chv_Cyrl 859863 turk1311
swe_Latn 20725883 indo1319 uzb_Cyrl 3029947 turk1311 lua_Latn 854359 atla1278
ces_Latn 20376340 indo1319 cos_Latn 3015055 indo1319 que_Latn 838486
isl_Latn 19547941 indo1319 als_Latn 2954874 indo1319 sag_Latn 771048 atla1278
pol_Latn 19339945 indo1319 amh_Ethi 2862985 afro1255 guw_Latn 767918 atla1278
ron_Latn 19190217 indo1319 sun_Latn 2586011 aust1307 bre_Latn 748954 indo1319
dan_Latn 19174573 indo1319 war_Latn 2584810 aust1307 toi_Latn 745385 atla1278
hun_Latn 18800025 ural1272 div_Thaa 2418687 indo1319 pus_Arab 731992 indo1319
tgk_Cyrl 18659517 indo1319 yor_Latn 2392359 atla1278 che_Cyrl 728201 nakh1245
srp_Latn 18371769 indo1319 fao_Latn 2365271 indo1319 pis_Latn 714783 indo1319
fas_Arab 18277593 uzn_Cyrl 2293672 turk1311 kon_Latn 685194
ceb_Latn 18149215 aust1307 smo_Latn 2290439 aust1307 oss_Cyrl 683517 indo1319
heb_Hebr 18128962 afro1255 bak_Cyrl 2264196 turk1311 hyw_Armn 679819 indo1319
hrv_Latn 17882932 indo1319 ilo_Latn 2106531 aust1307 iso_Latn 658789 atla1278
glg_Latn 17852274 indo1319 tso_Latn 2100708 atla1278 nan_Latn 656389 sino1245
fin_Latn 16730388 ural1272 mri_Latn 2046850 aust1307 lub_Latn 654390 atla1278
slv_Latn 15719210 indo1319 hmn_Latn 1903898 lim_Latn 652078 indo1319
vie_Latn 15697827 aust1305 asm_Beng 1882353 indo1319 tuk_Latn 649411 turk1311
mkd_Cyrl 14717004 indo1319 hil_Latn 1798875 aust1307 tir_Ethi 649117 afro1255
slk_Latn 14633631 indo1319 nso_Latn 1619354 atla1278 tgk_Latn 636541 indo1319
nor_Latn 14576191 indo1319 ibo_Latn 1543820 atla1278 yua_Latn 610052 maya1287
est_Latn 13600579 kin_Latn 1521612 atla1278 min_Latn 609065 aust1307
ltz_Latn 12997242 indo1319 hye_Armn 1463123 indo1319 lue_Latn 599429 atla1278
eus_Latn 12775959 oci_Latn 1449128 indo1319 khm_Khmr 590429 aust1305
lit_Latn 12479626 indo1319 lin_Latn 1408460 atla1278 tum_Latn 589857 atla1278
kaz_Cyrl 12378727 turk1311 tpi_Latn 1401844 indo1319 tll_Latn 586530 atla1278
lav_Latn 12143980 indo1319 twi_Latn 1400979 atla1278 ekk_Latn 582595 ural1272
bos_Latn 11014744 indo1319 kir_Cyrl 1397566 turk1311 lug_Latn 566948 atla1278
epo_Latn 8737198 arti1236 pap_Latn 1360138 indo1319 niu_Latn 566715 aust1307
cat_Latn 8648271 indo1319 nep_Deva 1317291 indo1319 tzo_Latn 540262 maya1287
tha_Thai 7735209 taik1256 azj_Latn 1315834 turk1311 mah_Latn 534614 aust1307
ukr_Cyrl 7462046 indo1319 bcl_Latn 1284493 aust1307 tvl_Latn 521556 aust1307
tgl_Latn 7411064 aust1307 xho_Latn 1262364 atla1278 jav_Latn 516833 aust1307
sin_Sinh 7293178 indo1319 cym_Latn 1244783 indo1319 vec_Latn 514240 indo1319
gle_Latn 7225513 indo1319 gaa_Latn 1222307 atla1278 jpn_Jpan 510722 japo1237
Table 5: List of languages of Glot500-c (Part I).
Lang Sentnorm𝑆𝑒𝑛𝑡\|Sent\|∥ italic_S italic_e italic_n italic_t ∥ Family Lang Sentnorm𝑆𝑒𝑛𝑡\|Sent\|∥ italic_S italic_e italic_n italic_t ∥ Family Lang Sentnorm𝑆𝑒𝑛𝑡\|Sent\|∥ italic_S italic_e italic_n italic_t ∥ Family
lus_Latn 509250 sino1245 kmb_Latn 296269 atla1278 ncx_Latn 162558 utoa1244
crs_Latn 508755 indo1319 zai_Latn 277632 otom1299 qug_Latn 162500 quec1387
kqn_Latn 507913 atla1278 gym_Latn 274512 chib1249 rmn_Latn 162069 indo1319
ndo_Latn 496613 atla1278 bod_Tibt 273489 sino1245 cjk_Latn 160645 atla1278
snd_Arab 488730 indo1319 nde_Latn 269931 atla1278 arb_Arab 159884 afro1255
yue_Hani 484700 sino1245 fon_Latn 268566 atla1278 kea_Latn 158047 indo1319
tiv_Latn 483064 atla1278 ber_Latn 264426 mck_Latn 157521 atla1278
kua_Latn 473535 atla1278 nbl_Latn 259158 atla1278 arn_Latn 155882 arau1255
kwy_Latn 473274 atla1278 kmr_Latn 256677 indo1319 pdt_Latn 155485 indo1319
hin_Latn 466175 indo1319 guc_Latn 249044 araw1281 her_Latn 154827 atla1278
iku_Cans 465011 mam_Latn 248348 maya1287 gla_Latn 152563 indo1319
kal_Latn 462430 eski1264 nia_Latn 247406 aust1307 kmr_Cyrl 151728 indo1319
tdt_Latn 459818 aust1307 nyn_Latn 241992 atla1278 mwl_Latn 150054 indo1319
gsw_Latn 449240 indo1319 cab_Latn 240101 araw1281 nav_Latn 147702 atha1245
mfe_Latn 447435 indo1319 top_Latn 239232 toto1251 ksw_Mymr 147674 sino1245
swc_Latn 446378 atla1278 tog_Latn 231969 atla1278 mxv_Latn 147591 otom1299
mon_Latn 437950 mong1349 mco_Latn 231209 mixe1284 hif_Latn 147261 indo1319
mos_Latn 437666 atla1278 tzh_Latn 230706 maya1287 wol_Latn 146992 atla1278
kik_Latn 437228 atla1278 pms_Latn 227748 indo1319 sme_Latn 146803 ural1272
cnh_Latn 436667 sino1245 wuu_Hani 224088 sino1245 gom_Latn 143937 indo1319
gil_Latn 434529 aust1307 plt_Latn 220413 aust1307 bum_Latn 141673 atla1278
pon_Latn 434522 aust1307 yid_Hebr 220214 indo1319 mgr_Latn 138953 atla1278
umb_Latn 431589 atla1278 ada_Latn 219427 atla1278 ahk_Latn 135068 sino1245
lvs_Latn 422952 indo1319 iba_Latn 213615 aust1307 kur_Arab 134160 indo1319
sco_Latn 411591 indo1319 kek_Latn 209932 maya1287 bas_Latn 133436 atla1278
ori_Orya 410827 koo_Latn 209375 atla1278 bin_Latn 133256 atla1278
arg_Latn 410683 indo1319 sop_Latn 206501 atla1278 tsz_Latn 133251 tara1323
kur_Latn 407169 indo1319 kac_Latn 205542 sino1245 sid_Latn 130406 afro1255
dhv_Latn 405711 aust1307 qvi_Latn 205447 quec1387 diq_Latn 128908 indo1319
luo_Latn 398974 nilo1247 cak_Latn 204472 maya1287 srd_Latn 127064
lun_Latn 395764 atla1278 kbp_Latn 202877 atla1278 tcf_Latn 126050 otom1299
nzi_Latn 394247 atla1278 ctu_Latn 201662 maya1287 bzj_Latn 124958 indo1319
gug_Latn 392227 tupi1275 kri_Latn 201087 indo1319 udm_Cyrl 121705 ural1272
bar_Latn 387070 indo1319 mau_Latn 199134 otom1299 cce_Latn 120636 atla1278
bci_Latn 384059 atla1278 scn_Latn 199068 indo1319 meu_Latn 120273 aust1307
chk_Latn 380596 aust1307 tyv_Cyrl 198649 turk1311 chw_Latn 119751 atla1278
roh_Latn 377067 indo1319 ina_Latn 197315 arti1236 cbk_Latn 118789 indo1319
aym_Latn 373329 ayma1253 btx_Latn 193701 aust1307 ibg_Latn 118733 aust1307
yap_Latn 358929 aust1307 nch_Latn 193129 utoa1244 bhw_Latn 117381 aust1307
ssw_Latn 356561 atla1278 ncj_Latn 192962 utoa1244 ngu_Latn 116851 utoa1244
quz_Latn 354781 quec1387 pau_Latn 190529 aust1307 nyy_Latn 115914 atla1278
sah_Cyrl 352697 turk1311 toj_Latn 189651 maya1287 szl_Latn 112496 indo1319
tsn_Latn 350954 atla1278 pcm_Latn 187594 indo1319 ish_Latn 111814 atla1278
lmo_Latn 348135 indo1319 dyu_Latn 186367 mand1469 naq_Latn 109747 khoe1240
ido_Latn 331239 arti1236 kss_Latn 185868 atla1278 toh_Latn 107583 atla1278
abk_Cyrl 321578 abkh1242 afb_Arab 183694 afro1255 ttj_Latn 106925 atla1278
zne_Latn 318871 atla1278 urh_Latn 182214 atla1278 nse_Latn 105189 atla1278
quy_Latn 311040 quec1387 quc_Latn 181559 maya1287 hsb_Latn 104802 indo1319
kam_Latn 310659 atla1278 new_Deva 181427 sino1245 ami_Latn 104559 aust1307
bbc_Latn 310420 aust1307 yao_Latn 179965 atla1278 alz_Latn 104392 nilo1247
vol_Latn 310399 arti1236 ngl_Latn 178498 atla1278 apc_Arab 102392 afro1255
wal_Latn 309873 gong1255 nyu_Latn 177483 atla1278 vls_Latn 101900 indo1319
uig_Arab 307302 turk1311 kab_Latn 176015 afro1255 mhr_Cyrl 100474 ural1272
vmw_Latn 306899 atla1278 tuk_Cyrl 175769 turk1311 djk_Latn 99234 indo1319
kwn_Latn 305362 atla1278 xmf_Geor 174994 kart1248 wes_Latn 98492 indo1319
pam_Latn 303737 aust1307 ndc_Latn 174305 atla1278 gkn_Latn 97041 atla1278
seh_Latn 300243 atla1278 san_Deva 165616 indo1319 grc_Grek 96986 indo1319
tsc_Latn 298442 atla1278 nba_Latn 163485 atla1278 hbo_Hebr 96484 afro1255
nyk_Latn 297976 atla1278 bpy_Beng 162838 indo1319 swh_Latn 95776 atla1278
Table 6: List of languages of Glot500-c (Part II).
Lang Sentnorm𝑆𝑒𝑛𝑡\|Sent\|∥ italic_S italic_e italic_n italic_t ∥ Family Lang Sentnorm𝑆𝑒𝑛𝑡\|Sent\|∥ italic_S italic_e italic_n italic_t ∥ Family Lang Sentnorm𝑆𝑒𝑛𝑡\|Sent\|∥ italic_S italic_e italic_n italic_t ∥ Family
alt_Cyrl 95148 turk1311 mny_Latn 50581 atla1278 csy_Latn 34126 sino1245
rmn_Grek 94533 indo1319 gkp_Latn 50549 mand1469 azb_Arab 33758 turk1311
miq_Latn 94343 misu1242 kat_Latn 50424 kart1248 csb_Latn 33743 indo1319
kaa_Cyrl 88815 turk1311 bjn_Latn 49068 aust1307 tpm_Latn 33517 atla1278
kos_Latn 88603 aust1307 acr_Latn 48886 maya1287 quw_Latn 33449 quec1387
grn_Latn 87568 dtp_Latn 48468 aust1307 rmy_Cyrl 33351 indo1319
lhu_Latn 87255 sino1245 lam_Latn 46853 atla1278 ixl_Latn 33289 maya1287
lzh_Hani 86035 sino1245 bik_Latn 46561 mbb_Latn 33240 aust1307
ajp_Arab 83297 afro1255 poh_Latn 46454 maya1287 pfl_Latn 33148 indo1319
cmn_Hani 80745 sino1245 phm_Latn 45862 atla1278 pcd_Latn 32867 indo1319
gcf_Latn 80737 indo1319 hrx_Latn 45716 indo1319 tlh_Latn 32863 arti1236
rmn_Cyrl 79925 indo1319 quh_Latn 45566 quec1387 suz_Deva 32811 sino1245
kjh_Cyrl 79262 turk1311 hyw_Cyrl 45379 indo1319 gcr_Latn 32676 indo1319
rng_Latn 78177 atla1278 rue_Cyrl 45369 indo1319 jbo_Latn 32619 arti1236
mgh_Latn 78117 atla1278 eml_Latn 44630 indo1319 tbz_Latn 32264 atla1278
xmv_Latn 77896 aust1307 acm_Arab 44505 afro1255 bam_Latn 32150 mand1469
ige_Latn 77114 atla1278 tob_Latn 44473 guai1249 prk_Latn 32085 aust1305
rmy_Latn 76991 indo1319 ach_Latn 43974 nilo1247 jam_Latn 32048 indo1319
srm_Latn 76884 indo1319 vep_Latn 43076 ural1272 twx_Latn 32028 atla1278
bak_Latn 76809 turk1311 npi_Deva 43072 indo1319 nmf_Latn 31997 sino1245
gur_Latn 76151 atla1278 tok_Latn 42820 arti1236 caq_Latn 31903 aust1305
idu_Latn 75106 atla1278 sgs_Latn 42467 indo1319 rop_Latn 31889 indo1319
yom_Latn 74818 atla1278 lij_Latn 42447 indo1319 tca_Latn 31852 ticu1244
tdx_Latn 74430 aust1307 myv_Cyrl 42147 ural1272 yan_Latn 31775 misu1242
mzn_Arab 73719 indo1319 tih_Latn 41873 aust1307 xav_Latn 31765 nucl1710
cfm_Latn 70227 sino1245 tat_Latn 41640 turk1311 bih_Deva 31658
zpa_Latn 69237 otom1299 lfn_Latn 41632 arti1236 cuk_Latn 31612 chib1249
kbd_Cyrl 67914 abkh1242 cgg_Latn 41196 atla1278 kjb_Latn 31471 maya1287
lao_Laoo 66966 taik1256 ful_Latn 41188 atla1278 hne_Deva 31465 indo1319
nap_Latn 65826 indo1319 gor_Latn 41174 aust1307 wbm_Latn 31394 aust1305
qub_Latn 64973 quec1387 ile_Latn 40984 arti1236 zlm_Latn 31345 aust1307
oke_Latn 64508 atla1278 ium_Latn 40683 hmon1336 tui_Latn 31161 atla1278
ote_Latn 64224 otom1299 teo_Latn 40203 nilo1247 ifb_Latn 30980 aust1307
bsb_Latn 63634 aust1307 kia_Latn 40035 atla1278 izz_Latn 30894 atla1278
ogo_Latn 61901 atla1278 crh_Cyrl 39985 turk1311 rug_Latn 30857 aust1307
abn_Latn 61830 atla1278 crh_Latn 39896 turk1311 aka_Latn 30704 atla1278
ldi_Latn 61827 atla1278 enm_Latn 39809 indo1319 pxm_Latn 30698 book1242
ayr_Latn 61570 ayma1253 sat_Olck 39614 aust1305 kmm_Latn 30671 sino1245
gom_Deva 61140 indo1319 mad_Latn 38993 aust1307 mcn_Latn 30666 afro1255
bba_Latn 61123 atla1278 cac_Latn 38812 maya1287 ifa_Latn 30621 aust1307
aln_Latn 60989 indo1319 hnj_Latn 38611 hmon1336 dln_Latn 30620 sino1245
leh_Latn 59944 atla1278 ksh_Latn 38130 indo1319 ext_Latn 30605 indo1319
ban_Latn 59805 aust1307 ikk_Latn 38071 atla1278 ksd_Latn 30550 aust1307
ace_Latn 59333 aust1307 sba_Latn 38040 cent2225 mzh_Latn 30517 mata1289
pes_Arab 57511 indo1319 zom_Latn 37013 sino1245 llb_Latn 30480 atla1278
skg_Latn 57228 aust1307 bqc_Latn 36881 mand1469 hra_Latn 30472 sino1245
ary_Arab 56933 afro1255 bim_Latn 36835 atla1278 mwm_Latn 30432 cent2225
hus_Latn 56176 maya1287 mdy_Ethi 36370 gong1255 krc_Cyrl 30353 turk1311
glv_Latn 55641 indo1319 bts_Latn 36216 aust1307 tuc_Latn 30349 aust1307
fat_Latn 55609 atla1278 gya_Latn 35902 atla1278 mrw_Latn 30304 aust1307
frr_Latn 55254 indo1319 ajg_Latn 35631 atla1278 pls_Latn 30136 otom1299
mwn_Latn 54805 atla1278 agw_Latn 35585 aust1307 rap_Latn 30102 aust1307
mai_Deva 54687 indo1319 kom_Cyrl 35249 ural1272 fur_Latn 30052 indo1319
dua_Latn 53392 atla1278 knv_Latn 35196 kaa_Latn 30031 turk1311
dzo_Tibt 52732 sino1245 giz_Latn 35040 afro1255 prs_Arab 26823 indo1319
ctd_Latn 52135 sino1245 hui_Latn 34926 nucl1709 san_Latn 25742 indo1319
nnb_Latn 52041 atla1278 kpg_Latn 34900 aust1307 som_Arab 14199 afro1255
sxn_Latn 51749 aust1307 zea_Latn 34426 indo1319 uig_Latn 9637 turk1311
mps_Latn 50645 tebe1251 aoj_Latn 34349 nucl1708 hau_Arab 9593 afro1255
Table 7: List of languages of Glot500-c (Part III).

Appendix B Detailed Results

Detailed results of evaluation are shown in Tables 8-15 (NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on Glot500-c), Tables 16-21 (NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on PBC), Tables 22-23 (ACC on SIB200), and Tables 24-29 (ACC on Taxi1500).

Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
abk_Cyrl 234.09 249.16 258.26 231.44 164.61
abn_Latn 140.01 197.81 153.58 152.90 111.86
ace_Latn 235.15 332.18 244.00 259.64 168.79
ach_Latn 179.03 227.84 194.55 197.05 161.01
acm_Arab 119.15 153.09 106.29 101.35 135.82
acr_Latn 301.73 399.80 321.79 316.49 194.71
ada_Latn 132.76 168.56 150.19 137.99 103.17
afb_Arab 134.03 169.73 112.55 110.59 152.58
afr_Latn 52.43 84.47 73.24 75.60 64.25
agw_Latn 228.22 318.95 246.48 242.04 152.59
ahk_Latn 229.45 377.60 245.81 241.21 163.96
ajg_Latn 146.48 185.41 170.89 155.21 113.83
ajp_Arab 153.34 199.79 129.62 124.24 164.80
aka_Latn 163.59 223.13 166.49 185.41 131.50
aln_Latn 191.62 259.76 218.75 267.34 143.64
als_Latn 191.60 271.51 219.17 260.14 155.23
alt_Cyrl 199.25 220.77 200.70 215.71 139.18
alz_Latn 167.89 214.64 185.35 171.34 155.03
amh_Ethi 328.25 834.56 407.68 550.50 268.11
ami_Latn 122.67 168.42 131.77 132.36 109.13
aoj_Latn 318.62 495.44 340.07 316.36 196.64
apc_Arab 131.19 153.97 106.78 109.24 145.81
ara_Arab 111.05 155.64 80.72 84.86 140.73
arb_Arab 166.93 318.76 135.76 137.80 173.03
arg_Latn 173.62 306.23 171.32 178.40 160.08
arn_Latn 202.09 292.40 204.32 216.04 163.87
ary_Arab 198.80 309.90 184.82 176.58 173.37
arz_Arab 122.74 248.72 95.61 100.43 131.75
asm_Beng 264.49 409.59 172.35 311.81 184.77
ast_Latn 208.41 325.35 184.93 192.86 178.77
aym_Latn 143.36 183.42 149.06 154.45 117.28
ayr_Latn 274.31 342.40 288.57 293.48 185.87
azb_Arab 254.60 293.24 273.20 285.61 162.94
aze_Latn 156.58 230.45 195.32 189.59 110.56
azj_Latn 168.12 228.08 212.31 199.86 126.98
bak_Cyrl 274.50 348.47 288.93 307.95 169.00
bak_Latn 191.06 259.97 196.98 213.41 152.50
bam_Latn 195.29 251.28 203.50 215.62 171.51
ban_Latn 205.77 297.97 213.20 213.89 186.89
bar_Latn 210.97 287.33 234.73 208.66 188.90
bas_Latn 137.53 172.78 143.37 147.13 110.71
bba_Latn 233.68 286.30 258.58 238.94 164.18
bbc_Latn 172.78 216.78 181.59 170.06 148.89
bci_Latn 176.81 223.93 190.52 189.46 171.00
bcl_Latn 149.22 209.44 162.25 174.40 132.55
bel_Cyrl 110.77 174.19 142.62 147.27 85.11
bem_Latn 182.62 222.50 198.45 150.51 158.31
ben_Beng 92.79 162.83 50.33 55.42 73.86
ber_Latn 88.37 120.03 87.79 101.52 71.90
bhw_Latn 186.42 245.14 194.41 188.81 155.12
bih_Deva 248.12 422.46 176.37 204.17 180.31
bik_Latn 151.63 218.03 173.42 187.11 137.28
bim_Latn 229.29 284.29 244.21 245.34 166.16
bin_Latn 137.28 175.41 152.32 152.02 109.51
bis_Latn 165.83 250.17 179.61 190.13 130.32
bjn_Latn 200.57 302.58 202.67 199.15 182.65
bod_Tibt 437.54 1690.09 461.35 80.21 286.05
bos_Latn 87.13 175.82 131.95 149.85 110.92
bpy_Beng 251.20 471.67 154.31 172.17 155.64
bqc_Latn 208.00 266.53 226.49 205.65 153.58
bre_Latn 222.93 276.71 208.07 260.44 184.35
bsb_Latn 236.62 358.90 275.10 306.64 204.50
bts_Latn 214.80 292.93 232.31 217.74 156.31
btx_Latn 169.13 227.44 181.86 174.25 148.25
bul_Cyrl 47.01 90.81 77.70 42.90 57.12
bum_Latn 183.88 237.35 194.64 195.91 156.33
bzj_Latn 167.62 244.15 188.25 194.46 137.81
Table 8: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on Glot500-c (Part I).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
cab_Latn 222.05 292.04 234.53 237.57 168.63
cac_Latn 293.47 395.52 310.33 301.30 192.22
cak_Latn 295.24 394.87 317.52 309.03 200.69
caq_Latn 240.00 323.71 264.17 257.49 164.95
cat_Latn 94.68 212.17 83.26 86.26 130.00
cbk_Latn 143.05 221.60 145.69 159.41 137.96
cce_Latn 178.45 226.07 190.01 192.54 152.70
ceb_Latn 136.44 278.02 164.94 183.55 123.31
ces_Latn 44.83 98.77 68.48 76.15 58.42
cfm_Latn 240.20 305.25 252.92 256.79 185.94
cgg_Latn 121.16 160.92 127.35 129.19 107.91
che_Cyrl 199.15 272.63 203.57 197.17 158.57
chk_Latn 189.52 258.69 201.19 200.61 145.98
chv_Cyrl 246.19 292.36 252.81 229.56 157.91
chw_Latn 139.07 174.73 142.88 121.98 121.16
cjk_Latn 125.30 158.06 134.03 128.75 106.21
ckb_Arab 372.24 437.95 370.20 521.30 243.30
cmn_Hani 52.17 92.04 40.75 49.81 62.30
cnh_Latn 185.01 242.39 198.20 198.57 147.90
cos_Latn 192.02 323.30 210.38 211.96 185.03
crh_Cyrl 236.43 282.79 239.67 260.03 141.08
crh_Latn 149.67 240.28 168.79 157.01 131.91
crs_Latn 153.11 202.53 153.34 87.81 129.39
csb_Latn 238.86 336.99 261.46 294.41 166.29
csy_Latn 226.53 299.52 249.53 245.14 172.03
ctd_Latn 210.45 276.87 227.39 224.34 158.35
ctu_Latn 216.90 310.89 226.68 220.32 157.27
cuk_Latn 233.42 325.97 252.00 247.83 190.81
cym_Latn 233.91 369.64 306.05 332.89 217.29
dan_Latn 43.75 84.32 69.51 66.96 54.56
deu_Latn 37.46 68.68 49.65 33.88 53.45
dhv_Latn 121.21 170.85 126.68 128.57 95.81
diq_Latn 174.75 265.78 180.00 190.78 147.56
div_Thaa 314.55 565.83 314.34 17.32 153.76
djk_Latn 188.44 249.39 201.50 207.16 163.39
dln_Latn 217.51 288.73 231.93 238.10 165.40
dtp_Latn 267.22 373.92 279.80 287.18 184.75
dua_Latn 131.20 169.64 136.03 129.20 109.86
dyu_Latn 186.37 237.65 193.19 205.47 157.89
dzo_Tibt 238.61 842.40 244.70 47.40 154.48
efi_Latn 178.91 251.07 205.96 203.93 134.40
ekk_Latn 155.86 223.64 194.37 89.18 141.19
ell_Grek 52.85 86.68 67.98 36.04 54.45
eml_Latn 213.57 278.33 224.10 225.17 163.91
eng_Latn 30.45 62.73 31.32 34.36 48.60
enm_Latn 79.08 193.74 108.20 119.78 87.78
epo_Latn 68.89 99.75 79.80 87.72 70.22
est_Latn 70.18 100.28 88.33 40.53 67.38
eus_Latn 79.07 87.15 48.33 45.59 70.49
ewe_Latn 208.53 269.62 218.53 195.99 148.78
ext_Latn 216.92 338.22 211.26 231.30 177.17
fao_Latn 202.04 284.61 227.56 263.89 165.45
fas_Arab 138.13 193.21 163.46 166.76 133.69
fat_Latn 134.67 180.66 144.54 144.50 106.86
fij_Latn 159.86 219.85 191.04 137.83 147.71
fil_Latn 120.89 206.21 162.04 161.84 120.27
fin_Latn 46.88 86.18 79.58 35.79 58.35
fon_Latn 237.19 295.74 256.54 262.29 160.24
fra_Latn 32.26 63.71 31.08 32.74 49.22
frr_Latn 192.91 299.41 206.26 211.00 144.13
fry_Latn 191.87 247.81 205.02 221.64 168.86
ful_Latn 447.47 550.03 457.25 511.87 339.38
fur_Latn 231.23 313.99 234.38 250.02 183.57
gaa_Latn 188.66 232.67 222.71 158.83 146.37
gcf_Latn 132.36 173.10 130.03 91.07 103.54
gcr_Latn 113.22 157.83 115.02 79.46 94.40
gil_Latn 175.92 237.54 187.79 181.71 154.60
giz_Latn 244.47 332.32 268.61 266.29 168.09
Table 9: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on Glot500-c (Part II).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
gkn_Latn 223.58 304.46 253.54 245.24 167.81
gkp_Latn 261.56 358.97 280.80 270.48 186.41
gla_Latn 220.92 382.20 293.89 315.23 210.51
gle_Latn 203.10 345.45 276.11 299.80 206.52
glg_Latn 120.88 204.76 108.43 122.45 132.58
glv_Latn 232.86 326.69 247.79 265.04 182.93
gom_Deva 328.82 462.17 324.77 358.50 233.15
gom_Latn 244.57 318.36 259.71 257.90 209.13
gor_Latn 217.70 326.26 232.98 239.37 168.23
grc_Grek 126.86 277.73 181.00 127.62 141.80
grn_Latn 293.70 382.11 298.10 316.62 204.94
gsw_Latn 180.67 226.37 199.03 171.72 157.34
guc_Latn 241.99 340.92 257.19 234.87 183.29
gug_Latn 197.04 258.55 201.92 214.05 158.39
guj_Gujr 118.82 291.38 74.12 194.71 90.02
gur_Latn 222.48 311.22 243.52 233.99 173.11
guw_Latn 210.29 215.37 235.91 246.28 146.55
gya_Latn 242.48 350.56 274.82 258.26 170.00
gym_Latn 231.32 324.92 249.32 191.13 178.06
hat_Latn 237.00 341.48 251.07 150.39 201.88
hau_Arab 173.08 330.75 130.96 129.69 230.02
hau_Latn 228.21 300.72 257.22 265.65 191.68
haw_Latn 190.18 300.25 217.54 213.20 174.30
hbo_Hebr 140.73 315.06 194.19 200.98 155.08
hbs_Cyrl 206.87 503.80 370.83 417.41 225.22
hbs_Latn 209.02 451.95 333.95 375.92 223.11
heb_Hebr 48.34 63.09 58.19 63.73 56.05
her_Latn 140.31 172.32 146.72 136.86 109.29
hif_Latn 396.80 613.23 471.65 465.81 371.92
hil_Latn 145.89 207.79 161.39 182.01 126.09
hin_Deva 142.07 289.53 105.86 106.38 166.12
hin_Latn 150.11 247.31 166.34 164.94 176.00
hmn_Latn 241.00 375.11 282.60 284.95 182.91
hmo_Latn 165.38 236.46 178.12 142.53 133.19
hne_Deva 201.66 298.38 171.37 184.06 161.30
hnj_Latn 231.56 324.18 263.80 278.94 141.56
hra_Latn 215.87 271.46 228.66 229.46 169.57
hrv_Latn 43.03 82.09 63.02 69.82 54.08
hrx_Latn 131.34 182.33 140.90 135.17 105.20
hsb_Latn 182.90 293.15 211.79 235.34 127.71
hui_Latn 297.34 388.25 319.23 318.32 197.57
hun_Latn 45.03 79.27 75.21 79.06 59.30
hus_Latn 247.96 352.19 260.90 258.32 180.85
hye_Armn 286.18 602.02 372.75 454.38 202.92
hyw_Armn 145.46 263.04 186.63 213.52 110.19
hyw_Cyrl 162.17 231.84 171.73 165.61 117.61
iba_Latn 150.03 192.62 157.75 151.54 133.67
ibg_Latn 115.37 152.94 119.10 122.19 106.08
ibo_Latn 232.57 333.59 223.37 296.17 184.97
ido_Latn 140.94 273.88 153.94 164.61 121.37
idu_Latn 153.00 209.20 162.53 157.46 106.21
ifa_Latn 252.33 328.31 270.66 266.03 172.20
ifb_Latn 257.92 340.56 278.23 272.79 183.83
ige_Latn 148.85 199.02 173.80 176.02 111.50
ikk_Latn 249.37 330.44 284.76 310.26 166.74
iku_Cans 261.21 877.71 343.18 496.50 174.80
ile_Latn 100.28 199.76 105.32 115.20 100.35
ilo_Latn 172.24 227.41 186.11 208.36 146.96
ina_Latn 209.38 408.99 230.14 236.01 201.92
ind_Latn 42.59 69.80 35.50 36.82 56.03
ish_Latn 126.54 178.71 144.92 146.15 101.29
isl_Latn 103.40 156.83 127.49 139.76 83.51
iso_Latn 148.38 175.75 168.85 167.42 104.67
ita_Latn 39.35 79.02 49.94 40.47 53.36
ium_Latn 247.28 361.48 264.84 266.46 167.10
ixl_Latn 327.09 506.74 353.08 348.23 222.05
izz_Latn 301.73 400.14 346.61 361.39 193.24
jam_Latn 204.69 291.31 223.99 231.17 157.87
Table 10: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on Glot500-c (Part III).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
jav_Latn 208.92 275.29 212.60 220.31 180.00
jbo_Latn 103.91 200.82 109.86 112.61 117.25
jpn_Jpan 136.26 301.32 197.23 149.70 150.43
kaa_Cyrl 281.21 363.07 300.20 317.13 146.98
kaa_Latn 284.60 354.51 292.04 309.67 192.43
kab_Latn 192.58 264.51 185.56 216.46 161.31
kac_Latn 210.47 267.38 223.77 249.95 166.44
kal_Latn 240.15 262.90 259.85 155.45 182.71
kam_Latn 153.84 194.60 156.10 186.00 115.78
kan_Knda 216.22 556.40 146.43 355.75 175.17
kat_Geor 302.53 413.90 435.47 483.85 239.51
kat_Latn 184.94 308.06 217.07 208.25 184.20
kaz_Cyrl 257.67 341.78 280.01 297.13 187.85
kbd_Cyrl 212.12 229.63 198.20 202.85 146.86
kbp_Latn 232.17 306.53 257.45 246.16 161.08
kea_Latn 118.17 159.93 121.92 122.29 105.69
kek_Latn 234.79 332.19 244.69 228.87 164.18
khm_Khmr 257.14 815.56 317.46 437.56 167.88
kia_Latn 222.01 298.21 245.77 236.71 164.39
kik_Latn 208.26 277.92 213.92 237.26 159.49
kin_Latn 206.40 237.66 174.18 234.91 168.37
kir_Cyrl 265.65 308.15 277.34 313.50 175.71
kjb_Latn 263.79 353.35 280.16 278.13 179.76
kjh_Cyrl 200.11 251.59 211.84 217.34 147.81
kmb_Latn 132.84 166.09 137.48 118.00 112.99
kmm_Latn 246.57 330.77 263.79 266.44 180.90
kmr_Cyrl 224.23 284.40 226.51 221.22 154.70
kmr_Latn 183.95 220.51 194.67 215.02 142.36
knv_Latn 430.56 581.45 456.13 427.27 232.18
kom_Cyrl 224.18 302.71 249.08 213.41 134.88
kon_Latn 112.77 131.61 116.89 119.41 96.00
koo_Latn 132.73 167.13 144.33 134.74 111.26
kor_Hang 129.20 224.06 180.21 95.71 151.37
kos_Latn 146.15 191.23 153.05 154.26 123.85
kpg_Latn 221.52 321.94 246.33 245.73 148.93
kqn_Latn 125.33 149.57 128.12 109.60 106.08
krc_Cyrl 247.13 292.86 248.83 267.39 167.05
kri_Latn 166.50 240.92 193.15 192.19 140.20
ksd_Latn 198.81 269.96 210.59 212.57 138.81
ksh_Latn 204.72 261.51 220.93 218.50 161.62
kss_Latn 310.35 477.02 335.25 300.31 226.38
ksw_Mymr 210.34 266.24 226.59 154.55 124.78
kua_Latn 179.05 206.09 187.92 151.87 140.72
kur_Arab 402.78 464.44 400.97 550.57 253.61
kur_Latn 633.22 779.47 678.30 748.20 424.98
kwn_Latn 136.80 170.23 141.88 111.31 107.21
kwy_Latn 131.93 160.78 137.77 134.01 110.55
lam_Latn 209.07 276.89 228.12 203.17 176.61
lao_Laoo 405.48 978.35 435.37 583.11 225.06
lat_Latn 167.49 274.19 186.97 210.22 183.32
lav_Latn 193.22 257.06 227.60 252.31 162.80
ldi_Latn 178.84 230.26 185.58 191.19 160.61
leh_Latn 216.80 273.56 230.25 201.57 172.92
lfn_Latn 232.59 368.62 246.45 258.76 187.82
lhu_Latn 209.10 365.95 220.56 219.50 142.74
lij_Latn 328.66 483.81 345.62 348.28 249.64
lim_Latn 199.01 290.80 236.94 239.44 180.52
lin_Latn 161.88 173.63 158.33 180.17 135.66
lit_Latn 163.71 220.62 195.08 225.98 147.53
llb_Latn 135.01 180.06 146.51 135.39 120.02
lmo_Latn 222.22 378.21 247.54 242.80 182.01
loz_Latn 179.54 194.46 185.77 142.19 147.86
ltz_Latn 190.70 303.65 202.02 174.00 169.36
lua_Latn 126.47 147.86 131.94 102.71 102.36
lub_Latn 136.45 143.64 140.96 99.41 111.01
lue_Latn 128.48 158.40 135.27 129.94 103.72
lug_Latn 225.72 318.09 221.56 272.90 196.21
lun_Latn 135.96 170.81 142.71 136.26 113.31
Table 11: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on Glot500-c (Part IV).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
luo_Latn 177.43 224.04 194.07 187.72 156.23
lus_Latn 192.97 251.35 203.37 212.72 163.95
lvs_Latn 154.85 211.40 185.87 198.99 138.63
lzh_Hani 149.57 215.19 130.32 153.38 151.15
mad_Latn 232.71 325.38 245.39 249.29 176.81
mah_Latn 178.50 246.26 188.98 183.17 145.35
mai_Deva 245.94 389.84 189.93 223.00 185.84
mal_Mlym 96.92 171.55 57.45 129.61 72.46
mam_Latn 232.38 315.16 247.28 244.43 189.28
mar_Deva 85.13 143.31 55.38 103.23 70.08
mau_Latn 186.46 333.61 204.91 193.09 161.45
mbb_Latn 282.70 410.99 309.56 307.47 175.50
mck_Latn 191.94 244.49 202.17 191.28 152.16
mcn_Latn 207.28 276.32 220.35 230.56 158.99
mco_Latn 271.45 368.23 281.55 260.70 206.54
mdy_Ethi 306.26 529.46 293.68 369.22 166.26
meu_Latn 177.74 235.19 188.09 168.43 143.62
mfe_Latn 147.50 194.41 143.47 92.23 129.23
mgh_Latn 193.72 257.45 207.05 200.68 166.17
mgr_Latn 183.96 226.09 194.25 149.77 160.18
mhr_Cyrl 230.20 298.73 235.59 236.71 167.55
min_Latn 161.40 266.18 164.13 170.30 166.91
miq_Latn 207.63 276.27 228.42 223.78 160.37
mkd_Cyrl 81.62 144.52 112.99 98.33 74.40
mlg_Latn 185.23 250.78 189.32 226.85 148.82
mlt_Latn 109.60 184.08 139.75 146.69 85.14
mny_Latn 133.04 170.16 135.30 126.14 112.38
mon_Cyrl 397.63 535.59 446.51 555.16 249.95
mon_Latn 354.75 411.54 383.60 383.02 282.85
mos_Latn 197.23 229.14 206.05 212.55 159.69
mps_Latn 347.99 496.26 378.75 366.78 213.10
mri_Latn 154.38 247.38 181.49 179.85 134.55
mrw_Latn 235.11 306.78 250.41 253.18 169.69
msa_Latn 164.05 261.28 155.14 151.77 190.44
mwl_Latn 275.26 410.83 270.47 280.98 202.89
mwm_Latn 293.40 430.46 315.11 294.17 162.95
mwn_Latn 131.84 162.91 138.48 111.20 123.37
mxv_Latn 206.13 324.92 222.48 222.86 171.82
mya_Mymr 383.74 576.49 472.04 277.91 252.84
myv_Cyrl 267.24 357.29 263.68 276.10 188.74
mzh_Latn 257.70 370.86 285.03 276.60 169.96
mzn_Arab 192.75 263.60 200.51 204.50 136.03
nan_Latn 172.36 311.98 186.78 200.62 153.96
nap_Latn 159.24 246.36 179.36 167.94 151.29
naq_Latn 195.43 261.60 207.68 207.27 150.47
nav_Latn 258.40 380.88 284.18 286.04 181.13
nba_Latn 123.68 154.25 130.25 126.08 99.29
nbl_Latn 175.10 238.64 194.74 211.98 154.90
nch_Latn 206.55 287.53 220.86 221.43 183.56
ncj_Latn 185.32 260.91 201.13 196.79 173.80
ncx_Latn 115.71 168.08 121.23 122.70 98.71
ndc_Latn 167.38 222.72 176.18 184.45 158.24
nde_Latn 169.75 235.54 185.98 211.96 151.45
ndo_Latn 192.10 227.02 204.45 150.28 149.69
nds_Latn 195.44 272.44 213.17 204.47 184.93
nep_Deva 232.93 425.83 167.54 291.83 210.52
new_Deva 169.64 330.40 128.26 135.07 103.54
ngl_Latn 134.87 177.05 140.92 115.46 104.59
ngu_Latn 205.16 282.39 215.65 213.78 167.56
nia_Latn 202.30 269.59 214.87 196.19 167.95
niu_Latn 105.04 142.53 111.71 113.36 88.11
nld_Latn 37.77 65.47 55.52 51.54 51.45
nmf_Latn 222.98 290.53 242.04 246.36 167.31
nnb_Latn 200.64 248.60 210.13 212.23 161.20
nno_Latn 138.72 234.11 192.13 199.51 146.16
nob_Latn 50.27 96.43 78.24 73.64 59.05
nor_Latn 78.04 146.26 126.19 123.99 99.50
npi_Deva 212.50 399.24 143.71 290.95 166.12
Table 12: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on Glot500-c (Part V).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
nse_Latn 176.20 234.62 184.77 174.57 161.52
nso_Latn 170.49 227.97 170.96 201.29 142.72
nya_Latn 203.45 299.12 222.51 224.89 175.69
nyk_Latn 131.13 166.47 142.89 138.50 105.26
nyn_Latn 174.51 229.51 189.05 194.14 149.17
nyu_Latn 126.29 172.40 132.49 127.67 99.26
nyy_Latn 215.07 271.23 234.37 220.80 168.74
nzi_Latn 191.21 256.55 219.47 209.42 152.30
oci_Latn 202.93 343.11 207.95 210.24 185.26
ogo_Latn 134.14 185.86 149.15 143.22 118.08
oke_Latn 131.90 166.07 146.98 149.55 102.72
ori_Orya 323.51 839.80 179.33 665.17 203.94
orm_Latn 225.00 334.29 288.51 313.08 201.60
ory_Orya 232.83 572.34 134.20 474.35 164.65
oss_Cyrl 229.49 279.89 229.34 227.24 151.79
ote_Latn 237.06 362.46 254.31 241.61 176.73
pag_Latn 173.32 223.39 184.05 184.76 157.30
pam_Latn 259.01 373.98 274.16 280.47 237.10
pan_Guru 242.88 510.70 153.50 395.85 180.54
pap_Latn 162.79 213.88 174.63 173.16 138.20
pau_Latn 176.42 243.03 188.84 187.10 150.24
pcd_Latn 144.96 228.79 143.18 150.08 140.39
pcm_Latn 159.00 346.35 182.00 179.53 147.50
pdt_Latn 192.69 252.34 199.07 199.80 144.40
pes_Arab 153.46 199.83 175.97 179.97 139.01
pfl_Latn 220.11 315.47 241.84 225.74 176.25
phm_Latn 117.81 162.32 128.28 125.57 100.73
pis_Latn 153.04 237.95 173.91 179.21 130.98
pls_Latn 237.55 350.88 251.43 251.35 175.28
plt_Latn 159.36 220.84 158.06 193.44 131.96
pms_Latn 132.94 257.06 137.39 146.18 106.52
pnb_Arab 345.25 418.35 279.85 240.35 237.22
poh_Latn 389.80 589.86 417.71 416.42 230.35
pol_Latn 44.19 82.29 66.66 71.91 60.02
pon_Latn 177.92 236.38 190.47 189.62 149.49
por_Latn 37.00 66.01 35.14 33.91 48.72
prk_Latn 220.51 301.85 230.42 238.15 148.46
prs_Arab 163.01 218.38 191.40 195.64 141.99
pus_Arab 259.45 327.43 277.81 340.38 203.38
pxm_Latn 299.37 391.48 317.99 307.01 180.85
qub_Latn 210.38 265.76 222.82 172.89 152.70
quc_Latn 248.16 320.50 271.51 258.06 187.13
que_Latn 144.31 170.69 154.62 96.53 121.19
qug_Latn 176.78 225.11 187.16 136.85 143.62
quh_Latn 257.89 293.32 275.35 187.44 175.55
quw_Latn 154.10 205.67 162.83 142.63 142.35
quy_Latn 177.21 202.67 190.48 125.92 139.15
quz_Latn 180.20 211.40 192.52 123.67 142.21
qvi_Latn 178.08 234.53 188.58 156.22 145.79
rap_Latn 204.53 354.21 219.29 226.89 158.90
rar_Latn 169.22 249.96 191.91 189.56 168.88
rmn_Cyrl 129.46 181.44 143.84 137.02 102.76
rmn_Grek 135.82 190.47 141.78 125.21 92.56
rmn_Latn 133.75 175.58 146.05 143.75 112.55
rmy_Cyrl 135.65 184.00 147.87 137.05 109.18
rmy_Latn 189.65 244.12 198.92 205.44 168.77
rng_Latn 122.59 150.36 125.06 129.81 104.16
roh_Latn 235.38 312.78 242.57 253.77 161.16
ron_Latn 44.70 84.55 68.14 74.76 54.82
rop_Latn 233.05 351.35 257.34 275.70 155.36
rue_Cyrl 223.89 402.99 299.90 265.32 179.38
rug_Latn 257.50 348.10 277.13 275.47 169.94
run_Latn 184.59 218.12 161.96 207.06 157.49
rus_Cyrl 65.34 155.39 116.17 67.59 84.56
sag_Latn 162.87 194.78 175.45 155.14 149.65
sah_Cyrl 383.55 455.30 382.36 423.03 218.84
san_Deva 182.35 287.49 189.83 201.00 186.46
san_Latn 242.46 324.45 278.75 282.93 199.18
Table 13: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on Glot500-c (Part VI).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
sat_Olck 654.37 3377.97 667.66 40.17 311.96
sba_Latn 272.45 372.47 303.48 293.62 167.13
scn_Latn 236.20 355.24 263.10 270.02 191.69
sco_Latn 147.94 341.79 193.24 193.20 170.39
seh_Latn 173.46 231.41 177.38 174.40 138.70
sgs_Latn 248.33 313.78 251.35 277.73 182.16
sid_Latn 135.53 180.29 147.72 139.17 114.24
sin_Sinh 82.29 173.16 114.00 137.98 70.77
skg_Latn 128.02 172.67 131.34 145.32 116.16
slk_Latn 62.89 116.82 86.67 103.39 63.93
slv_Latn 42.18 85.28 64.75 73.49 55.26
sme_Latn 288.98 357.31 301.64 295.23 205.46
smo_Latn 220.26 338.16 250.74 252.76 190.00
sna_Latn 221.02 311.60 221.92 258.38 189.74
snd_Arab 209.83 264.61 217.96 260.53 163.07
som_Arab 230.91 410.59 192.88 175.01 265.88
som_Latn 235.21 346.36 286.69 312.99 212.51
sop_Latn 176.17 207.78 188.41 157.21 167.90
sot_Latn 200.82 271.71 205.51 235.65 157.18
spa_Latn 37.28 70.48 34.26 38.65 53.39
sqi_Latn 207.58 295.58 241.22 296.90 172.78
srd_Latn 228.12 341.00 242.74 251.01 179.87
srm_Latn 229.46 318.77 250.79 246.75 173.83
srn_Latn 161.18 183.34 171.30 179.59 132.77
srp_Cyrl 45.22 100.88 77.59 81.95 57.85
srp_Latn 33.66 57.89 43.91 46.74 42.31
ssw_Latn 194.10 264.22 212.99 230.20 165.70
sun_Latn 220.72 314.99 228.18 237.07 203.32
suz_Deva 255.00 400.13 262.34 257.16 157.30
swa_Latn 156.02 208.21 125.78 94.55 151.68
swc_Latn 103.75 133.69 98.32 71.66 102.14
swe_Latn 42.72 82.20 68.89 60.92 56.18
swh_Latn 178.28 223.65 151.05 97.98 161.49
sxn_Latn 243.81 346.98 263.76 260.44 183.47
szl_Latn 132.77 348.45 156.37 177.33 111.32
tah_Latn 114.41 158.18 124.60 121.22 101.40
tam_Taml 231.12 444.83 152.34 146.94 205.53
tat_Cyrl 251.96 301.03 256.66 276.29 159.44
tat_Latn 248.71 338.00 261.10 278.92 186.84
tbz_Latn 273.90 352.17 299.25 281.11 164.62
tca_Latn 306.13 452.15 328.77 316.81 174.51
tcf_Latn 133.72 193.63 138.67 133.58 102.94
tdt_Latn 158.16 217.96 172.56 182.04 130.27
tdx_Latn 125.88 167.70 130.54 135.72 113.29
tel_Telu 94.93 152.33 54.92 47.52 72.02
teo_Latn 193.42 250.17 206.10 193.68 159.90
tgk_Cyrl 313.76 369.08 333.83 342.42 196.57
tgk_Latn 296.86 412.46 342.18 352.59 248.69
tgl_Latn 56.44 94.00 76.30 77.15 64.98
tha_Thai 192.70 331.25 242.28 116.12 175.60
tih_Latn 233.30 329.24 255.15 254.70 158.13
tir_Ethi 267.84 579.39 319.29 424.77 189.73
tiv_Latn 133.38 168.19 140.43 126.42 116.08
tlh_Latn 163.23 258.64 183.94 184.72 111.43
tll_Latn 138.57 167.75 152.44 126.10 105.23
tob_Latn 299.95 450.25 316.77 324.19 182.95
tog_Latn 127.47 165.93 133.37 115.35 102.88
toh_Latn 181.85 238.80 196.33 194.76 146.50
toi_Latn 185.04 233.23 194.93 164.94 165.33
toj_Latn 232.66 311.24 239.53 236.11 198.17
tok_Latn 46.19 61.55 50.56 43.88 47.57
ton_Latn 172.88 243.40 178.94 190.23 141.17
top_Latn 221.27 303.90 232.90 223.12 212.88
tpi_Latn 139.90 209.92 155.89 170.67 120.65
tpm_Latn 214.33 280.83 241.97 231.70 154.99
tsc_Latn 131.42 150.62 130.29 132.42 104.44
tsn_Latn 209.69 291.69 203.77 245.18 169.85
tso_Latn 182.87 208.89 176.69 194.90 142.27
Table 14: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on Glot500-c (Part VII).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
tsz_Latn 183.82 253.97 200.16 176.35 153.98
ttj_Latn 133.35 174.08 142.10 146.60 112.98
tuc_Latn 325.34 444.23 346.52 291.84 180.84
tui_Latn 247.40 330.20 266.54 265.71 181.39
tuk_Cyrl 196.40 248.01 210.45 219.39 143.19
tuk_Latn 217.31 235.95 217.78 238.66 155.71
tum_Latn 184.51 236.91 190.41 153.36 153.44
tur_Latn 48.52 66.76 60.61 34.71 63.33
tvl_Latn 114.81 156.00 123.30 121.62 97.96
twi_Latn 169.99 229.39 171.42 190.50 139.81
twx_Latn 123.52 172.96 130.82 135.08 106.56
tyv_Cyrl 270.89 314.09 275.97 304.11 174.60
tzh_Latn 195.49 274.47 208.05 202.10 162.63
tzo_Latn 223.14 324.35 237.54 228.92 173.78
udm_Cyrl 222.45 277.14 231.98 219.71 160.30
uig_Arab 336.01 432.84 320.43 463.38 207.25
uig_Latn 254.59 292.36 270.85 285.12 203.29
ukr_Cyrl 101.99 240.89 173.79 160.03 136.57
umb_Latn 129.60 165.59 135.06 139.75 100.07
urd_Arab 77.96 105.77 53.61 51.92 81.62
urh_Latn 145.52 153.19 164.21 161.54 108.55
uzb_Cyrl 307.70 353.00 332.86 314.77 178.07
uzb_Latn 307.44 363.61 357.04 383.26 220.74
uzn_Cyrl 233.89 270.06 254.96 247.92 145.01
vec_Latn 163.22 261.93 181.25 168.76 170.03
ven_Latn 190.45 233.75 198.94 198.18 151.65
vep_Latn 316.12 456.77 326.76 243.40 192.08
vie_Latn 108.65 169.92 86.74 91.41 138.89
vls_Latn 200.17 292.89 242.66 253.44 171.13
vmw_Latn 141.25 176.25 143.12 107.10 102.92
vol_Latn 94.00 260.01 85.47 87.18 83.77
wal_Latn 190.62 261.79 201.98 177.73 158.07
war_Latn 127.41 249.86 146.46 166.29 153.84
wbm_Latn 222.06 311.86 234.78 240.27 150.33
wes_Latn 64.78 106.54 73.37 73.73 86.61
wls_Latn 114.80 157.93 125.63 124.58 99.38
wol_Latn 197.17 251.63 171.70 208.78 173.01
wuu_Hani 152.90 283.11 127.83 152.82 145.05
xav_Latn 350.22 619.11 379.76 371.80 201.63
xho_Latn 224.10 315.12 219.57 265.57 187.35
xmf_Geor 260.61 315.58 316.49 376.33 170.15
xmv_Latn 125.37 168.73 129.48 139.97 111.94
yan_Latn 228.46 314.62 248.18 243.68 165.66
yao_Latn 196.25 253.72 209.77 198.91 166.06
yap_Latn 197.98 274.54 212.39 209.00 169.09
yid_Hebr 437.75 571.08 480.37 590.32 295.70
yom_Latn 176.11 220.86 184.62 189.29 150.95
yor_Latn 233.75 283.33 193.55 286.20 185.60
yua_Latn 195.86 284.05 208.08 205.70 161.16
yue_Hani 74.79 131.83 62.91 83.80 74.28
zai_Latn 170.49 223.03 179.18 188.38 148.03
zea_Latn 174.18 271.42 212.95 222.74 155.52
zho_Hani 57.89 99.40 48.19 55.24 70.80
zlm_Latn 106.37 176.09 92.63 93.81 118.56
zne_Latn 127.57 167.13 134.43 115.53 104.95
zom_Latn 214.60 277.57 233.64 228.48 170.06
zpa_Latn 127.29 180.39 129.07 132.30 107.04
zsm_Latn 102.42 171.64 92.39 94.59 123.31
zul_Latn 208.94 340.58 235.91 257.18 192.84
all 190.58 282.46 202.95 205.07 151.25
Table 15: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on Glot500-c (Part VIII).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
ace_Latn 137.43 196.93 144.50 152.49 97.89
ach_Latn 113.66 152.08 123.29 125.31 102.45
acr_Latn 177.86 233.22 188.27 182.33 114.27
afr_Latn 80.43 132.25 116.34 129.21 95.33
agw_Latn 130.32 186.58 136.17 138.27 95.93
ahk_Latn 175.75 291.31 187.63 179.54 116.76
aka_Latn 98.41 135.74 99.46 108.01 78.20
aln_Latn 101.54 147.77 115.11 139.73 82.71
als_Latn 93.47 134.99 106.68 127.57 78.53
alt_Cyrl 122.23 146.47 125.04 134.88 90.21
alz_Latn 107.41 139.39 116.48 109.62 102.48
amh_Ethi 100.60 255.43 121.36 161.34 98.24
aoj_Latn 175.25 270.87 185.66 171.29 114.19
arb_Arab 94.03 186.47 77.67 78.12 104.22
arn_Latn 141.08 205.53 143.63 154.61 113.59
ary_Arab 128.97 212.49 125.35 118.25 104.58
arz_Arab 80.59 185.56 64.91 66.52 92.22
asm_Beng 123.16 196.03 79.80 147.49 101.89
ayr_Latn 149.96 188.09 154.45 157.13 106.66
azb_Arab 134.00 160.29 139.49 144.68 93.09
aze_Latn 97.68 131.80 113.03 106.38 90.96
bak_Cyrl 134.49 169.96 133.79 150.46 93.49
bam_Latn 109.68 147.72 110.24 118.13 91.88
ban_Latn 138.98 195.92 147.93 149.04 111.51
bar_Latn 114.49 154.37 121.74 113.26 108.65
bba_Latn 132.00 166.51 146.31 131.24 96.87
bbc_Latn 110.66 143.87 117.12 107.02 100.17
bci_Latn 117.42 156.47 125.70 124.26 126.80
bcl_Latn 101.39 146.46 109.03 116.78 88.46
bel_Cyrl 92.30 137.26 110.12 118.30 88.89
bem_Latn 125.52 158.97 135.60 104.16 107.57
ben_Beng 111.68 194.50 68.00 77.83 105.61
bhw_Latn 124.94 169.40 130.40 123.48 101.65
bim_Latn 124.64 162.78 132.77 130.01 96.33
bis_Latn 126.46 196.19 136.72 148.29 95.85
bod_Tibt 138.16 525.70 144.33 30.40 105.99
bqc_Latn 113.18 149.13 122.07 112.76 91.11
bre_Latn 120.49 151.33 111.97 139.13 105.99
bts_Latn 111.90 154.57 120.61 110.17 89.16
btx_Latn 118.13 163.25 128.43 125.76 103.19
bul_Cyrl 66.25 124.78 104.01 42.33 85.30
bum_Latn 116.16 153.66 121.83 121.74 101.82
bzj_Latn 115.75 175.63 128.70 135.59 93.15
cab_Latn 164.07 215.31 172.22 174.35 123.20
cac_Latn 169.42 231.73 176.29 175.63 116.03
cak_Latn 185.42 246.76 193.62 191.54 123.65
caq_Latn 128.13 174.12 141.21 138.17 95.54
cat_Latn 54.93 118.69 44.29 45.98 76.47
cbk_Latn 103.50 154.23 105.08 108.15 91.19
cce_Latn 124.20 159.68 133.40 132.89 106.02
ceb_Latn 99.37 146.70 113.69 132.72 94.43
ces_Latn 62.40 133.26 101.82 114.59 86.91
cfm_Latn 138.07 179.26 142.58 143.43 107.20
che_Cyrl 152.68 188.52 146.76 148.42 126.87
chk_Latn 128.34 180.14 133.81 134.01 97.76
chv_Cyrl 132.89 166.12 138.37 128.58 91.96
ckb_Arab 126.47 155.90 125.59 164.22 100.65
cmn_Hani 63.67 121.22 51.49 60.91 76.95
cnh_Latn 129.26 175.83 134.65 139.53 104.21
crh_Cyrl 128.56 166.14 128.91 139.13 82.61
crs_Latn 100.72 139.95 101.88 57.70 80.86
csy_Latn 125.81 172.44 138.22 132.16 100.90
ctd_Latn 120.85 163.07 128.99 125.52 92.79
ctu_Latn 156.04 220.78 162.45 157.63 112.41
cuk_Latn 151.95 213.08 159.59 156.10 119.01
cym_Latn 110.34 165.10 135.91 147.72 103.89
Table 16: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on PBC (Part I).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
dan_Latn 63.65 114.43 97.95 101.13 86.68
deu_Latn 57.09 109.69 84.08 54.00 80.90
djk_Latn 143.19 192.80 147.13 153.74 120.66
dln_Latn 113.43 155.19 118.82 125.73 92.37
dtp_Latn 158.44 222.01 165.46 169.63 111.77
dyu_Latn 122.24 161.61 126.53 132.73 103.04
dzo_Tibt 157.37 550.44 162.42 36.35 99.22
efi_Latn 121.73 173.61 139.58 136.78 90.15
ell_Grek 80.65 169.16 109.07 57.11 105.74
eng_Latn 28.40 93.81 40.01 42.56 46.91
enm_Latn 45.43 113.74 62.99 66.87 55.22
epo_Latn 79.83 125.27 88.81 100.79 85.24
est_Latn 93.49 128.66 109.45 45.04 99.10
eus_Latn 133.89 145.19 101.06 78.92 150.43
ewe_Latn 140.69 190.49 147.85 133.36 103.15
fao_Latn 101.92 150.02 113.16 134.84 93.21
fas_Arab 87.19 121.18 99.32 104.15 77.85
fij_Latn 110.29 158.90 130.18 97.89 97.65
fil_Latn 74.66 130.09 106.51 109.03 84.32
fin_Latn 68.42 125.52 116.35 38.52 91.75
fon_Latn 160.80 210.76 176.23 178.40 107.16
fra_Latn 46.01 105.73 38.57 44.16 73.33
fry_Latn 111.69 146.88 111.45 123.28 100.32
gaa_Latn 128.54 165.53 145.88 107.90 100.68
gil_Latn 125.22 171.28 131.71 130.68 106.24
giz_Latn 131.75 183.07 145.21 143.35 97.84
gkn_Latn 151.99 210.57 167.40 166.78 116.75
gkp_Latn 159.33 219.00 168.31 166.05 110.30
gla_Latn 102.90 174.06 129.10 138.42 100.51
gle_Latn 102.09 161.80 132.57 146.14 116.86
glv_Latn 122.94 172.06 126.34 134.37 98.35
gom_Latn 149.35 199.59 155.54 159.44 129.81
gor_Latn 156.67 215.11 170.13 167.89 115.02
grc_Grek 64.91 153.70 93.39 68.67 81.49
guc_Latn 193.75 271.60 202.31 190.66 138.75
gug_Latn 139.06 183.84 146.45 151.28 114.14
guj_Gujr 121.18 329.05 86.23 202.19 107.88
gur_Latn 143.42 208.51 152.80 148.18 106.41
guw_Latn 142.60 155.00 158.22 166.16 98.92
gya_Latn 130.25 197.61 146.23 137.31 99.85
gym_Latn 180.93 262.58 196.74 161.03 135.73
hat_Latn 112.20 159.68 116.00 48.45 90.71
hau_Latn 105.95 146.45 117.21 127.18 96.63
haw_Latn 91.42 140.04 102.87 102.50 91.03
heb_Hebr 86.85 197.96 113.81 125.21 143.56
hif_Latn 104.78 161.10 114.69 116.63 107.93
hil_Latn 103.93 151.84 112.82 130.28 90.13
hin_Deva 87.35 175.19 62.49 63.21 103.09
hin_Latn 102.01 144.04 112.84 112.96 109.68
hmo_Latn 119.64 179.32 128.46 103.09 91.86
hne_Deva 124.72 183.69 106.59 120.10 94.27
hnj_Latn 126.88 186.09 144.08 149.87 89.64
hra_Latn 116.66 151.27 122.49 122.14 96.72
hrv_Latn 62.52 125.68 96.82 107.18 73.96
hui_Latn 151.46 203.46 161.05 161.36 108.54
hun_Latn 69.17 118.92 117.60 125.55 94.04
hus_Latn 170.91 241.76 179.70 177.42 120.81
hye_Armn 111.94 219.94 141.97 171.24 89.75
iba_Latn 102.40 135.32 109.00 102.90 87.43
ibo_Latn 131.16 189.15 130.12 172.79 112.01
ifa_Latn 140.53 194.86 151.53 148.37 102.38
ifb_Latn 149.93 198.42 157.12 156.49 107.60
ikk_Latn 132.84 186.95 150.31 163.16 95.14
ilo_Latn 119.72 162.55 127.85 146.58 102.18
ind_Latn 66.39 121.78 58.14 58.36 80.77
isl_Latn 92.39 137.42 113.54 123.83 94.12
ita_Latn 54.57 116.50 73.53 52.57 78.23
Table 17: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on PBC (Part II).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
ium_Latn 150.62 222.39 155.20 157.52 99.54
ixl_Latn 190.07 299.20 206.08 202.92 127.52
izz_Latn 167.28 228.45 195.19 198.57 118.78
jam_Latn 119.85 181.93 134.52 139.07 96.42
jav_Latn 134.11 171.16 136.06 140.01 109.34
jpn_Jpan 67.67 114.11 84.64 61.57 88.53
kaa_Cyrl 136.14 179.48 138.63 153.33 84.79
kaa_Latn 134.02 172.76 135.14 145.18 99.15
kab_Latn 137.81 193.54 129.87 159.45 117.96
kac_Latn 141.33 187.59 150.24 163.68 110.99
kal_Latn 120.90 143.71 134.44 90.38 109.58
kan_Knda 128.60 336.06 93.77 210.99 110.09
kat_Geor 103.81 132.04 144.43 155.32 93.39
kaz_Cyrl 129.49 166.60 137.43 150.12 108.56
kbp_Latn 151.83 205.24 166.76 156.21 105.09
kek_Latn 161.79 230.77 168.62 155.46 110.43
khm_Khmr 141.48 453.97 161.21 233.38 100.53
kia_Latn 122.81 171.17 136.22 131.73 95.76
kik_Latn 141.34 189.92 143.91 155.69 106.53
kin_Latn 110.75 137.92 101.14 123.88 99.96
kir_Cyrl 125.74 148.16 127.29 148.79 94.02
kjb_Latn 152.31 205.47 156.49 160.88 109.02
kjh_Cyrl 133.84 168.82 142.31 145.53 97.43
kmm_Latn 137.88 185.46 149.16 145.91 107.81
kmr_Cyrl 139.23 182.66 137.99 142.19 103.56
kmr_Latn 120.54 149.31 124.74 136.78 96.93
knv_Latn 249.77 346.55 266.68 245.87 135.66
kor_Hang 66.58 119.14 92.53 42.28 82.45
kpg_Latn 128.18 190.92 139.68 135.05 90.65
krc_Cyrl 123.42 149.60 119.82 130.97 89.22
kri_Latn 118.15 172.62 134.67 131.33 96.69
ksd_Latn 108.75 155.22 117.82 116.91 84.44
kss_Latn 248.46 385.70 269.58 224.81 174.70
ksw_Mymr 145.34 187.44 155.38 107.97 94.71
kua_Latn 118.00 142.31 125.97 104.16 99.83
lam_Latn 145.51 199.78 154.91 139.07 115.48
lao_Laoo 163.17 414.25 172.28 234.46 116.39
lat_Latn 56.98 102.85 65.60 73.77 73.15
lav_Latn 90.61 119.37 103.75 114.03 94.92
ldi_Latn 118.61 161.07 122.03 124.26 112.27
leh_Latn 131.72 169.51 140.67 124.57 104.47
lhu_Latn 147.32 262.73 152.94 153.96 100.83
lin_Latn 113.81 128.30 110.20 123.56 92.52
lit_Latn 92.16 120.15 107.63 123.52 97.69
loz_Latn 119.93 140.69 125.00 98.71 99.46
ltz_Latn 114.62 156.92 114.49 104.79 96.89
lug_Latn 117.59 174.83 115.12 143.99 107.58
luo_Latn 118.37 158.54 129.88 126.61 108.96
lus_Latn 122.17 159.02 125.37 133.65 103.21
lzh_Hani 62.06 88.07 54.92 60.19 66.36
mad_Latn 136.26 192.90 146.43 145.94 103.63
mah_Latn 113.96 159.42 120.91 110.27 97.45
mai_Deva 136.92 209.91 108.91 126.23 100.39
mal_Mlym 111.12 210.81 72.27 126.62 105.00
mam_Latn 173.35 227.62 181.33 179.63 138.57
mar_Deva 105.80 184.52 83.30 141.12 106.37
mau_Latn 139.06 259.48 153.06 140.49 148.96
mbb_Latn 160.77 237.84 174.36 171.35 101.96
mck_Latn 124.95 161.95 131.37 123.87 99.72
mcn_Latn 110.95 153.55 120.44 123.48 96.39
mco_Latn 203.59 285.23 205.92 192.68 159.16
mdy_Ethi 164.72 284.41 157.66 188.38 92.89
meu_Latn 111.26 152.92 120.09 103.47 91.50
mfe_Latn 99.68 136.00 98.60 55.39 80.86
mgh_Latn 131.75 181.11 140.22 136.00 118.72
mgr_Latn 126.60 154.99 129.40 106.55 108.42
Table 18: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on PBC (Part III).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
mhr_Cyrl 122.42 160.48 119.36 127.42 100.09
min_Latn 139.41 194.79 136.22 138.87 133.30
miq_Latn 129.28 182.98 144.36 141.12 104.92
mkd_Cyrl 85.29 151.22 112.67 89.21 89.46
mlg_Latn 107.66 135.73 106.88 128.75 86.60
mlt_Latn 108.58 168.92 134.24 137.26 107.12
mos_Latn 129.97 161.07 135.30 138.61 112.98
mps_Latn 196.29 283.21 212.92 204.15 126.56
mri_Latn 87.56 138.21 103.82 111.33 88.68
mrw_Latn 127.39 174.88 134.59 133.06 99.21
msa_Latn 104.71 152.69 97.60 93.04 113.32
mwm_Latn 159.30 238.34 171.46 159.27 99.80
mxv_Latn 146.98 235.76 162.84 164.53 126.52
mya_Mymr 162.62 248.51 185.69 84.78 107.92
myv_Cyrl 148.95 192.16 140.65 152.76 110.76
mzh_Latn 146.28 217.81 160.03 153.09 101.97
nan_Latn 130.85 204.44 144.08 138.29 118.30
naq_Latn 126.47 179.33 139.25 135.80 100.90
nav_Latn 167.01 233.91 176.25 183.89 119.97
nbl_Latn 109.14 148.07 114.06 127.75 96.55
nch_Latn 155.09 212.74 165.40 171.74 144.74
ncj_Latn 131.14 184.55 137.38 140.86 129.63
ndc_Latn 106.50 151.16 111.70 117.94 107.15
nde_Latn 106.83 152.97 114.79 133.97 100.83
ndo_Latn 132.12 162.83 138.17 107.11 105.82
nds_Latn 123.29 166.55 124.21 125.62 123.87
nep_Deva 109.47 199.05 81.70 141.10 103.11
ngu_Latn 148.78 204.17 156.73 156.92 120.15
nia_Latn 135.60 192.37 143.95 130.11 111.86
nld_Latn 58.81 114.31 96.47 97.28 82.78
nmf_Latn 122.39 165.17 130.43 134.30 98.07
nnb_Latn 122.26 163.28 127.40 131.83 98.36
nno_Latn 80.33 133.43 102.92 112.53 86.17
nob_Latn 61.45 126.38 100.25 98.89 80.02
nor_Latn 56.27 104.11 87.94 86.18 71.86
npi_Deva 115.63 219.43 78.62 159.24 96.97
nse_Latn 116.86 157.47 124.34 116.64 109.55
nso_Latn 116.55 160.63 114.34 132.40 97.49
nya_Latn 112.30 160.76 116.85 124.27 101.20
nyn_Latn 120.67 159.71 127.46 131.05 106.34
nyy_Latn 153.10 189.04 164.69 160.66 121.06
nzi_Latn 130.01 179.62 150.60 141.28 101.29
ori_Orya 148.25 392.96 91.33 296.43 98.06
ory_Orya 143.02 352.28 95.95 282.70 106.99
oss_Cyrl 140.22 182.75 141.83 139.80 97.09
ote_Latn 160.20 247.13 175.42 168.28 119.40
pag_Latn 123.49 163.26 131.05 133.56 109.90
pam_Latn 117.54 163.02 121.69 130.95 103.78
pan_Guru 130.07 286.36 90.80 208.67 106.44
pap_Latn 110.09 149.82 118.80 114.73 92.08
pau_Latn 125.22 178.67 132.62 131.85 104.80
pcm_Latn 76.80 127.05 89.92 91.28 79.44
pdt_Latn 124.83 175.03 129.63 126.61 97.29
pes_Arab 91.68 129.42 105.39 105.84 84.63
pis_Latn 118.50 180.76 130.26 133.14 95.70
pls_Latn 147.97 217.00 152.42 153.90 104.29
plt_Latn 113.52 139.96 112.22 139.93 89.18
poh_Latn 240.95 363.61 257.65 256.21 140.88
pol_Latn 61.88 111.24 97.90 107.87 85.46
pon_Latn 123.40 164.68 131.48 125.12 105.87
por_Latn 53.69 106.83 42.29 45.85 75.88
prk_Latn 118.66 167.68 121.37 128.55 94.48
prs_Arab 88.26 123.81 99.63 105.09 80.28
pxm_Latn 154.30 207.27 160.81 160.27 102.81
qub_Latn 133.85 172.77 139.98 107.69 93.49
quc_Latn 176.66 222.18 191.60 178.35 124.30
Table 19: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on PBC (Part IV).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
qug_Latn 124.11 158.66 131.62 95.23 95.07
quh_Latn 148.83 174.81 154.34 107.04 106.24
quw_Latn 104.78 139.63 109.91 95.69 92.58
quy_Latn 119.84 140.49 127.93 85.16 94.14
quz_Latn 126.18 149.08 134.60 85.68 96.32
qvi_Latn 134.03 177.51 139.73 114.64 100.81
rap_Latn 139.27 239.23 152.39 157.18 100.81
rar_Latn 136.30 205.48 152.87 149.88 123.36
rmy_Latn 124.05 164.59 129.35 132.97 108.84
ron_Latn 71.75 145.55 113.22 136.42 92.16
rop_Latn 141.24 218.11 152.37 163.59 93.46
rug_Latn 144.21 200.64 155.21 151.73 99.72
run_Latn 111.11 140.11 101.95 120.78 99.61
rus_Cyrl 57.09 115.06 85.44 48.93 78.66
sag_Latn 118.57 144.91 123.32 113.66 101.74
sah_Cyrl 140.83 175.34 139.86 155.36 99.78
san_Deva 120.11 183.77 128.71 131.38 123.45
san_Latn 133.78 188.35 151.17 152.84 112.82
sba_Latn 147.44 205.90 167.36 154.66 98.05
seh_Latn 116.71 159.73 123.08 121.65 100.51
sin_Sinh 133.79 283.43 166.13 228.72 113.72
slk_Latn 75.89 141.01 105.13 123.74 89.45
slv_Latn 75.67 140.40 111.88 127.31 95.15
sme_Latn 134.17 166.51 131.28 132.85 103.75
smo_Latn 113.64 165.04 126.68 127.57 96.65
sna_Latn 107.03 157.69 112.48 124.30 99.14
snd_Arab 141.08 183.48 144.96 173.65 107.47
som_Latn 114.80 163.60 131.06 149.83 110.34
sop_Latn 120.92 148.69 129.81 113.62 117.37
sot_Latn 112.14 155.55 113.59 127.04 95.35
spa_Latn 49.64 107.41 43.22 48.95 69.30
sqi_Latn 106.17 145.44 116.56 140.41 92.13
srm_Latn 172.30 242.13 187.65 185.42 124.54
srn_Latn 112.06 137.43 113.65 121.74 91.24
srp_Cyrl 57.16 129.53 97.17 99.06 71.36
srp_Latn 61.53 124.70 95.02 105.00 71.54
ssw_Latn 120.48 172.73 132.69 140.25 104.17
sun_Latn 123.92 165.15 124.81 129.90 111.93
suz_Deva 141.06 222.01 143.66 139.17 93.18
swe_Latn 60.78 124.59 105.53 99.90 86.99
swh_Latn 97.92 131.52 87.87 54.27 90.87
sxn_Latn 173.04 249.27 188.33 183.25 124.96
tam_Taml 109.64 213.05 70.91 64.81 100.45
tat_Cyrl 136.52 167.63 136.15 147.39 94.42
tbz_Latn 135.12 176.55 145.64 137.50 88.42
tca_Latn 202.44 294.68 215.39 207.66 112.33
tdt_Latn 114.70 164.66 123.50 129.26 93.86
tel_Telu 122.18 196.41 91.03 65.40 115.17
teo_Latn 115.64 157.71 122.24 117.99 99.95
tgk_Cyrl 128.86 144.47 127.10 140.25 101.04
tgl_Latn 74.71 130.27 109.14 110.56 85.29
tha_Thai 107.69 187.16 134.02 58.96 101.09
tih_Latn 129.95 188.97 139.91 137.24 89.82
tir_Ethi 122.75 258.00 143.76 190.93 99.03
tlh_Latn 87.59 142.90 97.02 97.38 62.55
tob_Latn 179.90 269.36 189.99 191.60 107.12
toh_Latn 127.60 171.62 136.60 136.06 104.79
toi_Latn 124.75 166.04 133.32 114.25 114.54
toj_Latn 175.52 237.75 181.56 177.72 148.72
ton_Latn 120.61 179.89 125.92 137.67 98.37
top_Latn 165.19 223.46 174.82 173.24 164.19
tpi_Latn 105.47 161.38 117.02 128.35 84.22
tpm_Latn 120.52 166.93 131.95 129.07 89.91
tsn_Latn 112.13 163.36 113.76 129.32 96.63
tso_Latn 125.25 155.16 120.37 134.66 103.51
tsz_Latn 129.96 184.88 142.29 126.39 110.66
tuc_Latn 187.91 261.93 196.20 166.43 106.46
Table 20: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on PBC (Part V).
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MALA-500
tui_Latn 135.41 187.71 146.21 146.43 107.57
tuk_Cyrl 127.20 168.02 136.69 145.86 94.72
tuk_Latn 123.72 144.22 124.67 135.21 97.42
tum_Latn 127.49 165.45 130.05 109.19 102.35
tur_Latn 76.97 118.11 102.96 57.95 99.22
twi_Latn 110.21 159.12 110.54 122.43 93.81
tyv_Cyrl 165.82 197.37 164.25 181.87 107.33
tzh_Latn 147.06 205.37 157.00 148.16 118.46
tzo_Latn 166.45 248.81 178.03 173.52 122.42
udm_Cyrl 138.00 176.90 140.39 137.21 102.56
uig_Arab 166.57 226.61 157.03 229.43 114.04
uig_Latn 145.11 165.68 156.02 157.78 121.77
ukr_Cyrl 68.45 134.95 101.40 93.65 92.95
urd_Arab 99.74 141.13 74.20 63.53 110.49
uzb_Cyrl 128.48 149.94 136.85 135.60 88.90
uzb_Latn 118.83 138.99 132.96 145.00 95.19
uzn_Cyrl 136.04 160.00 145.95 142.34 94.12
ven_Latn 131.18 172.05 138.68 137.88 104.82
vie_Latn 74.42 115.85 56.37 59.30 91.10
wal_Latn 129.99 180.12 134.43 122.12 105.68
war_Latn 111.26 159.23 118.85 131.06 113.74
wbm_Latn 120.96 174.48 126.11 128.99 94.78
wol_Latn 115.67 154.99 101.97 127.09 102.93
xav_Latn 243.35 430.83 263.49 257.10 137.76
xho_Latn 112.96 155.63 109.11 135.39 107.43
yan_Latn 125.81 179.37 136.34 131.31 98.47
yao_Latn 143.68 187.96 148.92 143.16 114.54
yap_Latn 150.28 207.86 157.67 157.91 123.07
yom_Latn 118.61 155.58 121.64 126.20 100.18
yor_Latn 129.77 166.68 100.87 155.88 105.79
yua_Latn 148.34 218.12 155.65 156.40 118.30
yue_Hani 64.57 122.47 54.42 62.71 87.78
zai_Latn 121.90 161.31 121.61 129.12 108.99
zho_Hani 64.02 115.19 51.79 63.22 69.53
zlm_Latn 57.83 101.11 48.96 51.87 64.76
zom_Latn 119.86 159.31 128.06 125.99 98.96
zsm_Latn 60.40 110.60 51.75 52.51 70.43
zul_Latn 103.20 157.55 113.26 130.35 98.06
all 122.10 180.54 129.55 131.31 101.67
Table 21: Detailed results of NLL𝑁𝐿𝐿NLLitalic_N italic_L italic_L on PBC (Part VI).
Lang LLaMA 2 7B mGPT 13B BLOOM 7B1 XGLM 7.5B MaLA-500
1-shot 2-shot 3-shot 4-shot 5-shot 6-shot 7-shot 8-shot 9-shot 10-shot
ace_Latn 44.12 47.55 50.00 36.76 34.31 52.94 60.29 60.78 65.69 67.65 64.22 65.20 68.63 71.57
acm_Arab 52.45 65.69 69.12 58.33 32.35 53.43 59.31 63.73 63.73 67.16 66.67 69.12 66.67 66.67
afr_Latn 68.14 55.39 53.92 40.20 41.18 62.25 65.69 69.12 71.08 74.02 73.53 74.51 76.96 78.92
ajp_Arab 47.55 64.22 68.63 53.43 33.33 56.86 59.80 59.80 65.20 63.73 63.24 69.12 68.14 66.67
als_Latn 41.67 46.57 45.59 28.43 27.94 51.96 63.73 62.25 69.12 71.08 69.61 72.06 75.98 77.45
amh_Ethi 15.69 18.63 16.67 13.24 25.00 36.76 45.59 51.47 51.96 53.92 51.96 53.43 54.90 53.92
apc_Arab 46.57 65.69 68.14 53.43 31.37 55.88 60.29 65.69 65.69 67.16 65.20 68.63 67.65 72.06
arb_Arab 53.43 63.24 68.14 57.35 32.35 54.90 60.29 63.73 65.20 68.14 67.16 69.12 68.63 70.59
ary_Arab 45.10 57.84 69.12 50.49 26.47 52.45 55.39 56.37 60.29 59.80 64.22 63.73 59.31 64.71
arz_Arab 50.98 64.22 68.14 56.86 30.88 52.45 59.31 60.78 64.22 66.18 68.14 69.12 66.67 69.12
asm_Beng 17.16 49.02 61.27 37.25 31.37 53.43 58.82 65.20 67.65 67.65 68.14 67.65 67.65 67.65
ast_Latn 69.12 60.78 69.12 55.39 34.31 65.69 70.10 70.59 74.02 75.00 75.98 77.94 79.90 79.90
ayr_Latn 25.00 26.96 32.35 19.61 20.10 29.41 38.24 38.73 38.24 43.14 40.20 43.14 44.61 42.16
azb_Arab 25.49 41.67 32.84 24.51 25.98 41.18 45.59 45.10 46.57 49.51 50.00 49.02 50.49 49.51
azj_Latn 34.80 64.22 37.25 32.84 30.88 57.84 64.71 68.63 64.22 72.55 69.61 70.59 74.02 72.55
bak_Cyrl 38.73 61.27 32.35 32.35 34.80 51.47 60.29 61.27 69.12 68.63 68.14 68.63 73.53 70.10
bam_Latn 25.49 24.51 29.41 20.10 22.55 25.98 34.80 42.16 43.14 44.12 45.10 42.16 46.08 44.12
ban_Latn 58.82 51.47 58.82 43.14 28.92 55.39 63.24 65.69 66.67 72.06 72.06 72.55 72.06 71.57
bel_Cyrl 47.55 59.80 28.92 30.39 40.69 60.29 63.24 66.18 67.65 70.10 72.55 72.06 72.55 73.04
bem_Latn 31.37 28.92 38.24 25.49 21.08 34.80 43.14 48.04 50.49 50.49 53.43 53.43 52.45 53.92
ben_Beng 25.49 61.27 64.22 52.45 31.37 54.90 63.24 62.25 67.65 70.10 70.10 69.12 66.18 68.63
bjn_Latn 48.53 51.96 61.76 42.65 32.35 62.75 66.18 68.14 71.57 75.98 73.04 72.55 75.00 77.45
bod_Tibt 15.20 12.75 15.69 15.69 22.06 34.80 37.75 37.75 38.73 39.71 39.71 39.71 41.67 44.12
bos_Latn 65.20 64.71 45.59 33.82 37.75 65.20 72.06 70.10 71.57 75.98 75.00 76.47 76.96 77.45
bul_Cyrl 66.18 63.24 38.73 52.94 45.10 62.25 67.65 67.16 68.14 75.00 71.57 72.06 73.53 75.00
cat_Latn 71.08 60.78 66.67 60.29 34.31 59.31 68.14 68.63 71.57 72.55 69.12 73.04 76.96 76.47
ceb_Latn 50.49 50.98 49.02 39.71 39.22 60.29 66.67 66.67 68.63 73.04 72.06 71.57 74.51 74.02
ces_Latn 69.12 62.75 47.55 40.69 39.22 62.25 69.61 70.59 72.55 76.47 72.55 74.02 80.88 76.96
cjk_Latn 27.94 30.39 34.31 26.47 22.55 30.88 31.86 32.84 38.24 38.24 38.24 35.78 39.22 42.65
ckb_Arab 19.61 22.55 23.04 12.25 28.92 53.43 60.29 57.35 65.20 65.20 62.75 65.69 65.69 70.59
cmn_Hani 73.04 65.20 67.65 54.90 39.71 69.12 74.02 72.06 76.47 77.45 76.47 75.98 79.41 76.47
crh_Latn 38.24 56.37 40.20 36.76 29.41 51.96 62.25 61.76 64.22 69.12 64.22 70.10 69.12 71.08
cym_Latn 39.22 28.43 34.80 21.57 28.43 55.39 62.75 63.73 66.18 72.06 74.02 72.06 75.00 77.45
dan_Latn 69.12 64.22 55.39 44.12 38.24 54.41 63.24 65.20 70.10 71.08 72.06 71.57 73.53 74.51
deu_Latn 74.02 60.29 61.27 55.39 41.18 63.73 68.63 71.57 69.12 75.00 75.49 76.47 77.45 77.45
dyu_Latn 28.43 28.92 32.35 20.10 21.08 29.90 38.73 39.71 46.57 44.12 41.67 47.06 44.61 43.63
dzo_Tibt 14.71 10.29 13.73 12.75 21.57 30.39 36.76 37.25 39.71 36.76 39.22 37.75 43.14 39.22
ell_Grek 47.55 63.73 28.43 60.29 43.63 62.75 69.61 66.67 69.12 69.61 70.59 73.04 72.06 72.06
eng_Latn 71.57 59.80 71.08 67.65 48.04 63.24 70.59 69.12 69.12 74.02 73.04 74.51 76.96 75.98
epo_Latn 52.94 50.49 52.94 43.63 27.94 49.51 66.18 66.67 68.63 72.55 73.53 71.57 74.51 75.98
est_Latn 48.04 54.90 41.18 57.35 29.41 55.88 62.75 66.67 70.10 72.06 70.10 69.12 71.57 73.04
eus_Latn 36.27 59.80 64.22 55.88 27.94 52.45 61.76 66.18 66.18 73.53 73.53 72.55 73.53 75.49
ewe_Latn 23.53 23.53 29.90 17.16 23.53 28.43 38.24 35.29 41.18 43.63 38.73 44.61 39.71 43.63
fao_Latn 41.18 43.63 38.73 29.90 35.29 52.94 56.86 57.84 62.25 61.27 62.25 61.76 64.22 69.12
fij_Latn 27.45 27.45 36.27 24.02 24.51 37.75 48.53 47.55 53.92 48.53 50.00 51.47 51.47 53.92
fin_Latn 67.65 63.24 38.73 56.86 36.76 61.76 70.10 70.10 72.55 74.51 72.06 73.53 75.00 75.49
fon_Latn 25.49 22.06 31.37 19.12 23.53 29.90 30.88 37.25 35.78 38.24 39.71 38.24 37.25 46.57
fra_Latn 72.06 64.71 66.18 59.80 36.76 58.82 71.08 67.65 71.08 74.51 71.57 74.02 77.45 77.45
ful_Latn 27.45 31.37 32.84 24.02 21.57 31.86 36.76 40.69 43.14 45.10 44.12 47.06 47.06 46.08
fur_Latn 58.82 50.00 55.88 38.73 35.78 53.92 55.88 62.25 66.67 68.63 68.14 67.16 75.00 72.55
gla_Latn 37.75 24.51 27.45 17.16 24.51 50.98 55.88 57.84 57.35 59.80 63.24 61.27 62.75 65.20
gle_Latn 39.71 25.49 25.00 17.16 27.94 53.43 57.84 63.24 64.22 67.65 64.71 63.24 62.75 73.53
glg_Latn 68.63 62.25 64.22 55.39 29.41 66.67 70.10 71.08 74.02 76.96 73.53 76.47 77.45 80.39
grn_Latn 42.16 47.06 48.53 33.82 25.98 52.45 62.75 64.71 62.25 67.16 64.22 65.69 61.76 69.61
guj_Gujr 15.20 09.31 62.25 12.75 28.43 50.98 54.90 60.29 63.73 63.73 64.22 65.69 62.75 67.16
hat_Latn 41.67 38.73 42.16 45.10 34.80 57.35 63.73 62.25 65.20 72.06 69.61 70.10 73.53 73.04
hau_Latn 25.49 28.43 29.90 20.59 28.43 47.06 55.39 57.84 61.27 65.69 62.75 65.69 66.18 65.20
heb_Hebr 37.75 63.24 20.59 11.76 26.47 41.67 47.06 51.47 54.90 51.96 50.98 54.90 53.92 54.41
hin_Deva 44.61 62.75 62.75 51.96 33.82 55.39 60.78 66.67 65.20 69.61 70.59 74.02 73.04 72.06
hne_Deva 37.75 58.82 59.80 49.02 28.43 55.39 55.88 65.20 62.75 68.63 66.18 65.69 68.14 71.08
hrv_Latn 66.18 65.20 44.12 36.27 40.69 63.73 73.04 71.08 73.53 76.47 72.06 74.51 78.43 78.43
hun_Latn 71.08 63.24 41.67 27.94 30.88 60.78 67.16 70.59 68.63 75.49 73.53 74.02 73.04 76.47
hye_Armn 20.59 17.16 13.73 12.75 32.84 58.82 59.80 67.16 65.20 69.61 68.14 69.12 69.12 72.55
ibo_Latn 24.02 26.47 38.24 19.12 30.39 51.47 57.35 63.73 67.16 69.12 68.14 69.12 68.63 72.06
ilo_Latn 45.10 45.59 48.04 32.35 27.45 54.90 61.76 61.76 68.14 68.14 70.10 73.04 73.53 70.10
ind_Latn 74.02 62.75 70.10 54.90 40.69 62.75 68.63 71.57 70.59 75.49 75.49 76.96 80.39 77.94
isl_Latn 35.29 36.76 28.92 24.51 38.73 55.88 60.78 58.33 60.29 64.71 63.73 63.24 62.75 65.20
ita_Latn 69.61 62.25 62.75 57.84 40.20 64.22 70.59 70.59 74.51 77.94 75.98 76.96 80.39 76.96
jav_Latn 50.49 52.94 55.39 38.24 31.86 53.43 60.78 64.22 65.20 72.55 69.12 73.04 68.63 73.04
jpn_Jpan 73.53 60.29 63.24 55.88 38.73 67.16 72.06 75.49 78.92 79.41 80.39 78.92 81.86 81.37
kab_Latn 16.18 16.67 20.10 12.25 20.59 24.02 22.55 30.39 31.86 34.80 28.43 33.33 32.35 34.31
kac_Latn 25.98 24.51 28.43 20.59 20.10 24.02 29.90 35.78 35.78 43.14 37.75 37.25 43.63 39.71
kam_Latn 26.96 34.31 34.80 26.47 22.06 36.76 38.73 37.75 40.69 41.67 46.57 41.67 42.16 42.16
kan_Knda 17.16 11.27 61.27 11.27 25.49 50.98 57.35 60.29 61.27 65.20 63.24 64.22 65.69 67.16
kat_Geor 29.41 61.27 18.14 14.71 32.84 56.86 60.78 62.25 65.20 67.65 70.59 70.59 70.10 74.51
kaz_Cyrl 37.75 62.75 29.90 28.43 34.31 53.43 57.35 62.25 65.69 67.16 65.20 65.69 69.12 67.65
kbp_Latn 24.51 22.06 30.39 16.18 21.08 28.43 36.76 38.73 40.69 39.22 40.69 39.71 38.73 40.20
kea_Latn 53.43 51.96 56.86 39.71 32.84 56.86 63.73 65.20 67.16 69.61 69.12 71.57 71.57 72.06
khm_Khmr 27.45 11.27 25.49 15.20 39.22 61.76 67.16 67.65 68.63 72.06 73.53 75.00 76.47 76.47
kik_Latn 29.41 32.84 38.73 26.96 21.57 37.75 49.51 50.49 50.49 56.86 56.37 56.37 52.94 56.86
kin_Latn 26.47 32.35 50.49 24.51 27.45 40.69 49.02 52.94 57.84 58.33 60.78 56.86 59.31 59.80
kir_Cyrl 35.78 60.78 34.80 27.45 29.90 45.59 58.33 60.78 60.29 65.20 60.29 64.71 66.18 66.18
kmb_Latn 26.47 28.43 33.82 25.00 21.08 31.86 35.29 39.71 38.24 41.18 41.67 37.25 41.67 44.61
kmr_Latn 29.41 33.82 33.33 21.57 25.98 37.75 47.06 47.55 52.45 52.45 54.90 54.41 58.82 61.76
kon_Latn 33.33 33.82 40.69 32.35 22.06 39.71 46.57 51.96 53.92 64.71 64.22 60.78 64.22 65.20
kor_Hang 67.65 63.24 43.14 56.37 45.10 63.24 67.65 69.12 71.57 73.04 70.59 75.98 76.96 76.47
lao_Laoo 24.02 14.22 26.47 16.67 39.71 55.39 62.25 63.73 68.63 70.59 70.10 68.63 70.59 70.59
lij_Latn 55.88 53.43 56.37 44.61 37.25 58.82 67.65 66.67 69.61 71.57 71.08 74.02 76.47 74.02
Table 22: Detailed results on SIB200 (Part I). For previous LLMs, 3-shot results are presented.
Lang LLaMA 2 7B mGPT 13B BLOOM 7B1 XGLM 7.5B MaLA-500
1-shot 2-shot 3-shot 4-shot 5-shot 6-shot 7-shot 8-shot 9-shot 10-shot
lim_Latn 60.78 50.00 50.00 33.82 32.84 56.37 60.78 63.24 66.67 67.16 69.12 72.55 70.59 72.06
lin_Latn 36.76 40.20 43.14 34.31 23.04 38.24 47.06 53.92 57.84 61.27 56.86 60.29 62.75 65.69
lit_Latn 40.20 60.29 41.18 30.39 32.35 55.39 62.75 65.20 64.71 70.59 68.14 66.67 69.61 73.04
lmo_Latn 57.84 50.98 55.88 41.67 34.80 59.31 65.69 66.18 70.10 71.57 70.59 70.59 75.00 75.49
ltz_Latn 55.88 47.06 52.94 39.22 39.22 56.37 65.20 61.27 70.59 68.14 71.08 70.59 71.08 74.02
lua_Latn 32.35 33.33 39.22 28.43 20.10 33.82 40.20 42.65 49.02 51.96 50.00 50.00 49.51 50.00
lug_Latn 27.94 25.00 33.82 19.61 22.06 35.78 40.20 43.63 48.04 51.47 47.06 43.63 49.02 49.02
luo_Latn 28.43 28.43 32.84 25.49 21.57 31.37 37.25 42.65 48.04 49.51 46.57 51.47 49.51 51.47
lus_Latn 43.63 42.16 49.02 31.37 25.49 45.59 51.96 53.92 52.45 54.90 57.84 60.29 58.33 60.29
lvs_Latn 43.14 67.16 43.63 29.41 31.37 57.84 65.20 63.24 68.14 72.55 67.65 69.61 71.08 72.55
mai_Deva 40.69 59.31 60.29 51.47 33.33 57.84 61.76 66.67 67.65 69.12 69.12 70.10 71.57 69.12
mal_Mlym 20.10 60.29 64.71 13.24 25.98 52.45 59.31 60.29 62.75 62.25 65.69 63.24 63.73 68.14
mar_Deva 29.90 56.86 63.73 37.75 36.27 51.96 57.35 64.22 63.73 63.73 63.73 66.67 66.18 68.14
min_Latn 48.04 55.39 59.80 39.71 31.37 57.35 69.12 68.14 68.63 77.94 72.06 75.98 75.49 77.45
mkd_Cyrl 60.78 52.45 32.84 44.12 44.12 66.18 68.63 69.12 68.63 73.04 73.04 72.55 73.04 76.96
mlt_Latn 49.51 45.10 46.08 29.90 35.78 64.71 67.16 68.14 67.16 77.45 75.49 76.96 77.45 76.96
mon_Cyrl 23.53 54.90 20.10 18.63 38.24 50.00 56.86 55.88 63.24 64.22 63.24 63.24 65.20 67.16
mos_Latn 25.49 23.53 29.90 20.59 20.59 27.94 36.76 37.75 37.75 40.69 41.67 45.10 41.67 45.10
mri_Latn 30.39 24.02 30.88 17.65 28.43 44.12 49.02 51.47 51.47 57.84 55.88 58.82 56.37 58.33
mya_Mymr 19.12 60.29 19.61 60.29 23.53 38.73 43.14 53.43 53.43 50.98 52.45 54.90 51.96 54.90
nld_Latn 70.10 59.80 55.88 46.08 45.59 64.71 69.12 68.63 73.04 73.53 75.49 74.02 79.41 80.88
nno_Latn 64.71 61.76 52.45 45.59 35.29 52.94 64.22 62.75 66.18 68.63 68.63 70.10 69.12 73.04
npi_Deva 39.22 51.96 64.71 40.69 33.82 57.84 61.76 68.14 67.65 67.65 68.63 70.10 68.63 75.49
nso_Latn 27.94 30.88 33.33 22.55 21.08 33.82 43.14 46.08 49.02 52.94 51.96 53.92 54.90 53.92
nya_Latn 32.35 34.31 40.69 27.94 23.04 35.29 45.59 49.02 50.98 51.47 52.94 53.92 52.94 58.33
oci_Latn 68.63 56.37 65.69 48.53 34.31 60.29 69.12 65.20 67.65 73.04 73.53 71.57 75.49 76.47
orm_Latn 17.16 18.14 22.06 16.67 20.10 30.39 35.29 41.18 41.67 47.55 41.67 44.12 43.14 51.47
ory_Orya 13.24 13.73 64.22 11.76 24.51 45.10 52.45 57.84 53.92 61.27 57.84 56.86 60.78 60.78
pag_Latn 52.45 49.51 53.92 40.20 31.86 54.90 62.75 60.78 67.65 64.71 70.10 70.10 69.12 69.61
pan_Guru 14.22 11.27 62.25 11.76 33.82 54.90 58.82 63.73 64.22 67.65 67.16 66.67 68.63 67.16
pap_Latn 55.39 50.00 52.94 38.24 30.39 56.86 64.71 66.67 69.61 74.51 69.12 73.53 70.59 75.49
pes_Arab 47.06 58.82 52.94 32.84 39.22 61.27 71.08 63.73 70.59 72.55 72.55 73.53 76.47 76.47
plt_Latn 28.43 32.84 37.25 21.57 29.41 51.96 58.82 57.84 59.31 60.78 60.29 60.29 64.22 60.29
pol_Latn 74.51 60.78 47.06 32.84 36.76 61.76 68.63 69.12 71.08 75.00 74.02 74.02 77.45 75.98
por_Latn 70.10 61.76 65.20 59.31 36.76 64.71 72.06 70.10 74.51 75.00 76.96 75.49 78.43 82.84
prs_Arab 50.49 55.39 49.51 33.33 37.25 60.78 64.22 67.16 69.12 72.55 72.55 73.53 72.55 75.49
pus_Arab 30.39 34.80 38.73 21.08 30.39 47.06 50.98 52.45 54.41 53.92 53.92 55.88 55.88 57.84
quy_Latn 32.84 35.29 40.69 35.29 22.06 36.27 44.12 45.59 49.02 52.45 49.51 49.02 50.98 50.98
ron_Latn 69.12 61.76 57.84 42.65 41.18 61.27 70.10 65.20 70.10 74.51 73.53 75.00 78.92 78.43
run_Latn 25.49 27.94 44.12 25.49 23.53 37.25 46.57 50.49 51.96 59.31 51.96 56.37 57.84 60.29
rus_Cyrl 71.57 63.24 53.43 60.29 38.73 64.22 65.20 69.12 72.06 75.98 75.00 76.47 75.49 78.92
sag_Latn 29.90 27.94 31.37 21.08 20.59 30.88 43.63 47.06 48.53 55.88 52.45 54.41 55.88 58.82
san_Deva 27.94 47.55 54.90 42.65 24.51 48.04 60.29 57.84 62.25 66.67 65.20 61.76 66.18 65.20
scn_Latn 51.96 50.00 53.43 40.69 37.25 63.73 73.04 70.59 74.02 77.45 75.49 75.00 80.39 76.47
sin_Sinh 15.20 10.78 20.10 12.75 29.90 56.37 60.29 65.20 66.18 68.14 64.71 66.67 63.73 67.16
slk_Latn 68.14 60.29 47.55 39.71 34.31 58.33 68.63 66.67 70.59 75.00 70.59 71.57 74.51 75.00
slv_Latn 68.14 60.78 44.12 32.84 38.73 63.24 68.14 68.14 70.59 73.53 73.53 74.51 78.43 76.47
smo_Latn 30.39 25.00 31.86 18.14 29.41 52.45 60.29 62.25 62.25 65.69 67.16 65.20 66.18 69.61
sna_Latn 28.43 29.41 36.27 23.53 24.51 39.71 44.61 45.59 44.61 49.51 45.59 47.55 47.06 50.00
snd_Arab 27.94 37.25 39.22 23.53 27.94 42.65 47.06 50.49 52.45 54.41 54.41 52.94 55.88 56.86
som_Latn 23.53 25.49 27.94 17.16 22.06 36.27 44.61 47.55 51.47 52.94 52.94 53.92 54.41 55.39
sot_Latn 29.41 28.43 33.82 18.63 22.55 36.76 43.14 47.06 50.49 51.96 52.45 55.39 54.41 56.86
spa_Latn 72.55 58.33 67.65 56.37 35.29 64.22 69.61 72.06 74.51 74.02 72.06 76.47 78.43 78.43
srd_Latn 53.92 52.45 50.98 37.25 31.37 60.29 66.18 68.63 75.98 74.02 77.94 77.45 79.41 79.41
srp_Cyrl 63.73 55.39 33.33 39.22 45.59 65.20 70.59 69.61 73.04 76.47 74.02 75.00 77.94 79.41
ssw_Latn 29.41 25.00 31.37 21.57 24.02 44.12 46.57 50.00 52.94 51.96 53.92 56.86 53.92 60.78
sun_Latn 55.39 59.31 63.73 44.61 37.25 60.29 68.63 70.10 71.08 73.53 72.55 73.53 75.00 75.98
swe_Latn 71.08 61.27 52.94 48.04 33.82 53.43 60.29 64.71 64.71 69.12 70.10 69.61 72.55 70.59
swh_Latn 32.35 63.24 61.27 56.86 29.41 50.49 59.31 58.82 62.75 60.29 62.25 68.63 66.67 66.67
szl_Latn 56.86 50.49 45.59 29.41 30.88 51.47 59.80 63.73 64.71 67.16 69.61 68.63 71.08 69.12
tam_Taml 20.59 63.24 67.16 58.82 30.88 50.49 55.88 62.75 63.24 63.73 65.20 69.61 66.67 68.63
tat_Cyrl 37.75 60.29 35.29 28.92 33.33 54.90 64.22 64.71 65.69 74.51 70.10 71.57 71.57 73.53
tel_Telu 18.14 60.78 61.27 59.80 25.98 50.00 52.45 59.80 58.82 63.73 60.78 60.29 63.73 61.76
tgk_Cyrl 26.96 57.84 23.53 17.16 36.76 54.90 60.29 60.78 61.27 69.12 64.71 66.18 68.14 70.59
tgl_Latn 55.88 58.33 49.02 40.20 43.14 64.22 69.61 64.71 70.10 75.49 74.51 77.45 78.43 77.45
tha_Thai 44.61 60.78 23.53 57.35 41.18 63.24 67.16 68.63 70.10 72.06 70.59 70.10 72.06 75.49
tir_Ethi 13.24 16.18 16.18 13.73 21.57 34.80 39.22 41.67 47.06 46.08 45.59 47.06 45.59 47.55
tpi_Latn 63.24 46.57 56.86 33.33 31.86 58.82 65.69 68.14 70.10 72.06 74.51 73.53 75.98 74.51
tsn_Latn 28.92 29.90 32.35 24.51 24.51 40.20 44.12 47.06 47.06 53.43 51.96 50.00 50.49 54.41
tso_Latn 30.88 31.37 36.76 28.92 22.55 35.29 41.18 45.10 46.08 48.04 43.14 43.14 45.59 49.02
tuk_Latn 34.31 46.08 39.22 27.45 24.02 45.59 53.43 58.82 57.84 63.73 63.73 66.18 65.69 66.18
tum_Latn 26.96 34.31 33.82 27.94 21.57 39.22 43.14 45.59 46.08 47.55 44.61 49.02 49.51 49.51
tur_Latn 52.94 62.75 40.20 52.94 36.76 60.78 68.63 70.10 72.06 74.02 75.00 76.47 75.98 76.96
uig_Arab 18.63 18.14 20.10 11.27 21.08 33.33 36.76 39.71 43.14 44.61 43.14 48.53 47.55 48.04
ukr_Cyrl 71.57 63.73 41.18 43.63 39.71 60.29 65.69 66.18 69.12 71.08 75.00 73.53 72.55 75.00
umb_Latn 25.00 26.47 29.90 23.04 21.57 30.88 32.84 36.76 35.78 40.20 38.24 34.80 36.76 35.29
urd_Arab 38.73 53.43 63.24 54.41 36.27 55.39 62.75 64.22 64.22 68.63 65.20 68.63 67.16 67.16
uzb_Latn 30.39 62.75 35.78 23.53 22.06 50.49 56.37 57.84 63.24 72.06 63.73 69.12 72.55 71.57
vec_Latn 65.69 59.80 56.86 52.45 39.22 62.25 66.18 69.61 70.10 69.12 74.51 75.98 75.00 76.47
vie_Latn 67.65 63.24 67.16 60.78 39.71 60.78 67.16 68.63 74.51 75.98 76.47 75.00 78.43 79.90
war_Latn 51.47 51.47 51.47 37.25 37.75 61.27 65.69 65.20 69.61 73.04 71.57 71.57 74.02 74.02
wol_Latn 32.35 34.80 43.14 25.49 23.53 36.76 42.16 45.59 48.53 53.43 47.55 52.94 54.90 53.43
xho_Latn 30.39 29.90 38.24 22.55 25.98 46.57 51.96 56.37 58.33 60.78 61.76 60.29 62.75 64.22
yid_Hebr 23.04 22.06 16.18 12.25 24.02 34.80 39.22 39.71 40.20 46.57 41.18 40.69 44.61 45.10
yor_Latn 21.57 29.41 47.55 21.57 26.47 32.35 41.18 39.22 41.67 48.04 42.16 43.14 44.61 43.14
yue_Hani 75.00 64.71 67.16 55.88 40.69 69.12 71.57 76.47 76.96 81.37 77.94 79.41 81.86 79.41
zsm_Latn 65.69 61.27 64.71 50.00 36.76 60.29 68.14 69.12 67.65 73.53 73.53 76.96 77.45 75.00
zul_Latn 25.00 25.49 35.29 15.20 25.49 51.47 49.02 54.41 56.37 57.84 60.29 61.76 59.31 62.75
all 42.08 45.34 44.63 34.36 30.88 50.71 57.02 58.95 61.20 64.04 63.15 64.13 65.19 66.32
Table 23: Detailed results on SIB200 (Part II). For previous LLMs, 3-shot results are presented.
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MaLA-500
ace_Latn 46.85 47.75 49.55 41.44 48.65
ach_Latn 45.05 37.84 41.44 40.54 36.04
acr_Latn 47.75 51.35 50.45 47.75 50.45
afr_Latn 54.05 38.74 51.35 49.55 55.86
agw_Latn 48.65 45.05 41.44 42.34 49.55
ahk_Latn 43.24 36.04 36.04 35.14 45.05
aka_Latn 42.34 32.43 38.74 42.34 54.95
aln_Latn 34.23 35.14 36.94 35.14 44.14
als_Latn 38.74 36.94 42.34 42.34 47.75
alt_Cyrl 44.14 44.14 45.05 51.35 51.35
alz_Latn 36.94 35.14 31.53 28.83 37.84
aoj_Latn 50.93 37.96 45.37 46.3 49.07
arb_Arab 43.24 45.05 49.55 44.14 50.45
arn_Latn 38.74 42.34 34.23 36.04 43.24
ary_Arab 32.43 33.33 38.74 32.43 44.14
arz_Arab 31.53 39.64 45.05 36.94 46.85
asm_Beng 45.95 42.34 54.95 40.54 54.05
ayr_Latn 47.75 37.84 44.14 44.14 54.05
azb_Arab 39.64 40.54 47.75 45.05 47.75
aze_Latn 45.05 46.85 45.05 43.24 49.55
bak_Cyrl 45.05 52.25 49.55 56.76 56.76
bam_Latn 42.34 37.84 49.55 39.64 47.75
ban_Latn 36.04 41.44 34.23 34.23 42.34
bar_Latn 49.55 46.85 44.14 48.65 53.15
bba_Latn 45.05 32.43 45.95 46.85 46.85
bci_Latn 36.94 35.14 36.94 33.33 44.14
bcl_Latn 42.34 48.65 39.64 39.64 54.95
bel_Cyrl 47.75 45.95 48.65 43.24 57.66
bem_Latn 47.75 37.84 42.34 41.44 51.35
ben_Beng 40.54 41.44 52.25 51.35 47.75
bhw_Latn 37.84 43.24 41.44 46.85 47.75
bim_Latn 38.74 39.64 33.33 36.94 45.05
bis_Latn 44.14 49.55 44.14 39.64 48.65
bqc_Latn 39.64 36.04 34.23 33.33 40.54
bre_Latn 39.64 36.04 35.14 36.04 40.54
btx_Latn 49.55 36.94 42.34 41.44 43.24
bul_Cyrl 45.05 42.34 48.65 45.05 54.95
bum_Latn 42.34 39.64 37.84 37.84 44.14
bzj_Latn 53.15 46.85 47.75 50.45 52.25
cab_Latn 39.64 38.74 37.84 36.94 36.04
cac_Latn 43.24 37.84 40.54 38.74 45.05
cak_Latn 45.95 35.14 44.14 40.54 50.45
caq_Latn 39.64 38.74 38.74 44.14 37.84
cat_Latn 52.25 45.05 46.85 48.65 52.25
cbk_Latn 54.05 40.54 56.76 54.05 55.86
cce_Latn 49.55 45.05 50.45 48.65 48.65
ceb_Latn 44.14 42.34 48.65 45.05 51.35
ces_Latn 44.14 43.24 45.05 46.85 51.35
cfm_Latn 49.55 41.44 49.55 53.15 48.65
che_Cyrl 37.84 33.33 36.94 37.84 38.74
chk_Latn 45.05 41.44 41.44 36.04 45.95
chv_Cyrl 43.24 45.05 45.05 49.55 58.56
ckb_Arab 44.14 36.94 45.05 42.34 51.35
cmn_Hani 48.65 45.05 53.15 48.65 53.15
cnh_Latn 46.85 46.85 46.85 49.55 46.85
crh_Cyrl 49.55 40.54 47.75 54.95 54.05
crs_Latn 52.25 44.14 49.55 55.86 59.46
csy_Latn 47.75 41.44 54.95 53.15 45.95
ctd_Latn 50.45 48.65 56.76 53.15 56.76
ctu_Latn 41.44 35.14 38.74 40.54 43.24
cuk_Latn 42.34 42.34 38.74 39.64 37.84
cym_Latn 39.64 38.74 39.64 43.24 41.44
dan_Latn 53.15 41.44 39.64 38.74 54.95
deu_Latn 45.05 36.04 37.84 38.74 43.24
djk_Latn 42.34 35.14 42.34 46.85 40.54
dln_Latn 48.65 40.54 51.35 54.05 47.75
Table 24: Detailed results on Taxi1500 (Part I). 3-shot results are presented.
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MaLA-500
dtp_Latn 39.64 35.14 42.34 46.85 50.45
dyu_Latn 41.44 39.64 42.34 38.74 46.85
dzo_Tibt 45.05 40.54 41.44 45.05 45.05
efi_Latn 39.64 36.04 38.74 41.44 45.95
ell_Grek 49.55 45.95 49.55 48.65 51.35
eng_Latn 55.86 42.34 58.56 54.05 59.46
enm_Latn 50.45 41.44 56.76 50.45 54.95
epo_Latn 49.55 40.54 47.75 42.34 49.55
est_Latn 46.85 42.34 38.74 52.25 46.85
eus_Latn 38.74 36.04 36.94 39.64 39.64
ewe_Latn 51.35 43.24 50.45 46.85 45.05
fao_Latn 53.15 44.14 52.25 53.15 58.56
fas_Arab 49.55 50.45 57.66 51.35 55.86
fij_Latn 48.65 43.24 41.44 43.24 53.15
fil_Latn 48.65 41.44 46.85 51.35 51.35
fin_Latn 47.75 45.05 41.44 45.95 54.95
fon_Latn 38.74 35.14 37.84 40.54 45.05
fra_Latn 60.36 51.35 62.16 52.25 59.46
fry_Latn 37.84 33.33 36.04 27.03 46.85
gaa_Latn 41.44 33.33 37.84 35.14 40.54
gil_Latn 36.7 31.19 41.28 32.11 41.28
giz_Latn 46.85 44.14 43.24 38.74 45.05
gkn_Latn 38.74 34.23 34.23 36.94 41.44
gkp_Latn 30.63 33.33 41.44 29.73 48.65
gla_Latn 33.33 39.64 44.14 45.05 49.55
gle_Latn 33.33 35.14 36.04 34.23 39.64
glv_Latn 43.24 41.44 37.84 38.74 42.34
gom_Latn 34.23 31.53 33.33 40.54 42.34
gor_Latn 43.24 34.23 43.24 40.54 46.85
guc_Latn 44.14 36.04 37.84 41.44 45.05
gug_Latn 45.05 44.14 42.34 41.44 50.45
guj_Gujr 45.95 37.84 52.25 44.14 56.76
gur_Latn 45.95 45.95 44.14 47.75 48.65
guw_Latn 45.05 37.84 47.75 46.85 48.65
gya_Latn 37.84 37.84 41.44 34.23 42.34
gym_Latn 41.44 39.64 39.64 43.24 50.45
hat_Latn 50.45 43.24 44.14 41.44 56.76
hau_Latn 44.14 37.84 41.44 44.14 48.65
haw_Latn 45.95 39.64 38.74 34.23 49.55
heb_Hebr 38.74 35.14 34.23 36.94 44.14
hif_Latn 42.34 43.24 49.55 47.75 48.65
hil_Latn 49.55 41.44 40.54 36.94 54.95
hin_Deva 51.35 50.45 49.55 46.85 56.76
hmo_Latn 46.85 45.05 46.85 45.05 53.15
hne_Deva 55.86 54.05 54.05 58.56 58.56
hnj_Latn 48.65 45.05 53.15 51.35 60.36
hra_Latn 49.55 41.44 43.24 46.85 45.95
hrv_Latn 55.86 51.35 52.25 54.95 61.26
hui_Latn 51.35 40.54 41.44 45.05 46.85
hun_Latn 46.85 44.14 41.44 43.24 48.65
hus_Latn 32.43 32.43 34.23 37.84 42.34
hye_Armn 45.95 40.54 45.95 49.55 61.26
iba_Latn 49.55 46.85 51.35 48.65 55.86
ibo_Latn 38.74 33.33 43.24 38.74 44.14
ifa_Latn 36.04 30.63 35.14 38.74 43.24
ifb_Latn 34.23 35.14 39.64 34.23 53.15
ikk_Latn 43.24 36.94 39.64 39.64 43.24
ilo_Latn 39.64 36.04 41.44 37.84 43.24
ind_Latn 49.55 50.45 53.15 53.15 54.95
isl_Latn 48.65 44.14 43.24 48.65 54.05
ita_Latn 50.45 49.55 56.76 58.56 54.95
ium_Latn 45.95 44.14 49.55 50.45 45.05
ixl_Latn 42.34 39.64 41.44 43.24 40.54
izz_Latn 38.74 47.75 39.64 42.34 54.95
jam_Latn 41.44 43.24 53.15 50.45 61.26
jav_Latn 41.44 47.75 44.14 37.84 45.95
Table 25: Detailed results on Taxi1500 (Part II). 3-shot results are presented.
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MaLA-500
jpn_Jpan 46.85 46.85 47.75 50.45 51.35
kaa_Latn 43.24 53.15 47.75 51.35 54.05
kab_Latn 27.93 36.04 30.63 34.23 35.14
kac_Latn 44.14 34.23 43.24 42.34 52.25
kal_Latn 41.44 37.84 36.04 35.14 40.54
kan_Knda 48.65 37.84 52.25 45.95 54.05
kat_Geor 41.44 41.44 42.34 46.85 48.65
kaz_Cyrl 49.55 45.05 51.35 53.15 55.86
kbp_Latn 40.54 35.14 36.94 31.53 47.75
kek_Latn 45.95 42.34 45.05 44.14 51.35
khm_Khmr 52.25 38.74 48.65 49.55 64.86
kia_Latn 36.94 36.04 40.54 41.44 48.65
kik_Latn 45.05 43.24 45.05 44.14 50.45
kin_Latn 42.34 37.84 41.44 38.74 50.45
kir_Cyrl 51.35 46.85 47.75 63.06 64.86
kjb_Latn 48.65 46.85 44.14 44.14 48.65
kjh_Cyrl 44.14 41.44 45.05 41.44 45.95
kmm_Latn 45.95 45.05 47.75 51.35 45.95
kmr_Cyrl 39.64 35.14 45.05 42.34 43.24
knv_Latn 44.55 44.55 45.45 42.73 44.55
kor_Hang 48.65 48.65 49.55 51.35 62.16
kpg_Latn 44.14 52.25 51.35 42.34 54.95
krc_Cyrl 45.95 36.04 48.65 48.65 53.15
kri_Latn 49.55 48.65 49.55 51.35 54.95
ksd_Latn 36.94 33.33 40.54 33.33 49.55
kss_Latn 32.43 28.83 34.23 29.73 47.75
ksw_Mymr 44.14 45.95 42.34 37.84 52.25
kua_Latn 41.44 42.34 36.94 35.14 40.54
lam_Latn 43.24 36.94 45.95 43.24 40.54
lao_Laoo 45.05 39.64 46.85 50.45 50.45
lat_Latn 53.15 41.44 53.15 56.76 57.66
lav_Latn 39.64 33.33 36.04 39.64 45.05
ldi_Latn 35.14 32.43 36.94 34.23 36.04
leh_Latn 47.75 37.84 33.33 32.43 41.44
lhu_Latn 27.93 34.23 34.23 37.84 42.34
lin_Latn 47.75 37.84 39.64 39.64 48.65
lit_Latn 42.34 40.54 44.14 48.65 49.55
loz_Latn 45.95 42.34 36.04 44.14 40.54
ltz_Latn 46.85 45.95 47.75 41.44 49.55
lug_Latn 40.54 32.43 39.64 38.74 45.95
luo_Latn 40.54 36.94 34.23 38.74 40.54
lus_Latn 39.64 40.54 42.34 41.44 50.45
lzh_Hani 54.95 48.65 54.05 43.24 56.76
mad_Latn 47.75 52.25 47.75 47.75 53.15
mah_Latn 43.24 36.04 42.34 45.95 45.05
mai_Deva 45.05 41.44 49.55 54.05 51.35
mam_Latn 43.24 33.33 41.44 45.05 45.95
mar_Deva 49.55 44.14 53.15 45.95 56.76
mau_Latn 29.73 29.73 36.94 37.84 32.43
mbb_Latn 44.14 42.34 38.74 39.64 49.55
mck_Latn 40.54 34.23 36.04 39.64 49.55
mcn_Latn 35.14 27.93 33.33 33.33 38.74
mco_Latn 41.44 33.33 43.24 33.33 43.24
mdy_Ethi 39.64 46.85 43.24 43.24 51.35
meu_Latn 53.15 38.74 45.05 48.65 52.25
mfe_Latn 51.35 48.65 52.25 50.45 56.76
mgh_Latn 42.34 33.33 41.44 35.14 38.74
mgr_Latn 39.64 34.23 33.33 41.44 38.74
mhr_Cyrl 47.27 42.73 45.45 42.73 48.18
min_Latn 37.84 45.95 53.15 45.05 53.15
miq_Latn 51.35 46.85 43.24 54.95 49.55
mkd_Cyrl 52.25 48.65 56.76 57.66 66.67
mlg_Latn 35.14 36.04 36.94 37.84 45.95
mlt_Latn 37.84 33.33 42.34 43.24 46.85
mos_Latn 39.64 42.34 39.64 36.04 36.04
mps_Latn 47.75 45.05 42.34 45.05 51.35
mri_Latn 45.05 42.34 38.74 42.34 44.14
Table 26: Detailed results on Taxi1500 (Part III). 3-shot results are presented.
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MaLA-500
mrw_Latn 40.54 39.64 41.44 37.84 49.55
msa_Latn 44.14 41.44 45.95 37.84 46.85
mwm_Latn 36.94 31.53 39.64 38.74 47.75
mxv_Latn 33.33 35.14 39.64 38.74 40.54
mya_Mymr 45.05 48.65 44.14 44.14 46.85
myv_Cyrl 39.64 43.24 40.54 41.44 45.05
mzh_Latn 45.05 45.95 42.34 40.54 44.14
nan_Latn 32.43 35.14 48.65 49.55 44.14
naq_Latn 36.94 36.94 37.84 39.64 41.44
nav_Latn 27.03 28.83 30.63 33.33 38.74
nbl_Latn 21.62 18.02 21.62 25.23 27.93
nch_Latn 37.84 34.23 33.33 40.54 40.54
ncj_Latn 46.85 45.95 42.34 41.44 42.34
ndc_Latn 44.14 36.04 43.24 36.94 49.55
nde_Latn 33.33 29.73 33.33 36.04 41.44
ndo_Latn 41.28 34.86 37.61 33.94 46.79
nds_Latn 41.44 38.74 37.84 34.23 43.24
nep_Deva 45.05 49.55 63.06 51.35 60.36
ngu_Latn 47.75 39.64 43.24 42.34 49.55
nld_Latn 47.75 39.64 47.75 43.24 56.76
nmf_Latn 44.14 40.54 42.34 41.44 44.14
nnb_Latn 45.05 42.34 36.94 44.14 40.54
nno_Latn 56.76 46.85 45.95 52.25 54.95
nob_Latn 52.25 41.44 44.14 45.95 56.76
nor_Latn 50.45 35.14 46.85 47.75 53.15
npi_Deva 51.35 54.95 55.86 45.95 54.95
nse_Latn 38.74 28.83 39.64 38.74 42.34
nso_Latn 45.05 43.24 45.05 45.05 50.45
nya_Latn 48.65 39.64 44.14 42.34 54.95
nyn_Latn 39.64 33.33 37.84 36.94 45.05
nyy_Latn 43.24 42.34 43.24 40.54 47.75
nzi_Latn 36.94 32.43 33.33 32.43 35.14
ori_Orya 43.24 34.23 51.35 46.85 45.95
ory_Orya 44.14 44.14 49.55 46.85 55.86
oss_Cyrl 49.55 49.55 49.55 44.14 54.05
ote_Latn 34.23 31.53 34.23 36.04 49.55
pag_Latn 44.14 48.65 48.65 42.34 50.45
pam_Latn 45.95 36.04 44.14 47.75 45.05
pan_Guru 41.44 33.33 46.85 40.54 47.75
pap_Latn 50.45 44.14 52.25 49.55 53.15
pau_Latn 38.74 45.05 37.84 36.94 46.85
pcm_Latn 58.56 47.75 56.76 53.15 57.66
pdt_Latn 53.15 45.95 45.95 48.65 54.05
pes_Arab 50.91 46.36 59.09 48.18 53.64
pis_Latn 57.66 47.75 50.45 45.95 55.86
pls_Latn 43.24 43.24 43.24 39.64 45.95
plt_Latn 36.94 35.14 37.84 43.24 47.75
poh_Latn 42.34 42.34 45.05 39.64 48.65
pol_Latn 41.44 43.24 46.85 55.86 56.76
pon_Latn 45.95 39.64 43.24 39.64 42.34
por_Latn 56.76 54.95 56.76 54.05 58.56
prk_Latn 44.14 43.24 49.55 40.54 46.85
prs_Arab 50.45 51.35 55.86 56.76 57.66
pxm_Latn 48.65 44.14 41.44 41.44 47.75
qub_Latn 46.85 44.14 43.24 48.65 45.05
quc_Latn 45.05 41.44 43.24 38.74 50.45
qug_Latn 45.95 46.85 50.45 45.05 56.76
quh_Latn 49.55 49.55 46.85 42.34 51.35
quw_Latn 43.24 36.94 45.05 44.14 53.15
quy_Latn 58.56 48.65 54.95 50.45 57.66
quz_Latn 51.35 38.74 60.36 54.95 59.46
qvi_Latn 46.79 46.79 49.54 45.87 47.71
rap_Latn 43.24 35.14 41.44 39.64 46.85
rar_Latn 40.54 32.43 31.53 29.73 45.95
rmy_Latn 37.84 37.84 38.74 40.54 43.24
ron_Latn 45.05 51.35 44.14 47.75 57.66
Table 27: Detailed results on Taxi1500 (Part IV). 3-shot results are presented.
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MaLA-500
rop_Latn 45.95 45.05 42.34 42.34 55.86
rug_Latn 43.24 38.74 46.85 44.14 45.05
run_Latn 46.85 40.54 45.05 40.54 52.25
rus_Cyrl 49.55 41.44 50.45 47.75 53.15
sag_Latn 43.24 43.24 41.44 40.54 47.75
sah_Cyrl 40.54 35.14 44.14 44.14 54.95
sba_Latn 42.34 43.24 45.05 40.54 49.55
seh_Latn 45.05 35.14 40.54 42.34 45.95
sin_Sinh 39.64 38.74 39.64 42.34 45.95
slk_Latn 53.15 50.45 44.14 47.75 53.15
slv_Latn 47.75 45.05 55.86 51.35 49.55
sme_Latn 45.95 45.05 42.34 41.44 48.65
smo_Latn 38.74 40.54 43.24 44.14 53.15
sna_Latn 50.45 30.63 43.24 45.95 60.36
snd_Arab 44.14 45.05 56.76 51.35 56.76
som_Latn 33.33 36.94 35.14 34.23 39.64
sop_Latn 40.54 34.23 40.54 35.14 35.14
sot_Latn 47.75 41.44 40.54 43.24 49.55
spa_Latn 51.35 49.55 51.35 51.35 56.76
sqi_Latn 42.34 43.24 52.25 52.25 57.66
srm_Latn 35.14 41.44 39.64 37.84 45.05
srn_Latn 45.95 53.15 54.05 48.65 51.35
srp_Latn 59.46 48.65 58.56 54.05 58.56
ssw_Latn 38.74 45.05 36.94 40.54 48.65
sun_Latn 43.24 40.54 45.05 44.14 48.65
suz_Deva 46.85 42.34 42.34 43.24 49.55
swe_Latn 58.56 48.65 53.15 54.95 61.26
swh_Latn 46.85 49.55 49.55 48.65 56.76
sxn_Latn 42.34 36.94 44.14 44.14 46.85
tam_Taml 44.14 53.15 59.46 48.65 60.36
tat_Cyrl 47.75 47.75 45.95 48.65 54.05
tbz_Latn 36.04 35.14 34.23 35.14 42.34
tca_Latn 39.64 40.54 43.24 41.44 45.05
tdt_Latn 40.54 38.74 48.65 45.05 52.25
tel_Telu 33.33 45.95 50.45 45.95 49.55
teo_Latn 33.33 37.84 26.13 31.53 41.44
tgk_Cyrl 42.34 44.14 48.65 49.55 57.66
tgl_Latn 48.65 41.44 46.85 51.35 51.35
tha_Thai 43.24 42.34 43.24 37.84 47.75
tih_Latn 43.24 37.84 40.54 36.04 54.05
tir_Ethi 29.73 36.94 27.93 34.23 41.44
tlh_Latn 51.35 45.95 45.95 41.44 53.15
tob_Latn 44.55 43.64 41.82 38.18 50.00
toh_Latn 42.34 39.64 40.54 40.54 42.34
toi_Latn 44.14 45.05 34.23 36.04 45.05
toj_Latn 43.24 40.54 36.94 43.24 42.34
ton_Latn 42.34 42.34 42.34 44.14 52.25
top_Latn 46.85 34.23 37.84 38.74 36.94
tpi_Latn 48.65 44.14 52.25 48.65 49.55
tpm_Latn 37.84 41.44 38.74 32.43 42.34
tsn_Latn 40.54 36.04 38.74 34.23 37.84
tsz_Latn 37.84 32.43 37.84 38.74 46.85
tuc_Latn 45.95 44.14 47.75 44.14 48.65
tui_Latn 42.34 38.74 38.74 37.84 50.45
tuk_Latn 36.04 42.34 45.05 43.24 50.45
tum_Latn 47.75 39.64 46.85 52.25 50.45
tur_Latn 46.79 44.04 40.37 43.12 45.87
twi_Latn 41.44 43.24 41.44 37.84 46.85
tyv_Cyrl 38.74 38.74 43.24 44.14 45.05
tzh_Latn 41.82 36.36 41.82 41.82 38.18
tzo_Latn 39.64 43.24 34.23 29.73 41.44
udm_Cyrl 36.94 38.74 42.34 44.14 47.75
ukr_Cyrl 52.25 48.65 51.35 55.86 53.15
Table 28: Detailed results on Taxi1500 (Part V). 3-shot results are presented.
Lang LLaMA 2-7B mGPT-13B BLOOM-7B1 XGLM-7.5B MaLA-500
ukr_Cyrl 52.25 48.65 51.35 55.86 53.15
uzb_Latn 45.05 49.55 37.84 46.85 54.05
uzn_Cyrl 45.95 40.54 45.05 45.05 49.55
ven_Latn 45.05 44.14 42.34 41.44 54.05
vie_Latn 53.15 45.95 62.16 45.95 54.95
wal_Latn 35.14 33.33 35.14 35.14 39.64
war_Latn 48.65 39.64 37.84 45.05 54.95
wbm_Latn 48.65 39.64 46.85 46.85 48.65
wol_Latn 36.04 34.23 32.43 34.23 36.94
xav_Latn 50.45 33.33 46.85 44.14 45.95
xho_Latn 43.24 37.84 40.54 39.64 46.85
yan_Latn 45.05 46.85 52.25 41.44 53.15
yao_Latn 42.34 41.44 43.24 44.14 48.65
yap_Latn 38.74 40.54 35.14 32.43 41.44
yom_Latn 35.14 31.53 33.33 25.23 36.94
yor_Latn 41.44 38.74 39.64 44.14 47.75
yua_Latn 41.44 32.43 43.24 41.44 36.04
yue_Hani 43.24 48.65 53.15 38.74 57.66
zai_Latn 45.05 35.14 40.54 43.24 44.14
zho_Hani 47.75 51.35 51.35 44.14 58.56
zlm_Latn 54.05 49.55 57.66 56.76 64.86
zom_Latn 50.45 42.34 44.14 43.24 48.65
zsm_Latn 58.56 59.46 63.96 55.86 66.67
zul_Latn 46.85 42.34 46.85 46.85 51.35
all 44.07 40.98 43.98 43.24 48.89
Table 29: Detailed results on Taxi1500 (Part VI). 3-shot results are presented.