^*^*footnotetext: Equal contribution.

MaLA-500: Massive Language Adaptation of Large Language Models

Peiqin Lin^∗^1,2, Shaoxiong Ji^∗³, Jörg Tiedemann³, André F. T. Martins^4,5,6, Hinrich Schütze^1,2
¹Center for Information and Language Processing, LMU Munich
²Munich Center for Machine Learning ³University of Helsinki
⁴Instituto Superior Técnico (Lisbon ELLIS Unit)
⁵Instituto de Telecomunicações ⁶Unbabel
[email protected], [email protected]

Abstract

Large language models (LLMs) have advanced the state of the art in natural language processing. However, their predominant design for English or a limited set of languages creates a substantial gap in their effectiveness for low-resource languages. To bridge this gap, we introduce MaLA-500, a novel large language model designed to cover an extensive range of 534 languages. To train MaLA-500, we employ vocabulary extension and continued pretraining on LLaMA 2 with Glot500-c. Our intrinsic evaluation demonstrates that MaLA-500 is better at predicting the given texts of low-resource languages than existing multilingual LLMs. Moreover, the extrinsic evaluation of in-context learning shows that MaLA-500 outperforms previous LLMs on SIB200 and Taxi1500 by a significant margin, i.e., 11.68% and 4.82% marco-average accuracy across languages. We release MaLA-500 at https://huggingface.co/MaLA-LM.

1 Introduction

Large Language Models (LLMs), e.g., LLaMA (Touvron et al., 2023a; b), Mistral (Jiang et al., 2023; 2024), and ChatGPT,¹¹1https://openai.com/blog/chatgpt have shown remarkable performance in natural language understanding and generation. Follow-up studies (Bang et al., 2023; Lai et al., 2023; Ahuja et al., 2023a; b) observe that these English-centric LLMs, such as LLaMA with mainly English as the training data, are capable of handling some high-resource non-English languages, benefiting from the inclusion of non-English language data during pretraining. However, their applicability to low-resource languages is still limited due to data scarcity.

Previous studies have released pretrained multilingual models with mostly encoder-only transformer architectures, e.g., multilingual BERT (Devlin et al., 2019) and XLM-R (Conneau et al., 2020), for around 100 languages. The paradigm shift from encoder-only to decoder-only achieves scalability for large language models with billions of model parameters, leading to the development of open multilingual models. Recently, several generative multilingual LLMs, such as XGLM (Lin et al., 2021), mGPT (Shliazhko et al., 2022), and BLOOM (Scao et al., 2022), have emerged. Notably, the current language coverage for these generative LLMs is limited to up to 60 languages, highlighting the remaining need for further work on massively multilingual LLMs for many natural languages.

ImaniGooghari et al. (2023) have achieved a significant milestone in the realm of massive language adaptation by extending the language coverage of a small-scale multilingual language model, XLM-R (Conneau et al., 2020) - an auto-encoding model with 278M parameters, from 100 languages to an impressive number of 534 languages, and introducing an extended model, Glot500-m with 395M parameters. ImaniGooghari et al. (2023) introduce the Glot500-c corpora spanning 534 languages from 47 language families, and then apply vocabulary extension and continued pretraining to create Glot500-m. The introduction of Glot500-c mitigates the challenge of data scarcity for low-resource languages. Moreover, the adaptation method is more favorable than training from scratch, as it requires fewer computational resources and emits a smaller carbon footprint. This success serves as a strong motivation for our exploration into the massive language adaptation of LLMs.

This work aims to extend the capabilities of LLMs to encompass a wider range of languages. Existing works like ImaniGooghari et al. (2023) on language adaptation of pretrained models provide extended coverage across a wide linguistic spectrum but are limited to relatively small model sizes - mostly at the hundred million scales, while other works like Yong et al. (2022) extended generative LLMs but are limited to a small number of languages. Our study pushes the boundaries by exploring language adaptation techniques for LLMs with model parameters scaling up to 10 billion for 534 languages. Our investigation delves into generative LLMs with a substantial increase in model parameters and their in-context learning capabilities in diverse languages, especially low-resource languages. This augmentation enables us to enhance contextual and linguistic relevance across a diverse range of languages.

We address the challenges of adapting LLMs to low-resource languages, such as data sparsity, domain-specific vocabulary, and linguistic diversity. Specifically, we study continued pretraining of open LLM, i.e., LLaMA 2 (Touvron et al., 2023b), vocabulary extension, and adaptation techniques, i.e., LoRA low-rank reparameterization (Hu et al., 2022). We deploy distributed training and release MaLA-500 that covers more than 500 languages in various domains. We evaluate MaLA-500 using intrinsic measures on held-out Glot500-c test set and parallel data and extrinsic metrics on downstream benchmarks: SIB200 and Taxi1500. The results show that MaLA-500 outperforms existing open LLMs of close or slightly larger model size. This work broadens the accessibility of LLMs, making them valuable for a more diverse set of language-specific use cases, especially for low-resource ones, and addressing the equality issue by removing language barriers for speakers of many languages, especially those underrepresented languages covered by existing LLMs.

2 Massive Language Adaptation

The principle of massive language adaptation of large language models accommodates the utilization of a massively multilingual corpus (Section 2.1), the strong base LLM (Section 2.2), and the technique for effective language adaptation: vocabulary extension (Section 2.3) and continued pretraining (Section 2.4).

2.1 Data

We use Glot500-c (ImaniGooghari et al., 2023) covering 534 languages²²2We define languages using the ISO 639-3 code combined with the corresponding written script. For example, “eng_Latn” represents English written in the Latin script. as the training data of MaLA-500. See §A for the list of languages with their data amounts. The original number of sentences ranges from 10 thousand to 63 million. Note that Glot500-c does not put full effort into collecting data for high-resource languages but focuses on low-resource languages. We sample languages from the imbalanced dataset according to a multinomial distribution, with $\alpha=0.3$ for vocabulary extension and continued pretraining. We use different scales for sampling data to be used in model training and vocabulary construction. After sampling, the number of sentences for training ranges from 600 thousand to 8 million per language, leading to 1 billion sentences in total. The number of sentences for vocabulary construction ranges from 30 thousand to 400 thousand, making a total of 50 million sentences.

2.2 Model

We choose LLaMA 2 (Touvron et al., 2023b) to start continual training. LLaMA series models (Touvron et al., 2023a), with model weights released publicly, have gained popularity in the research community. Despite being English-centric compared to their multilingual counterparts, they have shown remarkable capacity for multiple languages (Ahuja et al., 2023b). We choose the latest LLaMA 2, trained on 2 trillion tokens, as our base model to benefit from its outstanding language capacity. Our study chooses the 7B model with 32 transformer layers, and leaves the extension of LLMs with larger sizes as a future work.

2.3 Vocabulary Extension

The original LLaMA 2’s 32,000 tokenizer covers English and a small fraction of other European languages using Latin or Cyrillic scripts. To enhance its capability and encoding efficiency for a broader range of languages, we extend the vocabulary with Glot500-c. Specifically, we initially train a multilingual tokenizer with SentencePiece (Kudo & Richardson, 2018) on the sampled Glot500-c with a vocabulary of 250,000. Subsequently, we merge the trained tokenizer with the original LLaMA 2 tokenizer by taking the union of their vocabularies. As a result, we obtain the MaLA-500’s tokenizer with a vocabulary size of 260,164. After vocabulary extension and resizing the embedding layer, the model size becomes 8.6B.

We measure the impact of vocabulary extension on the development set of Glot500-c by analyzing the reduction in segmentation length for each language. The results indicate that the effect of vocabulary extension varies, ranging from 8% (English, eng_Latn) to 88% (Oriya, ori_Orya). Unsurprisingly, vocabulary extension has a larger effect on languages written in non-Latin scripts than on those in the Latin script. However, for some low-resource languages written in the Latin script, e.g., Kabiyè (kbp_Latn) and Vietnamese (vie_Latn), the segmentation length is shortened by around 50%.

2.4 Continued Pretraining

We employ continued pretraining for language adaptation with low-rank adaptation (LoRA, Hu et al., 2022) to enable parameter-efficient training, given the limitation of our computing resources. LoRA injects trainable rank decomposition matrices, which approximate the large weight matrices with a lower rank, to the pretrained model weights. It reduces the computational complexity and thus saves the training cost while retaining high model quality (Hu et al., 2022). We continually train the casual language model to update the rank-decomposition matrices, embedding layer, and language modeling head while freezing the transformer weights of pretrained models, allowing the continually trained language model to learn from new data in new languages without completely losing its previous language capacity. Continual training of large language models requires substantial computational resources. We adopt efficient distributed training setups on supercomputers to make the training process feasible.

2.5 Training

Hardware and Software

We train our model on the computing cluster with the theoretical peak performance of 2 petaflops on GPU nodes. We deploy distributed training on 24 Nvidia Ampere A100 GPUs. As for software, we utilize the Huggingface Transformers (Wolf et al., 2020), PEFT (Parameter-Efficient Fine-Tuning),³³3https://huggingface.co/docs/peft/index and DeepSpeed (Rasley et al., 2020). We use the ZeRO redundancy optimizer (Rajbhandari et al., 2020) and maximize the batch size that fits the memory of each GPU. We employ mixed-precision training using the bfloat16 format.

Hyperparameters

The learning rate is set at 3e-4. A weight decay of 0.01 is applied to penalize large weights and mitigate overfitting. The trainable LoRA module targets the query and value matrices. The language model head is not decomposed by a LoRA module but is trained in a full-parameter manner. In our setting, the final model has 10B parameters in total, in which 2B parameters are trainable. The LoRA module is incorporated with a rank of 8, an alpha value of 32, and a dropout rate of 0.05, contributing to the model’s adaptability and regularization during training. The context window is 4k. We maximize the batch size to fit the memory, making a global batch size of 384. The model undergoes three training epochs. Checkpoints are saved every 500 steps, and we employ early stop** to select the checkpoint that exhibits the most favorable average performance on downstream tasks.

Environmental Impacts

We train our model on a carbon-neutral data center, with all electricity generated with renewable hydropower, and the waste heat is utilized in district heating to further reduce CO2 footprint.⁴⁴4https://www.csc.fi/sustainable-development

3 Evaluation

3.1 Benchmarks and Setup

We consider both intrinsic and extrinsic measures for evaluation. Evaluation dataset statistics are shown in Table 1.

	Datasets	Metric	$\\|$ Data $\\|$	$\\|$ Lang $\\|$	Domain
Intrinsic	Glot500-c test (ImaniGooghari et al., 2023)	$NLL$	1000	534	Misc
Intrinsic	PBC (Mayer & Cysouw, 2014)	$NLL$	500	370	Bible
Extrinsic	SIB200 (Adelani et al., 2023)	ACC	204	177	Misc
Extrinsic	Taxi1500 (Ma et al., 2023)	ACC	111	351	Bible

Table 1: Evaluation dataset statistics.

\|

Data

\|

: test set size per language.

\|

Lang

\|

: number of evaluated languages.

NLL

: negative log-likelihood. ACC: Accuracy.

For intrinsic evaluation, perplexity is not comparable across models and languages due to different text segmentations. Inspired by Xue et al. (2022); Yu et al. (2023), we instead measure the negative log-likelihood ( $NLL$ ) of the text using the given LLMs.

We concatenate the dataset as the input text and adopt the sliding-window strategy.⁵⁵5https://huggingface.co/docs/transformers/en/perplexity The evaluation of different LLMs uses the same data with the concatenation of sentences per language, thus making $NLL$ model-comparable. In addition, we consider language-comparable $NLL$ by measuring $NLL$ on parallel data, in which every sample in different languages contains the same semantic information. We report the model-comparable $NLL$ of Glot500-c test set covering all 534 considered languages (§3.2), and language-comparable $NLL$ on Parallel Bible Corpus (PBC, Mayer & Cysouw, 2014), covering 370 languages (§3.3).

For extrinsic evaluation, we evaluate the few-shot learning capability of MaLA-500 and compare it with other LLMs on SIB200 (Adelani et al., 2023) and Taxi1500 (Ma et al., 2023).

SIB200 is a topic classification dataset. The classification task involves seven classes, namely science/technology, travel, politics, sports, health, entertainment, and geography. Our evaluation spans a diverse set of 177 languages, obtained by intersecting the language sets of SIB200 and Glot500-c. Note that the flores200-based SIB200 evaluation set is included in the training data since Glot500-c includes flores200, but the classification labels are not provided.

Taxi1500 is another text classification dataset spanning 351 languages. It involves six classes, namely, Recommendation, Faith, Description, Sin, Grace, and Violence. Our evaluation efforts aim to cover as many languages as possible. However, the evaluation of massively multilingual language models is a challenging task. Due to the lack of real-world multilingual evaluation benchmarks, we use this benchmark that contains religious content.

For in-context learning evaluation, the evaluated LLM receives a structured prompt, which is the concatenation of few-shot examples and the sample intended for prediction. The format for both a few-shot example and the sample to predict is defined as follows:

Template for SIB200:

The topic of the news [sent] is [label]

Template for Taxi1500:

The topic of the verse [sent] is [label]

where [sent] is the sentence for classification, and [label] is the ground truth. [label] is included when the sample serves as a few-shot example but is omitted when predicting the sample. The constructed prompt is then used as input to the LLM. Subsequently, the evaluated LLM is prompted to estimate the probability of the label over the label set based on the provided prompt.

For SIB200, few-shots examples are randomly sampled from the in-language training sets. Since randomly selecting few-shot examples for in-context learning yields random results for both MaLA-500 and previous LLMs on Taxi1500, we consider the retriever-based in-context learning (Liu et al., 2022). Specifically, we use average word embeddings in layer 8 of the Glot500 (ImaniGooghari et al., 2023) for retrieving semantic-similar samples as suggested in previous work (Sabet et al., 2020) for all the compared models. The evaluation process is implemented using the lm-evaluation-harness,⁶⁶6https://github.com/EleutherAI/lm-evaluation-harness and we use accuracy (ACC) to measure the performance of classification.

3.2 Comparison across LLMs

We compare MaLA-500 with LLaMA 2-7B, mGPT-13B, BLOOM-7B1, and XGLM-7.5B on Glot500-c test set, SIB200, Taxi1500 by computing the averaged performance across languages, and the result are given in Table 2. Among the evaluated LLMs, LLaMA 2-7B performs second-best, indicating that LLaMA 2-7B has a strong multilingual capacity and that it is reasonable to select it as the base model. MaLA-500 outperforms all compared LLMs with a close or slightly larger model size across all the evaluated tasks. Notably, compared to LLaMA 2-7B, MaLA-500 gains a lower $NLL$ on the Glot500-c test set by 39.33, and has 14.94% and 4.82% improvements on SIB200 and Taxi1500, respectively. It highlights MaLA-500’s substantial contribution to enhancing the multilingual capacity of LLMs.

Model	Glot500-c test ( $NLL$ $\downarrow$ )	SIB200 (ACC $\uparrow$ )	Taxi1500 (ACC $\uparrow$ )
LLaMA 2-7B	190.58	42.08	44.07
mGPT-13B	282.46	45.34	40.98
BLOOM-7B1	202.95	44.63	43.98
XGLM-7.5B	205.07	34.36	43.24
MaLA-500	151.25	57.02	48.89

Table 2: Averaged results across languages on Glot500-c test (measured by

NLL

), SIB200, and Taxi1500 (measured by accuracy (%)) of different LLMs. mGPT has no model with around 7B parameters, so we choose a larger one with 13B parameters.

\downarrow

indicates the lower, the better.

\uparrow

indicates the higher, the better. The best results are bold.

Figures 1, 2 and 3 provide detailed performance analysis across languages on Glot500-c test, SIB200, and Taxi1500. In those figures, we group scores into different performance bins and display them in different colors. For Glot500-c test, MaLA-500 has more languages achieving better $NLL$ , i.e., 61 languages with $NLL$ less than 100 and 171 languages with $NLL$ between 100 and 150. Besides, MaLA-500 has 54 (10%) languages achieving $NLL$ larger than 200, which may indicate the languages are not well covered by the measured LLM. Nevertheless, the number is much less than other LLMs. For example, the second-best LLM, LLaMA 2-7B, has 231 (43%) languages achieving $NLL$ larger than 200. For both SIB200 and Taxi1500, MaLA-500 surpasses previous LLMs in the sense that it obtains random results in fewer languages and achieves impressive performance in more languages than its counterparts.

Refer to caption — Figure 1: $NLL$ (lower is better) on Glot500-c test with the scores grouped into four bins displayed in different colors. X-axis: the number of languages in performance ranges.

3.3 Comparison across Languages

To check in detail how MaLA-500 performs across languages, we check the performance across language families⁷⁷7We assign languages to families based on Glottolog: https://glottolog.org/glottolog/family. shown in Table 3. We observe that more high-resource language families, e.g., Indo-European (indo1319) and Dravidian (drav1251), achieve slightly better performance than low-resource language families, e.g., Sino-Tibetan (sino1245).

family	$\\|$ Sent $\\|$	PBC ( $NLL$ $\downarrow$ )	SIB200 (ACC $\uparrow$ )	Taxi1500 (ACC $\uparrow$ )
indo1319	988M	145.35	63.53	53.03
drav1251	135M	131.29	56.25	54.65
aust1307	113M	147.37	62.83	49.69
turk1311	109M	161.71	57.08	52.55
afro1255	100M	165.46	52.00	43.74
atla1278	57M	141.92	42.90	45.52
ural1272	50M	137.52	66.67	48.58
sino1245	29M	155.64	49.30	49.31
other	60M	167.69	55.74	46.67

Table 3: Performance comparison across language families on PBC, SIB200, and Taxi1500.

\|

Sent

\|

: sentence number used for continued pretraining in total.

\downarrow

indicates the lower, the better.

\uparrow

indicates the higher, the better.

In Table 4, we present a comprehensive analysis of the top 5 performance improvements and declines across languages on SIB200 from MaLA-500 compared to LLaMA 2-7B. We observe that MaLA-500 has substantial improvements on low-resource scripts, e.g., Kannada (kan_Knda), while has worse performance on high-resource languages, e.g., Swedish (swe_Latn), which have been well covered by LLaMA 2-7B.

high end				low end
Language	LLaMA 2-7B	MaLA-500	$\Delta$	Language	LLaMA 2-7B	MaLA-500	$\Delta$
kan_Knda	17.16	57.35	40.19	swe_Latn	71.08	60.29	-10.79
ckb_Arab	19.61	60.29	40.68	rus_Cyrl	71.57	65.20	-06.37
asm_Beng	17.16	58.82	41.66	dan_Latn	69.12	63.24	-05.88
pan_Guru	14.22	58.82	44.60	pol_Latn	74.51	68.63	-05.88
sin_Sinh	15.20	60.29	45.09	ukr_Cyrl	71.57	65.69	-05.88

Table 4: Results for five languages each with the largest (high end) and smallest (low end) gains from MaLA-500 vs. LLaMA 2-7B for SIB200.

\Delta

indicates the difference between the scores of MaLA-500 and LLaMA 2-7B. See §B for detailed results for each task.

In our comprehensive analysis of contributing factors on SIB200, we note that the corpus size of a language exhibits a weak correlation of 0.13 with its performance gain. In contrast, the corpus size of the language family to which a language belongs demonstrates a moderate correlation of 0.40. A moderately high Pearson correlation of 0.53 is observed between the effect of vocabulary extension, i.e., the reduction in segmentation length, and the performance gain. This observation holds true for languages with both non-Latin scripts, such as Kannada (kan_Knda), Malayalam (mal_Mlym), and Tigrinya (tir_Ethi), as well as Latin scripts, such as Igbo (ibo_Latn) and Yoruba (yor_Latn). It demonstrates the effectiveness of vocabulary extension.

3.4 Effect of Number of Shots

Figure 4 illustrates the relationship between accuracy and the number of in-context examples (i.e., shots) on SIB200. As the number of in-context shots increases, there is a corresponding rise in accuracy. Notably, with just 1-shot, accuracy exhibits randomness at 30.88%, indicating 1-shot provides limited information for task learning. The transition from 1 shot to 2 shots/3 shots results in a notable improvement, with performances boosted by 19.83% and 26.14%, respectively. This highlights the effectiveness of increasing the number of shots. MaLA-500 achieves its peak performance at approximately 65% accuracy with 6-10 in-context shots. This may be attributed to the multi-class nature of the SIB200 dataset, necessitating more shots for learning intricate input-label map**s.

In Figure 5, a more nuanced portrayal of results aligns with the observations made in Figure 4. In the realm of 1-shot in-context learning, approximately 50 languages exhibit erratic results. As the number of shots increases, there is a reduction in the number of languages achieving low accuracy (25-50%), coupled with a growing cohort achieving high accuracy (75-100%).

Further examination into individual language trends reveals that some low-resource languages require more shots to achieve better performance (e.g., pes_Arab for Persian) or even exhibit poor performance with 10 shots (e.g., dzo_Tibt for Dzongkha and ayr_Latn for Central Aymara). In contrast, high-resource languages, such as fra_Latn for French, demonstrate impressive performance even with fewer shots, and increasing the number of shots results in only marginal improvement.

4 Related Work

4.1 Multilingual Language Models

Language model development has endeavored to broaden the scope of pretraining languages to address multilingual scenarios. Pretrained multilingual models have been able to accommodate up to a hundred or more languages. Noteworthy examples include mBERT Devlin et al. (2019), which supports 104 languages, XLM-R (Conneau et al., 2020) covering 100 languages, mBART (Liu et al., 2020) designed for 25 languages, mT5 (Xue et al., 2021) spanning 101 languages, XGLM (Lin et al., 2021) across 30 languages, GPT-3 covering 118 languages (93% English), mGPT (Shliazhko et al., 2022) accommodating 60 languages, and BLOOM (Scao et al., 2022) supporting 46 languages and 13 programming languages.

Surprisingly, two recent multilingual language models have surpassed the conventional limit by supporting more than 400 languages. Glot500-m (ImaniGooghari et al., 2023) spans 511 languages through vocabulary extension and continued training based on XLM-R. SERENGETI (Adebara et al., 2022) goes even further by supporting 517 African languages and language varieties, written in five different scripts, employing models inspired by both ELECTRA (Clark et al., 2020) and XLM-R. MADLAD (Kudugunta et al., 2023) covers 419 languages and trains an 8B language model from scratch with an adapted UL2 objective (Tay et al., 2022). Our work is concurrent with the MADLAD-400 language model. We distinguish it by: 1) language coverage. Our work covered more than 500 languages, a number comparable to that of encoder-only models and surpassing MADLAD-400 by an additional 100 languages. 2) training methods. We consider continual training to benefit from the learned knowledge of the original models. 3) model architecture. We adopt an open model architecture, i.e., LLaMA, while MADLAD uses decoder-only T5 architecture, which has not been supported by the HuggingFace ecosystem at the time of writing, thus leading to additional difficulty in usage.

4.2 Language Adaptation

Before the advent of LLMs, diverse approaches are employed to adapt small-scale multilingual language models to new languages. These methods include using adapters (Pfeiffer et al., 2020; Üstün et al., 2020; Pfeiffer et al., 2020; Nguyen et al., 2021; Faisal & Anastasopoulos, 2022; Yong et al., 2022), vocabulary extension and substitution (Chau et al., 2020; Wang et al., 2020; Müller et al., 2020; 2021; Pfeiffer et al., 2021; Chen et al., 2023; Downey et al., 2023), leveraging monolingual corpora (Ebrahimi & Kann, 2021; Alabi et al., 2022), and utilizing bilingual lexicons (Wang et al., 2022).

While language models have been scaled up notably, their coverage is limited to a specific set of languages. To address this constraint, various methods have been proposed to expand the applicability of these large language models across a broader range of languages, catering to both general-purpose tasks and specific applications like machine translation. These methods also involve vocabulary extension (Cui et al., 2023), continued pretraining and instruction-tuning (Yong et al., 2022; Cui et al., 2023; Chen et al., 2024; Zhao et al., 2024), and parallel corpora exploitation (Cahyawijaya et al., 2023; Yang et al., 2023; Zhu et al., 2023; Xu et al., 2023). Despite these efforts, massive language adaptation of LLMs for general-purpose tasks across diverse languages, e.g., covering many languages families and more than one hundred languages, remains an area yet to be thoroughly explored.

5 Conclusion and Future Work

We present a pioneering effort in massive language adaptation on LLMs, focusing on extending LLaMA 7B to our model, MaLA-500. This adaptation involves vocabulary extension and continued pretraining with LoRA. Our approach leads to MaLA-500 achieving state-of-the-art in-context learning capabilities, as demonstrated on the benchmarks of SIB200 and Taxi1500. We release the training scripts and model weights publicly to facilitate future research. This work marks a substantial advancement in applying LLMs to a diverse range of languages.

Our future work will focus on further improving the model capacity, for example, on machine translation across many language pairs. Alves et al. (2023) showed that LLMs (LLaMA-7B and LLaMA-13B) exhibited poor performance even on English-centric high-resource language pairs in some cases. Translation with LLMs on low-resource languages is more challenging. The LLaMA-7B model performed poorly in our preliminary experiments. Besides, our pretraining corpus does not intentionally include bilingual texts, and our MaLA-500 model is not instruction-tuned with translation data. We leave the inclusion of bilingual text during continual pretraining, instruction fine-tuning with translation data, and the evaluation on machine translation as future works.

Ethical Statement

LLMs have been known to exhibit biases present in their training data. When extending LLMs to low-resource languages, there is a risk of propagating biases from high-resource languages to underrepresented ones. Careful attention must be paid to mitigate bias and ensure fairness in data collection and model training. The paper aims to make LLMs more accessible for underrepresented languages. Still, there is a risk of creating a digital language divide if certain communities are left out due to limited technological access. Future work would address biases by conducting bias audits on the training data, debiasing the models during generation, and continuously monitoring model outputs.

Reproducibility Statement

We make the following efforts to ensure reproducible research. We release the model weights (https://huggingface.co/MaLA-LM) and codes for training and evaluation (https://github.com/MaLA-LM/mala-500). We use publicly available evaluation benchmarks which can be obtained freely or by request. The results are reproducible with our released model weights and evaluation scripts,

Acknowledgements

We thank José Pombal for constructive suggestions on training. This work is funded by The European Research Council (grants #740516, #771113 and #758969), EU’s Horizon Europe Research and Innovation Actions (UTTER, contract 101070631), and the European Union’s Horizon Europe research and innovation programme under grant agreement No 101070350 and from UK Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee [grant #10052546]. The authors wish to acknowledge CSC – IT Center for Science, Finland, for generous computational resources on the Mahti supercomputer and LUMI supercomputer through the LUMI extreme scale access (MOOMIN and LumiNMT). Shaoxiong Ji and Peiqin Lin acknowledge travel support from ELISE (GA no 951847).

References

Adebara et al. (2022) Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed, and Alcides Alcoba Inciarte. SERENGETI: Massively multilingual language models for africa. arXiv preprint arXiv:2212.10785, 2022.
Adelani et al. (2023) David Ifeoluwa Adelani, Hannah Liu, Xiaoyu Shen, Nikita Vassilyev, Jesujoba O. Alabi, Yanke Mao, Haonan Gao, and En-Shiun Annie Lee. SIB-200: A simple, inclusive, and big evaluation dataset for topic classification in 200+ languages and dialects. CoRR, abs/2309.07445, 2023. doi: 10.48550/arXiv.2309.07445. URL https://doi.org/10.48550/arXiv.2309.07445.
Ahuja et al. (2023a) Kabir Ahuja, Rishav Hada, Millicent Ochieng, Prachi Jain, Harshita Diddee, Samuel Maina, Tanuja Ganu, Sameer Segal, Maxamed Axmed, Kalika Bali, and Sunayana Sitaram. MEGA: multilingual evaluation of generative AI. CoRR, abs/2303.12528, 2023a. doi: 10.48550/arXiv.2303.12528. URL https://doi.org/10.48550/arXiv.2303.12528.
Ahuja et al. (2023b) Sanchit Ahuja, Divyanshu Aggarwal, Varun Gumma, Ishaan Watts, Ashutosh Sathe, Millicent Ochieng, Rishav Hada, Prachi Jain, Maxamed Axmed, Kalika Bali, and Sunayana Sitaram. MEGAVERSE: benchmarking large language models across languages, modalities, models and tasks. CoRR, abs/2311.07463, 2023b. doi: 10.48550/ARXIV.2311.07463. URL https://doi.org/10.48550/arXiv.2311.07463.
Alabi et al. (2022) Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, and Dietrich Klakow. Adapting pre-trained language models to african languages via multilingual adaptive fine-tuning. In Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, and Seung-Hoon Na (eds.), Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12-17, 2022, pp. 4336–4349. International Committee on Computational Linguistics, 2022. URL https://aclanthology.org/2022.coling-1.382.
Alves et al. (2023) Duarte Alves, Nuno Guerreiro, João Alves, José Pombal, Ricardo Rei, José de Souza, Pierre Colombo, and André FT Martins. Steering large language models for machine translation with finetuning and in-context learning. In Findings of the Association for Computational Linguistics: EMNLP 2023, pp. 11127–11148, 2023.
Bang et al. (2023) Ye** Bang, Samuel Cahyawijaya, Nayeon Lee, Wenliang Dai, Dan Su, Bryan Wilie, Holy Lovenia, Ziwei Ji, Tiezheng Yu, Willy Chung, Quyet V. Do, Yan Xu, and Pascale Fung. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. CoRR, abs/2302.04023, 2023. doi: 10.48550/arXiv.2302.04023. URL https://doi.org/10.48550/arXiv.2302.04023.
Cahyawijaya et al. (2023) Samuel Cahyawijaya, Holy Lovenia, Tiezheng Yu, Willy Chung, and Pascale Fung. Instruct-align: Teaching novel languages with to LLMs through alignment-based cross-lingual instruction. CoRR, abs/2305.13627, 2023. doi: 10.48550/arXiv.2305.13627. URL https://doi.org/10.48550/arXiv.2305.13627.
Chau et al. (2020) Ethan C. Chau, Lucy H. Lin, and Noah A. Smith. Parsing with multilingual bert, a small treebank, and a small corpus. In Trevor Cohn, Yulan He, and Yang Liu (eds.), Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020, volume EMNLP 2020 of Findings of ACL, pp. 1324–1334. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.findings-emnlp.118. URL https://doi.org/10.18653/v1/2020.findings-emnlp.118.
Chen et al. (2024) Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, Barry Haddow, and Kenneth Heafield. Monolingual or multilingual instruction tuning: Which makes a better Alpaca. In Findings of the Association for Computational Linguistics: EACL, 2024. URL https://doi.org/10.48550/arXiv.2309.08958.
Chen et al. (2023) Yihong Chen, Kelly Marchisio, Roberta Raileanu, David Ifeoluwa Adelani, Pontus Stenetorp, Sebastian Riedel, and Mikel Artetxe. Improving language plasticity via pretraining with active forgetting. CoRR, abs/2307.01163, 2023. doi: 10.48550/arXiv.2307.01163. URL https://doi.org/10.48550/arXiv.2307.01163.
Clark et al. (2020) Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. ELECTRA: Pre-training text encoders as discriminators rather than generators. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=r1xMH1BtvB.
Conneau et al. (2020) Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Édouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8440–8451, 2020.
Cui et al. (2023) Yiming Cui, Ziqing Yang, and Xin Yao. Efficient and effective text encoding for Chinese LLaMA and Alpaca. CoRR, abs/2304.08177, 2023. doi: 10.48550/ARXIV.2304.08177. URL https://doi.org/10.48550/arXiv.2304.08177.
Devlin et al. (2019) Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019.
Downey et al. (2023) C. M. Downey, Terra Blevins, Nora Goldfine, and Shane Steinert-Threlkeld. Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages. CoRR, abs/2309.04679, 2023. doi: 10.48550/arXiv.2309.04679. URL https://doi.org/10.48550/arXiv.2309.04679.
Ebrahimi & Kann (2021) Abteen Ebrahimi and Katharina Kann. How to adapt your pretrained multilingual model to 1600 languages. In Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (eds.), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, pp. 4555–4567. Association for Computational Linguistics, 2021. doi: 10.18653/v1/2021.acl-long.351. URL https://doi.org/10.18653/v1/2021.acl-long.351.
Faisal & Anastasopoulos (2022) Fahim Faisal and Antonios Anastasopoulos. Phylogeny-inspired adaptation of multilingual models to new languages. CoRR, abs/2205.09634, 2022. doi: 10.48550/arXiv.2205.09634. URL https://doi.org/10.48550/arXiv.2205.09634.
Hu et al. (2022) Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=nZeVKeeFYf9.
ImaniGooghari et al. (2023) Ayyoob ImaniGooghari, Peiqin Lin, Amir Hossein Kargaran, Silvia Severini, Masoud Jalili Sabet, Nora Kassner, Chunlan Ma, Helmut Schmid, André Martins, François Yvon, and Hinrich Schütze. Glot500: Scaling multilingual corpora and language models to 500 languages. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1082–1117, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.acl-long.61. URL https://aclanthology.org/2023.acl-long.61.
Jiang et al. (2023) Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, and William El Sayed. Mistral 7b. CoRR, abs/2310.06825, 2023. doi: 10.48550/ARXIV.2310.06825. URL https://doi.org/10.48550/arXiv.2310.06825.
Jiang et al. (2024) Albert Q Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, et al. Mixtral of experts. arXiv preprint arXiv:2401.04088, 2024.
Kudo & Richardson (2018) Taku Kudo and John Richardson. SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In Eduardo Blanco and Wei Lu (eds.), Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018: System Demonstrations, Brussels, Belgium, October 31 - November 4, 2018, pp. 66–71. Association for Computational Linguistics, 2018. doi: 10.18653/v1/d18-2012. URL https://doi.org/10.18653/v1/d18-2012.
Kudugunta et al. (2023) Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, et al. MADLAD-400: A multilingual and document-level large audited dataset. arXiv preprint arXiv:2309.04662, 2023.
Lai et al. (2023) Viet Dac Lai, Nghia Trung Ngo, Amir Pouran Ben Veyseh, Hieu Man, Franck Dernoncourt, Trung Bui, and Thien Huu Nguyen. ChatGPT beyond English: Towards a comprehensive evaluation of large language models in multilingual learning. CoRR, abs/2304.05613, 2023. doi: 10.48550/arXiv.2304.05613. URL https://doi.org/10.48550/arXiv.2304.05613.
Lin et al. (2021) Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, **gfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O’Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona T. Diab, Veselin Stoyanov, and Xian Li. Few-shot learning with multilingual language models. CoRR, abs/2112.10668, 2021. URL https://arxiv.longhoe.net/abs/2112.10668.
Liu et al. (2022) Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. What makes good in-context examples for gpt-3? In Eneko Agirre, Marianna Apidianaki, and Ivan Vulic (eds.), Proceedings of Deep Learning Inside Out: The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, DeeLIO@ACL 2022, Dublin, Ireland and Online, May 27, 2022, pp. 100–114. Association for Computational Linguistics, 2022. doi: 10.18653/V1/2022.DEELIO-1.10. URL https://doi.org/10.18653/v1/2022.deelio-1.10.
Liu et al. (2020) Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, and Luke Zettlemoyer. Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 8:726–742, 2020.
Ma et al. (2023) Chunlan Ma, Ayyoob ImaniGooghari, Haotian Ye, Ehsaneddin Asgari, and Hinrich Schütze. Taxi1500: A multilingual dataset for text classification in 1500 languages, 2023.
Mayer & Cysouw (2014) Thomas Mayer and Michael Cysouw. Creating a massively parallel bible corpus. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asunción Moreno, Jan Odijk, and Stelios Piperidis (eds.), Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, Reykjavik, Iceland, May 26-31, 2014, pp. 3158–3163. European Language Resources Association (ELRA), 2014. URL http://www.lrec-conf.org/proceedings/lrec2014/summaries/220.html.
Müller et al. (2020) Benjamin Müller, Benoît Sagot, and Djamé Seddah. Can multilingual language models transfer to an unseen dialect? A case study on north african arabizi. CoRR, abs/2005.00318, 2020. URL https://arxiv.longhoe.net/abs/2005.00318.
Müller et al. (2021) Benjamin Müller, Antonios Anastasopoulos, Benoît Sagot, and Djamé Seddah. When being unseen from mbert is just the beginning: Handling new languages with multilingual language models. In Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tür, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou (eds.), Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021, pp. 448–462. Association for Computational Linguistics, 2021. doi: 10.18653/v1/2021.naacl-main.38. URL https://doi.org/10.18653/v1/2021.naacl-main.38.
Nguyen et al. (2021) Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, and Thien Huu Nguyen. Trankit: A light-weight transformer-based toolkit for multilingual natural language processing. In Dimitra Gkatzia and Djamé Seddah (eds.), Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, EACL 2021, Online, April 19-23, 2021, pp. 80–90. Association for Computational Linguistics, 2021. doi: 10.18653/v1/2021.eacl-demos.10. URL https://doi.org/10.18653/v1/2021.eacl-demos.10.
Pfeiffer et al. (2020) Jonas Pfeiffer, Ivan Vulic, Iryna Gurevych, and Sebastian Ruder. MAD-X: an adapter-based framework for multi-task cross-lingual transfer. In Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pp. 7654–7673. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.emnlp-main.617. URL https://doi.org/10.18653/v1/2020.emnlp-main.617.
Pfeiffer et al. (2021) Jonas Pfeiffer, Ivan Vulic, Iryna Gurevych, and Sebastian Ruder. Unks everywhere: Adapting multilingual language models to new scripts. In Marie-Francine Moens, Xuan**g Huang, Lucia Specia, and Scott Wen-tau Yih (eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021, pp. 10186–10203. Association for Computational Linguistics, 2021. doi: 10.18653/v1/2021.emnlp-main.800. URL https://doi.org/10.18653/v1/2021.emnlp-main.800.
Rajbhandari et al. (2020) Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, and Yuxiong He. ZeRO: Memory optimizations toward training trillion parameter models. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–16. IEEE, 2020.
Rasley et al. (2020) Jeff Rasley, Samyam Rajbhandari, Olatunji Ruwase, and Yuxiong He. DeepSpeed: System optimizations enable training deep learning models with over 100 billion parameters. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3505–3506, 2020.
Sabet et al. (2020) Masoud Jalili Sabet, Philipp Dufter, François Yvon, and Hinrich Schütze. Simalign: High quality word alignments without parallel training data using static and contextualized embeddings. In Trevor Cohn, Yulan He, and Yang Liu (eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, EMNLP 2020, Online Event, 16-20 November 2020, volume EMNLP 2020 of Findings of ACL, pp. 1627–1643. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.findings-emnlp.147. URL https://doi.org/10.18653/v1/2020.findings-emnlp.147.
Scao et al. (2022) Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, et al. BLOOM: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100, 2022.
Shliazhko et al. (2022) Oleh Shliazhko, Alena Fenogenova, Maria Tikhonova, Vladislav Mikhailov, Anastasia Kozlova, and Tatiana Shavrina. mGPT: Few-shot learners go multilingual. CoRR, abs/2204.07580, 2022. doi: 10.48550/arXiv.2204.07580. URL https://doi.org/10.48550/arXiv.2204.07580.
Tay et al. (2022) Yi Tay, Mostafa Dehghani, Vinh Q Tran, Xavier Garcia, Jason Wei, Xuezhi Wang, Hyung Won Chung, Dara Bahri, Tal Schuster, Steven Zheng, et al. UL2: Unifying language learning paradigms. In The Eleventh International Conference on Learning Representations, 2022.
Touvron et al. (2023a) Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurélien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. LLaMA: Open and efficient foundation language models. CoRR, abs/2302.13971, 2023a. doi: 10.48550/arXiv.2302.13971. URL https://doi.org/10.48550/arXiv.2302.13971.
Touvron et al. (2023b) Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton-Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurélien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288, 2023b. doi: 10.48550/arXiv.2307.09288. URL https://doi.org/10.48550/arXiv.2307.09288.
Üstün et al. (2020) Ahmet Üstün, Arianna Bisazza, Gosse Bouma, and Gertjan van Noord. Udapter: Language adaptation for truly universal dependency parsing. In Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pp. 2302–2315. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.emnlp-main.180. URL https://doi.org/10.18653/v1/2020.emnlp-main.180.
Wang et al. (2022) Xinyi Wang, Sebastian Ruder, and Graham Neubig. Expanding pretrained models to thousands more languages via lexicon-based adaptation. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pp. 863–877. Association for Computational Linguistics, 2022. URL https://aclanthology.org/2022.acl-long.61.
Wang et al. (2020) Zihan Wang, Karthikeyan K, Stephen Mayhew, and Dan Roth. Extending multilingual BERT to low-resource languages. In Trevor Cohn, Yulan He, and Yang Liu (eds.), Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020, volume EMNLP 2020 of Findings of ACL, pp. 2649–2656. Association for Computational Linguistics, 2020. doi: 10.18653/v1/2020.findings-emnlp.240. URL https://doi.org/10.18653/v1/2020.findings-emnlp.240.
Wolf et al. (2020) Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pp. 38–45, 2020.
Xu et al. (2023) Haoran Xu, Young ** Kim, Amr Sharaf, and Hany Hassan Awadalla. A paradigm shift in machine translation: Boosting translation performance of large language models. CoRR, abs/2309.11674, 2023. doi: 10.48550/ARXIV.2309.11674. URL https://doi.org/10.48550/arXiv.2309.11674.
Xue et al. (2021) Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. mT5: A massively multilingual pre-trained text-to-text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 483–498, 2021.
Xue et al. (2022) Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang, Mihir Kale, Adam Roberts, and Colin Raffel. Byt5: Towards a token-free future with pre-trained byte-to-byte models. Trans. Assoc. Comput. Linguistics, 10:291–306, 2022. doi: 10.1162/tacl“˙a“˙00461. URL https://doi.org/10.1162/tacl_a_00461.
Yang et al. (2023) Wen Yang, Chong Li, Jiajun Zhang, and Chengqing Zong. Bigtrans: Augmenting large language models with multilingual translation capability over 100 languages. CoRR, abs/2305.18098, 2023. doi: 10.48550/arXiv.2305.18098. URL https://doi.org/10.48550/arXiv.2305.18098.
Yong et al. (2022) Zheng Xin Yong, Hailey Schoelkopf, Niklas Muennighoff, Alham Fikri Aji, David Ifeoluwa Adelani, Khalid Almubarak, M. Saiful Bari, Lintang Sutawika, Jungo Kasai, Ahmed Baruwa, Genta Indra Winata, Stella Biderman, Dragomir Radev, and Vassilina Nikoulina. BLOOM+1: adding language support to BLOOM for zero-shot prompting. CoRR, abs/2212.09535, 2022. doi: 10.48550/arXiv.2212.09535. URL https://doi.org/10.48550/arXiv.2212.09535.
Yu et al. (2023) Lili Yu, Daniel Simig, Colin Flaherty, Armen Aghajanyan, Luke Zettlemoyer, and Mike Lewis. MEGABYTE: predicting million-byte sequences with multiscale transformers. CoRR, abs/2305.07185, 2023. doi: 10.48550/arXiv.2305.07185. URL https://doi.org/10.48550/arXiv.2305.07185.
Zhao et al. (2024) Jun Zhao, Zhihao Zhang, Qi Zhang, Tao Gui, and Xuan**g Huang. LLaMA beyond English: An empirical study on language capability transfer. arXiv preprint arXiv:2401.01055, 2024.
Zhu et al. (2023) Wenhao Zhu, Yunzhe Lv, Qingxiu Dong, Fei Yuan, **g**g Xu, Shujian Huang, Lingpeng Kong, Jiajun Chen, and Lei Li. Extrapolating large language models to non-english by aligning languages. CoRR, abs/2308.04948, 2023. doi: 10.48550/arXiv.2308.04948. URL https://doi.org/10.48550/arXiv.2308.04948.

Appendix A Languages

The list of languages of Glot500-c used to train MaLA-500 with the number of available sentences and language family information for each language is available in Tables 5, 6 and 7.

Lang	$\\|Sent\\|$	Family	Lang	$\\|Sent\\|$	Family	Lang	$\\|Sent\\|$	Family
hbs_Latn	63411156	indo1319	hin_Deva	7046700	indo1319	ton_Latn	1216118	aust1307
mal_Mlym	48098273	drav1251	kor_Hang	6468444	kore1284	tah_Latn	1190747	aust1307
aze_Latn	46300705		ory_Orya	6266475	indo1319	lat_Latn	1179913	indo1319
guj_Gujr	45738685	indo1319	urd_Arab	6009594	indo1319	srn_Latn	1172349	indo1319
ben_Beng	43514870	indo1319	swa_Latn	5989369		ewe_Latn	1161605	atla1278
kan_Knda	41836495	drav1251	sqi_Latn	5526836	indo1319	bem_Latn	1111969	atla1278
tel_Telu	41580525	drav1251	bel_Cyrl	5319675	indo1319	efi_Latn	1082621	atla1278
mlt_Latn	40654838	afro1255	afr_Latn	5157787	indo1319	bis_Latn	1070170	indo1319
fra_Latn	39197581	indo1319	nno_Latn	4899103	indo1319	orm_Latn	1067699
spa_Latn	37286756	indo1319	tat_Cyrl	4708088	turk1311	haw_Latn	1062491	aust1307
eng_Latn	36122761	indo1319	ast_Latn	4683554	indo1319	hmo_Latn	1033636	pidg1258
fil_Latn	33493255	aust1307	mon_Cyrl	4616960	mong1349	kat_Geor	1004297	kart1248
nob_Latn	32869205	indo1319	hbs_Cyrl	4598073	indo1319	pag_Latn	983637	aust1307
rus_Cyrl	31787973	indo1319	hau_Latn	4368483	afro1255	loz_Latn	964418	atla1278
deu_Latn	31015993	indo1319	sna_Latn	4019596	atla1278	fry_Latn	957422	indo1319
tur_Latn	29184662	turk1311	msa_Latn	3929084		mya_Mymr	945180	sino1245
pan_Guru	29052537	indo1319	som_Latn	3916769	afro1255	nds_Latn	944715	indo1319
mar_Deva	28748897	indo1319	srp_Cyrl	3864091	indo1319	run_Latn	943828	atla1278
por_Latn	27824391	indo1319	mlg_Latn	3715802		pnb_Arab	899895	indo1319
nld_Latn	25061426	indo1319	zul_Latn	3580113	atla1278	rar_Latn	894515	aust1307
ara_Arab	24524122		arz_Arab	3488224	afro1255	fij_Latn	887134	aust1307
zho_Hani	24143786		nya_Latn	3409030	atla1278	wls_Latn	882167	aust1307
ita_Latn	23539857	indo1319	tam_Taml	3388255	drav1251	ckb_Arab	874441	indo1319
ind_Latn	23018106	aust1307	hat_Latn	3226932	indo1319	ven_Latn	860249	atla1278
ell_Grek	22033282	indo1319	uzb_Latn	3223485	turk1311	zsm_Latn	859947	aust1307
bul_Cyrl	21823004	indo1319	sot_Latn	3205510	atla1278	chv_Cyrl	859863	turk1311
swe_Latn	20725883	indo1319	uzb_Cyrl	3029947	turk1311	lua_Latn	854359	atla1278
ces_Latn	20376340	indo1319	cos_Latn	3015055	indo1319	que_Latn	838486
isl_Latn	19547941	indo1319	als_Latn	2954874	indo1319	sag_Latn	771048	atla1278
pol_Latn	19339945	indo1319	amh_Ethi	2862985	afro1255	guw_Latn	767918	atla1278
ron_Latn	19190217	indo1319	sun_Latn	2586011	aust1307	bre_Latn	748954	indo1319
dan_Latn	19174573	indo1319	war_Latn	2584810	aust1307	toi_Latn	745385	atla1278
hun_Latn	18800025	ural1272	div_Thaa	2418687	indo1319	pus_Arab	731992	indo1319
tgk_Cyrl	18659517	indo1319	yor_Latn	2392359	atla1278	che_Cyrl	728201	nakh1245
srp_Latn	18371769	indo1319	fao_Latn	2365271	indo1319	pis_Latn	714783	indo1319
fas_Arab	18277593		uzn_Cyrl	2293672	turk1311	kon_Latn	685194
ceb_Latn	18149215	aust1307	smo_Latn	2290439	aust1307	oss_Cyrl	683517	indo1319
heb_Hebr	18128962	afro1255	bak_Cyrl	2264196	turk1311	hyw_Armn	679819	indo1319
hrv_Latn	17882932	indo1319	ilo_Latn	2106531	aust1307	iso_Latn	658789	atla1278
glg_Latn	17852274	indo1319	tso_Latn	2100708	atla1278	nan_Latn	656389	sino1245
fin_Latn	16730388	ural1272	mri_Latn	2046850	aust1307	lub_Latn	654390	atla1278
slv_Latn	15719210	indo1319	hmn_Latn	1903898		lim_Latn	652078	indo1319
vie_Latn	15697827	aust1305	asm_Beng	1882353	indo1319	tuk_Latn	649411	turk1311
mkd_Cyrl	14717004	indo1319	hil_Latn	1798875	aust1307	tir_Ethi	649117	afro1255
slk_Latn	14633631	indo1319	nso_Latn	1619354	atla1278	tgk_Latn	636541	indo1319
nor_Latn	14576191	indo1319	ibo_Latn	1543820	atla1278	yua_Latn	610052	maya1287
est_Latn	13600579		kin_Latn	1521612	atla1278	min_Latn	609065	aust1307
ltz_Latn	12997242	indo1319	hye_Armn	1463123	indo1319	lue_Latn	599429	atla1278
eus_Latn	12775959		oci_Latn	1449128	indo1319	khm_Khmr	590429	aust1305
lit_Latn	12479626	indo1319	lin_Latn	1408460	atla1278	tum_Latn	589857	atla1278
kaz_Cyrl	12378727	turk1311	tpi_Latn	1401844	indo1319	tll_Latn	586530	atla1278
lav_Latn	12143980	indo1319	twi_Latn	1400979	atla1278	ekk_Latn	582595	ural1272
bos_Latn	11014744	indo1319	kir_Cyrl	1397566	turk1311	lug_Latn	566948	atla1278
epo_Latn	8737198	arti1236	pap_Latn	1360138	indo1319	niu_Latn	566715	aust1307
cat_Latn	8648271	indo1319	nep_Deva	1317291	indo1319	tzo_Latn	540262	maya1287
tha_Thai	7735209	taik1256	azj_Latn	1315834	turk1311	mah_Latn	534614	aust1307
ukr_Cyrl	7462046	indo1319	bcl_Latn	1284493	aust1307	tvl_Latn	521556	aust1307
tgl_Latn	7411064	aust1307	xho_Latn	1262364	atla1278	jav_Latn	516833	aust1307
sin_Sinh	7293178	indo1319	cym_Latn	1244783	indo1319	vec_Latn	514240	indo1319
gle_Latn	7225513	indo1319	gaa_Latn	1222307	atla1278	jpn_Jpan	510722	japo1237

Table 5: List of languages of Glot500-c (Part I).

Lang	$\\|Sent\\|$	Family	Lang	$\\|Sent\\|$	Family	Lang	$\\|Sent\\|$	Family
lus_Latn	509250	sino1245	kmb_Latn	296269	atla1278	ncx_Latn	162558	utoa1244
crs_Latn	508755	indo1319	zai_Latn	277632	otom1299	qug_Latn	162500	quec1387
kqn_Latn	507913	atla1278	gym_Latn	274512	chib1249	rmn_Latn	162069	indo1319
ndo_Latn	496613	atla1278	bod_Tibt	273489	sino1245	cjk_Latn	160645	atla1278
snd_Arab	488730	indo1319	nde_Latn	269931	atla1278	arb_Arab	159884	afro1255
yue_Hani	484700	sino1245	fon_Latn	268566	atla1278	kea_Latn	158047	indo1319
tiv_Latn	483064	atla1278	ber_Latn	264426		mck_Latn	157521	atla1278
kua_Latn	473535	atla1278	nbl_Latn	259158	atla1278	arn_Latn	155882	arau1255
kwy_Latn	473274	atla1278	kmr_Latn	256677	indo1319	pdt_Latn	155485	indo1319
hin_Latn	466175	indo1319	guc_Latn	249044	araw1281	her_Latn	154827	atla1278
iku_Cans	465011		mam_Latn	248348	maya1287	gla_Latn	152563	indo1319
kal_Latn	462430	eski1264	nia_Latn	247406	aust1307	kmr_Cyrl	151728	indo1319
tdt_Latn	459818	aust1307	nyn_Latn	241992	atla1278	mwl_Latn	150054	indo1319
gsw_Latn	449240	indo1319	cab_Latn	240101	araw1281	nav_Latn	147702	atha1245
mfe_Latn	447435	indo1319	top_Latn	239232	toto1251	ksw_Mymr	147674	sino1245
swc_Latn	446378	atla1278	tog_Latn	231969	atla1278	mxv_Latn	147591	otom1299
mon_Latn	437950	mong1349	mco_Latn	231209	mixe1284	hif_Latn	147261	indo1319
mos_Latn	437666	atla1278	tzh_Latn	230706	maya1287	wol_Latn	146992	atla1278
kik_Latn	437228	atla1278	pms_Latn	227748	indo1319	sme_Latn	146803	ural1272
cnh_Latn	436667	sino1245	wuu_Hani	224088	sino1245	gom_Latn	143937	indo1319
gil_Latn	434529	aust1307	plt_Latn	220413	aust1307	bum_Latn	141673	atla1278
pon_Latn	434522	aust1307	yid_Hebr	220214	indo1319	mgr_Latn	138953	atla1278
umb_Latn	431589	atla1278	ada_Latn	219427	atla1278	ahk_Latn	135068	sino1245
lvs_Latn	422952	indo1319	iba_Latn	213615	aust1307	kur_Arab	134160	indo1319
sco_Latn	411591	indo1319	kek_Latn	209932	maya1287	bas_Latn	133436	atla1278
ori_Orya	410827		koo_Latn	209375	atla1278	bin_Latn	133256	atla1278
arg_Latn	410683	indo1319	sop_Latn	206501	atla1278	tsz_Latn	133251	tara1323
kur_Latn	407169	indo1319	kac_Latn	205542	sino1245	sid_Latn	130406	afro1255
dhv_Latn	405711	aust1307	qvi_Latn	205447	quec1387	diq_Latn	128908	indo1319
luo_Latn	398974	nilo1247	cak_Latn	204472	maya1287	srd_Latn	127064
lun_Latn	395764	atla1278	kbp_Latn	202877	atla1278	tcf_Latn	126050	otom1299
nzi_Latn	394247	atla1278	ctu_Latn	201662	maya1287	bzj_Latn	124958	indo1319
gug_Latn	392227	tupi1275	kri_Latn	201087	indo1319	udm_Cyrl	121705	ural1272
bar_Latn	387070	indo1319	mau_Latn	199134	otom1299	cce_Latn	120636	atla1278
bci_Latn	384059	atla1278	scn_Latn	199068	indo1319	meu_Latn	120273	aust1307
chk_Latn	380596	aust1307	tyv_Cyrl	198649	turk1311	chw_Latn	119751	atla1278
roh_Latn	377067	indo1319	ina_Latn	197315	arti1236	cbk_Latn	118789	indo1319
aym_Latn	373329	ayma1253	btx_Latn	193701	aust1307	ibg_Latn	118733	aust1307
yap_Latn	358929	aust1307	nch_Latn	193129	utoa1244	bhw_Latn	117381	aust1307
ssw_Latn	356561	atla1278	ncj_Latn	192962	utoa1244	ngu_Latn	116851	utoa1244
quz_Latn	354781	quec1387	pau_Latn	190529	aust1307	nyy_Latn	115914	atla1278
sah_Cyrl	352697	turk1311	toj_Latn	189651	maya1287	szl_Latn	112496	indo1319
tsn_Latn	350954	atla1278	pcm_Latn	187594	indo1319	ish_Latn	111814	atla1278
lmo_Latn	348135	indo1319	dyu_Latn	186367	mand1469	naq_Latn	109747	khoe1240
ido_Latn	331239	arti1236	kss_Latn	185868	atla1278	toh_Latn	107583	atla1278
abk_Cyrl	321578	abkh1242	afb_Arab	183694	afro1255	ttj_Latn	106925	atla1278
zne_Latn	318871	atla1278	urh_Latn	182214	atla1278	nse_Latn	105189	atla1278
quy_Latn	311040	quec1387	quc_Latn	181559	maya1287	hsb_Latn	104802	indo1319
kam_Latn	310659	atla1278	new_Deva	181427	sino1245	ami_Latn	104559	aust1307
bbc_Latn	310420	aust1307	yao_Latn	179965	atla1278	alz_Latn	104392	nilo1247
vol_Latn	310399	arti1236	ngl_Latn	178498	atla1278	apc_Arab	102392	afro1255
wal_Latn	309873	gong1255	nyu_Latn	177483	atla1278	vls_Latn	101900	indo1319
uig_Arab	307302	turk1311	kab_Latn	176015	afro1255	mhr_Cyrl	100474	ural1272
vmw_Latn	306899	atla1278	tuk_Cyrl	175769	turk1311	djk_Latn	99234	indo1319
kwn_Latn	305362	atla1278	xmf_Geor	174994	kart1248	wes_Latn	98492	indo1319
pam_Latn	303737	aust1307	ndc_Latn	174305	atla1278	gkn_Latn	97041	atla1278
seh_Latn	300243	atla1278	san_Deva	165616	indo1319	grc_Grek	96986	indo1319
tsc_Latn	298442	atla1278	nba_Latn	163485	atla1278	hbo_Hebr	96484	afro1255
nyk_Latn	297976	atla1278	bpy_Beng	162838	indo1319	swh_Latn	95776	atla1278

Table 6: List of languages of Glot500-c (Part II).

Lang	$\\|Sent\\|$	Family	Lang	$\\|Sent\\|$	Family	Lang	$\\|Sent\\|$	Family
alt_Cyrl	95148	turk1311	mny_Latn	50581	atla1278	csy_Latn	34126	sino1245
rmn_Grek	94533	indo1319	gkp_Latn	50549	mand1469	azb_Arab	33758	turk1311
miq_Latn	94343	misu1242	kat_Latn	50424	kart1248	csb_Latn	33743	indo1319
kaa_Cyrl	88815	turk1311	bjn_Latn	49068	aust1307	tpm_Latn	33517	atla1278
kos_Latn	88603	aust1307	acr_Latn	48886	maya1287	quw_Latn	33449	quec1387
grn_Latn	87568		dtp_Latn	48468	aust1307	rmy_Cyrl	33351	indo1319
lhu_Latn	87255	sino1245	lam_Latn	46853	atla1278	ixl_Latn	33289	maya1287
lzh_Hani	86035	sino1245	bik_Latn	46561		mbb_Latn	33240	aust1307
ajp_Arab	83297	afro1255	poh_Latn	46454	maya1287	pfl_Latn	33148	indo1319
cmn_Hani	80745	sino1245	phm_Latn	45862	atla1278	pcd_Latn	32867	indo1319
gcf_Latn	80737	indo1319	hrx_Latn	45716	indo1319	tlh_Latn	32863	arti1236
rmn_Cyrl	79925	indo1319	quh_Latn	45566	quec1387	suz_Deva	32811	sino1245
kjh_Cyrl	79262	turk1311	hyw_Cyrl	45379	indo1319	gcr_Latn	32676	indo1319
rng_Latn	78177	atla1278	rue_Cyrl	45369	indo1319	jbo_Latn	32619	arti1236
mgh_Latn	78117	atla1278	eml_Latn	44630	indo1319	tbz_Latn	32264	atla1278
xmv_Latn	77896	aust1307	acm_Arab	44505	afro1255	bam_Latn	32150	mand1469
ige_Latn	77114	atla1278	tob_Latn	44473	guai1249	prk_Latn	32085	aust1305
rmy_Latn	76991	indo1319	ach_Latn	43974	nilo1247	jam_Latn	32048	indo1319
srm_Latn	76884	indo1319	vep_Latn	43076	ural1272	twx_Latn	32028	atla1278
bak_Latn	76809	turk1311	npi_Deva	43072	indo1319	nmf_Latn	31997	sino1245
gur_Latn	76151	atla1278	tok_Latn	42820	arti1236	caq_Latn	31903	aust1305
idu_Latn	75106	atla1278	sgs_Latn	42467	indo1319	rop_Latn	31889	indo1319
yom_Latn	74818	atla1278	lij_Latn	42447	indo1319	tca_Latn	31852	ticu1244
tdx_Latn	74430	aust1307	myv_Cyrl	42147	ural1272	yan_Latn	31775	misu1242
mzn_Arab	73719	indo1319	tih_Latn	41873	aust1307	xav_Latn	31765	nucl1710
cfm_Latn	70227	sino1245	tat_Latn	41640	turk1311	bih_Deva	31658
zpa_Latn	69237	otom1299	lfn_Latn	41632	arti1236	cuk_Latn	31612	chib1249
kbd_Cyrl	67914	abkh1242	cgg_Latn	41196	atla1278	kjb_Latn	31471	maya1287
lao_Laoo	66966	taik1256	ful_Latn	41188	atla1278	hne_Deva	31465	indo1319
nap_Latn	65826	indo1319	gor_Latn	41174	aust1307	wbm_Latn	31394	aust1305
qub_Latn	64973	quec1387	ile_Latn	40984	arti1236	zlm_Latn	31345	aust1307
oke_Latn	64508	atla1278	ium_Latn	40683	hmon1336	tui_Latn	31161	atla1278
ote_Latn	64224	otom1299	teo_Latn	40203	nilo1247	ifb_Latn	30980	aust1307
bsb_Latn	63634	aust1307	kia_Latn	40035	atla1278	izz_Latn	30894	atla1278
ogo_Latn	61901	atla1278	crh_Cyrl	39985	turk1311	rug_Latn	30857	aust1307
abn_Latn	61830	atla1278	crh_Latn	39896	turk1311	aka_Latn	30704	atla1278
ldi_Latn	61827	atla1278	enm_Latn	39809	indo1319	pxm_Latn	30698	book1242
ayr_Latn	61570	ayma1253	sat_Olck	39614	aust1305	kmm_Latn	30671	sino1245
gom_Deva	61140	indo1319	mad_Latn	38993	aust1307	mcn_Latn	30666	afro1255
bba_Latn	61123	atla1278	cac_Latn	38812	maya1287	ifa_Latn	30621	aust1307
aln_Latn	60989	indo1319	hnj_Latn	38611	hmon1336	dln_Latn	30620	sino1245
leh_Latn	59944	atla1278	ksh_Latn	38130	indo1319	ext_Latn	30605	indo1319
ban_Latn	59805	aust1307	ikk_Latn	38071	atla1278	ksd_Latn	30550	aust1307
ace_Latn	59333	aust1307	sba_Latn	38040	cent2225	mzh_Latn	30517	mata1289
pes_Arab	57511	indo1319	zom_Latn	37013	sino1245	llb_Latn	30480	atla1278
skg_Latn	57228	aust1307	bqc_Latn	36881	mand1469	hra_Latn	30472	sino1245
ary_Arab	56933	afro1255	bim_Latn	36835	atla1278	mwm_Latn	30432	cent2225
hus_Latn	56176	maya1287	mdy_Ethi	36370	gong1255	krc_Cyrl	30353	turk1311
glv_Latn	55641	indo1319	bts_Latn	36216	aust1307	tuc_Latn	30349	aust1307
fat_Latn	55609	atla1278	gya_Latn	35902	atla1278	mrw_Latn	30304	aust1307
frr_Latn	55254	indo1319	ajg_Latn	35631	atla1278	pls_Latn	30136	otom1299
mwn_Latn	54805	atla1278	agw_Latn	35585	aust1307	rap_Latn	30102	aust1307
mai_Deva	54687	indo1319	kom_Cyrl	35249	ural1272	fur_Latn	30052	indo1319
dua_Latn	53392	atla1278	knv_Latn	35196		kaa_Latn	30031	turk1311
dzo_Tibt	52732	sino1245	giz_Latn	35040	afro1255	prs_Arab	26823	indo1319
ctd_Latn	52135	sino1245	hui_Latn	34926	nucl1709	san_Latn	25742	indo1319
nnb_Latn	52041	atla1278	kpg_Latn	34900	aust1307	som_Arab	14199	afro1255
sxn_Latn	51749	aust1307	zea_Latn	34426	indo1319	uig_Latn	9637	turk1311
mps_Latn	50645	tebe1251	aoj_Latn	34349	nucl1708	hau_Arab	9593	afro1255

Table 7: List of languages of Glot500-c (Part III).

Appendix B Detailed Results

Detailed results of evaluation are shown in Tables 8-15 ( $NLL$ on Glot500-c), Tables 16-21 ( $NLL$ on PBC), Tables 22-23 (ACC on SIB200), and Tables 24-29 (ACC on Taxi1500).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
abk_Cyrl	234.09	249.16	258.26	231.44	164.61
abn_Latn	140.01	197.81	153.58	152.90	111.86
ace_Latn	235.15	332.18	244.00	259.64	168.79
ach_Latn	179.03	227.84	194.55	197.05	161.01
acm_Arab	119.15	153.09	106.29	101.35	135.82
acr_Latn	301.73	399.80	321.79	316.49	194.71
ada_Latn	132.76	168.56	150.19	137.99	103.17
afb_Arab	134.03	169.73	112.55	110.59	152.58
afr_Latn	52.43	84.47	73.24	75.60	64.25
agw_Latn	228.22	318.95	246.48	242.04	152.59
ahk_Latn	229.45	377.60	245.81	241.21	163.96
ajg_Latn	146.48	185.41	170.89	155.21	113.83
ajp_Arab	153.34	199.79	129.62	124.24	164.80
aka_Latn	163.59	223.13	166.49	185.41	131.50
aln_Latn	191.62	259.76	218.75	267.34	143.64
als_Latn	191.60	271.51	219.17	260.14	155.23
alt_Cyrl	199.25	220.77	200.70	215.71	139.18
alz_Latn	167.89	214.64	185.35	171.34	155.03
amh_Ethi	328.25	834.56	407.68	550.50	268.11
ami_Latn	122.67	168.42	131.77	132.36	109.13
aoj_Latn	318.62	495.44	340.07	316.36	196.64
apc_Arab	131.19	153.97	106.78	109.24	145.81
ara_Arab	111.05	155.64	80.72	84.86	140.73
arb_Arab	166.93	318.76	135.76	137.80	173.03
arg_Latn	173.62	306.23	171.32	178.40	160.08
arn_Latn	202.09	292.40	204.32	216.04	163.87
ary_Arab	198.80	309.90	184.82	176.58	173.37
arz_Arab	122.74	248.72	95.61	100.43	131.75
asm_Beng	264.49	409.59	172.35	311.81	184.77
ast_Latn	208.41	325.35	184.93	192.86	178.77
aym_Latn	143.36	183.42	149.06	154.45	117.28
ayr_Latn	274.31	342.40	288.57	293.48	185.87
azb_Arab	254.60	293.24	273.20	285.61	162.94
aze_Latn	156.58	230.45	195.32	189.59	110.56
azj_Latn	168.12	228.08	212.31	199.86	126.98
bak_Cyrl	274.50	348.47	288.93	307.95	169.00
bak_Latn	191.06	259.97	196.98	213.41	152.50
bam_Latn	195.29	251.28	203.50	215.62	171.51
ban_Latn	205.77	297.97	213.20	213.89	186.89
bar_Latn	210.97	287.33	234.73	208.66	188.90
bas_Latn	137.53	172.78	143.37	147.13	110.71
bba_Latn	233.68	286.30	258.58	238.94	164.18
bbc_Latn	172.78	216.78	181.59	170.06	148.89
bci_Latn	176.81	223.93	190.52	189.46	171.00
bcl_Latn	149.22	209.44	162.25	174.40	132.55
bel_Cyrl	110.77	174.19	142.62	147.27	85.11
bem_Latn	182.62	222.50	198.45	150.51	158.31
ben_Beng	92.79	162.83	50.33	55.42	73.86
ber_Latn	88.37	120.03	87.79	101.52	71.90
bhw_Latn	186.42	245.14	194.41	188.81	155.12
bih_Deva	248.12	422.46	176.37	204.17	180.31
bik_Latn	151.63	218.03	173.42	187.11	137.28
bim_Latn	229.29	284.29	244.21	245.34	166.16
bin_Latn	137.28	175.41	152.32	152.02	109.51
bis_Latn	165.83	250.17	179.61	190.13	130.32
bjn_Latn	200.57	302.58	202.67	199.15	182.65
bod_Tibt	437.54	1690.09	461.35	80.21	286.05
bos_Latn	87.13	175.82	131.95	149.85	110.92
bpy_Beng	251.20	471.67	154.31	172.17	155.64
bqc_Latn	208.00	266.53	226.49	205.65	153.58
bre_Latn	222.93	276.71	208.07	260.44	184.35
bsb_Latn	236.62	358.90	275.10	306.64	204.50
bts_Latn	214.80	292.93	232.31	217.74	156.31
btx_Latn	169.13	227.44	181.86	174.25	148.25
bul_Cyrl	47.01	90.81	77.70	42.90	57.12
bum_Latn	183.88	237.35	194.64	195.91	156.33
bzj_Latn	167.62	244.15	188.25	194.46	137.81

Table 8: Detailed results of

NLL

on Glot500-c (Part I).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
cab_Latn	222.05	292.04	234.53	237.57	168.63
cac_Latn	293.47	395.52	310.33	301.30	192.22
cak_Latn	295.24	394.87	317.52	309.03	200.69
caq_Latn	240.00	323.71	264.17	257.49	164.95
cat_Latn	94.68	212.17	83.26	86.26	130.00
cbk_Latn	143.05	221.60	145.69	159.41	137.96
cce_Latn	178.45	226.07	190.01	192.54	152.70
ceb_Latn	136.44	278.02	164.94	183.55	123.31
ces_Latn	44.83	98.77	68.48	76.15	58.42
cfm_Latn	240.20	305.25	252.92	256.79	185.94
cgg_Latn	121.16	160.92	127.35	129.19	107.91
che_Cyrl	199.15	272.63	203.57	197.17	158.57
chk_Latn	189.52	258.69	201.19	200.61	145.98
chv_Cyrl	246.19	292.36	252.81	229.56	157.91
chw_Latn	139.07	174.73	142.88	121.98	121.16
cjk_Latn	125.30	158.06	134.03	128.75	106.21
ckb_Arab	372.24	437.95	370.20	521.30	243.30
cmn_Hani	52.17	92.04	40.75	49.81	62.30
cnh_Latn	185.01	242.39	198.20	198.57	147.90
cos_Latn	192.02	323.30	210.38	211.96	185.03
crh_Cyrl	236.43	282.79	239.67	260.03	141.08
crh_Latn	149.67	240.28	168.79	157.01	131.91
crs_Latn	153.11	202.53	153.34	87.81	129.39
csb_Latn	238.86	336.99	261.46	294.41	166.29
csy_Latn	226.53	299.52	249.53	245.14	172.03
ctd_Latn	210.45	276.87	227.39	224.34	158.35
ctu_Latn	216.90	310.89	226.68	220.32	157.27
cuk_Latn	233.42	325.97	252.00	247.83	190.81
cym_Latn	233.91	369.64	306.05	332.89	217.29
dan_Latn	43.75	84.32	69.51	66.96	54.56
deu_Latn	37.46	68.68	49.65	33.88	53.45
dhv_Latn	121.21	170.85	126.68	128.57	95.81
diq_Latn	174.75	265.78	180.00	190.78	147.56
div_Thaa	314.55	565.83	314.34	17.32	153.76
djk_Latn	188.44	249.39	201.50	207.16	163.39
dln_Latn	217.51	288.73	231.93	238.10	165.40
dtp_Latn	267.22	373.92	279.80	287.18	184.75
dua_Latn	131.20	169.64	136.03	129.20	109.86
dyu_Latn	186.37	237.65	193.19	205.47	157.89
dzo_Tibt	238.61	842.40	244.70	47.40	154.48
efi_Latn	178.91	251.07	205.96	203.93	134.40
ekk_Latn	155.86	223.64	194.37	89.18	141.19
ell_Grek	52.85	86.68	67.98	36.04	54.45
eml_Latn	213.57	278.33	224.10	225.17	163.91
eng_Latn	30.45	62.73	31.32	34.36	48.60
enm_Latn	79.08	193.74	108.20	119.78	87.78
epo_Latn	68.89	99.75	79.80	87.72	70.22
est_Latn	70.18	100.28	88.33	40.53	67.38
eus_Latn	79.07	87.15	48.33	45.59	70.49
ewe_Latn	208.53	269.62	218.53	195.99	148.78
ext_Latn	216.92	338.22	211.26	231.30	177.17
fao_Latn	202.04	284.61	227.56	263.89	165.45
fas_Arab	138.13	193.21	163.46	166.76	133.69
fat_Latn	134.67	180.66	144.54	144.50	106.86
fij_Latn	159.86	219.85	191.04	137.83	147.71
fil_Latn	120.89	206.21	162.04	161.84	120.27
fin_Latn	46.88	86.18	79.58	35.79	58.35
fon_Latn	237.19	295.74	256.54	262.29	160.24
fra_Latn	32.26	63.71	31.08	32.74	49.22
frr_Latn	192.91	299.41	206.26	211.00	144.13
fry_Latn	191.87	247.81	205.02	221.64	168.86
ful_Latn	447.47	550.03	457.25	511.87	339.38
fur_Latn	231.23	313.99	234.38	250.02	183.57
gaa_Latn	188.66	232.67	222.71	158.83	146.37
gcf_Latn	132.36	173.10	130.03	91.07	103.54
gcr_Latn	113.22	157.83	115.02	79.46	94.40
gil_Latn	175.92	237.54	187.79	181.71	154.60
giz_Latn	244.47	332.32	268.61	266.29	168.09

Table 9: Detailed results of

NLL

on Glot500-c (Part II).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
gkn_Latn	223.58	304.46	253.54	245.24	167.81
gkp_Latn	261.56	358.97	280.80	270.48	186.41
gla_Latn	220.92	382.20	293.89	315.23	210.51
gle_Latn	203.10	345.45	276.11	299.80	206.52
glg_Latn	120.88	204.76	108.43	122.45	132.58
glv_Latn	232.86	326.69	247.79	265.04	182.93
gom_Deva	328.82	462.17	324.77	358.50	233.15
gom_Latn	244.57	318.36	259.71	257.90	209.13
gor_Latn	217.70	326.26	232.98	239.37	168.23
grc_Grek	126.86	277.73	181.00	127.62	141.80
grn_Latn	293.70	382.11	298.10	316.62	204.94
gsw_Latn	180.67	226.37	199.03	171.72	157.34
guc_Latn	241.99	340.92	257.19	234.87	183.29
gug_Latn	197.04	258.55	201.92	214.05	158.39
guj_Gujr	118.82	291.38	74.12	194.71	90.02
gur_Latn	222.48	311.22	243.52	233.99	173.11
guw_Latn	210.29	215.37	235.91	246.28	146.55
gya_Latn	242.48	350.56	274.82	258.26	170.00
gym_Latn	231.32	324.92	249.32	191.13	178.06
hat_Latn	237.00	341.48	251.07	150.39	201.88
hau_Arab	173.08	330.75	130.96	129.69	230.02
hau_Latn	228.21	300.72	257.22	265.65	191.68
haw_Latn	190.18	300.25	217.54	213.20	174.30
hbo_Hebr	140.73	315.06	194.19	200.98	155.08
hbs_Cyrl	206.87	503.80	370.83	417.41	225.22
hbs_Latn	209.02	451.95	333.95	375.92	223.11
heb_Hebr	48.34	63.09	58.19	63.73	56.05
her_Latn	140.31	172.32	146.72	136.86	109.29
hif_Latn	396.80	613.23	471.65	465.81	371.92
hil_Latn	145.89	207.79	161.39	182.01	126.09
hin_Deva	142.07	289.53	105.86	106.38	166.12
hin_Latn	150.11	247.31	166.34	164.94	176.00
hmn_Latn	241.00	375.11	282.60	284.95	182.91
hmo_Latn	165.38	236.46	178.12	142.53	133.19
hne_Deva	201.66	298.38	171.37	184.06	161.30
hnj_Latn	231.56	324.18	263.80	278.94	141.56
hra_Latn	215.87	271.46	228.66	229.46	169.57
hrv_Latn	43.03	82.09	63.02	69.82	54.08
hrx_Latn	131.34	182.33	140.90	135.17	105.20
hsb_Latn	182.90	293.15	211.79	235.34	127.71
hui_Latn	297.34	388.25	319.23	318.32	197.57
hun_Latn	45.03	79.27	75.21	79.06	59.30
hus_Latn	247.96	352.19	260.90	258.32	180.85
hye_Armn	286.18	602.02	372.75	454.38	202.92
hyw_Armn	145.46	263.04	186.63	213.52	110.19
hyw_Cyrl	162.17	231.84	171.73	165.61	117.61
iba_Latn	150.03	192.62	157.75	151.54	133.67
ibg_Latn	115.37	152.94	119.10	122.19	106.08
ibo_Latn	232.57	333.59	223.37	296.17	184.97
ido_Latn	140.94	273.88	153.94	164.61	121.37
idu_Latn	153.00	209.20	162.53	157.46	106.21
ifa_Latn	252.33	328.31	270.66	266.03	172.20
ifb_Latn	257.92	340.56	278.23	272.79	183.83
ige_Latn	148.85	199.02	173.80	176.02	111.50
ikk_Latn	249.37	330.44	284.76	310.26	166.74
iku_Cans	261.21	877.71	343.18	496.50	174.80
ile_Latn	100.28	199.76	105.32	115.20	100.35
ilo_Latn	172.24	227.41	186.11	208.36	146.96
ina_Latn	209.38	408.99	230.14	236.01	201.92
ind_Latn	42.59	69.80	35.50	36.82	56.03
ish_Latn	126.54	178.71	144.92	146.15	101.29
isl_Latn	103.40	156.83	127.49	139.76	83.51
iso_Latn	148.38	175.75	168.85	167.42	104.67
ita_Latn	39.35	79.02	49.94	40.47	53.36
ium_Latn	247.28	361.48	264.84	266.46	167.10
ixl_Latn	327.09	506.74	353.08	348.23	222.05
izz_Latn	301.73	400.14	346.61	361.39	193.24
jam_Latn	204.69	291.31	223.99	231.17	157.87

Table 10: Detailed results of

NLL

on Glot500-c (Part III).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
jav_Latn	208.92	275.29	212.60	220.31	180.00
jbo_Latn	103.91	200.82	109.86	112.61	117.25
jpn_Jpan	136.26	301.32	197.23	149.70	150.43
kaa_Cyrl	281.21	363.07	300.20	317.13	146.98
kaa_Latn	284.60	354.51	292.04	309.67	192.43
kab_Latn	192.58	264.51	185.56	216.46	161.31
kac_Latn	210.47	267.38	223.77	249.95	166.44
kal_Latn	240.15	262.90	259.85	155.45	182.71
kam_Latn	153.84	194.60	156.10	186.00	115.78
kan_Knda	216.22	556.40	146.43	355.75	175.17
kat_Geor	302.53	413.90	435.47	483.85	239.51
kat_Latn	184.94	308.06	217.07	208.25	184.20
kaz_Cyrl	257.67	341.78	280.01	297.13	187.85
kbd_Cyrl	212.12	229.63	198.20	202.85	146.86
kbp_Latn	232.17	306.53	257.45	246.16	161.08
kea_Latn	118.17	159.93	121.92	122.29	105.69
kek_Latn	234.79	332.19	244.69	228.87	164.18
khm_Khmr	257.14	815.56	317.46	437.56	167.88
kia_Latn	222.01	298.21	245.77	236.71	164.39
kik_Latn	208.26	277.92	213.92	237.26	159.49
kin_Latn	206.40	237.66	174.18	234.91	168.37
kir_Cyrl	265.65	308.15	277.34	313.50	175.71
kjb_Latn	263.79	353.35	280.16	278.13	179.76
kjh_Cyrl	200.11	251.59	211.84	217.34	147.81
kmb_Latn	132.84	166.09	137.48	118.00	112.99
kmm_Latn	246.57	330.77	263.79	266.44	180.90
kmr_Cyrl	224.23	284.40	226.51	221.22	154.70
kmr_Latn	183.95	220.51	194.67	215.02	142.36
knv_Latn	430.56	581.45	456.13	427.27	232.18
kom_Cyrl	224.18	302.71	249.08	213.41	134.88
kon_Latn	112.77	131.61	116.89	119.41	96.00
koo_Latn	132.73	167.13	144.33	134.74	111.26
kor_Hang	129.20	224.06	180.21	95.71	151.37
kos_Latn	146.15	191.23	153.05	154.26	123.85
kpg_Latn	221.52	321.94	246.33	245.73	148.93
kqn_Latn	125.33	149.57	128.12	109.60	106.08
krc_Cyrl	247.13	292.86	248.83	267.39	167.05
kri_Latn	166.50	240.92	193.15	192.19	140.20
ksd_Latn	198.81	269.96	210.59	212.57	138.81
ksh_Latn	204.72	261.51	220.93	218.50	161.62
kss_Latn	310.35	477.02	335.25	300.31	226.38
ksw_Mymr	210.34	266.24	226.59	154.55	124.78
kua_Latn	179.05	206.09	187.92	151.87	140.72
kur_Arab	402.78	464.44	400.97	550.57	253.61
kur_Latn	633.22	779.47	678.30	748.20	424.98
kwn_Latn	136.80	170.23	141.88	111.31	107.21
kwy_Latn	131.93	160.78	137.77	134.01	110.55
lam_Latn	209.07	276.89	228.12	203.17	176.61
lao_Laoo	405.48	978.35	435.37	583.11	225.06
lat_Latn	167.49	274.19	186.97	210.22	183.32
lav_Latn	193.22	257.06	227.60	252.31	162.80
ldi_Latn	178.84	230.26	185.58	191.19	160.61
leh_Latn	216.80	273.56	230.25	201.57	172.92
lfn_Latn	232.59	368.62	246.45	258.76	187.82
lhu_Latn	209.10	365.95	220.56	219.50	142.74
lij_Latn	328.66	483.81	345.62	348.28	249.64
lim_Latn	199.01	290.80	236.94	239.44	180.52
lin_Latn	161.88	173.63	158.33	180.17	135.66
lit_Latn	163.71	220.62	195.08	225.98	147.53
llb_Latn	135.01	180.06	146.51	135.39	120.02
lmo_Latn	222.22	378.21	247.54	242.80	182.01
loz_Latn	179.54	194.46	185.77	142.19	147.86
ltz_Latn	190.70	303.65	202.02	174.00	169.36
lua_Latn	126.47	147.86	131.94	102.71	102.36
lub_Latn	136.45	143.64	140.96	99.41	111.01
lue_Latn	128.48	158.40	135.27	129.94	103.72
lug_Latn	225.72	318.09	221.56	272.90	196.21
lun_Latn	135.96	170.81	142.71	136.26	113.31

Table 11: Detailed results of

NLL

on Glot500-c (Part IV).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
luo_Latn	177.43	224.04	194.07	187.72	156.23
lus_Latn	192.97	251.35	203.37	212.72	163.95
lvs_Latn	154.85	211.40	185.87	198.99	138.63
lzh_Hani	149.57	215.19	130.32	153.38	151.15
mad_Latn	232.71	325.38	245.39	249.29	176.81
mah_Latn	178.50	246.26	188.98	183.17	145.35
mai_Deva	245.94	389.84	189.93	223.00	185.84
mal_Mlym	96.92	171.55	57.45	129.61	72.46
mam_Latn	232.38	315.16	247.28	244.43	189.28
mar_Deva	85.13	143.31	55.38	103.23	70.08
mau_Latn	186.46	333.61	204.91	193.09	161.45
mbb_Latn	282.70	410.99	309.56	307.47	175.50
mck_Latn	191.94	244.49	202.17	191.28	152.16
mcn_Latn	207.28	276.32	220.35	230.56	158.99
mco_Latn	271.45	368.23	281.55	260.70	206.54
mdy_Ethi	306.26	529.46	293.68	369.22	166.26
meu_Latn	177.74	235.19	188.09	168.43	143.62
mfe_Latn	147.50	194.41	143.47	92.23	129.23
mgh_Latn	193.72	257.45	207.05	200.68	166.17
mgr_Latn	183.96	226.09	194.25	149.77	160.18
mhr_Cyrl	230.20	298.73	235.59	236.71	167.55
min_Latn	161.40	266.18	164.13	170.30	166.91
miq_Latn	207.63	276.27	228.42	223.78	160.37
mkd_Cyrl	81.62	144.52	112.99	98.33	74.40
mlg_Latn	185.23	250.78	189.32	226.85	148.82
mlt_Latn	109.60	184.08	139.75	146.69	85.14
mny_Latn	133.04	170.16	135.30	126.14	112.38
mon_Cyrl	397.63	535.59	446.51	555.16	249.95
mon_Latn	354.75	411.54	383.60	383.02	282.85
mos_Latn	197.23	229.14	206.05	212.55	159.69
mps_Latn	347.99	496.26	378.75	366.78	213.10
mri_Latn	154.38	247.38	181.49	179.85	134.55
mrw_Latn	235.11	306.78	250.41	253.18	169.69
msa_Latn	164.05	261.28	155.14	151.77	190.44
mwl_Latn	275.26	410.83	270.47	280.98	202.89
mwm_Latn	293.40	430.46	315.11	294.17	162.95
mwn_Latn	131.84	162.91	138.48	111.20	123.37
mxv_Latn	206.13	324.92	222.48	222.86	171.82
mya_Mymr	383.74	576.49	472.04	277.91	252.84
myv_Cyrl	267.24	357.29	263.68	276.10	188.74
mzh_Latn	257.70	370.86	285.03	276.60	169.96
mzn_Arab	192.75	263.60	200.51	204.50	136.03
nan_Latn	172.36	311.98	186.78	200.62	153.96
nap_Latn	159.24	246.36	179.36	167.94	151.29
naq_Latn	195.43	261.60	207.68	207.27	150.47
nav_Latn	258.40	380.88	284.18	286.04	181.13
nba_Latn	123.68	154.25	130.25	126.08	99.29
nbl_Latn	175.10	238.64	194.74	211.98	154.90
nch_Latn	206.55	287.53	220.86	221.43	183.56
ncj_Latn	185.32	260.91	201.13	196.79	173.80
ncx_Latn	115.71	168.08	121.23	122.70	98.71
ndc_Latn	167.38	222.72	176.18	184.45	158.24
nde_Latn	169.75	235.54	185.98	211.96	151.45
ndo_Latn	192.10	227.02	204.45	150.28	149.69
nds_Latn	195.44	272.44	213.17	204.47	184.93
nep_Deva	232.93	425.83	167.54	291.83	210.52
new_Deva	169.64	330.40	128.26	135.07	103.54
ngl_Latn	134.87	177.05	140.92	115.46	104.59
ngu_Latn	205.16	282.39	215.65	213.78	167.56
nia_Latn	202.30	269.59	214.87	196.19	167.95
niu_Latn	105.04	142.53	111.71	113.36	88.11
nld_Latn	37.77	65.47	55.52	51.54	51.45
nmf_Latn	222.98	290.53	242.04	246.36	167.31
nnb_Latn	200.64	248.60	210.13	212.23	161.20
nno_Latn	138.72	234.11	192.13	199.51	146.16
nob_Latn	50.27	96.43	78.24	73.64	59.05
nor_Latn	78.04	146.26	126.19	123.99	99.50
npi_Deva	212.50	399.24	143.71	290.95	166.12

Table 12: Detailed results of

NLL

on Glot500-c (Part V).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
nse_Latn	176.20	234.62	184.77	174.57	161.52
nso_Latn	170.49	227.97	170.96	201.29	142.72
nya_Latn	203.45	299.12	222.51	224.89	175.69
nyk_Latn	131.13	166.47	142.89	138.50	105.26
nyn_Latn	174.51	229.51	189.05	194.14	149.17
nyu_Latn	126.29	172.40	132.49	127.67	99.26
nyy_Latn	215.07	271.23	234.37	220.80	168.74
nzi_Latn	191.21	256.55	219.47	209.42	152.30
oci_Latn	202.93	343.11	207.95	210.24	185.26
ogo_Latn	134.14	185.86	149.15	143.22	118.08
oke_Latn	131.90	166.07	146.98	149.55	102.72
ori_Orya	323.51	839.80	179.33	665.17	203.94
orm_Latn	225.00	334.29	288.51	313.08	201.60
ory_Orya	232.83	572.34	134.20	474.35	164.65
oss_Cyrl	229.49	279.89	229.34	227.24	151.79
ote_Latn	237.06	362.46	254.31	241.61	176.73
pag_Latn	173.32	223.39	184.05	184.76	157.30
pam_Latn	259.01	373.98	274.16	280.47	237.10
pan_Guru	242.88	510.70	153.50	395.85	180.54
pap_Latn	162.79	213.88	174.63	173.16	138.20
pau_Latn	176.42	243.03	188.84	187.10	150.24
pcd_Latn	144.96	228.79	143.18	150.08	140.39
pcm_Latn	159.00	346.35	182.00	179.53	147.50
pdt_Latn	192.69	252.34	199.07	199.80	144.40
pes_Arab	153.46	199.83	175.97	179.97	139.01
pfl_Latn	220.11	315.47	241.84	225.74	176.25
phm_Latn	117.81	162.32	128.28	125.57	100.73
pis_Latn	153.04	237.95	173.91	179.21	130.98
pls_Latn	237.55	350.88	251.43	251.35	175.28
plt_Latn	159.36	220.84	158.06	193.44	131.96
pms_Latn	132.94	257.06	137.39	146.18	106.52
pnb_Arab	345.25	418.35	279.85	240.35	237.22
poh_Latn	389.80	589.86	417.71	416.42	230.35
pol_Latn	44.19	82.29	66.66	71.91	60.02
pon_Latn	177.92	236.38	190.47	189.62	149.49
por_Latn	37.00	66.01	35.14	33.91	48.72
prk_Latn	220.51	301.85	230.42	238.15	148.46
prs_Arab	163.01	218.38	191.40	195.64	141.99
pus_Arab	259.45	327.43	277.81	340.38	203.38
pxm_Latn	299.37	391.48	317.99	307.01	180.85
qub_Latn	210.38	265.76	222.82	172.89	152.70
quc_Latn	248.16	320.50	271.51	258.06	187.13
que_Latn	144.31	170.69	154.62	96.53	121.19
qug_Latn	176.78	225.11	187.16	136.85	143.62
quh_Latn	257.89	293.32	275.35	187.44	175.55
quw_Latn	154.10	205.67	162.83	142.63	142.35
quy_Latn	177.21	202.67	190.48	125.92	139.15
quz_Latn	180.20	211.40	192.52	123.67	142.21
qvi_Latn	178.08	234.53	188.58	156.22	145.79
rap_Latn	204.53	354.21	219.29	226.89	158.90
rar_Latn	169.22	249.96	191.91	189.56	168.88
rmn_Cyrl	129.46	181.44	143.84	137.02	102.76
rmn_Grek	135.82	190.47	141.78	125.21	92.56
rmn_Latn	133.75	175.58	146.05	143.75	112.55
rmy_Cyrl	135.65	184.00	147.87	137.05	109.18
rmy_Latn	189.65	244.12	198.92	205.44	168.77
rng_Latn	122.59	150.36	125.06	129.81	104.16
roh_Latn	235.38	312.78	242.57	253.77	161.16
ron_Latn	44.70	84.55	68.14	74.76	54.82
rop_Latn	233.05	351.35	257.34	275.70	155.36
rue_Cyrl	223.89	402.99	299.90	265.32	179.38
rug_Latn	257.50	348.10	277.13	275.47	169.94
run_Latn	184.59	218.12	161.96	207.06	157.49
rus_Cyrl	65.34	155.39	116.17	67.59	84.56
sag_Latn	162.87	194.78	175.45	155.14	149.65
sah_Cyrl	383.55	455.30	382.36	423.03	218.84
san_Deva	182.35	287.49	189.83	201.00	186.46
san_Latn	242.46	324.45	278.75	282.93	199.18

Table 13: Detailed results of

NLL

on Glot500-c (Part VI).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
sat_Olck	654.37	3377.97	667.66	40.17	311.96
sba_Latn	272.45	372.47	303.48	293.62	167.13
scn_Latn	236.20	355.24	263.10	270.02	191.69
sco_Latn	147.94	341.79	193.24	193.20	170.39
seh_Latn	173.46	231.41	177.38	174.40	138.70
sgs_Latn	248.33	313.78	251.35	277.73	182.16
sid_Latn	135.53	180.29	147.72	139.17	114.24
sin_Sinh	82.29	173.16	114.00	137.98	70.77
skg_Latn	128.02	172.67	131.34	145.32	116.16
slk_Latn	62.89	116.82	86.67	103.39	63.93
slv_Latn	42.18	85.28	64.75	73.49	55.26
sme_Latn	288.98	357.31	301.64	295.23	205.46
smo_Latn	220.26	338.16	250.74	252.76	190.00
sna_Latn	221.02	311.60	221.92	258.38	189.74
snd_Arab	209.83	264.61	217.96	260.53	163.07
som_Arab	230.91	410.59	192.88	175.01	265.88
som_Latn	235.21	346.36	286.69	312.99	212.51
sop_Latn	176.17	207.78	188.41	157.21	167.90
sot_Latn	200.82	271.71	205.51	235.65	157.18
spa_Latn	37.28	70.48	34.26	38.65	53.39
sqi_Latn	207.58	295.58	241.22	296.90	172.78
srd_Latn	228.12	341.00	242.74	251.01	179.87
srm_Latn	229.46	318.77	250.79	246.75	173.83
srn_Latn	161.18	183.34	171.30	179.59	132.77
srp_Cyrl	45.22	100.88	77.59	81.95	57.85
srp_Latn	33.66	57.89	43.91	46.74	42.31
ssw_Latn	194.10	264.22	212.99	230.20	165.70
sun_Latn	220.72	314.99	228.18	237.07	203.32
suz_Deva	255.00	400.13	262.34	257.16	157.30
swa_Latn	156.02	208.21	125.78	94.55	151.68
swc_Latn	103.75	133.69	98.32	71.66	102.14
swe_Latn	42.72	82.20	68.89	60.92	56.18
swh_Latn	178.28	223.65	151.05	97.98	161.49
sxn_Latn	243.81	346.98	263.76	260.44	183.47
szl_Latn	132.77	348.45	156.37	177.33	111.32
tah_Latn	114.41	158.18	124.60	121.22	101.40
tam_Taml	231.12	444.83	152.34	146.94	205.53
tat_Cyrl	251.96	301.03	256.66	276.29	159.44
tat_Latn	248.71	338.00	261.10	278.92	186.84
tbz_Latn	273.90	352.17	299.25	281.11	164.62
tca_Latn	306.13	452.15	328.77	316.81	174.51
tcf_Latn	133.72	193.63	138.67	133.58	102.94
tdt_Latn	158.16	217.96	172.56	182.04	130.27
tdx_Latn	125.88	167.70	130.54	135.72	113.29
tel_Telu	94.93	152.33	54.92	47.52	72.02
teo_Latn	193.42	250.17	206.10	193.68	159.90
tgk_Cyrl	313.76	369.08	333.83	342.42	196.57
tgk_Latn	296.86	412.46	342.18	352.59	248.69
tgl_Latn	56.44	94.00	76.30	77.15	64.98
tha_Thai	192.70	331.25	242.28	116.12	175.60
tih_Latn	233.30	329.24	255.15	254.70	158.13
tir_Ethi	267.84	579.39	319.29	424.77	189.73
tiv_Latn	133.38	168.19	140.43	126.42	116.08
tlh_Latn	163.23	258.64	183.94	184.72	111.43
tll_Latn	138.57	167.75	152.44	126.10	105.23
tob_Latn	299.95	450.25	316.77	324.19	182.95
tog_Latn	127.47	165.93	133.37	115.35	102.88
toh_Latn	181.85	238.80	196.33	194.76	146.50
toi_Latn	185.04	233.23	194.93	164.94	165.33
toj_Latn	232.66	311.24	239.53	236.11	198.17
tok_Latn	46.19	61.55	50.56	43.88	47.57
ton_Latn	172.88	243.40	178.94	190.23	141.17
top_Latn	221.27	303.90	232.90	223.12	212.88
tpi_Latn	139.90	209.92	155.89	170.67	120.65
tpm_Latn	214.33	280.83	241.97	231.70	154.99
tsc_Latn	131.42	150.62	130.29	132.42	104.44
tsn_Latn	209.69	291.69	203.77	245.18	169.85
tso_Latn	182.87	208.89	176.69	194.90	142.27

Table 14: Detailed results of

NLL

on Glot500-c (Part VII).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
tsz_Latn	183.82	253.97	200.16	176.35	153.98
ttj_Latn	133.35	174.08	142.10	146.60	112.98
tuc_Latn	325.34	444.23	346.52	291.84	180.84
tui_Latn	247.40	330.20	266.54	265.71	181.39
tuk_Cyrl	196.40	248.01	210.45	219.39	143.19
tuk_Latn	217.31	235.95	217.78	238.66	155.71
tum_Latn	184.51	236.91	190.41	153.36	153.44
tur_Latn	48.52	66.76	60.61	34.71	63.33
tvl_Latn	114.81	156.00	123.30	121.62	97.96
twi_Latn	169.99	229.39	171.42	190.50	139.81
twx_Latn	123.52	172.96	130.82	135.08	106.56
tyv_Cyrl	270.89	314.09	275.97	304.11	174.60
tzh_Latn	195.49	274.47	208.05	202.10	162.63
tzo_Latn	223.14	324.35	237.54	228.92	173.78
udm_Cyrl	222.45	277.14	231.98	219.71	160.30
uig_Arab	336.01	432.84	320.43	463.38	207.25
uig_Latn	254.59	292.36	270.85	285.12	203.29
ukr_Cyrl	101.99	240.89	173.79	160.03	136.57
umb_Latn	129.60	165.59	135.06	139.75	100.07
urd_Arab	77.96	105.77	53.61	51.92	81.62
urh_Latn	145.52	153.19	164.21	161.54	108.55
uzb_Cyrl	307.70	353.00	332.86	314.77	178.07
uzb_Latn	307.44	363.61	357.04	383.26	220.74
uzn_Cyrl	233.89	270.06	254.96	247.92	145.01
vec_Latn	163.22	261.93	181.25	168.76	170.03
ven_Latn	190.45	233.75	198.94	198.18	151.65
vep_Latn	316.12	456.77	326.76	243.40	192.08
vie_Latn	108.65	169.92	86.74	91.41	138.89
vls_Latn	200.17	292.89	242.66	253.44	171.13
vmw_Latn	141.25	176.25	143.12	107.10	102.92
vol_Latn	94.00	260.01	85.47	87.18	83.77
wal_Latn	190.62	261.79	201.98	177.73	158.07
war_Latn	127.41	249.86	146.46	166.29	153.84
wbm_Latn	222.06	311.86	234.78	240.27	150.33
wes_Latn	64.78	106.54	73.37	73.73	86.61
wls_Latn	114.80	157.93	125.63	124.58	99.38
wol_Latn	197.17	251.63	171.70	208.78	173.01
wuu_Hani	152.90	283.11	127.83	152.82	145.05
xav_Latn	350.22	619.11	379.76	371.80	201.63
xho_Latn	224.10	315.12	219.57	265.57	187.35
xmf_Geor	260.61	315.58	316.49	376.33	170.15
xmv_Latn	125.37	168.73	129.48	139.97	111.94
yan_Latn	228.46	314.62	248.18	243.68	165.66
yao_Latn	196.25	253.72	209.77	198.91	166.06
yap_Latn	197.98	274.54	212.39	209.00	169.09
yid_Hebr	437.75	571.08	480.37	590.32	295.70
yom_Latn	176.11	220.86	184.62	189.29	150.95
yor_Latn	233.75	283.33	193.55	286.20	185.60
yua_Latn	195.86	284.05	208.08	205.70	161.16
yue_Hani	74.79	131.83	62.91	83.80	74.28
zai_Latn	170.49	223.03	179.18	188.38	148.03
zea_Latn	174.18	271.42	212.95	222.74	155.52
zho_Hani	57.89	99.40	48.19	55.24	70.80
zlm_Latn	106.37	176.09	92.63	93.81	118.56
zne_Latn	127.57	167.13	134.43	115.53	104.95
zom_Latn	214.60	277.57	233.64	228.48	170.06
zpa_Latn	127.29	180.39	129.07	132.30	107.04
zsm_Latn	102.42	171.64	92.39	94.59	123.31
zul_Latn	208.94	340.58	235.91	257.18	192.84
all	190.58	282.46	202.95	205.07	151.25

Table 15: Detailed results of

NLL

on Glot500-c (Part VIII).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
ace_Latn	137.43	196.93	144.50	152.49	97.89
ach_Latn	113.66	152.08	123.29	125.31	102.45
acr_Latn	177.86	233.22	188.27	182.33	114.27
afr_Latn	80.43	132.25	116.34	129.21	95.33
agw_Latn	130.32	186.58	136.17	138.27	95.93
ahk_Latn	175.75	291.31	187.63	179.54	116.76
aka_Latn	98.41	135.74	99.46	108.01	78.20
aln_Latn	101.54	147.77	115.11	139.73	82.71
als_Latn	93.47	134.99	106.68	127.57	78.53
alt_Cyrl	122.23	146.47	125.04	134.88	90.21
alz_Latn	107.41	139.39	116.48	109.62	102.48
amh_Ethi	100.60	255.43	121.36	161.34	98.24
aoj_Latn	175.25	270.87	185.66	171.29	114.19
arb_Arab	94.03	186.47	77.67	78.12	104.22
arn_Latn	141.08	205.53	143.63	154.61	113.59
ary_Arab	128.97	212.49	125.35	118.25	104.58
arz_Arab	80.59	185.56	64.91	66.52	92.22
asm_Beng	123.16	196.03	79.80	147.49	101.89
ayr_Latn	149.96	188.09	154.45	157.13	106.66
azb_Arab	134.00	160.29	139.49	144.68	93.09
aze_Latn	97.68	131.80	113.03	106.38	90.96
bak_Cyrl	134.49	169.96	133.79	150.46	93.49
bam_Latn	109.68	147.72	110.24	118.13	91.88
ban_Latn	138.98	195.92	147.93	149.04	111.51
bar_Latn	114.49	154.37	121.74	113.26	108.65
bba_Latn	132.00	166.51	146.31	131.24	96.87
bbc_Latn	110.66	143.87	117.12	107.02	100.17
bci_Latn	117.42	156.47	125.70	124.26	126.80
bcl_Latn	101.39	146.46	109.03	116.78	88.46
bel_Cyrl	92.30	137.26	110.12	118.30	88.89
bem_Latn	125.52	158.97	135.60	104.16	107.57
ben_Beng	111.68	194.50	68.00	77.83	105.61
bhw_Latn	124.94	169.40	130.40	123.48	101.65
bim_Latn	124.64	162.78	132.77	130.01	96.33
bis_Latn	126.46	196.19	136.72	148.29	95.85
bod_Tibt	138.16	525.70	144.33	30.40	105.99
bqc_Latn	113.18	149.13	122.07	112.76	91.11
bre_Latn	120.49	151.33	111.97	139.13	105.99
bts_Latn	111.90	154.57	120.61	110.17	89.16
btx_Latn	118.13	163.25	128.43	125.76	103.19
bul_Cyrl	66.25	124.78	104.01	42.33	85.30
bum_Latn	116.16	153.66	121.83	121.74	101.82
bzj_Latn	115.75	175.63	128.70	135.59	93.15
cab_Latn	164.07	215.31	172.22	174.35	123.20
cac_Latn	169.42	231.73	176.29	175.63	116.03
cak_Latn	185.42	246.76	193.62	191.54	123.65
caq_Latn	128.13	174.12	141.21	138.17	95.54
cat_Latn	54.93	118.69	44.29	45.98	76.47
cbk_Latn	103.50	154.23	105.08	108.15	91.19
cce_Latn	124.20	159.68	133.40	132.89	106.02
ceb_Latn	99.37	146.70	113.69	132.72	94.43
ces_Latn	62.40	133.26	101.82	114.59	86.91
cfm_Latn	138.07	179.26	142.58	143.43	107.20
che_Cyrl	152.68	188.52	146.76	148.42	126.87
chk_Latn	128.34	180.14	133.81	134.01	97.76
chv_Cyrl	132.89	166.12	138.37	128.58	91.96
ckb_Arab	126.47	155.90	125.59	164.22	100.65
cmn_Hani	63.67	121.22	51.49	60.91	76.95
cnh_Latn	129.26	175.83	134.65	139.53	104.21
crh_Cyrl	128.56	166.14	128.91	139.13	82.61
crs_Latn	100.72	139.95	101.88	57.70	80.86
csy_Latn	125.81	172.44	138.22	132.16	100.90
ctd_Latn	120.85	163.07	128.99	125.52	92.79
ctu_Latn	156.04	220.78	162.45	157.63	112.41
cuk_Latn	151.95	213.08	159.59	156.10	119.01
cym_Latn	110.34	165.10	135.91	147.72	103.89

Table 16: Detailed results of

NLL

on PBC (Part I).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
dan_Latn	63.65	114.43	97.95	101.13	86.68
deu_Latn	57.09	109.69	84.08	54.00	80.90
djk_Latn	143.19	192.80	147.13	153.74	120.66
dln_Latn	113.43	155.19	118.82	125.73	92.37
dtp_Latn	158.44	222.01	165.46	169.63	111.77
dyu_Latn	122.24	161.61	126.53	132.73	103.04
dzo_Tibt	157.37	550.44	162.42	36.35	99.22
efi_Latn	121.73	173.61	139.58	136.78	90.15
ell_Grek	80.65	169.16	109.07	57.11	105.74
eng_Latn	28.40	93.81	40.01	42.56	46.91
enm_Latn	45.43	113.74	62.99	66.87	55.22
epo_Latn	79.83	125.27	88.81	100.79	85.24
est_Latn	93.49	128.66	109.45	45.04	99.10
eus_Latn	133.89	145.19	101.06	78.92	150.43
ewe_Latn	140.69	190.49	147.85	133.36	103.15
fao_Latn	101.92	150.02	113.16	134.84	93.21
fas_Arab	87.19	121.18	99.32	104.15	77.85
fij_Latn	110.29	158.90	130.18	97.89	97.65
fil_Latn	74.66	130.09	106.51	109.03	84.32
fin_Latn	68.42	125.52	116.35	38.52	91.75
fon_Latn	160.80	210.76	176.23	178.40	107.16
fra_Latn	46.01	105.73	38.57	44.16	73.33
fry_Latn	111.69	146.88	111.45	123.28	100.32
gaa_Latn	128.54	165.53	145.88	107.90	100.68
gil_Latn	125.22	171.28	131.71	130.68	106.24
giz_Latn	131.75	183.07	145.21	143.35	97.84
gkn_Latn	151.99	210.57	167.40	166.78	116.75
gkp_Latn	159.33	219.00	168.31	166.05	110.30
gla_Latn	102.90	174.06	129.10	138.42	100.51
gle_Latn	102.09	161.80	132.57	146.14	116.86
glv_Latn	122.94	172.06	126.34	134.37	98.35
gom_Latn	149.35	199.59	155.54	159.44	129.81
gor_Latn	156.67	215.11	170.13	167.89	115.02
grc_Grek	64.91	153.70	93.39	68.67	81.49
guc_Latn	193.75	271.60	202.31	190.66	138.75
gug_Latn	139.06	183.84	146.45	151.28	114.14
guj_Gujr	121.18	329.05	86.23	202.19	107.88
gur_Latn	143.42	208.51	152.80	148.18	106.41
guw_Latn	142.60	155.00	158.22	166.16	98.92
gya_Latn	130.25	197.61	146.23	137.31	99.85
gym_Latn	180.93	262.58	196.74	161.03	135.73
hat_Latn	112.20	159.68	116.00	48.45	90.71
hau_Latn	105.95	146.45	117.21	127.18	96.63
haw_Latn	91.42	140.04	102.87	102.50	91.03
heb_Hebr	86.85	197.96	113.81	125.21	143.56
hif_Latn	104.78	161.10	114.69	116.63	107.93
hil_Latn	103.93	151.84	112.82	130.28	90.13
hin_Deva	87.35	175.19	62.49	63.21	103.09
hin_Latn	102.01	144.04	112.84	112.96	109.68
hmo_Latn	119.64	179.32	128.46	103.09	91.86
hne_Deva	124.72	183.69	106.59	120.10	94.27
hnj_Latn	126.88	186.09	144.08	149.87	89.64
hra_Latn	116.66	151.27	122.49	122.14	96.72
hrv_Latn	62.52	125.68	96.82	107.18	73.96
hui_Latn	151.46	203.46	161.05	161.36	108.54
hun_Latn	69.17	118.92	117.60	125.55	94.04
hus_Latn	170.91	241.76	179.70	177.42	120.81
hye_Armn	111.94	219.94	141.97	171.24	89.75
iba_Latn	102.40	135.32	109.00	102.90	87.43
ibo_Latn	131.16	189.15	130.12	172.79	112.01
ifa_Latn	140.53	194.86	151.53	148.37	102.38
ifb_Latn	149.93	198.42	157.12	156.49	107.60
ikk_Latn	132.84	186.95	150.31	163.16	95.14
ilo_Latn	119.72	162.55	127.85	146.58	102.18
ind_Latn	66.39	121.78	58.14	58.36	80.77
isl_Latn	92.39	137.42	113.54	123.83	94.12
ita_Latn	54.57	116.50	73.53	52.57	78.23

Table 17: Detailed results of

NLL

on PBC (Part II).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
ium_Latn	150.62	222.39	155.20	157.52	99.54
ixl_Latn	190.07	299.20	206.08	202.92	127.52
izz_Latn	167.28	228.45	195.19	198.57	118.78
jam_Latn	119.85	181.93	134.52	139.07	96.42
jav_Latn	134.11	171.16	136.06	140.01	109.34
jpn_Jpan	67.67	114.11	84.64	61.57	88.53
kaa_Cyrl	136.14	179.48	138.63	153.33	84.79
kaa_Latn	134.02	172.76	135.14	145.18	99.15
kab_Latn	137.81	193.54	129.87	159.45	117.96
kac_Latn	141.33	187.59	150.24	163.68	110.99
kal_Latn	120.90	143.71	134.44	90.38	109.58
kan_Knda	128.60	336.06	93.77	210.99	110.09
kat_Geor	103.81	132.04	144.43	155.32	93.39
kaz_Cyrl	129.49	166.60	137.43	150.12	108.56
kbp_Latn	151.83	205.24	166.76	156.21	105.09
kek_Latn	161.79	230.77	168.62	155.46	110.43
khm_Khmr	141.48	453.97	161.21	233.38	100.53
kia_Latn	122.81	171.17	136.22	131.73	95.76
kik_Latn	141.34	189.92	143.91	155.69	106.53
kin_Latn	110.75	137.92	101.14	123.88	99.96
kir_Cyrl	125.74	148.16	127.29	148.79	94.02
kjb_Latn	152.31	205.47	156.49	160.88	109.02
kjh_Cyrl	133.84	168.82	142.31	145.53	97.43
kmm_Latn	137.88	185.46	149.16	145.91	107.81
kmr_Cyrl	139.23	182.66	137.99	142.19	103.56
kmr_Latn	120.54	149.31	124.74	136.78	96.93
knv_Latn	249.77	346.55	266.68	245.87	135.66
kor_Hang	66.58	119.14	92.53	42.28	82.45
kpg_Latn	128.18	190.92	139.68	135.05	90.65
krc_Cyrl	123.42	149.60	119.82	130.97	89.22
kri_Latn	118.15	172.62	134.67	131.33	96.69
ksd_Latn	108.75	155.22	117.82	116.91	84.44
kss_Latn	248.46	385.70	269.58	224.81	174.70
ksw_Mymr	145.34	187.44	155.38	107.97	94.71
kua_Latn	118.00	142.31	125.97	104.16	99.83
lam_Latn	145.51	199.78	154.91	139.07	115.48
lao_Laoo	163.17	414.25	172.28	234.46	116.39
lat_Latn	56.98	102.85	65.60	73.77	73.15
lav_Latn	90.61	119.37	103.75	114.03	94.92
ldi_Latn	118.61	161.07	122.03	124.26	112.27
leh_Latn	131.72	169.51	140.67	124.57	104.47
lhu_Latn	147.32	262.73	152.94	153.96	100.83
lin_Latn	113.81	128.30	110.20	123.56	92.52
lit_Latn	92.16	120.15	107.63	123.52	97.69
loz_Latn	119.93	140.69	125.00	98.71	99.46
ltz_Latn	114.62	156.92	114.49	104.79	96.89
lug_Latn	117.59	174.83	115.12	143.99	107.58
luo_Latn	118.37	158.54	129.88	126.61	108.96
lus_Latn	122.17	159.02	125.37	133.65	103.21
lzh_Hani	62.06	88.07	54.92	60.19	66.36
mad_Latn	136.26	192.90	146.43	145.94	103.63
mah_Latn	113.96	159.42	120.91	110.27	97.45
mai_Deva	136.92	209.91	108.91	126.23	100.39
mal_Mlym	111.12	210.81	72.27	126.62	105.00
mam_Latn	173.35	227.62	181.33	179.63	138.57
mar_Deva	105.80	184.52	83.30	141.12	106.37
mau_Latn	139.06	259.48	153.06	140.49	148.96
mbb_Latn	160.77	237.84	174.36	171.35	101.96
mck_Latn	124.95	161.95	131.37	123.87	99.72
mcn_Latn	110.95	153.55	120.44	123.48	96.39
mco_Latn	203.59	285.23	205.92	192.68	159.16
mdy_Ethi	164.72	284.41	157.66	188.38	92.89
meu_Latn	111.26	152.92	120.09	103.47	91.50
mfe_Latn	99.68	136.00	98.60	55.39	80.86
mgh_Latn	131.75	181.11	140.22	136.00	118.72
mgr_Latn	126.60	154.99	129.40	106.55	108.42

Table 18: Detailed results of

NLL

on PBC (Part III).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
mhr_Cyrl	122.42	160.48	119.36	127.42	100.09
min_Latn	139.41	194.79	136.22	138.87	133.30
miq_Latn	129.28	182.98	144.36	141.12	104.92
mkd_Cyrl	85.29	151.22	112.67	89.21	89.46
mlg_Latn	107.66	135.73	106.88	128.75	86.60
mlt_Latn	108.58	168.92	134.24	137.26	107.12
mos_Latn	129.97	161.07	135.30	138.61	112.98
mps_Latn	196.29	283.21	212.92	204.15	126.56
mri_Latn	87.56	138.21	103.82	111.33	88.68
mrw_Latn	127.39	174.88	134.59	133.06	99.21
msa_Latn	104.71	152.69	97.60	93.04	113.32
mwm_Latn	159.30	238.34	171.46	159.27	99.80
mxv_Latn	146.98	235.76	162.84	164.53	126.52
mya_Mymr	162.62	248.51	185.69	84.78	107.92
myv_Cyrl	148.95	192.16	140.65	152.76	110.76
mzh_Latn	146.28	217.81	160.03	153.09	101.97
nan_Latn	130.85	204.44	144.08	138.29	118.30
naq_Latn	126.47	179.33	139.25	135.80	100.90
nav_Latn	167.01	233.91	176.25	183.89	119.97
nbl_Latn	109.14	148.07	114.06	127.75	96.55
nch_Latn	155.09	212.74	165.40	171.74	144.74
ncj_Latn	131.14	184.55	137.38	140.86	129.63
ndc_Latn	106.50	151.16	111.70	117.94	107.15
nde_Latn	106.83	152.97	114.79	133.97	100.83
ndo_Latn	132.12	162.83	138.17	107.11	105.82
nds_Latn	123.29	166.55	124.21	125.62	123.87
nep_Deva	109.47	199.05	81.70	141.10	103.11
ngu_Latn	148.78	204.17	156.73	156.92	120.15
nia_Latn	135.60	192.37	143.95	130.11	111.86
nld_Latn	58.81	114.31	96.47	97.28	82.78
nmf_Latn	122.39	165.17	130.43	134.30	98.07
nnb_Latn	122.26	163.28	127.40	131.83	98.36
nno_Latn	80.33	133.43	102.92	112.53	86.17
nob_Latn	61.45	126.38	100.25	98.89	80.02
nor_Latn	56.27	104.11	87.94	86.18	71.86
npi_Deva	115.63	219.43	78.62	159.24	96.97
nse_Latn	116.86	157.47	124.34	116.64	109.55
nso_Latn	116.55	160.63	114.34	132.40	97.49
nya_Latn	112.30	160.76	116.85	124.27	101.20
nyn_Latn	120.67	159.71	127.46	131.05	106.34
nyy_Latn	153.10	189.04	164.69	160.66	121.06
nzi_Latn	130.01	179.62	150.60	141.28	101.29
ori_Orya	148.25	392.96	91.33	296.43	98.06
ory_Orya	143.02	352.28	95.95	282.70	106.99
oss_Cyrl	140.22	182.75	141.83	139.80	97.09
ote_Latn	160.20	247.13	175.42	168.28	119.40
pag_Latn	123.49	163.26	131.05	133.56	109.90
pam_Latn	117.54	163.02	121.69	130.95	103.78
pan_Guru	130.07	286.36	90.80	208.67	106.44
pap_Latn	110.09	149.82	118.80	114.73	92.08
pau_Latn	125.22	178.67	132.62	131.85	104.80
pcm_Latn	76.80	127.05	89.92	91.28	79.44
pdt_Latn	124.83	175.03	129.63	126.61	97.29
pes_Arab	91.68	129.42	105.39	105.84	84.63
pis_Latn	118.50	180.76	130.26	133.14	95.70
pls_Latn	147.97	217.00	152.42	153.90	104.29
plt_Latn	113.52	139.96	112.22	139.93	89.18
poh_Latn	240.95	363.61	257.65	256.21	140.88
pol_Latn	61.88	111.24	97.90	107.87	85.46
pon_Latn	123.40	164.68	131.48	125.12	105.87
por_Latn	53.69	106.83	42.29	45.85	75.88
prk_Latn	118.66	167.68	121.37	128.55	94.48
prs_Arab	88.26	123.81	99.63	105.09	80.28
pxm_Latn	154.30	207.27	160.81	160.27	102.81
qub_Latn	133.85	172.77	139.98	107.69	93.49
quc_Latn	176.66	222.18	191.60	178.35	124.30

Table 19: Detailed results of

NLL

on PBC (Part IV).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
qug_Latn	124.11	158.66	131.62	95.23	95.07
quh_Latn	148.83	174.81	154.34	107.04	106.24
quw_Latn	104.78	139.63	109.91	95.69	92.58
quy_Latn	119.84	140.49	127.93	85.16	94.14
quz_Latn	126.18	149.08	134.60	85.68	96.32
qvi_Latn	134.03	177.51	139.73	114.64	100.81
rap_Latn	139.27	239.23	152.39	157.18	100.81
rar_Latn	136.30	205.48	152.87	149.88	123.36
rmy_Latn	124.05	164.59	129.35	132.97	108.84
ron_Latn	71.75	145.55	113.22	136.42	92.16
rop_Latn	141.24	218.11	152.37	163.59	93.46
rug_Latn	144.21	200.64	155.21	151.73	99.72
run_Latn	111.11	140.11	101.95	120.78	99.61
rus_Cyrl	57.09	115.06	85.44	48.93	78.66
sag_Latn	118.57	144.91	123.32	113.66	101.74
sah_Cyrl	140.83	175.34	139.86	155.36	99.78
san_Deva	120.11	183.77	128.71	131.38	123.45
san_Latn	133.78	188.35	151.17	152.84	112.82
sba_Latn	147.44	205.90	167.36	154.66	98.05
seh_Latn	116.71	159.73	123.08	121.65	100.51
sin_Sinh	133.79	283.43	166.13	228.72	113.72
slk_Latn	75.89	141.01	105.13	123.74	89.45
slv_Latn	75.67	140.40	111.88	127.31	95.15
sme_Latn	134.17	166.51	131.28	132.85	103.75
smo_Latn	113.64	165.04	126.68	127.57	96.65
sna_Latn	107.03	157.69	112.48	124.30	99.14
snd_Arab	141.08	183.48	144.96	173.65	107.47
som_Latn	114.80	163.60	131.06	149.83	110.34
sop_Latn	120.92	148.69	129.81	113.62	117.37
sot_Latn	112.14	155.55	113.59	127.04	95.35
spa_Latn	49.64	107.41	43.22	48.95	69.30
sqi_Latn	106.17	145.44	116.56	140.41	92.13
srm_Latn	172.30	242.13	187.65	185.42	124.54
srn_Latn	112.06	137.43	113.65	121.74	91.24
srp_Cyrl	57.16	129.53	97.17	99.06	71.36
srp_Latn	61.53	124.70	95.02	105.00	71.54
ssw_Latn	120.48	172.73	132.69	140.25	104.17
sun_Latn	123.92	165.15	124.81	129.90	111.93
suz_Deva	141.06	222.01	143.66	139.17	93.18
swe_Latn	60.78	124.59	105.53	99.90	86.99
swh_Latn	97.92	131.52	87.87	54.27	90.87
sxn_Latn	173.04	249.27	188.33	183.25	124.96
tam_Taml	109.64	213.05	70.91	64.81	100.45
tat_Cyrl	136.52	167.63	136.15	147.39	94.42
tbz_Latn	135.12	176.55	145.64	137.50	88.42
tca_Latn	202.44	294.68	215.39	207.66	112.33
tdt_Latn	114.70	164.66	123.50	129.26	93.86
tel_Telu	122.18	196.41	91.03	65.40	115.17
teo_Latn	115.64	157.71	122.24	117.99	99.95
tgk_Cyrl	128.86	144.47	127.10	140.25	101.04
tgl_Latn	74.71	130.27	109.14	110.56	85.29
tha_Thai	107.69	187.16	134.02	58.96	101.09
tih_Latn	129.95	188.97	139.91	137.24	89.82
tir_Ethi	122.75	258.00	143.76	190.93	99.03
tlh_Latn	87.59	142.90	97.02	97.38	62.55
tob_Latn	179.90	269.36	189.99	191.60	107.12
toh_Latn	127.60	171.62	136.60	136.06	104.79
toi_Latn	124.75	166.04	133.32	114.25	114.54
toj_Latn	175.52	237.75	181.56	177.72	148.72
ton_Latn	120.61	179.89	125.92	137.67	98.37
top_Latn	165.19	223.46	174.82	173.24	164.19
tpi_Latn	105.47	161.38	117.02	128.35	84.22
tpm_Latn	120.52	166.93	131.95	129.07	89.91
tsn_Latn	112.13	163.36	113.76	129.32	96.63
tso_Latn	125.25	155.16	120.37	134.66	103.51
tsz_Latn	129.96	184.88	142.29	126.39	110.66
tuc_Latn	187.91	261.93	196.20	166.43	106.46

Table 20: Detailed results of

NLL

on PBC (Part V).

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MALA-500
tui_Latn	135.41	187.71	146.21	146.43	107.57
tuk_Cyrl	127.20	168.02	136.69	145.86	94.72
tuk_Latn	123.72	144.22	124.67	135.21	97.42
tum_Latn	127.49	165.45	130.05	109.19	102.35
tur_Latn	76.97	118.11	102.96	57.95	99.22
twi_Latn	110.21	159.12	110.54	122.43	93.81
tyv_Cyrl	165.82	197.37	164.25	181.87	107.33
tzh_Latn	147.06	205.37	157.00	148.16	118.46
tzo_Latn	166.45	248.81	178.03	173.52	122.42
udm_Cyrl	138.00	176.90	140.39	137.21	102.56
uig_Arab	166.57	226.61	157.03	229.43	114.04
uig_Latn	145.11	165.68	156.02	157.78	121.77
ukr_Cyrl	68.45	134.95	101.40	93.65	92.95
urd_Arab	99.74	141.13	74.20	63.53	110.49
uzb_Cyrl	128.48	149.94	136.85	135.60	88.90
uzb_Latn	118.83	138.99	132.96	145.00	95.19
uzn_Cyrl	136.04	160.00	145.95	142.34	94.12
ven_Latn	131.18	172.05	138.68	137.88	104.82
vie_Latn	74.42	115.85	56.37	59.30	91.10
wal_Latn	129.99	180.12	134.43	122.12	105.68
war_Latn	111.26	159.23	118.85	131.06	113.74
wbm_Latn	120.96	174.48	126.11	128.99	94.78
wol_Latn	115.67	154.99	101.97	127.09	102.93
xav_Latn	243.35	430.83	263.49	257.10	137.76
xho_Latn	112.96	155.63	109.11	135.39	107.43
yan_Latn	125.81	179.37	136.34	131.31	98.47
yao_Latn	143.68	187.96	148.92	143.16	114.54
yap_Latn	150.28	207.86	157.67	157.91	123.07
yom_Latn	118.61	155.58	121.64	126.20	100.18
yor_Latn	129.77	166.68	100.87	155.88	105.79
yua_Latn	148.34	218.12	155.65	156.40	118.30
yue_Hani	64.57	122.47	54.42	62.71	87.78
zai_Latn	121.90	161.31	121.61	129.12	108.99
zho_Hani	64.02	115.19	51.79	63.22	69.53
zlm_Latn	57.83	101.11	48.96	51.87	64.76
zom_Latn	119.86	159.31	128.06	125.99	98.96
zsm_Latn	60.40	110.60	51.75	52.51	70.43
zul_Latn	103.20	157.55	113.26	130.35	98.06
all	122.10	180.54	129.55	131.31	101.67

Table 21: Detailed results of

NLL

on PBC (Part VI).

Lang	LLaMA 2 7B	mGPT 13B	BLOOM 7B1	XGLM 7.5B	MaLA-500
Lang	LLaMA 2 7B	mGPT 13B	BLOOM 7B1	XGLM 7.5B	1-shot	2-shot	3-shot	4-shot	5-shot	6-shot	7-shot	8-shot	9-shot	10-shot
ace_Latn	44.12	47.55	50.00	36.76	34.31	52.94	60.29	60.78	65.69	67.65	64.22	65.20	68.63	71.57
acm_Arab	52.45	65.69	69.12	58.33	32.35	53.43	59.31	63.73	63.73	67.16	66.67	69.12	66.67	66.67
afr_Latn	68.14	55.39	53.92	40.20	41.18	62.25	65.69	69.12	71.08	74.02	73.53	74.51	76.96	78.92
ajp_Arab	47.55	64.22	68.63	53.43	33.33	56.86	59.80	59.80	65.20	63.73	63.24	69.12	68.14	66.67
als_Latn	41.67	46.57	45.59	28.43	27.94	51.96	63.73	62.25	69.12	71.08	69.61	72.06	75.98	77.45
amh_Ethi	15.69	18.63	16.67	13.24	25.00	36.76	45.59	51.47	51.96	53.92	51.96	53.43	54.90	53.92
apc_Arab	46.57	65.69	68.14	53.43	31.37	55.88	60.29	65.69	65.69	67.16	65.20	68.63	67.65	72.06
arb_Arab	53.43	63.24	68.14	57.35	32.35	54.90	60.29	63.73	65.20	68.14	67.16	69.12	68.63	70.59
ary_Arab	45.10	57.84	69.12	50.49	26.47	52.45	55.39	56.37	60.29	59.80	64.22	63.73	59.31	64.71
arz_Arab	50.98	64.22	68.14	56.86	30.88	52.45	59.31	60.78	64.22	66.18	68.14	69.12	66.67	69.12
asm_Beng	17.16	49.02	61.27	37.25	31.37	53.43	58.82	65.20	67.65	67.65	68.14	67.65	67.65	67.65
ast_Latn	69.12	60.78	69.12	55.39	34.31	65.69	70.10	70.59	74.02	75.00	75.98	77.94	79.90	79.90
ayr_Latn	25.00	26.96	32.35	19.61	20.10	29.41	38.24	38.73	38.24	43.14	40.20	43.14	44.61	42.16
azb_Arab	25.49	41.67	32.84	24.51	25.98	41.18	45.59	45.10	46.57	49.51	50.00	49.02	50.49	49.51
azj_Latn	34.80	64.22	37.25	32.84	30.88	57.84	64.71	68.63	64.22	72.55	69.61	70.59	74.02	72.55
bak_Cyrl	38.73	61.27	32.35	32.35	34.80	51.47	60.29	61.27	69.12	68.63	68.14	68.63	73.53	70.10
bam_Latn	25.49	24.51	29.41	20.10	22.55	25.98	34.80	42.16	43.14	44.12	45.10	42.16	46.08	44.12
ban_Latn	58.82	51.47	58.82	43.14	28.92	55.39	63.24	65.69	66.67	72.06	72.06	72.55	72.06	71.57
bel_Cyrl	47.55	59.80	28.92	30.39	40.69	60.29	63.24	66.18	67.65	70.10	72.55	72.06	72.55	73.04
bem_Latn	31.37	28.92	38.24	25.49	21.08	34.80	43.14	48.04	50.49	50.49	53.43	53.43	52.45	53.92
ben_Beng	25.49	61.27	64.22	52.45	31.37	54.90	63.24	62.25	67.65	70.10	70.10	69.12	66.18	68.63
bjn_Latn	48.53	51.96	61.76	42.65	32.35	62.75	66.18	68.14	71.57	75.98	73.04	72.55	75.00	77.45
bod_Tibt	15.20	12.75	15.69	15.69	22.06	34.80	37.75	37.75	38.73	39.71	39.71	39.71	41.67	44.12
bos_Latn	65.20	64.71	45.59	33.82	37.75	65.20	72.06	70.10	71.57	75.98	75.00	76.47	76.96	77.45
bul_Cyrl	66.18	63.24	38.73	52.94	45.10	62.25	67.65	67.16	68.14	75.00	71.57	72.06	73.53	75.00
cat_Latn	71.08	60.78	66.67	60.29	34.31	59.31	68.14	68.63	71.57	72.55	69.12	73.04	76.96	76.47
ceb_Latn	50.49	50.98	49.02	39.71	39.22	60.29	66.67	66.67	68.63	73.04	72.06	71.57	74.51	74.02
ces_Latn	69.12	62.75	47.55	40.69	39.22	62.25	69.61	70.59	72.55	76.47	72.55	74.02	80.88	76.96
cjk_Latn	27.94	30.39	34.31	26.47	22.55	30.88	31.86	32.84	38.24	38.24	38.24	35.78	39.22	42.65
ckb_Arab	19.61	22.55	23.04	12.25	28.92	53.43	60.29	57.35	65.20	65.20	62.75	65.69	65.69	70.59
cmn_Hani	73.04	65.20	67.65	54.90	39.71	69.12	74.02	72.06	76.47	77.45	76.47	75.98	79.41	76.47
crh_Latn	38.24	56.37	40.20	36.76	29.41	51.96	62.25	61.76	64.22	69.12	64.22	70.10	69.12	71.08
cym_Latn	39.22	28.43	34.80	21.57	28.43	55.39	62.75	63.73	66.18	72.06	74.02	72.06	75.00	77.45
dan_Latn	69.12	64.22	55.39	44.12	38.24	54.41	63.24	65.20	70.10	71.08	72.06	71.57	73.53	74.51
deu_Latn	74.02	60.29	61.27	55.39	41.18	63.73	68.63	71.57	69.12	75.00	75.49	76.47	77.45	77.45
dyu_Latn	28.43	28.92	32.35	20.10	21.08	29.90	38.73	39.71	46.57	44.12	41.67	47.06	44.61	43.63
dzo_Tibt	14.71	10.29	13.73	12.75	21.57	30.39	36.76	37.25	39.71	36.76	39.22	37.75	43.14	39.22
ell_Grek	47.55	63.73	28.43	60.29	43.63	62.75	69.61	66.67	69.12	69.61	70.59	73.04	72.06	72.06
eng_Latn	71.57	59.80	71.08	67.65	48.04	63.24	70.59	69.12	69.12	74.02	73.04	74.51	76.96	75.98
epo_Latn	52.94	50.49	52.94	43.63	27.94	49.51	66.18	66.67	68.63	72.55	73.53	71.57	74.51	75.98
est_Latn	48.04	54.90	41.18	57.35	29.41	55.88	62.75	66.67	70.10	72.06	70.10	69.12	71.57	73.04
eus_Latn	36.27	59.80	64.22	55.88	27.94	52.45	61.76	66.18	66.18	73.53	73.53	72.55	73.53	75.49
ewe_Latn	23.53	23.53	29.90	17.16	23.53	28.43	38.24	35.29	41.18	43.63	38.73	44.61	39.71	43.63
fao_Latn	41.18	43.63	38.73	29.90	35.29	52.94	56.86	57.84	62.25	61.27	62.25	61.76	64.22	69.12
fij_Latn	27.45	27.45	36.27	24.02	24.51	37.75	48.53	47.55	53.92	48.53	50.00	51.47	51.47	53.92
fin_Latn	67.65	63.24	38.73	56.86	36.76	61.76	70.10	70.10	72.55	74.51	72.06	73.53	75.00	75.49
fon_Latn	25.49	22.06	31.37	19.12	23.53	29.90	30.88	37.25	35.78	38.24	39.71	38.24	37.25	46.57
fra_Latn	72.06	64.71	66.18	59.80	36.76	58.82	71.08	67.65	71.08	74.51	71.57	74.02	77.45	77.45
ful_Latn	27.45	31.37	32.84	24.02	21.57	31.86	36.76	40.69	43.14	45.10	44.12	47.06	47.06	46.08
fur_Latn	58.82	50.00	55.88	38.73	35.78	53.92	55.88	62.25	66.67	68.63	68.14	67.16	75.00	72.55
gla_Latn	37.75	24.51	27.45	17.16	24.51	50.98	55.88	57.84	57.35	59.80	63.24	61.27	62.75	65.20
gle_Latn	39.71	25.49	25.00	17.16	27.94	53.43	57.84	63.24	64.22	67.65	64.71	63.24	62.75	73.53
glg_Latn	68.63	62.25	64.22	55.39	29.41	66.67	70.10	71.08	74.02	76.96	73.53	76.47	77.45	80.39
grn_Latn	42.16	47.06	48.53	33.82	25.98	52.45	62.75	64.71	62.25	67.16	64.22	65.69	61.76	69.61
guj_Gujr	15.20	09.31	62.25	12.75	28.43	50.98	54.90	60.29	63.73	63.73	64.22	65.69	62.75	67.16
hat_Latn	41.67	38.73	42.16	45.10	34.80	57.35	63.73	62.25	65.20	72.06	69.61	70.10	73.53	73.04
hau_Latn	25.49	28.43	29.90	20.59	28.43	47.06	55.39	57.84	61.27	65.69	62.75	65.69	66.18	65.20
heb_Hebr	37.75	63.24	20.59	11.76	26.47	41.67	47.06	51.47	54.90	51.96	50.98	54.90	53.92	54.41
hin_Deva	44.61	62.75	62.75	51.96	33.82	55.39	60.78	66.67	65.20	69.61	70.59	74.02	73.04	72.06
hne_Deva	37.75	58.82	59.80	49.02	28.43	55.39	55.88	65.20	62.75	68.63	66.18	65.69	68.14	71.08
hrv_Latn	66.18	65.20	44.12	36.27	40.69	63.73	73.04	71.08	73.53	76.47	72.06	74.51	78.43	78.43
hun_Latn	71.08	63.24	41.67	27.94	30.88	60.78	67.16	70.59	68.63	75.49	73.53	74.02	73.04	76.47
hye_Armn	20.59	17.16	13.73	12.75	32.84	58.82	59.80	67.16	65.20	69.61	68.14	69.12	69.12	72.55
ibo_Latn	24.02	26.47	38.24	19.12	30.39	51.47	57.35	63.73	67.16	69.12	68.14	69.12	68.63	72.06
ilo_Latn	45.10	45.59	48.04	32.35	27.45	54.90	61.76	61.76	68.14	68.14	70.10	73.04	73.53	70.10
ind_Latn	74.02	62.75	70.10	54.90	40.69	62.75	68.63	71.57	70.59	75.49	75.49	76.96	80.39	77.94
isl_Latn	35.29	36.76	28.92	24.51	38.73	55.88	60.78	58.33	60.29	64.71	63.73	63.24	62.75	65.20
ita_Latn	69.61	62.25	62.75	57.84	40.20	64.22	70.59	70.59	74.51	77.94	75.98	76.96	80.39	76.96
jav_Latn	50.49	52.94	55.39	38.24	31.86	53.43	60.78	64.22	65.20	72.55	69.12	73.04	68.63	73.04
jpn_Jpan	73.53	60.29	63.24	55.88	38.73	67.16	72.06	75.49	78.92	79.41	80.39	78.92	81.86	81.37
kab_Latn	16.18	16.67	20.10	12.25	20.59	24.02	22.55	30.39	31.86	34.80	28.43	33.33	32.35	34.31
kac_Latn	25.98	24.51	28.43	20.59	20.10	24.02	29.90	35.78	35.78	43.14	37.75	37.25	43.63	39.71
kam_Latn	26.96	34.31	34.80	26.47	22.06	36.76	38.73	37.75	40.69	41.67	46.57	41.67	42.16	42.16
kan_Knda	17.16	11.27	61.27	11.27	25.49	50.98	57.35	60.29	61.27	65.20	63.24	64.22	65.69	67.16
kat_Geor	29.41	61.27	18.14	14.71	32.84	56.86	60.78	62.25	65.20	67.65	70.59	70.59	70.10	74.51
kaz_Cyrl	37.75	62.75	29.90	28.43	34.31	53.43	57.35	62.25	65.69	67.16	65.20	65.69	69.12	67.65
kbp_Latn	24.51	22.06	30.39	16.18	21.08	28.43	36.76	38.73	40.69	39.22	40.69	39.71	38.73	40.20
kea_Latn	53.43	51.96	56.86	39.71	32.84	56.86	63.73	65.20	67.16	69.61	69.12	71.57	71.57	72.06
khm_Khmr	27.45	11.27	25.49	15.20	39.22	61.76	67.16	67.65	68.63	72.06	73.53	75.00	76.47	76.47
kik_Latn	29.41	32.84	38.73	26.96	21.57	37.75	49.51	50.49	50.49	56.86	56.37	56.37	52.94	56.86
kin_Latn	26.47	32.35	50.49	24.51	27.45	40.69	49.02	52.94	57.84	58.33	60.78	56.86	59.31	59.80
kir_Cyrl	35.78	60.78	34.80	27.45	29.90	45.59	58.33	60.78	60.29	65.20	60.29	64.71	66.18	66.18
kmb_Latn	26.47	28.43	33.82	25.00	21.08	31.86	35.29	39.71	38.24	41.18	41.67	37.25	41.67	44.61
kmr_Latn	29.41	33.82	33.33	21.57	25.98	37.75	47.06	47.55	52.45	52.45	54.90	54.41	58.82	61.76
kon_Latn	33.33	33.82	40.69	32.35	22.06	39.71	46.57	51.96	53.92	64.71	64.22	60.78	64.22	65.20
kor_Hang	67.65	63.24	43.14	56.37	45.10	63.24	67.65	69.12	71.57	73.04	70.59	75.98	76.96	76.47
lao_Laoo	24.02	14.22	26.47	16.67	39.71	55.39	62.25	63.73	68.63	70.59	70.10	68.63	70.59	70.59
lij_Latn	55.88	53.43	56.37	44.61	37.25	58.82	67.65	66.67	69.61	71.57	71.08	74.02	76.47	74.02

Table 22: Detailed results on SIB200 (Part I). For previous LLMs, 3-shot results are presented.

Lang	LLaMA 2 7B	mGPT 13B	BLOOM 7B1	XGLM 7.5B	MaLA-500
Lang	LLaMA 2 7B	mGPT 13B	BLOOM 7B1	XGLM 7.5B	1-shot	2-shot	3-shot	4-shot	5-shot	6-shot	7-shot	8-shot	9-shot	10-shot
lim_Latn	60.78	50.00	50.00	33.82	32.84	56.37	60.78	63.24	66.67	67.16	69.12	72.55	70.59	72.06
lin_Latn	36.76	40.20	43.14	34.31	23.04	38.24	47.06	53.92	57.84	61.27	56.86	60.29	62.75	65.69
lit_Latn	40.20	60.29	41.18	30.39	32.35	55.39	62.75	65.20	64.71	70.59	68.14	66.67	69.61	73.04
lmo_Latn	57.84	50.98	55.88	41.67	34.80	59.31	65.69	66.18	70.10	71.57	70.59	70.59	75.00	75.49
ltz_Latn	55.88	47.06	52.94	39.22	39.22	56.37	65.20	61.27	70.59	68.14	71.08	70.59	71.08	74.02
lua_Latn	32.35	33.33	39.22	28.43	20.10	33.82	40.20	42.65	49.02	51.96	50.00	50.00	49.51	50.00
lug_Latn	27.94	25.00	33.82	19.61	22.06	35.78	40.20	43.63	48.04	51.47	47.06	43.63	49.02	49.02
luo_Latn	28.43	28.43	32.84	25.49	21.57	31.37	37.25	42.65	48.04	49.51	46.57	51.47	49.51	51.47
lus_Latn	43.63	42.16	49.02	31.37	25.49	45.59	51.96	53.92	52.45	54.90	57.84	60.29	58.33	60.29
lvs_Latn	43.14	67.16	43.63	29.41	31.37	57.84	65.20	63.24	68.14	72.55	67.65	69.61	71.08	72.55
mai_Deva	40.69	59.31	60.29	51.47	33.33	57.84	61.76	66.67	67.65	69.12	69.12	70.10	71.57	69.12
mal_Mlym	20.10	60.29	64.71	13.24	25.98	52.45	59.31	60.29	62.75	62.25	65.69	63.24	63.73	68.14
mar_Deva	29.90	56.86	63.73	37.75	36.27	51.96	57.35	64.22	63.73	63.73	63.73	66.67	66.18	68.14
min_Latn	48.04	55.39	59.80	39.71	31.37	57.35	69.12	68.14	68.63	77.94	72.06	75.98	75.49	77.45
mkd_Cyrl	60.78	52.45	32.84	44.12	44.12	66.18	68.63	69.12	68.63	73.04	73.04	72.55	73.04	76.96
mlt_Latn	49.51	45.10	46.08	29.90	35.78	64.71	67.16	68.14	67.16	77.45	75.49	76.96	77.45	76.96
mon_Cyrl	23.53	54.90	20.10	18.63	38.24	50.00	56.86	55.88	63.24	64.22	63.24	63.24	65.20	67.16
mos_Latn	25.49	23.53	29.90	20.59	20.59	27.94	36.76	37.75	37.75	40.69	41.67	45.10	41.67	45.10
mri_Latn	30.39	24.02	30.88	17.65	28.43	44.12	49.02	51.47	51.47	57.84	55.88	58.82	56.37	58.33
mya_Mymr	19.12	60.29	19.61	60.29	23.53	38.73	43.14	53.43	53.43	50.98	52.45	54.90	51.96	54.90
nld_Latn	70.10	59.80	55.88	46.08	45.59	64.71	69.12	68.63	73.04	73.53	75.49	74.02	79.41	80.88
nno_Latn	64.71	61.76	52.45	45.59	35.29	52.94	64.22	62.75	66.18	68.63	68.63	70.10	69.12	73.04
npi_Deva	39.22	51.96	64.71	40.69	33.82	57.84	61.76	68.14	67.65	67.65	68.63	70.10	68.63	75.49
nso_Latn	27.94	30.88	33.33	22.55	21.08	33.82	43.14	46.08	49.02	52.94	51.96	53.92	54.90	53.92
nya_Latn	32.35	34.31	40.69	27.94	23.04	35.29	45.59	49.02	50.98	51.47	52.94	53.92	52.94	58.33
oci_Latn	68.63	56.37	65.69	48.53	34.31	60.29	69.12	65.20	67.65	73.04	73.53	71.57	75.49	76.47
orm_Latn	17.16	18.14	22.06	16.67	20.10	30.39	35.29	41.18	41.67	47.55	41.67	44.12	43.14	51.47
ory_Orya	13.24	13.73	64.22	11.76	24.51	45.10	52.45	57.84	53.92	61.27	57.84	56.86	60.78	60.78
pag_Latn	52.45	49.51	53.92	40.20	31.86	54.90	62.75	60.78	67.65	64.71	70.10	70.10	69.12	69.61
pan_Guru	14.22	11.27	62.25	11.76	33.82	54.90	58.82	63.73	64.22	67.65	67.16	66.67	68.63	67.16
pap_Latn	55.39	50.00	52.94	38.24	30.39	56.86	64.71	66.67	69.61	74.51	69.12	73.53	70.59	75.49
pes_Arab	47.06	58.82	52.94	32.84	39.22	61.27	71.08	63.73	70.59	72.55	72.55	73.53	76.47	76.47
plt_Latn	28.43	32.84	37.25	21.57	29.41	51.96	58.82	57.84	59.31	60.78	60.29	60.29	64.22	60.29
pol_Latn	74.51	60.78	47.06	32.84	36.76	61.76	68.63	69.12	71.08	75.00	74.02	74.02	77.45	75.98
por_Latn	70.10	61.76	65.20	59.31	36.76	64.71	72.06	70.10	74.51	75.00	76.96	75.49	78.43	82.84
prs_Arab	50.49	55.39	49.51	33.33	37.25	60.78	64.22	67.16	69.12	72.55	72.55	73.53	72.55	75.49
pus_Arab	30.39	34.80	38.73	21.08	30.39	47.06	50.98	52.45	54.41	53.92	53.92	55.88	55.88	57.84
quy_Latn	32.84	35.29	40.69	35.29	22.06	36.27	44.12	45.59	49.02	52.45	49.51	49.02	50.98	50.98
ron_Latn	69.12	61.76	57.84	42.65	41.18	61.27	70.10	65.20	70.10	74.51	73.53	75.00	78.92	78.43
run_Latn	25.49	27.94	44.12	25.49	23.53	37.25	46.57	50.49	51.96	59.31	51.96	56.37	57.84	60.29
rus_Cyrl	71.57	63.24	53.43	60.29	38.73	64.22	65.20	69.12	72.06	75.98	75.00	76.47	75.49	78.92
sag_Latn	29.90	27.94	31.37	21.08	20.59	30.88	43.63	47.06	48.53	55.88	52.45	54.41	55.88	58.82
san_Deva	27.94	47.55	54.90	42.65	24.51	48.04	60.29	57.84	62.25	66.67	65.20	61.76	66.18	65.20
scn_Latn	51.96	50.00	53.43	40.69	37.25	63.73	73.04	70.59	74.02	77.45	75.49	75.00	80.39	76.47
sin_Sinh	15.20	10.78	20.10	12.75	29.90	56.37	60.29	65.20	66.18	68.14	64.71	66.67	63.73	67.16
slk_Latn	68.14	60.29	47.55	39.71	34.31	58.33	68.63	66.67	70.59	75.00	70.59	71.57	74.51	75.00
slv_Latn	68.14	60.78	44.12	32.84	38.73	63.24	68.14	68.14	70.59	73.53	73.53	74.51	78.43	76.47
smo_Latn	30.39	25.00	31.86	18.14	29.41	52.45	60.29	62.25	62.25	65.69	67.16	65.20	66.18	69.61
sna_Latn	28.43	29.41	36.27	23.53	24.51	39.71	44.61	45.59	44.61	49.51	45.59	47.55	47.06	50.00
snd_Arab	27.94	37.25	39.22	23.53	27.94	42.65	47.06	50.49	52.45	54.41	54.41	52.94	55.88	56.86
som_Latn	23.53	25.49	27.94	17.16	22.06	36.27	44.61	47.55	51.47	52.94	52.94	53.92	54.41	55.39
sot_Latn	29.41	28.43	33.82	18.63	22.55	36.76	43.14	47.06	50.49	51.96	52.45	55.39	54.41	56.86
spa_Latn	72.55	58.33	67.65	56.37	35.29	64.22	69.61	72.06	74.51	74.02	72.06	76.47	78.43	78.43
srd_Latn	53.92	52.45	50.98	37.25	31.37	60.29	66.18	68.63	75.98	74.02	77.94	77.45	79.41	79.41
srp_Cyrl	63.73	55.39	33.33	39.22	45.59	65.20	70.59	69.61	73.04	76.47	74.02	75.00	77.94	79.41
ssw_Latn	29.41	25.00	31.37	21.57	24.02	44.12	46.57	50.00	52.94	51.96	53.92	56.86	53.92	60.78
sun_Latn	55.39	59.31	63.73	44.61	37.25	60.29	68.63	70.10	71.08	73.53	72.55	73.53	75.00	75.98
swe_Latn	71.08	61.27	52.94	48.04	33.82	53.43	60.29	64.71	64.71	69.12	70.10	69.61	72.55	70.59
swh_Latn	32.35	63.24	61.27	56.86	29.41	50.49	59.31	58.82	62.75	60.29	62.25	68.63	66.67	66.67
szl_Latn	56.86	50.49	45.59	29.41	30.88	51.47	59.80	63.73	64.71	67.16	69.61	68.63	71.08	69.12
tam_Taml	20.59	63.24	67.16	58.82	30.88	50.49	55.88	62.75	63.24	63.73	65.20	69.61	66.67	68.63
tat_Cyrl	37.75	60.29	35.29	28.92	33.33	54.90	64.22	64.71	65.69	74.51	70.10	71.57	71.57	73.53
tel_Telu	18.14	60.78	61.27	59.80	25.98	50.00	52.45	59.80	58.82	63.73	60.78	60.29	63.73	61.76
tgk_Cyrl	26.96	57.84	23.53	17.16	36.76	54.90	60.29	60.78	61.27	69.12	64.71	66.18	68.14	70.59
tgl_Latn	55.88	58.33	49.02	40.20	43.14	64.22	69.61	64.71	70.10	75.49	74.51	77.45	78.43	77.45
tha_Thai	44.61	60.78	23.53	57.35	41.18	63.24	67.16	68.63	70.10	72.06	70.59	70.10	72.06	75.49
tir_Ethi	13.24	16.18	16.18	13.73	21.57	34.80	39.22	41.67	47.06	46.08	45.59	47.06	45.59	47.55
tpi_Latn	63.24	46.57	56.86	33.33	31.86	58.82	65.69	68.14	70.10	72.06	74.51	73.53	75.98	74.51
tsn_Latn	28.92	29.90	32.35	24.51	24.51	40.20	44.12	47.06	47.06	53.43	51.96	50.00	50.49	54.41
tso_Latn	30.88	31.37	36.76	28.92	22.55	35.29	41.18	45.10	46.08	48.04	43.14	43.14	45.59	49.02
tuk_Latn	34.31	46.08	39.22	27.45	24.02	45.59	53.43	58.82	57.84	63.73	63.73	66.18	65.69	66.18
tum_Latn	26.96	34.31	33.82	27.94	21.57	39.22	43.14	45.59	46.08	47.55	44.61	49.02	49.51	49.51
tur_Latn	52.94	62.75	40.20	52.94	36.76	60.78	68.63	70.10	72.06	74.02	75.00	76.47	75.98	76.96
uig_Arab	18.63	18.14	20.10	11.27	21.08	33.33	36.76	39.71	43.14	44.61	43.14	48.53	47.55	48.04
ukr_Cyrl	71.57	63.73	41.18	43.63	39.71	60.29	65.69	66.18	69.12	71.08	75.00	73.53	72.55	75.00
umb_Latn	25.00	26.47	29.90	23.04	21.57	30.88	32.84	36.76	35.78	40.20	38.24	34.80	36.76	35.29
urd_Arab	38.73	53.43	63.24	54.41	36.27	55.39	62.75	64.22	64.22	68.63	65.20	68.63	67.16	67.16
uzb_Latn	30.39	62.75	35.78	23.53	22.06	50.49	56.37	57.84	63.24	72.06	63.73	69.12	72.55	71.57
vec_Latn	65.69	59.80	56.86	52.45	39.22	62.25	66.18	69.61	70.10	69.12	74.51	75.98	75.00	76.47
vie_Latn	67.65	63.24	67.16	60.78	39.71	60.78	67.16	68.63	74.51	75.98	76.47	75.00	78.43	79.90
war_Latn	51.47	51.47	51.47	37.25	37.75	61.27	65.69	65.20	69.61	73.04	71.57	71.57	74.02	74.02
wol_Latn	32.35	34.80	43.14	25.49	23.53	36.76	42.16	45.59	48.53	53.43	47.55	52.94	54.90	53.43
xho_Latn	30.39	29.90	38.24	22.55	25.98	46.57	51.96	56.37	58.33	60.78	61.76	60.29	62.75	64.22
yid_Hebr	23.04	22.06	16.18	12.25	24.02	34.80	39.22	39.71	40.20	46.57	41.18	40.69	44.61	45.10
yor_Latn	21.57	29.41	47.55	21.57	26.47	32.35	41.18	39.22	41.67	48.04	42.16	43.14	44.61	43.14
yue_Hani	75.00	64.71	67.16	55.88	40.69	69.12	71.57	76.47	76.96	81.37	77.94	79.41	81.86	79.41
zsm_Latn	65.69	61.27	64.71	50.00	36.76	60.29	68.14	69.12	67.65	73.53	73.53	76.96	77.45	75.00
zul_Latn	25.00	25.49	35.29	15.20	25.49	51.47	49.02	54.41	56.37	57.84	60.29	61.76	59.31	62.75
all	42.08	45.34	44.63	34.36	30.88	50.71	57.02	58.95	61.20	64.04	63.15	64.13	65.19	66.32

Table 23: Detailed results on SIB200 (Part II). For previous LLMs, 3-shot results are presented.

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MaLA-500
ace_Latn	46.85	47.75	49.55	41.44	48.65
ach_Latn	45.05	37.84	41.44	40.54	36.04
acr_Latn	47.75	51.35	50.45	47.75	50.45
afr_Latn	54.05	38.74	51.35	49.55	55.86
agw_Latn	48.65	45.05	41.44	42.34	49.55
ahk_Latn	43.24	36.04	36.04	35.14	45.05
aka_Latn	42.34	32.43	38.74	42.34	54.95
aln_Latn	34.23	35.14	36.94	35.14	44.14
als_Latn	38.74	36.94	42.34	42.34	47.75
alt_Cyrl	44.14	44.14	45.05	51.35	51.35
alz_Latn	36.94	35.14	31.53	28.83	37.84
aoj_Latn	50.93	37.96	45.37	46.3	49.07
arb_Arab	43.24	45.05	49.55	44.14	50.45
arn_Latn	38.74	42.34	34.23	36.04	43.24
ary_Arab	32.43	33.33	38.74	32.43	44.14
arz_Arab	31.53	39.64	45.05	36.94	46.85
asm_Beng	45.95	42.34	54.95	40.54	54.05
ayr_Latn	47.75	37.84	44.14	44.14	54.05
azb_Arab	39.64	40.54	47.75	45.05	47.75
aze_Latn	45.05	46.85	45.05	43.24	49.55
bak_Cyrl	45.05	52.25	49.55	56.76	56.76
bam_Latn	42.34	37.84	49.55	39.64	47.75
ban_Latn	36.04	41.44	34.23	34.23	42.34
bar_Latn	49.55	46.85	44.14	48.65	53.15
bba_Latn	45.05	32.43	45.95	46.85	46.85
bci_Latn	36.94	35.14	36.94	33.33	44.14
bcl_Latn	42.34	48.65	39.64	39.64	54.95
bel_Cyrl	47.75	45.95	48.65	43.24	57.66
bem_Latn	47.75	37.84	42.34	41.44	51.35
ben_Beng	40.54	41.44	52.25	51.35	47.75
bhw_Latn	37.84	43.24	41.44	46.85	47.75
bim_Latn	38.74	39.64	33.33	36.94	45.05
bis_Latn	44.14	49.55	44.14	39.64	48.65
bqc_Latn	39.64	36.04	34.23	33.33	40.54
bre_Latn	39.64	36.04	35.14	36.04	40.54
btx_Latn	49.55	36.94	42.34	41.44	43.24
bul_Cyrl	45.05	42.34	48.65	45.05	54.95
bum_Latn	42.34	39.64	37.84	37.84	44.14
bzj_Latn	53.15	46.85	47.75	50.45	52.25
cab_Latn	39.64	38.74	37.84	36.94	36.04
cac_Latn	43.24	37.84	40.54	38.74	45.05
cak_Latn	45.95	35.14	44.14	40.54	50.45
caq_Latn	39.64	38.74	38.74	44.14	37.84
cat_Latn	52.25	45.05	46.85	48.65	52.25
cbk_Latn	54.05	40.54	56.76	54.05	55.86
cce_Latn	49.55	45.05	50.45	48.65	48.65
ceb_Latn	44.14	42.34	48.65	45.05	51.35
ces_Latn	44.14	43.24	45.05	46.85	51.35
cfm_Latn	49.55	41.44	49.55	53.15	48.65
che_Cyrl	37.84	33.33	36.94	37.84	38.74
chk_Latn	45.05	41.44	41.44	36.04	45.95
chv_Cyrl	43.24	45.05	45.05	49.55	58.56
ckb_Arab	44.14	36.94	45.05	42.34	51.35
cmn_Hani	48.65	45.05	53.15	48.65	53.15
cnh_Latn	46.85	46.85	46.85	49.55	46.85
crh_Cyrl	49.55	40.54	47.75	54.95	54.05
crs_Latn	52.25	44.14	49.55	55.86	59.46
csy_Latn	47.75	41.44	54.95	53.15	45.95
ctd_Latn	50.45	48.65	56.76	53.15	56.76
ctu_Latn	41.44	35.14	38.74	40.54	43.24
cuk_Latn	42.34	42.34	38.74	39.64	37.84
cym_Latn	39.64	38.74	39.64	43.24	41.44
dan_Latn	53.15	41.44	39.64	38.74	54.95
deu_Latn	45.05	36.04	37.84	38.74	43.24
djk_Latn	42.34	35.14	42.34	46.85	40.54
dln_Latn	48.65	40.54	51.35	54.05	47.75

Table 24: Detailed results on Taxi1500 (Part I). 3-shot results are presented.

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MaLA-500
dtp_Latn	39.64	35.14	42.34	46.85	50.45
dyu_Latn	41.44	39.64	42.34	38.74	46.85
dzo_Tibt	45.05	40.54	41.44	45.05	45.05
efi_Latn	39.64	36.04	38.74	41.44	45.95
ell_Grek	49.55	45.95	49.55	48.65	51.35
eng_Latn	55.86	42.34	58.56	54.05	59.46
enm_Latn	50.45	41.44	56.76	50.45	54.95
epo_Latn	49.55	40.54	47.75	42.34	49.55
est_Latn	46.85	42.34	38.74	52.25	46.85
eus_Latn	38.74	36.04	36.94	39.64	39.64
ewe_Latn	51.35	43.24	50.45	46.85	45.05
fao_Latn	53.15	44.14	52.25	53.15	58.56
fas_Arab	49.55	50.45	57.66	51.35	55.86
fij_Latn	48.65	43.24	41.44	43.24	53.15
fil_Latn	48.65	41.44	46.85	51.35	51.35
fin_Latn	47.75	45.05	41.44	45.95	54.95
fon_Latn	38.74	35.14	37.84	40.54	45.05
fra_Latn	60.36	51.35	62.16	52.25	59.46
fry_Latn	37.84	33.33	36.04	27.03	46.85
gaa_Latn	41.44	33.33	37.84	35.14	40.54
gil_Latn	36.7	31.19	41.28	32.11	41.28
giz_Latn	46.85	44.14	43.24	38.74	45.05
gkn_Latn	38.74	34.23	34.23	36.94	41.44
gkp_Latn	30.63	33.33	41.44	29.73	48.65
gla_Latn	33.33	39.64	44.14	45.05	49.55
gle_Latn	33.33	35.14	36.04	34.23	39.64
glv_Latn	43.24	41.44	37.84	38.74	42.34
gom_Latn	34.23	31.53	33.33	40.54	42.34
gor_Latn	43.24	34.23	43.24	40.54	46.85
guc_Latn	44.14	36.04	37.84	41.44	45.05
gug_Latn	45.05	44.14	42.34	41.44	50.45
guj_Gujr	45.95	37.84	52.25	44.14	56.76
gur_Latn	45.95	45.95	44.14	47.75	48.65
guw_Latn	45.05	37.84	47.75	46.85	48.65
gya_Latn	37.84	37.84	41.44	34.23	42.34
gym_Latn	41.44	39.64	39.64	43.24	50.45
hat_Latn	50.45	43.24	44.14	41.44	56.76
hau_Latn	44.14	37.84	41.44	44.14	48.65
haw_Latn	45.95	39.64	38.74	34.23	49.55
heb_Hebr	38.74	35.14	34.23	36.94	44.14
hif_Latn	42.34	43.24	49.55	47.75	48.65
hil_Latn	49.55	41.44	40.54	36.94	54.95
hin_Deva	51.35	50.45	49.55	46.85	56.76
hmo_Latn	46.85	45.05	46.85	45.05	53.15
hne_Deva	55.86	54.05	54.05	58.56	58.56
hnj_Latn	48.65	45.05	53.15	51.35	60.36
hra_Latn	49.55	41.44	43.24	46.85	45.95
hrv_Latn	55.86	51.35	52.25	54.95	61.26
hui_Latn	51.35	40.54	41.44	45.05	46.85
hun_Latn	46.85	44.14	41.44	43.24	48.65
hus_Latn	32.43	32.43	34.23	37.84	42.34
hye_Armn	45.95	40.54	45.95	49.55	61.26
iba_Latn	49.55	46.85	51.35	48.65	55.86
ibo_Latn	38.74	33.33	43.24	38.74	44.14
ifa_Latn	36.04	30.63	35.14	38.74	43.24
ifb_Latn	34.23	35.14	39.64	34.23	53.15
ikk_Latn	43.24	36.94	39.64	39.64	43.24
ilo_Latn	39.64	36.04	41.44	37.84	43.24
ind_Latn	49.55	50.45	53.15	53.15	54.95
isl_Latn	48.65	44.14	43.24	48.65	54.05
ita_Latn	50.45	49.55	56.76	58.56	54.95
ium_Latn	45.95	44.14	49.55	50.45	45.05
ixl_Latn	42.34	39.64	41.44	43.24	40.54
izz_Latn	38.74	47.75	39.64	42.34	54.95
jam_Latn	41.44	43.24	53.15	50.45	61.26
jav_Latn	41.44	47.75	44.14	37.84	45.95

Table 25: Detailed results on Taxi1500 (Part II). 3-shot results are presented.

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MaLA-500
jpn_Jpan	46.85	46.85	47.75	50.45	51.35
kaa_Latn	43.24	53.15	47.75	51.35	54.05
kab_Latn	27.93	36.04	30.63	34.23	35.14
kac_Latn	44.14	34.23	43.24	42.34	52.25
kal_Latn	41.44	37.84	36.04	35.14	40.54
kan_Knda	48.65	37.84	52.25	45.95	54.05
kat_Geor	41.44	41.44	42.34	46.85	48.65
kaz_Cyrl	49.55	45.05	51.35	53.15	55.86
kbp_Latn	40.54	35.14	36.94	31.53	47.75
kek_Latn	45.95	42.34	45.05	44.14	51.35
khm_Khmr	52.25	38.74	48.65	49.55	64.86
kia_Latn	36.94	36.04	40.54	41.44	48.65
kik_Latn	45.05	43.24	45.05	44.14	50.45
kin_Latn	42.34	37.84	41.44	38.74	50.45
kir_Cyrl	51.35	46.85	47.75	63.06	64.86
kjb_Latn	48.65	46.85	44.14	44.14	48.65
kjh_Cyrl	44.14	41.44	45.05	41.44	45.95
kmm_Latn	45.95	45.05	47.75	51.35	45.95
kmr_Cyrl	39.64	35.14	45.05	42.34	43.24
knv_Latn	44.55	44.55	45.45	42.73	44.55
kor_Hang	48.65	48.65	49.55	51.35	62.16
kpg_Latn	44.14	52.25	51.35	42.34	54.95
krc_Cyrl	45.95	36.04	48.65	48.65	53.15
kri_Latn	49.55	48.65	49.55	51.35	54.95
ksd_Latn	36.94	33.33	40.54	33.33	49.55
kss_Latn	32.43	28.83	34.23	29.73	47.75
ksw_Mymr	44.14	45.95	42.34	37.84	52.25
kua_Latn	41.44	42.34	36.94	35.14	40.54
lam_Latn	43.24	36.94	45.95	43.24	40.54
lao_Laoo	45.05	39.64	46.85	50.45	50.45
lat_Latn	53.15	41.44	53.15	56.76	57.66
lav_Latn	39.64	33.33	36.04	39.64	45.05
ldi_Latn	35.14	32.43	36.94	34.23	36.04
leh_Latn	47.75	37.84	33.33	32.43	41.44
lhu_Latn	27.93	34.23	34.23	37.84	42.34
lin_Latn	47.75	37.84	39.64	39.64	48.65
lit_Latn	42.34	40.54	44.14	48.65	49.55
loz_Latn	45.95	42.34	36.04	44.14	40.54
ltz_Latn	46.85	45.95	47.75	41.44	49.55
lug_Latn	40.54	32.43	39.64	38.74	45.95
luo_Latn	40.54	36.94	34.23	38.74	40.54
lus_Latn	39.64	40.54	42.34	41.44	50.45
lzh_Hani	54.95	48.65	54.05	43.24	56.76
mad_Latn	47.75	52.25	47.75	47.75	53.15
mah_Latn	43.24	36.04	42.34	45.95	45.05
mai_Deva	45.05	41.44	49.55	54.05	51.35
mam_Latn	43.24	33.33	41.44	45.05	45.95
mar_Deva	49.55	44.14	53.15	45.95	56.76
mau_Latn	29.73	29.73	36.94	37.84	32.43
mbb_Latn	44.14	42.34	38.74	39.64	49.55
mck_Latn	40.54	34.23	36.04	39.64	49.55
mcn_Latn	35.14	27.93	33.33	33.33	38.74
mco_Latn	41.44	33.33	43.24	33.33	43.24
mdy_Ethi	39.64	46.85	43.24	43.24	51.35
meu_Latn	53.15	38.74	45.05	48.65	52.25
mfe_Latn	51.35	48.65	52.25	50.45	56.76
mgh_Latn	42.34	33.33	41.44	35.14	38.74
mgr_Latn	39.64	34.23	33.33	41.44	38.74
mhr_Cyrl	47.27	42.73	45.45	42.73	48.18
min_Latn	37.84	45.95	53.15	45.05	53.15
miq_Latn	51.35	46.85	43.24	54.95	49.55
mkd_Cyrl	52.25	48.65	56.76	57.66	66.67
mlg_Latn	35.14	36.04	36.94	37.84	45.95
mlt_Latn	37.84	33.33	42.34	43.24	46.85
mos_Latn	39.64	42.34	39.64	36.04	36.04
mps_Latn	47.75	45.05	42.34	45.05	51.35
mri_Latn	45.05	42.34	38.74	42.34	44.14

Table 26: Detailed results on Taxi1500 (Part III). 3-shot results are presented.

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MaLA-500
mrw_Latn	40.54	39.64	41.44	37.84	49.55
msa_Latn	44.14	41.44	45.95	37.84	46.85
mwm_Latn	36.94	31.53	39.64	38.74	47.75
mxv_Latn	33.33	35.14	39.64	38.74	40.54
mya_Mymr	45.05	48.65	44.14	44.14	46.85
myv_Cyrl	39.64	43.24	40.54	41.44	45.05
mzh_Latn	45.05	45.95	42.34	40.54	44.14
nan_Latn	32.43	35.14	48.65	49.55	44.14
naq_Latn	36.94	36.94	37.84	39.64	41.44
nav_Latn	27.03	28.83	30.63	33.33	38.74
nbl_Latn	21.62	18.02	21.62	25.23	27.93
nch_Latn	37.84	34.23	33.33	40.54	40.54
ncj_Latn	46.85	45.95	42.34	41.44	42.34
ndc_Latn	44.14	36.04	43.24	36.94	49.55
nde_Latn	33.33	29.73	33.33	36.04	41.44
ndo_Latn	41.28	34.86	37.61	33.94	46.79
nds_Latn	41.44	38.74	37.84	34.23	43.24
nep_Deva	45.05	49.55	63.06	51.35	60.36
ngu_Latn	47.75	39.64	43.24	42.34	49.55
nld_Latn	47.75	39.64	47.75	43.24	56.76
nmf_Latn	44.14	40.54	42.34	41.44	44.14
nnb_Latn	45.05	42.34	36.94	44.14	40.54
nno_Latn	56.76	46.85	45.95	52.25	54.95
nob_Latn	52.25	41.44	44.14	45.95	56.76
nor_Latn	50.45	35.14	46.85	47.75	53.15
npi_Deva	51.35	54.95	55.86	45.95	54.95
nse_Latn	38.74	28.83	39.64	38.74	42.34
nso_Latn	45.05	43.24	45.05	45.05	50.45
nya_Latn	48.65	39.64	44.14	42.34	54.95
nyn_Latn	39.64	33.33	37.84	36.94	45.05
nyy_Latn	43.24	42.34	43.24	40.54	47.75
nzi_Latn	36.94	32.43	33.33	32.43	35.14
ori_Orya	43.24	34.23	51.35	46.85	45.95
ory_Orya	44.14	44.14	49.55	46.85	55.86
oss_Cyrl	49.55	49.55	49.55	44.14	54.05
ote_Latn	34.23	31.53	34.23	36.04	49.55
pag_Latn	44.14	48.65	48.65	42.34	50.45
pam_Latn	45.95	36.04	44.14	47.75	45.05
pan_Guru	41.44	33.33	46.85	40.54	47.75
pap_Latn	50.45	44.14	52.25	49.55	53.15
pau_Latn	38.74	45.05	37.84	36.94	46.85
pcm_Latn	58.56	47.75	56.76	53.15	57.66
pdt_Latn	53.15	45.95	45.95	48.65	54.05
pes_Arab	50.91	46.36	59.09	48.18	53.64
pis_Latn	57.66	47.75	50.45	45.95	55.86
pls_Latn	43.24	43.24	43.24	39.64	45.95
plt_Latn	36.94	35.14	37.84	43.24	47.75
poh_Latn	42.34	42.34	45.05	39.64	48.65
pol_Latn	41.44	43.24	46.85	55.86	56.76
pon_Latn	45.95	39.64	43.24	39.64	42.34
por_Latn	56.76	54.95	56.76	54.05	58.56
prk_Latn	44.14	43.24	49.55	40.54	46.85
prs_Arab	50.45	51.35	55.86	56.76	57.66
pxm_Latn	48.65	44.14	41.44	41.44	47.75
qub_Latn	46.85	44.14	43.24	48.65	45.05
quc_Latn	45.05	41.44	43.24	38.74	50.45
qug_Latn	45.95	46.85	50.45	45.05	56.76
quh_Latn	49.55	49.55	46.85	42.34	51.35
quw_Latn	43.24	36.94	45.05	44.14	53.15
quy_Latn	58.56	48.65	54.95	50.45	57.66
quz_Latn	51.35	38.74	60.36	54.95	59.46
qvi_Latn	46.79	46.79	49.54	45.87	47.71
rap_Latn	43.24	35.14	41.44	39.64	46.85
rar_Latn	40.54	32.43	31.53	29.73	45.95
rmy_Latn	37.84	37.84	38.74	40.54	43.24
ron_Latn	45.05	51.35	44.14	47.75	57.66

Table 27: Detailed results on Taxi1500 (Part IV). 3-shot results are presented.

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MaLA-500
rop_Latn	45.95	45.05	42.34	42.34	55.86
rug_Latn	43.24	38.74	46.85	44.14	45.05
run_Latn	46.85	40.54	45.05	40.54	52.25
rus_Cyrl	49.55	41.44	50.45	47.75	53.15
sag_Latn	43.24	43.24	41.44	40.54	47.75
sah_Cyrl	40.54	35.14	44.14	44.14	54.95
sba_Latn	42.34	43.24	45.05	40.54	49.55
seh_Latn	45.05	35.14	40.54	42.34	45.95
sin_Sinh	39.64	38.74	39.64	42.34	45.95
slk_Latn	53.15	50.45	44.14	47.75	53.15
slv_Latn	47.75	45.05	55.86	51.35	49.55
sme_Latn	45.95	45.05	42.34	41.44	48.65
smo_Latn	38.74	40.54	43.24	44.14	53.15
sna_Latn	50.45	30.63	43.24	45.95	60.36
snd_Arab	44.14	45.05	56.76	51.35	56.76
som_Latn	33.33	36.94	35.14	34.23	39.64
sop_Latn	40.54	34.23	40.54	35.14	35.14
sot_Latn	47.75	41.44	40.54	43.24	49.55
spa_Latn	51.35	49.55	51.35	51.35	56.76
sqi_Latn	42.34	43.24	52.25	52.25	57.66
srm_Latn	35.14	41.44	39.64	37.84	45.05
srn_Latn	45.95	53.15	54.05	48.65	51.35
srp_Latn	59.46	48.65	58.56	54.05	58.56
ssw_Latn	38.74	45.05	36.94	40.54	48.65
sun_Latn	43.24	40.54	45.05	44.14	48.65
suz_Deva	46.85	42.34	42.34	43.24	49.55
swe_Latn	58.56	48.65	53.15	54.95	61.26
swh_Latn	46.85	49.55	49.55	48.65	56.76
sxn_Latn	42.34	36.94	44.14	44.14	46.85
tam_Taml	44.14	53.15	59.46	48.65	60.36
tat_Cyrl	47.75	47.75	45.95	48.65	54.05
tbz_Latn	36.04	35.14	34.23	35.14	42.34
tca_Latn	39.64	40.54	43.24	41.44	45.05
tdt_Latn	40.54	38.74	48.65	45.05	52.25
tel_Telu	33.33	45.95	50.45	45.95	49.55
teo_Latn	33.33	37.84	26.13	31.53	41.44
tgk_Cyrl	42.34	44.14	48.65	49.55	57.66
tgl_Latn	48.65	41.44	46.85	51.35	51.35
tha_Thai	43.24	42.34	43.24	37.84	47.75
tih_Latn	43.24	37.84	40.54	36.04	54.05
tir_Ethi	29.73	36.94	27.93	34.23	41.44
tlh_Latn	51.35	45.95	45.95	41.44	53.15
tob_Latn	44.55	43.64	41.82	38.18	50.00
toh_Latn	42.34	39.64	40.54	40.54	42.34
toi_Latn	44.14	45.05	34.23	36.04	45.05
toj_Latn	43.24	40.54	36.94	43.24	42.34
ton_Latn	42.34	42.34	42.34	44.14	52.25
top_Latn	46.85	34.23	37.84	38.74	36.94
tpi_Latn	48.65	44.14	52.25	48.65	49.55
tpm_Latn	37.84	41.44	38.74	32.43	42.34
tsn_Latn	40.54	36.04	38.74	34.23	37.84
tsz_Latn	37.84	32.43	37.84	38.74	46.85
tuc_Latn	45.95	44.14	47.75	44.14	48.65
tui_Latn	42.34	38.74	38.74	37.84	50.45
tuk_Latn	36.04	42.34	45.05	43.24	50.45
tum_Latn	47.75	39.64	46.85	52.25	50.45
tur_Latn	46.79	44.04	40.37	43.12	45.87
twi_Latn	41.44	43.24	41.44	37.84	46.85
tyv_Cyrl	38.74	38.74	43.24	44.14	45.05
tzh_Latn	41.82	36.36	41.82	41.82	38.18
tzo_Latn	39.64	43.24	34.23	29.73	41.44
udm_Cyrl	36.94	38.74	42.34	44.14	47.75
ukr_Cyrl	52.25	48.65	51.35	55.86	53.15

Table 28: Detailed results on Taxi1500 (Part V). 3-shot results are presented.

Lang	LLaMA 2-7B	mGPT-13B	BLOOM-7B1	XGLM-7.5B	MaLA-500
ukr_Cyrl	52.25	48.65	51.35	55.86	53.15
uzb_Latn	45.05	49.55	37.84	46.85	54.05
uzn_Cyrl	45.95	40.54	45.05	45.05	49.55
ven_Latn	45.05	44.14	42.34	41.44	54.05
vie_Latn	53.15	45.95	62.16	45.95	54.95
wal_Latn	35.14	33.33	35.14	35.14	39.64
war_Latn	48.65	39.64	37.84	45.05	54.95
wbm_Latn	48.65	39.64	46.85	46.85	48.65
wol_Latn	36.04	34.23	32.43	34.23	36.94
xav_Latn	50.45	33.33	46.85	44.14	45.95
xho_Latn	43.24	37.84	40.54	39.64	46.85
yan_Latn	45.05	46.85	52.25	41.44	53.15
yao_Latn	42.34	41.44	43.24	44.14	48.65
yap_Latn	38.74	40.54	35.14	32.43	41.44
yom_Latn	35.14	31.53	33.33	25.23	36.94
yor_Latn	41.44	38.74	39.64	44.14	47.75
yua_Latn	41.44	32.43	43.24	41.44	36.04
yue_Hani	43.24	48.65	53.15	38.74	57.66
zai_Latn	45.05	35.14	40.54	43.24	44.14
zho_Hani	47.75	51.35	51.35	44.14	58.56
zlm_Latn	54.05	49.55	57.66	56.76	64.86
zom_Latn	50.45	42.34	44.14	43.24	48.65
zsm_Latn	58.56	59.46	63.96	55.86	66.67
zul_Latn	46.85	42.34	46.85	46.85	51.35
all	44.07	40.98	43.98	43.24	48.89

Table 29: Detailed results on Taxi1500 (Part VI). 3-shot results are presented.