We gratefully acknowledge support from
the Simons Foundation and member institutions.

Gautier Dagan is qualified to endorse.

Getting the most out of your tokenizer for pre-training and domain adaptation

Gautier Dagan: Is registered as an author of this paper.
Can endorse for cs.CL. (why?)

Gabriel Synnaeve and Baptiste Rozière are not registered as owners of this paper. (why?)