Search | arXiv e-print repository

Skill Check: Some Considerations on the Evaluation of Gamemastering Models for Role-playing Games

Authors: Santiago Góngora, Luis Chiruzzo, Gonzalo Méndez, Pablo Gervás

Abstract: In role-playing games a Game Master (GM) is the player in charge of the game, who must design the challenges the players face and narrate the outcomes of their actions. In this work we discuss some challenges to model GMs from an Interactive Storytelling and Natural Language Processing perspective. Following those challenges we propose three test categories to evaluate such dialogue systems, and w… ▽ More In role-playing games a Game Master (GM) is the player in charge of the game, who must design the challenges the players face and narrate the outcomes of their actions. In this work we discuss some challenges to model GMs from an Interactive Storytelling and Natural Language Processing perspective. Following those challenges we propose three test categories to evaluate such dialogue systems, and we use them to test ChatGPT, Bard and OpenAssistant as out-of-the-box GMs. △ Less

Submitted 30 September, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

Comments: 11 pages. Accepted at GALA 2023 (Games and Learning Alliance 12th International Conference)

arXiv:2309.06163 [pdf, ps, other]

Overview of GUA-SPA at IberLEF 2023: Guarani-Spanish Code Switching Analysis

Authors: Luis Chiruzzo, Marvin Agüero-Torales, Gustavo Giménez-Lugo, Aldo Alvarez, Yliana Rodríguez, Santiago Góngora, Thamar Solorio

Abstract: We present the first shared task for detecting and analyzing code-switching in Guarani and Spanish, GUA-SPA at IberLEF 2023. The challenge consisted of three tasks: identifying the language of a token, NER, and a novel task of classifying the way a Spanish span is used in the code-switched context. We annotated a corpus of 1500 texts extracted from news articles and tweets, around 25 thousand toke… ▽ More We present the first shared task for detecting and analyzing code-switching in Guarani and Spanish, GUA-SPA at IberLEF 2023. The challenge consisted of three tasks: identifying the language of a token, NER, and a novel task of classifying the way a Spanish span is used in the code-switched context. We annotated a corpus of 1500 texts extracted from news articles and tweets, around 25 thousand tokens, with the information for the tasks. Three teams took part in the evaluation phase, obtaining in general good results for Task 1, and more mixed results for Tasks 2 and 3. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Journal ref: Procesamiento del Lenguaje Natural, Revista no. 71, septiembre de 2023, pp. 321-328

arXiv:2302.07912 [pdf, other]

Meeting the Needs of Low-Resource Languages: The Value of Automatic Alignments via Pretrained Models

Authors: Abteen Ebrahimi, Arya D. McCarthy, Arturo Oncevay, Luis Chiruzzo, John E. Ortega, Gustavo A. Giménez-Lugo, Rolando Coto-Solano, Katharina Kann

Abstract: Large multilingual models have inspired a new class of word alignment methods, which work well for the model's pretraining languages. However, the languages most in need of automatic alignment are low-resource and, thus, not typically included in the pretraining data. In this work, we ask: How do modern aligners perform on unseen languages, and are they better than traditional methods? We contribu… ▽ More Large multilingual models have inspired a new class of word alignment methods, which work well for the model's pretraining languages. However, the languages most in need of automatic alignment are low-resource and, thus, not typically included in the pretraining data. In this work, we ask: How do modern aligners perform on unseen languages, and are they better than traditional methods? We contribute gold-standard alignments for Bribri--Spanish, Guarani--Spanish, Quechua--Spanish, and Shipibo-Konibo--Spanish. With these, we evaluate state-of-the-art aligners with and without model adaptation to the target language. Finally, we also evaluate the resulting alignments extrinsically through two downstream tasks: named entity recognition and part-of-speech tagging. We find that although transformer-based methods generally outperform traditional models, the two classes of approach remain competitive with each other. △ Less

Submitted 15 February, 2023; originally announced February 2023.

Comments: EACL 2023

arXiv:2208.10898 [pdf, other]

Don't Take it Personally: Analyzing Gender and Age Differences in Ratings of Online Humor

Authors: J. A. Meaney, Steven R. Wilson, Luis Chiruzzo, Walid Magdy

Abstract: Computational humor detection systems rarely model the subjectivity of humor responses, or consider alternative reactions to humor - namely offense. We analyzed a large dataset of humor and offense ratings by male and female annotators of different age groups. We find that women link these two concepts more strongly than men, and they tend to give lower humor ratings and higher offense scores. We… ▽ More Computational humor detection systems rarely model the subjectivity of humor responses, or consider alternative reactions to humor - namely offense. We analyzed a large dataset of humor and offense ratings by male and female annotators of different age groups. We find that women link these two concepts more strongly than men, and they tend to give lower humor ratings and higher offense scores. We also find that the correlation between humor and offense increases with age. Although there were no gender or age differences in humor detection, women and older annotators signalled that they did not understand joke texts more often than men. We discuss implications for computational humor detection and downstream tasks. △ Less

Submitted 23 August, 2022; originally announced August 2022.

arXiv:2104.08726 [pdf, other]

AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages

Authors: Abteen Ebrahimi, Manuel Mager, Arturo Oncevay, Vishrav Chaudhary, Luis Chiruzzo, Angela Fan, John Ortega, Ricardo Ramos, Annette Rios, Ivan Meza-Ruiz, Gustavo A. Giménez-Lugo, Elisabeth Mager, Graham Neubig, Alexis Palmer, Rolando Coto-Solano, Ngoc Thang Vu, Katharina Kann

Abstract: Pretrained multilingual models are able to perform cross-lingual transfer in a zero-shot setting, even for languages unseen during pretraining. However, prior work evaluating performance on unseen languages has largely been limited to low-level, syntactic tasks, and it remains unclear if zero-shot learning of high-level, semantic tasks is possible for unseen languages. To explore this question, we… ▽ More Pretrained multilingual models are able to perform cross-lingual transfer in a zero-shot setting, even for languages unseen during pretraining. However, prior work evaluating performance on unseen languages has largely been limited to low-level, syntactic tasks, and it remains unclear if zero-shot learning of high-level, semantic tasks is possible for unseen languages. To explore this question, we present AmericasNLI, an extension of XNLI (Conneau et al., 2018) to 10 indigenous languages of the Americas. We conduct experiments with XLM-R, testing multiple zero-shot and translation-based approaches. Additionally, we explore model adaptation via continued pretraining and provide an analysis of the dataset by considering hypothesis-only models. We find that XLM-R's zero-shot performance is poor for all 10 languages, with an average performance of 38.62%. Continued pretraining offers improvements, with an average accuracy of 44.05%. Surprisingly, training on poorly translated data by far outperforms all other methods with an accuracy of 48.72%. △ Less

Submitted 16 March, 2022; v1 submitted 18 April, 2021; originally announced April 2021.

Comments: Accepted to ACL 2022

arXiv:1710.06393 [pdf, other]

RETUYT in TASS 2017: Sentiment Analysis for Spanish Tweets using SVM and CNN

Authors: Aiala Rosá, Luis Chiruzzo, Mathias Etcheverry, Santiago Castro

Abstract: This article presents classifiers based on SVM and Convolutional Neural Networks (CNN) for the TASS 2017 challenge on tweets sentiment analysis. The classifier with the best performance in general uses a combination of SVM and CNN. The use of word embeddings was particularly useful for improving the classifiers performance. This article presents classifiers based on SVM and Convolutional Neural Networks (CNN) for the TASS 2017 challenge on tweets sentiment analysis. The classifier with the best performance in general uses a combination of SVM and CNN. The use of word embeddings was particularly useful for improving the classifiers performance. △ Less

Submitted 17 October, 2017; originally announced October 2017.

Comments: in Spanish. Published in http://ceur-ws.org/Vol-1896/p9_retuyt_tass2017.pdf

Journal ref: ISSN 1613-0073, TASS 2017: Workshop on Semantic Analysis at SEPLN, Sep 2017, pages 77-83

arXiv:1710.00477 [pdf, other]

A Crowd-Annotated Spanish Corpus for Humor Analysis

Authors: Santiago Castro, Luis Chiruzzo, Aiala Rosá, Diego Garat, Guillermo Moncecchi

Abstract: Computational Humor involves several tasks, such as humor recognition, humor generation, and humor scoring, for which it is useful to have human-curated data. In this work we present a corpus of 27,000 tweets written in Spanish and crowd-annotated by their humor value and funniness score, with about four annotations per tweet, tagged by 1,300 people over the Internet. It is equally divided between… ▽ More Computational Humor involves several tasks, such as humor recognition, humor generation, and humor scoring, for which it is useful to have human-curated data. In this work we present a corpus of 27,000 tweets written in Spanish and crowd-annotated by their humor value and funniness score, with about four annotations per tweet, tagged by 1,300 people over the Internet. It is equally divided between tweets coming from humorous and non-humorous accounts. The inter-annotator agreement Krippendorff's alpha value is 0.5710. The dataset is available for general use and can serve as a basis for humor detection and as a first step to tackle subjectivity. △ Less

Submitted 19 July, 2018; v1 submitted 2 October, 2017; originally announced October 2017.

Comments: Camera-ready version of the paper submitted to SocialNLP 2018, with a fixed typo

Showing 1–7 of 7 results for author: Chiruzzo, L