-
The Economic Implications of Large Language Model Selection on Earnings and Return on Investment: A Decision Theoretic Model
Authors:
Geraldo Xexéo,
Filipe Braida,
Marcus Parreiras,
Paulo Xavier
Abstract:
Selecting language models in business contexts requires a careful analysis of the final financial benefits of the investment. However, the emphasis of academia and industry analysis of LLM is solely on performance. This work introduces a framework to evaluate LLMs, focusing on the earnings and return on investment aspects that should be taken into account in business decision making. We use a deci…
▽ More
Selecting language models in business contexts requires a careful analysis of the final financial benefits of the investment. However, the emphasis of academia and industry analysis of LLM is solely on performance. This work introduces a framework to evaluate LLMs, focusing on the earnings and return on investment aspects that should be taken into account in business decision making. We use a decision-theoretic approach to compare the financial impact of different LLMs, considering variables such as the cost per token, the probability of success in the specific task, and the gain and losses associated with LLMs use. The study reveals how the superior accuracy of more expensive models can, under certain conditions, justify a greater investment through more significant earnings but not necessarily a larger RoI. This article provides a framework for companies looking to optimize their technology choices, ensuring that investment in cutting-edge technology aligns with strategic financial objectives. In addition, we discuss how changes in operational variables influence the economics of using LLMs, offering practical insights for enterprise settings, finding that the predicted gain and loss and the different probabilities of success and failure are the variables that most impact the sensitivity of the models.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
A Conceptual Model for the Analysis of Investigation Elements in Games
Authors:
Pedro Marques,
Marcus Parreiras,
Joshua Kritz,
Geraldo Xexeo
Abstract:
This paper presents the 4E conceptual model, developed to formally analyze investigation games from a game design perspective. The model encompasses four components: Exploration, Elicitation, Experimentation, and Evaluation. Grounded Theory was employed as the methodology for constructing the model, allowing for an in-depth understanding of the underlying concepts. The resulting model was then com…
▽ More
This paper presents the 4E conceptual model, developed to formally analyze investigation games from a game design perspective. The model encompasses four components: Exploration, Elicitation, Experimentation, and Evaluation. Grounded Theory was employed as the methodology for constructing the model, allowing for an in-depth understanding of the underlying concepts. The resulting model was then compared to existing literature, and its contributions were thoroughly discussed. Overall, the 4E model presents a comprehensive framework for understanding investigation games elements. It's application in two real-world scenarios demonstrates its practical relevance.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
A Vocabulary of Board Game Dynamics
Authors:
Joshua Kritz,
Geraldo Xexéo
Abstract:
In recent years, significant advances have been made in the field of game research. However, there has been a noticeable dearth of scholarly research focused on the domain of dynamics, despite the widespread recognition among researchers of its existence and importance. The objective of this paper is to address this research gap by presenting a vocabulary dedicated to boardgame dynamics. To achiev…
▽ More
In recent years, significant advances have been made in the field of game research. However, there has been a noticeable dearth of scholarly research focused on the domain of dynamics, despite the widespread recognition among researchers of its existence and importance. The objective of this paper is to address this research gap by presenting a vocabulary dedicated to boardgame dynamics. To achieve this goal, we employ a focus group to generate a set of dynamic concepts that are subsequently subjected to validation and refinement through a survey. The resulting concepts are then organized into a vocabulary using a taxonomic structure, allowing the grou** of these concepts into broader and more general ideas.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Word Embeddings: A Survey
Authors:
Felipe Almeida,
Geraldo Xexéo
Abstract:
This work lists and describes the main recent strategies for building fixed-length, dense and distributed representations for words, based on the distributional hypothesis. These representations are now commonly called word embeddings and, in addition to encoding surprisingly good syntactic and semantic information, have been proven useful as extra features in many downstream NLP tasks.
This work lists and describes the main recent strategies for building fixed-length, dense and distributed representations for words, based on the distributional hypothesis. These representations are now commonly called word embeddings and, in addition to encoding surprisingly good syntactic and semantic information, have been proven useful as extra features in many downstream NLP tasks.
△ Less
Submitted 1 May, 2023; v1 submitted 25 January, 2019;
originally announced January 2019.
-
RDF2PT: Generating Brazilian Portuguese Texts from RDF Data
Authors:
Diego Moussallem,
Thiago Castro Ferreira,
Marcos Zampieri,
Maria Claudia Cavalcanti,
Geraldo Xexéo,
Mariana Neves,
Axel-Cyrille Ngonga Ngomo
Abstract:
The generation of natural language from Resource Description Framework (RDF) data has recently gained significant attention due to the continuous growth of Linked Data. A number of these approaches generate natural language in languages other than English, however, no work has been proposed to generate Brazilian Portuguese texts out of RDF. We address this research gap by presenting RDF2PT, an app…
▽ More
The generation of natural language from Resource Description Framework (RDF) data has recently gained significant attention due to the continuous growth of Linked Data. A number of these approaches generate natural language in languages other than English, however, no work has been proposed to generate Brazilian Portuguese texts out of RDF. We address this research gap by presenting RDF2PT, an approach that verbalizes RDF data to Brazilian Portuguese language. We evaluated RDF2PT in an open questionnaire with 44 native speakers divided into experts and non-experts. Our results suggest that RDF2PT is able to generate text which is similar to that generated by humans and can hence be easily understood.
△ Less
Submitted 22 February, 2018;
originally announced February 2018.
-
A distributed system for SearchOnMath based on the Microsoft BizSpark program
Authors:
Ricardo M. Oliveira,
Flavio B. Gonzaga,
Valmir C. Barbosa,
Geraldo B. Xexéo
Abstract:
Mathematical information retrieval is a relatively new area, so the first search tools capable of retrieving mathematical formulas began to appear only a few years ago. The proposals made public so far mostly implement searches on internal university databases, small sets of scientific papers, or Wikipedia in English. As such, only modest computing power is required. In this context, SearchOnMath…
▽ More
Mathematical information retrieval is a relatively new area, so the first search tools capable of retrieving mathematical formulas began to appear only a few years ago. The proposals made public so far mostly implement searches on internal university databases, small sets of scientific papers, or Wikipedia in English. As such, only modest computing power is required. In this context, SearchOnMath has emerged as a pioneering tool in that it indexes several different databases and is compatible with several mathematical representation languages. Given the significantly greater number of formulas it handles, a distributed system becomes necessary to support it. The present study is based on the Microsoft BizSpark program and has aimed, for 38 different distributed-system scenarios, to pinpoint the one affording the best response times when searching the SearchOnMath databases for a collection of 120 formulas.
△ Less
Submitted 11 November, 2017;
originally announced November 2017.
-
The network structure of mathematical knowledge according to the Wikipedia, MathWorld, and DLMF online libraries
Authors:
Flavio B. Gonzaga,
Valmir C. Barbosa,
Geraldo B. Xexéo
Abstract:
We study the network structure of Wikipedia (restricted to its mathematical portion), MathWorld, and DLMF. We approach these three online mathematical libraries from the perspective of several global and local network-theoretic features, providing for each one the appropriate value or distribution, along with comparisons that, if possible, also include the whole of the Wikipedia or the Web. We ide…
▽ More
We study the network structure of Wikipedia (restricted to its mathematical portion), MathWorld, and DLMF. We approach these three online mathematical libraries from the perspective of several global and local network-theoretic features, providing for each one the appropriate value or distribution, along with comparisons that, if possible, also include the whole of the Wikipedia or the Web. We identify some distinguishing characteristics of all three libraries, most of them supposedly traceable to the libraries' shared nature of relating to a very specialized domain. Among these characteristics are the presence of a very large strongly connected component in each of the corresponding directed graphs, the complete absence of any clear power laws describing the distribution of local features, and the rise to prominence of some local features (e.g., stress centrality) that can be used to effectively search for keywords in the libraries.
△ Less
Submitted 14 December, 2012;
originally announced December 2012.