-
Question Answering with Texts and Tables through Deep Reinforcement Learning
Authors:
Marcos M. José,
Flávio N. Cação,
Maria F. Ribeiro,
Rafael M. Cheang,
Paulo Pirozelli,
Fabio G. Cozman
Abstract:
This paper proposes a novel architecture to generate multi-hop answers to open domain questions that require information from texts and tables, using the Open Table-and-Text Question Answering dataset for validation and training. One of the most common ways to generate answers in this setting is to retrieve information sequentially, where a selected piece of data helps searching for the next piece…
▽ More
This paper proposes a novel architecture to generate multi-hop answers to open domain questions that require information from texts and tables, using the Open Table-and-Text Question Answering dataset for validation and training. One of the most common ways to generate answers in this setting is to retrieve information sequentially, where a selected piece of data helps searching for the next piece. As different models can have distinct behaviors when called in this sequential information search, a challenge is how to select models at each step. Our architecture employs reinforcement learning to choose between different state-of-the-art tools sequentially until, in the end, a desired answer is generated. This system achieved an F1-score of 19.03, comparable to iterative systems in the literature.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Assessing Good, Bad and Ugly Arguments Generated by ChatGPT: a New Dataset, its Methodology and Associated Tasks
Authors:
Victor Hugo Nascimento Rocha,
Igor Cataneo Silveira,
Paulo Pirozelli,
Denis Deratani Mauá,
Fabio Gagliardi Cozman
Abstract:
The recent success of Large Language Models (LLMs) has sparked concerns about their potential to spread misinformation. As a result, there is a pressing need for tools to identify ``fake arguments'' generated by such models. To create these tools, examples of texts generated by LLMs are needed. This paper introduces a methodology to obtain good, bad and ugly arguments from argumentative essays pro…
▽ More
The recent success of Large Language Models (LLMs) has sparked concerns about their potential to spread misinformation. As a result, there is a pressing need for tools to identify ``fake arguments'' generated by such models. To create these tools, examples of texts generated by LLMs are needed. This paper introduces a methodology to obtain good, bad and ugly arguments from argumentative essays produced by ChatGPT, OpenAI's LLM. We then describe a novel dataset containing a set of diverse arguments, ArGPT. We assess the effectiveness of our dataset and establish baselines for several argumentation-related tasks. Finally, we show that the artificially generated data relates well to human argumentation and thus is useful as a tool to train and test systems for the defined tasks.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Assessing Logical Reasoning Capabilities of Encoder-Only Transformer Models
Authors:
Paulo Pirozelli,
Marcos M. José,
Paulo de Tarso P. Filho,
Anarosa A. F. Brandão,
Fabio G. Cozman
Abstract:
Logical reasoning is central to complex human activities, such as thinking, debating, and planning; it is also a central component of many AI systems as well. In this paper, we investigate the extent to which encoder-only transformer language models (LMs) can reason according to logical rules. We ask whether those LMs can deduce theorems in propositional calculus and first-order logic; if their re…
▽ More
Logical reasoning is central to complex human activities, such as thinking, debating, and planning; it is also a central component of many AI systems as well. In this paper, we investigate the extent to which encoder-only transformer language models (LMs) can reason according to logical rules. We ask whether those LMs can deduce theorems in propositional calculus and first-order logic; if their relative success in these problems reflects general logical capabilities; and which layers contribute the most to the task. First, we show for several encoder-only LMs that they can be trained, to a reasonable degree, to determine logical validity on various datasets. Next, by cross-probing fine-tuned models on these datasets, we show that LMs have difficulty in transferring their putative logical reasoning ability, which suggests that they may have learned dataset-specific features, instead of a general capability. Finally, we conduct a layerwise probing experiment, which shows that the hypothesis classification task is mostly solved through higher layers.
△ Less
Submitted 1 July, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Benchmarks for Pirá 2.0, a Reading Comprehension Dataset about the Ocean, the Brazilian Coast, and Climate Change
Authors:
Paulo Pirozelli,
Marcos M. José,
Igor Silveira,
Flávio Nakasato,
Sarajane M. Peres,
Anarosa A. F. Brandão,
Anna H. R. Costa,
Fabio G. Cozman
Abstract:
Pirá is a reading comprehension dataset focused on the ocean, the Brazilian coast, and climate change, built from a collection of scientific abstracts and reports on these topics. This dataset represents a versatile language resource, particularly useful for testing the ability of current machine learning models to acquire expert scientific knowledge. Despite its potential, a detailed set of basel…
▽ More
Pirá is a reading comprehension dataset focused on the ocean, the Brazilian coast, and climate change, built from a collection of scientific abstracts and reports on these topics. This dataset represents a versatile language resource, particularly useful for testing the ability of current machine learning models to acquire expert scientific knowledge. Despite its potential, a detailed set of baselines has not yet been developed for Pirá. By creating these baselines, researchers can more easily utilize Pirá as a resource for testing machine learning models across a wide range of question answering tasks. In this paper, we define six benchmarks over the Pirá dataset, covering closed generative question answering, machine reading comprehension, information retrieval, open question answering, answer triggering, and multiple choice question answering. As part of this effort, we have also produced a curated version of the original dataset, where we fixed a number of grammar issues, repetitions, and other shortcomings. Furthermore, the dataset has been extended in several new directions, so as to face the aforementioned benchmarks: translation of supporting texts from English into Portuguese, classification labels for answerability, automatic paraphrases of questions and answers, and multiple choice candidates. The results described in this paper provide several points of reference for researchers interested in exploring the challenges provided by the Pirá dataset.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
dPASP: A Comprehensive Differentiable Probabilistic Answer Set Programming Environment For Neurosymbolic Learning and Reasoning
Authors:
Renato Lui Geh,
Jonas Gonçalves,
Igor Cataneo Silveira,
Denis Deratani Mauá,
Fabio Gagliardi Cozman
Abstract:
We present dPASP, a novel declarative probabilistic logic programming framework for differentiable neuro-symbolic reasoning. The framework allows for the specification of discrete probabilistic models with neural predicates, logic constraints and interval-valued probabilistic choices, thus supporting models that combine low-level perception (images, texts, etc), common-sense reasoning, and (vague)…
▽ More
We present dPASP, a novel declarative probabilistic logic programming framework for differentiable neuro-symbolic reasoning. The framework allows for the specification of discrete probabilistic models with neural predicates, logic constraints and interval-valued probabilistic choices, thus supporting models that combine low-level perception (images, texts, etc), common-sense reasoning, and (vague) statistical knowledge. To support all such features, we discuss the several semantics for probabilistic logic programs that can express nondeterministic, contradictory, incomplete and/or statistical knowledge. We also discuss how gradient-based learning can be performed with neural predicates and probabilistic choices under selected semantics. We then describe an implemented package that supports inference and learning in the language, along with several example programs. The package requires minimal user knowledge of deep learning system's inner workings, while allowing end-to-end training of rather sophisticated models and loss functions.
△ Less
Submitted 5 August, 2023;
originally announced August 2023.
-
A Multilingual Translator to SQL with Database Schema Pruning to Improve Self-Attention
Authors:
Marcelo Archanjo Jose,
Fabio Gagliardi Cozman
Abstract:
Long sequences of text are challenging in the context of transformers, due to quadratic memory increase in the self-attention mechanism. As this issue directly affects the translation from natural language to SQL queries (as techniques usually take as input a concatenated text with the question and the database schema), we present techniques that allow long text sequences to be handled by transfor…
▽ More
Long sequences of text are challenging in the context of transformers, due to quadratic memory increase in the self-attention mechanism. As this issue directly affects the translation from natural language to SQL queries (as techniques usually take as input a concatenated text with the question and the database schema), we present techniques that allow long text sequences to be handled by transformers with up to 512 input tokens. We propose a training process with database schema pruning (removal of tables and columns names that are useless for the query of interest). In addition, we used a multilingual approach with the mT5-large model fine-tuned with a data-augmented Spider dataset in four languages simultaneously: English, Portuguese, Spanish, and French. Our proposed technique used the Spider dataset and increased the exact set match accuracy results from 0.718 to 0.736 in a validation dataset (Dev). Source code, evaluations, and checkpoints are available at: \underline{https://github.com/C4AI/gap-text2sql}.
△ Less
Submitted 25 June, 2023;
originally announced June 2023.
-
Markov Conditions and Factorization in Logical Credal Networks
Authors:
Fabio Gagliardi Cozman
Abstract:
We examine the recently proposed language of Logical Credal Networks, in particular investigating the consequences of various Markov conditions. We introduce the notion of structure for a Logical Credal Network and show that a structure without directed cycles leads to a well-known factorization result. For networks with directed cycles, we analyze the differences between Markov conditions, factor…
▽ More
We examine the recently proposed language of Logical Credal Networks, in particular investigating the consequences of various Markov conditions. We introduce the notion of structure for a Logical Credal Network and show that a structure without directed cycles leads to a well-known factorization result. For networks with directed cycles, we analyze the differences between Markov conditions, factorization results, and specification requirements.
△ Less
Submitted 17 March, 2023; v1 submitted 27 February, 2023;
originally announced February 2023.
-
Augmenting a Physics-Informed Neural Network for the 2D Burgers Equation by Addition of Solution Data Points
Authors:
Marlon Sproesser Mathias,
Wesley Pereira de Almeida,
Marcel Rodrigues de Barros,
Jefferson Fialho Coelho,
Lucas Palmiro de Freitas,
Felipe Marino Moreno,
Caio Fabricio Deberaldini Netto,
Fabio Gagliardi Cozman,
Anna Helena Reali Costa,
Eduardo Aoun Tannuri,
Edson Satoshi Gomi,
Marcelo Dottori
Abstract:
We implement a Physics-Informed Neural Network (PINN) for solving the two-dimensional Burgers equations. This type of model can be trained with no previous knowledge of the solution; instead, it relies on evaluating the governing equations of the system in points of the physical domain. It is also possible to use points with a known solution during training. In this paper, we compare PINNs trained…
▽ More
We implement a Physics-Informed Neural Network (PINN) for solving the two-dimensional Burgers equations. This type of model can be trained with no previous knowledge of the solution; instead, it relies on evaluating the governing equations of the system in points of the physical domain. It is also possible to use points with a known solution during training. In this paper, we compare PINNs trained with different amounts of governing equation evaluation points and known solution points. Comparing models that were trained purely with known solution points to those that have also used the governing equations, we observe an improvement in the overall observance of the underlying physics in the latter. We also investigate how changing the number of each type of point affects the resulting models differently. Finally, we argue that the addition of the governing equations during training may provide a way to improve the overall performance of the model without relying on additional data, which is especially important for situations where the number of known solution points is limited.
△ Less
Submitted 18 January, 2023;
originally announced January 2023.
-
A Physics-Informed Neural Network to Model Port Channels
Authors:
Marlon S. Mathias,
Marcel R. de Barros,
Jefferson F. Coelho,
Lucas P. de Freitas,
Felipe M. Moreno,
Caio F. D. Netto,
Fabio G. Cozman,
Anna H. R. Costa,
Eduardo A. Tannuri,
Edson S. Gomi,
Marcelo Dottori
Abstract:
We describe a Physics-Informed Neural Network (PINN) that simulates the flow induced by the astronomical tide in a synthetic port channel, with dimensions based on the Santos - São Vicente - Bertioga Estuarine System. PINN models aim to combine the knowledge of physical systems and data-driven machine learning models. This is done by training a neural network to minimize the residuals of the gover…
▽ More
We describe a Physics-Informed Neural Network (PINN) that simulates the flow induced by the astronomical tide in a synthetic port channel, with dimensions based on the Santos - São Vicente - Bertioga Estuarine System. PINN models aim to combine the knowledge of physical systems and data-driven machine learning models. This is done by training a neural network to minimize the residuals of the governing equations in sample points. In this work, our flow is governed by the Navier-Stokes equations with some approximations. There are two main novelties in this paper. First, we design our model to assume that the flow is periodic in time, which is not feasible in conventional simulation methods. Second, we evaluate the benefit of resampling the function evaluation points during training, which has a near zero computational cost and has been verified to improve the final model, especially for small batch sizes. Finally, we discuss some limitations of the approximations used in the Navier-Stokes equations regarding the modeling of turbulence and how it interacts with PINNs.
△ Less
Submitted 20 December, 2022;
originally announced December 2022.
-
BLAB Reporter: Automated journalism covering the Blue Amazon
Authors:
Yan V. Sym,
João Gabriel M. Campos,
Fabio G. Cozman
Abstract:
This demo paper introduces the BLAB Reporter, a robot-journalist covering the Brazilian Blue Amazon. The Reporter is based on a pipeline architecture for Natural Language Generation; it offers daily reports, news summaries and curious facts in Brazilian Portuguese. By collecting, storing and analysing structured data from publicly available sources, the robot-journalist uses domain knowledge to ge…
▽ More
This demo paper introduces the BLAB Reporter, a robot-journalist covering the Brazilian Blue Amazon. The Reporter is based on a pipeline architecture for Natural Language Generation; it offers daily reports, news summaries and curious facts in Brazilian Portuguese. By collecting, storing and analysing structured data from publicly available sources, the robot-journalist uses domain knowledge to generate and publish texts in Twitter. Code and corpus are publicly available
△ Less
Submitted 8 October, 2022;
originally announced October 2022.
-
Comparing Computational Architectures for Automated Journalism
Authors:
Yan V. Sym,
João Gabriel M. Campos,
Marcos M. José,
Fabio G. Cozman
Abstract:
The majority of NLG systems have been designed following either a template-based or a pipeline-based architecture. Recent neural models for data-to-text generation have been proposed with an end-to-end deep learning flavor, which handles non-linguistic input in natural language without explicit intermediary representations. This study compares the most often employed methods for generating Brazili…
▽ More
The majority of NLG systems have been designed following either a template-based or a pipeline-based architecture. Recent neural models for data-to-text generation have been proposed with an end-to-end deep learning flavor, which handles non-linguistic input in natural language without explicit intermediary representations. This study compares the most often employed methods for generating Brazilian Portuguese texts from structured data. Results suggest that explicit intermediate steps in the generation process produce better texts than the ones generated by neural end-to-end architectures, avoiding data hallucination while better generalizing to unseen inputs. Code and corpus are publicly available.
△ Less
Submitted 8 October, 2022;
originally announced October 2022.
-
The BLue Amazon Brain (BLAB): A Modular Architecture of Services about the Brazilian Maritime Territory
Authors:
Paulo Pirozelli,
Ais B. R. Castro,
Ana Luiza C. de Oliveira,
André S. Oliveira,
Flávio N. Cação,
Igor C. Silveira,
João G. M. Campos,
Laura C. Motheo,
Leticia F. Figueiredo,
Lucas F. A. O. Pellicer,
Marcelo A. José,
Marcos M. José,
Pedro de M. Ligabue,
Ricardo S. Grava,
Rodrigo M. Tavares,
Vinícius B. Matos,
Yan V. Sym,
Anna H. R. Costa,
Anarosa A. F. Brandão,
Denis D. Mauá,
Fabio G. Cozman,
Sarajane M. Peres
Abstract:
We describe the first steps in the development of an artificial agent focused on the Brazilian maritime territory, a large region within the South Atlantic also known as the Blue Amazon. The "BLue Amazon Brain" (BLAB) integrates a number of services aimed at disseminating information about this region and its importance, functioning as a tool for environmental awareness. The main service provided…
▽ More
We describe the first steps in the development of an artificial agent focused on the Brazilian maritime territory, a large region within the South Atlantic also known as the Blue Amazon. The "BLue Amazon Brain" (BLAB) integrates a number of services aimed at disseminating information about this region and its importance, functioning as a tool for environmental awareness. The main service provided by BLAB is a conversational facility that deals with complex questions about the Blue Amazon, called BLAB-Chat; its central component is a controller that manages several task-oriented natural language processing modules (e.g., question answering and summarizer systems). These modules have access to an internal data lake as well as to third-party databases. A news reporter (BLAB-Reporter) and a purposely-developed wiki (BLAB-Wiki) are also part of the BLAB service architecture. In this paper, we describe our current version of BLAB's architecture (interface, backend, web services, NLP modules, and resources) and comment on the challenges we have faced so far, such as the lack of training data and the scattered state of domain information. Solving these issues presents a considerable challenge in the development of artificial intelligence for technical domains.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
Enhancing Oceanic Variables Forecast in the Santos Channel by Estimating Model Error with Random Forests
Authors:
Felipe M. Moreno,
Caio F. D. Netto,
Marcel R. de Barros,
Jefferson F. Coelho,
Lucas P. de Freitas,
Marlon S. Mathias,
Luiz A. Schiaveto Neto,
Marcelo Dottori,
Fabio G. Cozman,
Anna H. R. Costa,
Edson S. Gomi,
Eduardo A. Tannuri
Abstract:
In this work we improve forecasting of Sea Surface Height (SSH) and current velocity (speed and direction) in oceanic scenarios. We do so by resorting to Random Forests so as to predict the error of a numerical forecasting system developed for the Santos Channel in Brazil. We have used the Santos Operational Forecasting System (SOFS) and data collected in situ between the years of 2019 and 2021. I…
▽ More
In this work we improve forecasting of Sea Surface Height (SSH) and current velocity (speed and direction) in oceanic scenarios. We do so by resorting to Random Forests so as to predict the error of a numerical forecasting system developed for the Santos Channel in Brazil. We have used the Santos Operational Forecasting System (SOFS) and data collected in situ between the years of 2019 and 2021. In previous studies we have applied similar methods for current velocity in the channel entrance, in this work we expand the application to improve the SHH forecast and include four other stations in the channel. We have obtained an average reduction of 11.9% in forecasting Root-Mean Square Error (RMSE) and 38.7% in bias with our approach. We also obtained an increase of Agreement (IOA) in 10 of the 14 combinations of forecasted variables and stations.
△ Less
Submitted 22 July, 2022;
originally announced August 2022.
-
Modeling Oceanic Variables with Dynamic Graph Neural Networks
Authors:
Caio F. D. Netto,
Marcel R. de Barros,
Jefferson F. Coelho,
Lucas P. de Freitas,
Felipe M. Moreno,
Marlon S. Mathias,
Marcelo Dottori,
Fábio G. Cozman,
Anna H. R. Costa,
Edson S. Gomi,
Eduardo A. Tannuri
Abstract:
Researchers typically resort to numerical methods to understand and predict ocean dynamics, a key task in mastering environmental phenomena. Such methods may not be suitable in scenarios where the topographic map is complex, knowledge about the underlying processes is incomplete, or the application is time critical. On the other hand, if ocean dynamics are observed, they can be exploited by recent…
▽ More
Researchers typically resort to numerical methods to understand and predict ocean dynamics, a key task in mastering environmental phenomena. Such methods may not be suitable in scenarios where the topographic map is complex, knowledge about the underlying processes is incomplete, or the application is time critical. On the other hand, if ocean dynamics are observed, they can be exploited by recent machine learning methods. In this paper we describe a data-driven method to predict environmental variables such as current velocity and sea surface height in the region of Santos-Sao Vicente-Bertioga Estuarine System in the southeastern coast of Brazil. Our model exploits both temporal and spatial inductive biases by joining state-of-the-art sequence models (LSTM and Transformers) and relational models (Graph Neural Networks) in an end-to-end framework that learns both the temporal features and the spatial relationship shared among observation sites. We compare our results with the Santos Operational Forecasting System (SOFS). Experiments show that better results are attained by our model, while maintaining flexibility and little domain knowledge dependency.
△ Less
Submitted 25 June, 2022;
originally announced June 2022.
-
Integrating question answering and text-to-SQL in Portuguese
Authors:
Marcos Menon José,
Marcelo Archanjo José,
Denis Deratani Mauá,
Fábio Gagliardi Cozman
Abstract:
Deep learning transformers have drastically improved systems that automatically answer questions in natural language. However, different questions demand different answering techniques; here we propose, build and validate an architecture that integrates different modules to answer two distinct kinds of queries. Our architecture takes a free-form natural language text and classifies it to send it e…
▽ More
Deep learning transformers have drastically improved systems that automatically answer questions in natural language. However, different questions demand different answering techniques; here we propose, build and validate an architecture that integrates different modules to answer two distinct kinds of queries. Our architecture takes a free-form natural language text and classifies it to send it either to a Neural Question Answering Reasoner or a Natural Language parser to SQL. We implemented a complete system for the Portuguese language, using some of the main tools available for the language and translating training and testing datasets. Experiments show that our system selects the appropriate answering method with high accuracy (over 99\%), thus validating a modular question answering strategy.
△ Less
Submitted 21 September, 2022; v1 submitted 8 February, 2022;
originally announced February 2022.
-
Pirá: A Bilingual Portuguese-English Dataset for Question-Answering about the Ocean
Authors:
André F. A. Paschoal,
Paulo Pirozelli,
Valdinei Freire,
Karina V. Delgado,
Sarajane M. Peres,
Marcos M. José,
Flávio Nakasato,
André S. Oliveira,
Anarosa A. F. Brandão,
Anna H. R. Costa,
Fabio G. Cozman
Abstract:
Current research in natural language processing is highly dependent on carefully produced corpora. Most existing resources focus on English; some resources focus on languages such as Chinese and French; few resources deal with more than one language. This paper presents the Pirá dataset, a large set of questions and answers about the ocean and the Brazilian coast both in Portuguese and English. Pi…
▽ More
Current research in natural language processing is highly dependent on carefully produced corpora. Most existing resources focus on English; some resources focus on languages such as Chinese and French; few resources deal with more than one language. This paper presents the Pirá dataset, a large set of questions and answers about the ocean and the Brazilian coast both in Portuguese and English. Pirá is, to the best of our knowledge, the first QA dataset with supporting texts in Portuguese, and, perhaps more importantly, the first bilingual QA dataset that includes this language. The Pirá dataset consists of 2261 properly curated question/answer (QA) sets in both languages. The QA sets were manually created based on two corpora: abstracts related to the Brazilian coast and excerpts of United Nation reports about the ocean. The QA sets were validated in a peer-review process with the dataset contributors. We discuss some of the advantages as well as limitations of Pirá, as this new resource can support a set of tasks in NLP such as question-answering, information retrieval, and machine translation.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment
Authors:
Flávio Nakasato Cação,
Marcos Menon José,
André Seidel Oliveira,
Stefano Spindola,
Anna Helena Reali Costa,
Fábio Gagliardi Cozman
Abstract:
The challenge of climate change and biome conservation is one of the most pressing issues of our time - particularly in Brazil, where key environmental reserves are located. Given the availability of large textual databases on ecological themes, it is natural to resort to question answering (QA) systems to increase social awareness and understanding about these topics. In this work, we introduce m…
▽ More
The challenge of climate change and biome conservation is one of the most pressing issues of our time - particularly in Brazil, where key environmental reserves are located. Given the availability of large textual databases on ecological themes, it is natural to resort to question answering (QA) systems to increase social awareness and understanding about these topics. In this work, we introduce multiple QA systems that combine in novel ways the BM25 algorithm, a sparse retrieval technique, with PTT5, a pre-trained state-of-the-art language model. Our QA systems focus on the Portuguese language, thus offering resources not found elsewhere in the literature. As training data, we collected questions from open-domain datasets, as well as content from the Portuguese Wikipedia and news from the press. We thus contribute with innovative architectures and novel applications, attaining an F1-score of 36.2 with our best model.
△ Less
Submitted 19 October, 2021;
originally announced October 2021.
-
mRAT-SQL+GAP:A Portuguese Text-to-SQL Transformer
Authors:
Marcelo Archanjo José,
Fabio Gagliardi Cozman
Abstract:
The translation of natural language questions to SQL queries has attracted growing attention, in particular in connection with transformers and similar language models. A large number of techniques are geared towards the English language; in this work, we thus investigated translation to SQL when input questions are given in the Portuguese language. To do so, we properly adapted state-of-the-art t…
▽ More
The translation of natural language questions to SQL queries has attracted growing attention, in particular in connection with transformers and similar language models. A large number of techniques are geared towards the English language; in this work, we thus investigated translation to SQL when input questions are given in the Portuguese language. To do so, we properly adapted state-of-the-art tools and resources. We changed the RAT-SQL+GAP system by relying on a multilingual BART model (we report tests with other language models), and we produced a translated version of the Spider dataset. Our experiments expose interesting phenomena that arise when non-English languages are targeted; in particular, it is better to train with original and translated training datasets together, even if a single target language is desired. This multilingual BART model fine-tuned with a double-size training dataset (English and Portuguese) achieved 83% of the baseline, making inferences for the Portuguese test dataset. This investigation can help other researchers to produce results in Machine Learning in a language different from English. Our multilingual ready version of RAT-SQL+GAP and the data are available, open-sourced as mRAT-SQL+GAP at: https://github.com/C4AI/gap-text2sql
△ Less
Submitted 29 November, 2021; v1 submitted 7 October, 2021;
originally announced October 2021.
-
Why should I not follow you? Reasons For and Reasons Against in Responsible Recommender Systems
Authors:
Gustavo Padilha Polleti,
Douglas Luan de Souza,
Fabio Cozman
Abstract:
A few Recommender Systems (RS) resort to explanations so as to enhance trust in recommendations. However, current techniques for explanation generation tend to strongly uphold the recommended products instead of presenting both reasons for and reasons against them. We argue that an RS can better enhance overall trust and transparency by frankly displaying both kinds of reasons to users.We have dev…
▽ More
A few Recommender Systems (RS) resort to explanations so as to enhance trust in recommendations. However, current techniques for explanation generation tend to strongly uphold the recommended products instead of presenting both reasons for and reasons against them. We argue that an RS can better enhance overall trust and transparency by frankly displaying both kinds of reasons to users.We have developed such an RS by exploiting knowledge graphs and by applying Snedegar's theory of practical reasoning. We show that our implemented RS has excellent performance and we report on an experiment with human subjects that shows the value of presenting both reasons for and against, with significant improvements in trust, engagement, and persuasion.
△ Less
Submitted 8 September, 2020; v1 submitted 3 September, 2020;
originally announced September 2020.
-
An Empirical Accuracy Law for Sequential Machine Translation: the Case of Google Translate
Authors:
Lucas Nunes Sequeira,
Bruno Moreschi,
Fabio Gagliardi Cozman,
Bernardo Fontes
Abstract:
In this research, we have established, through empirical testing, a law that relates the number of translating hops to translation accuracy in sequential machine translation in Google Translate. Both accuracy and size decrease with the number of hops; the former displays a decrease closely following a power law. Such a law allows one to predict the behavior of translation chains that may be built…
▽ More
In this research, we have established, through empirical testing, a law that relates the number of translating hops to translation accuracy in sequential machine translation in Google Translate. Both accuracy and size decrease with the number of hops; the former displays a decrease closely following a power law. Such a law allows one to predict the behavior of translation chains that may be built as society increasingly depends on automated devices.
△ Less
Submitted 8 April, 2020; v1 submitted 5 March, 2020;
originally announced March 2020.
-
Measuring Unfairness through Game-Theoretic Interpretability
Authors:
Juliana Cesaro,
Fabio G. Cozman
Abstract:
One often finds in the literature connections between measures of fairness and measures of feature importance employed to interpret trained classifiers. However, there seems to be no study that compares fairness measures and feature importance measures. In this paper we propose ways to evaluate and compare such measures. We focus in particular on SHAP, a game-theoretic measure of feature importanc…
▽ More
One often finds in the literature connections between measures of fairness and measures of feature importance employed to interpret trained classifiers. However, there seems to be no study that compares fairness measures and feature importance measures. In this paper we propose ways to evaluate and compare such measures. We focus in particular on SHAP, a game-theoretic measure of feature importance; we present results for a number of unfairness-prone datasets.
△ Less
Submitted 12 October, 2019;
originally announced October 2019.
-
A Fully Attention-Based Information Retriever
Authors:
Alvaro Henrique Chaim Correia,
Jorge Luiz Moreira Silva,
Thiago de Castro Martins,
Fabio Gagliardi Cozman
Abstract:
Recurrent neural networks are now the state-of-the-art in natural language processing because they can build rich contextual representations and process texts of arbitrary length. However, recent developments on attention mechanisms have equipped feedforward networks with similar capabilities, hence enabling faster computations due to the increase in the number of operations that can be paralleliz…
▽ More
Recurrent neural networks are now the state-of-the-art in natural language processing because they can build rich contextual representations and process texts of arbitrary length. However, recent developments on attention mechanisms have equipped feedforward networks with similar capabilities, hence enabling faster computations due to the increase in the number of operations that can be parallelized. We explore this new type of architecture in the domain of question-answering and propose a novel approach that we call Fully Attention Based Information Retriever (FABIR). We show that FABIR achieves competitive results in the Stanford Question Answering Dataset (SQuAD) while having fewer parameters and being faster at both learning and inference than rival methods.
△ Less
Submitted 22 October, 2018;
originally announced October 2018.
-
Interpreting Embedding Models of Knowledge Bases: A Pedagogical Approach
Authors:
Arthur Colombini Gusmão,
Alvaro Henrique Chaim Correia,
Glauber De Bona,
Fabio Gagliardi Cozman
Abstract:
Knowledge bases are employed in a variety of applications from natural language processing to semantic web search; alas, in practice their usefulness is hurt by their incompleteness. Embedding models attain state-of-the-art accuracy in knowledge base completion, but their predictions are notoriously hard to interpret. In this paper, we adapt "pedagogical approaches" (from the literature on neural…
▽ More
Knowledge bases are employed in a variety of applications from natural language processing to semantic web search; alas, in practice their usefulness is hurt by their incompleteness. Embedding models attain state-of-the-art accuracy in knowledge base completion, but their predictions are notoriously hard to interpret. In this paper, we adapt "pedagogical approaches" (from the literature on neural networks) so as to interpret embedding models by extracting weighted Horn rules from them. We show how pedagogical approaches have to be adapted to take upon the large-scale relational aspects of knowledge bases and show experimentally their strengths and weaknesses.
△ Less
Submitted 20 June, 2018;
originally announced June 2018.
-
Speeding-up ProbLog's Parameter Learning
Authors:
Francisco H. O. V. de Faria,
Arthur C. Gusmão,
Fabio G. Cozman,
Denis D. Mauá
Abstract:
ProbLog is a state-of-art combination of logic programming and probabilities; in particular ProbLog offers parameter learning through a variant of the EM algorithm. However, the resulting learning algorithm is rather slow, even when the data are complete. In this short paper we offer some insights that lead to orders of magnitude improvements in ProbLog's parameter learning speed with complete dat…
▽ More
ProbLog is a state-of-art combination of logic programming and probabilities; in particular ProbLog offers parameter learning through a variant of the EM algorithm. However, the resulting learning algorithm is rather slow, even when the data are complete. In this short paper we offer some insights that lead to orders of magnitude improvements in ProbLog's parameter learning speed with complete data.
△ Less
Submitted 1 August, 2017; v1 submitted 25 July, 2017;
originally announced July 2017.
-
On the Semantics and Complexity of Probabilistic Logic Programs
Authors:
Fabio Gagliardi Cozman,
Denis Deratani Mauá
Abstract:
We examine the meaning and the complexity of probabilistic logic programs that consist of a set of rules and a set of independent probabilistic facts (that is, programs based on Sato's distribution semantics). We focus on two semantics, respectively based on stable and on well-founded models. We show that the semantics based on stable models (referred to as the "credal semantics") produces sets of…
▽ More
We examine the meaning and the complexity of probabilistic logic programs that consist of a set of rules and a set of independent probabilistic facts (that is, programs based on Sato's distribution semantics). We focus on two semantics, respectively based on stable and on well-founded models. We show that the semantics based on stable models (referred to as the "credal semantics") produces sets of probability models that dominate infinitely monotone Choquet capacities, we describe several useful consequences of this result. We then examine the complexity of inference with probabilistic logic programs. We distinguish between the complexity of inference when a probabilistic program and a query are given (the inferential complexity), and the complexity of inference when the probabilistic program is fixed and the query is given (the query complexity, akin to data complexity as used in database theory). We obtain results on the inferential and query complexity for acyclic, stratified, and cyclic propositional and relational programs, complexity reaches various levels of the counting hierarchy and even exponential levels.
△ Less
Submitted 31 January, 2017;
originally announced January 2017.
-
The Complexity of Bayesian Networks Specified by Propositional and Relational Languages
Authors:
Fabio Gagliardi Cozman,
Denis Deratani Mauá
Abstract:
We examine the complexity of inference in Bayesian networks specified by logical languages. We consider representations that range from fragments of propositional logic to function-free first-order logic with equality; in doing so we cover a variety of plate models and of probabilistic relational models. We study the complexity of inferences when network, query and domain are the input (the infere…
▽ More
We examine the complexity of inference in Bayesian networks specified by logical languages. We consider representations that range from fragments of propositional logic to function-free first-order logic with equality; in doing so we cover a variety of plate models and of probabilistic relational models. We study the complexity of inferences when network, query and domain are the input (the inferential and the combined complexity), when the network is fixed and query and domain are the input (the query/data complexity), and when the network and query are fixed and the domain is the input (the domain complexity). We draw connections with probabilistic databases and liftability results, and obtain complexity classes that range from polynomial to exponential levels.
△ Less
Submitted 6 January, 2017; v1 submitted 4 December, 2016;
originally announced December 2016.
-
First-Order Bayesian Network Specifications Capture the Complexity Class PP
Authors:
Fabio Gagliardi Cozman
Abstract:
The point of this note is to prove that a language is in the complexity class PP if and only if the strings of the language encode valid inferences in a Bayesian network defined using function-free first-order logic with equality.
The point of this note is to prove that a language is in the complexity class PP if and only if the strings of the language encode valid inferences in a Bayesian network defined using function-free first-order logic with equality.
△ Less
Submitted 12 September, 2016;
originally announced September 2016.
-
Quasi-Bayesian Strategies for Efficient Plan Generation: Application to the Planning to Observe Problem
Authors:
Fabio Gagliardi Cozman,
Eric Krotkov
Abstract:
Quasi-Bayesian theory uses convex sets of probability distributions and expected loss to represent preferences about plans. The theory focuses on decision robustness, i.e., the extent to which plans are affected by deviations in subjective assessments of probability. The present work presents solutions for plan generation when robustness of probability assessments must be included: plans contain…
▽ More
Quasi-Bayesian theory uses convex sets of probability distributions and expected loss to represent preferences about plans. The theory focuses on decision robustness, i.e., the extent to which plans are affected by deviations in subjective assessments of probability. The present work presents solutions for plan generation when robustness of probability assessments must be included: plans contain information about the robustness of certain actions. The surprising result is that some problems can be solved faster in the Quasi-Bayesian framework than within usual Bayesian theory. We investigate this on the planning to observe problem, i.e., an agent must decide whether to take new observations or not. The fundamental question is: How, and how much, to search for a "best" plan, based on the robustness of probability assessments? Plan generation algorithms are derived in the context of material classification with an acoustic robotic probe. A package that constructs Quasi-Bayesian plans is available through anonymous ftp.
△ Less
Submitted 13 February, 2013;
originally announced February 2013.
-
Robustness Analysis of Bayesian Networks with Local Convex Sets of Distributions
Authors:
Fabio Gagliardi Cozman
Abstract:
Robust Bayesian inference is the calculation of posterior probability bounds given perturbations in a probabilistic model. This paper focuses on perturbations that can be expressed locally in Bayesian networks through convex sets of distributions. Two approaches for combination of local models are considered. The first approach takes the largest set of joint distributions that is compatible wit…
▽ More
Robust Bayesian inference is the calculation of posterior probability bounds given perturbations in a probabilistic model. This paper focuses on perturbations that can be expressed locally in Bayesian networks through convex sets of distributions. Two approaches for combination of local models are considered. The first approach takes the largest set of joint distributions that is compatible with the local sets of distributions; we show how to reduce this type of robust inference to a linear programming problem. The second approach takes the convex hull of joint distributions generated from the local sets of distributions; we demonstrate how to apply interior-point optimization methods to generate posterior bounds and how to generate approximations that are guaranteed to converge to correct posterior bounds. We also discuss calculation of bounds for expected utilities and variances, and global perturbation models.
△ Less
Submitted 6 February, 2013;
originally announced February 2013.
-
Irrelevance and Independence Relations in Quasi-Bayesian Networks
Authors:
Fabio Gagliardi Cozman
Abstract:
This paper analyzes irrelevance and independence relations in graphical models associated with convex sets of probability distributions (called Quasi-Bayesian networks). The basic question in Quasi-Bayesian networks is, How can irrelevance/independence relations in Quasi-Bayesian networks be detected, enforced and exploited? This paper addresses these questions through Walley's definitions of irre…
▽ More
This paper analyzes irrelevance and independence relations in graphical models associated with convex sets of probability distributions (called Quasi-Bayesian networks). The basic question in Quasi-Bayesian networks is, How can irrelevance/independence relations in Quasi-Bayesian networks be detected, enforced and exploited? This paper addresses these questions through Walley's definitions of irrelevance and independence. Novel algorithms and results are presented for inferences with the so-called natural extensions using fractional linear programming, and the properties of the so-called type-1 extensions are clarified through a new generalization of d-separation.
△ Less
Submitted 30 January, 2013;
originally announced January 2013.
-
Separation Properties of Sets of Probability Measures
Authors:
Fabio Gagliardi Cozman
Abstract:
This paper analyzes independence concepts for sets of probability measures associated with directed acyclic graphs. The paper shows that epistemic independence and the standard Markov condition violate desirable separation properties. The adoption of a contraction condition leads to d-separation but still fails to guarantee a belief separation property. To overcome this unsatisfactory situation, a…
▽ More
This paper analyzes independence concepts for sets of probability measures associated with directed acyclic graphs. The paper shows that epistemic independence and the standard Markov condition violate desirable separation properties. The adoption of a contraction condition leads to d-separation but still fails to guarantee a belief separation property. To overcome this unsatisfactory situation, a strong Markov condition is proposed, based on epistemic independence. The main result is that the strong Markov condition leads to strong independence and does enforce separation properties; this result implies that (1) separation properties of Bayesian networks do extend to epistemic independence and sets of probability measures, and (2) strong independence has a clear justification based on epistemic independence and the strong Markov condition.
△ Less
Submitted 16 January, 2013;
originally announced January 2013.
-
Inference with Seperately Specified Sets of Probabilities in Credal Networks
Authors:
Jose Carlos Ferreira da Rocha,
Fabio Gagliardi Cozman
Abstract:
We present new algorithms for inference in credal networks --- directed acyclic graphs associated with sets of probabilities. Credal networks are here interpreted as encoding strong independence relations among variables. We first present a theory of credal networks based on separately specified sets of probabilities. We also show that inference with polytrees is NP-hard in this setting. We then…
▽ More
We present new algorithms for inference in credal networks --- directed acyclic graphs associated with sets of probabilities. Credal networks are here interpreted as encoding strong independence relations among variables. We first present a theory of credal networks based on separately specified sets of probabilities. We also show that inference with polytrees is NP-hard in this setting. We then introduce new techniques that reduce the computational effort demanded by inference, particularly in polytrees, by exploring separability of credal sets.
△ Less
Submitted 12 December, 2012;
originally announced January 2013.
-
Inference in Polytrees with Sets of Probabilities
Authors:
Jose Carlos Ferreira da Rocha,
Fabio Gagliardi Cozman,
Cassio Polpo de Campos
Abstract:
Inferences in directed acyclic graphs associated with probability sets and probability intervals are NP-hard, even for polytrees. In this paper we focus on such inferences, and propose: 1) a substantial improvement on Tessems A / R algorithm FOR polytrees WITH probability intervals; 2) a new algorithm FOR direction - based local search(IN sets OF probability) that improves ON e…
▽ More
Inferences in directed acyclic graphs associated with probability sets and probability intervals are NP-hard, even for polytrees. In this paper we focus on such inferences, and propose: 1) a substantial improvement on Tessems A / R algorithm FOR polytrees WITH probability intervals; 2) a new algorithm FOR direction - based local search(IN sets OF probability) that improves ON existing methods; 3) a collection OF branch - AND - bound algorithms that combine the previous techniques.The first two techniques lead TO approximate solutions, WHILE branch - AND - bound procedures can produce either exact OR approximate solutions.We report ON dramatic improvements ON existing techniques FOR inference WITH probability sets AND intervals, IN SOME cases reducing the computational effort BY many orders OF magnitude.
△ Less
Submitted 19 October, 2012;
originally announced December 2012.
-
Propositional and Relational Bayesian Networks Associated with Imprecise and Qualitative Probabilistic Assesments
Authors:
Fabio Gagliardi Cozman,
Cassio Polpo de Campos,
Jaime Ide,
Jose Carlos Ferreira da Rocha
Abstract:
This paper investigates a representation language with flexibility inspired by probabilistic logic and compactness inspired by relational Bayesian networks. The goal is to handle propositional and first-order constructs together with precise, imprecise, indeterminate and qualitative probabilistic assessments. The paper shows how this can be achieved through the theory of credal networks. New exact…
▽ More
This paper investigates a representation language with flexibility inspired by probabilistic logic and compactness inspired by relational Bayesian networks. The goal is to handle propositional and first-order constructs together with precise, imprecise, indeterminate and qualitative probabilistic assessments. The paper shows how this can be achieved through the theory of credal networks. New exact and approximate inference algorithms based on multilinear programming and iterated/loopy propagation of interval probabilities are presented; their superior performance, compared to existing ones, is shown empirically.
△ Less
Submitted 11 July, 2012;
originally announced July 2012.
-
Belief Updating and Learning in Semi-Qualitative Probabilistic Networks
Authors:
Cassio Polpo de Campos,
Fabio Gagliardi Cozman
Abstract:
This paper explores semi-qualitative probabilistic networks (SQPNs) that combine numeric and qualitative information. We first show that exact inferences with SQPNs are NPPP-Complete. We then show that existing qualitative relations in SQPNs (plus probabilistic logic and imprecise assessments) can be dealt effectively through multilinear programming. We then discuss learning: we consider a maximum…
▽ More
This paper explores semi-qualitative probabilistic networks (SQPNs) that combine numeric and qualitative information. We first show that exact inferences with SQPNs are NPPP-Complete. We then show that existing qualitative relations in SQPNs (plus probabilistic logic and imprecise assessments) can be dealt effectively through multilinear programming. We then discuss learning: we consider a maximum likelihood method that generates point estimates given a SQPN and empirical data, and we describe a Bayesian-minded method that employs the Imprecise Dirichlet Model to generate set-valued estimates.
△ Less
Submitted 4 July, 2012;
originally announced July 2012.
-
Complexity Analysis and Variational Inference for Interpretation-based Probabilistic Description Logic
Authors:
Fabio Gagliardi Cozman,
Rodrigo Bellizia Polastro
Abstract:
This paper presents complexity analysis and variational methods for inference in probabilistic description logics featuring Boolean operators, quantification, qualified number restrictions, nominals, inverse roles and role hierarchies. Inference is shown to be PEXP-complete, and variational methods are designed so as to exploit logical inference whenever possible.
This paper presents complexity analysis and variational methods for inference in probabilistic description logics featuring Boolean operators, quantification, qualified number restrictions, nominals, inverse roles and role hierarchies. Inference is shown to be PEXP-complete, and variational methods are designed so as to exploit logical inference whenever possible.
△ Less
Submitted 9 May, 2012;
originally announced May 2012.
-
Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence (2011)
Authors:
Fabio Cozman,
Avi Pfeffer
Abstract:
This is the Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, which was held in Barcelona, Spain, July 14 - 17 2011.
This is the Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, which was held in Barcelona, Spain, July 14 - 17 2011.
△ Less
Submitted 28 August, 2014; v1 submitted 11 May, 2012;
originally announced May 2012.