-
Contract Usage and Evolution in Android Mobile Applications
Authors:
David R. Ferreira,
Alexandra Mendes,
João F. Ferreira
Abstract:
Formal contracts and assertions are effective methods to enhance software quality by enforcing preconditions, postconditions, and invariants. Previous research has demonstrated the value of contracts in traditional software development contexts. However, the adoption and impact of contracts in the context of mobile application development, particularly of Android applications, remain unexplored.…
▽ More
Formal contracts and assertions are effective methods to enhance software quality by enforcing preconditions, postconditions, and invariants. Previous research has demonstrated the value of contracts in traditional software development contexts. However, the adoption and impact of contracts in the context of mobile application development, particularly of Android applications, remain unexplored.
To address this, we present the first large-scale empirical study on the presence and use of contracts in Android applications, written in Java or Kotlin. We consider different types of contract elements divided into five categories: conditional runtime exceptions, APIs, annotations, assertions, and other. We analyzed 2,390 Android applications from the F-Droid repository and processed more than 51,749 KLOC to determine 1) how and to what extent contracts are used, 2) how contract usage evolves, and 3) whether contracts are used safely in the context of program evolution and inheritance. Our findings include: 1) although most applications do not specify contracts, annotation-based approaches are the most popular among practitioners; 2) applications that use contracts continue to use them in later versions, but the number of methods increases at a higher rate than the number of contracts; and 3) there are many potentially unsafe specification changes when applications evolve and in subty** relationships, which indicates a lack of specification stability. Our findings show that it would be desirable to have libraries that standardize contract specifications in Java and Kotlin, and tools that aid practitioners in writing stronger contracts and in detecting contract violations in the context of program evolution and inheritance.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Leveraging Large Language Models to Boost Dafny's Developers Productivity
Authors:
Álvaro Silva,
Alexandra Mendes,
João F. Ferreira
Abstract:
This research idea paper proposes leveraging Large Language Models (LLMs) to enhance the productivity of Dafny developers. Although the use of verification-aware languages, such as Dafny, has increased considerably in the last decade, these are still not widely adopted. Often the cost of using such languages is too high, due to the level of expertise required from the developers and challenges tha…
▽ More
This research idea paper proposes leveraging Large Language Models (LLMs) to enhance the productivity of Dafny developers. Although the use of verification-aware languages, such as Dafny, has increased considerably in the last decade, these are still not widely adopted. Often the cost of using such languages is too high, due to the level of expertise required from the developers and challenges that they often face when trying to prove a program correct. Even though Dafny automates a lot of the verification process, sometimes there are steps that are too complex for Dafny to perform on its own. One such case is that of missing lemmas, i.e. Dafny is unable to prove a result without being given further help in the form of a theorem that can assist it in the proof of the step.
In this paper, we describe preliminary work on a new Dafny plugin that leverages LLMs to assist developers by generating suggestions for relevant lemmas that Dafny is unable to discover and use. Moreover, for the lemmas that cannot be proved automatically, the plugin also attempts to provide accompanying calculational proofs. We also discuss ideas for future work by describing a research agenda on using LLMs to increase the adoption of verification-aware languages in general, by increasing developers productivity and by reducing the level of expertise required for crafting formal specifications and proving program properties.
△ Less
Submitted 1 January, 2024;
originally announced January 2024.
-
Using the SP!CE Framework to Code Influence Campaign Activity on Social Media: Case Study on the 2022 Brazilian Presidential Election
Authors:
Alexander Gocso,
Claudia Perez Brito,
Bryan Ruesca,
Allen Mendes,
Mark A. Finlayson
Abstract:
We describe a case study in the use of the Structured Process for Information Campaign Enhancement (SP!CE, version 2.1) to evaluate influence campaigns present in the 2nd round of the Brazilian presidential election in 2022 October. SP!CE is a US-military focused framework for describing both friendly and adversary actions in influence campaigns, and is inter-operable with the Disinformation Analy…
▽ More
We describe a case study in the use of the Structured Process for Information Campaign Enhancement (SP!CE, version 2.1) to evaluate influence campaigns present in the 2nd round of the Brazilian presidential election in 2022 October. SP!CE is a US-military focused framework for describing both friendly and adversary actions in influence campaigns, and is inter-operable with the Disinformation Analysis and Risk Management (DISARM) framework. The purpose of the case study is to demonstrate how SP!CE can be used to describe influence campaign behaviors. We selected the Brazilian election as the target of the case study as it is known that there were significant amounts of mis- and disinformation present on social media during the campaigns. Our goal was to demonstrate how SP!CE could be applied in such a context, showing how social media content could be aligned with information campaign behaviors and how such an alignment can be used to analyze which mis- and disinformation narratives were in play. Additionally, we aim to provide insights on best practices regarding how to apply the framework in further research. We release the coding and screenshots of the relevant social media posts to support future research.
△ Less
Submitted 6 December, 2023; v1 submitted 5 December, 2023;
originally announced December 2023.
-
Supervising the Centroid Baseline for Extractive Multi-Document Summarization
Authors:
Simão Gonçalves,
Gonçalo Correia,
Diogo Pernes,
Afonso Mendes
Abstract:
The centroid method is a simple approach for extractive multi-document summarization and many improvements to its pipeline have been proposed. We further refine it by adding a beam search process to the sentence selection and also a centroid estimation attention model that leads to improved results. We demonstrate this in several multi-document summarization datasets, including in a multilingual s…
▽ More
The centroid method is a simple approach for extractive multi-document summarization and many improvements to its pipeline have been proposed. We further refine it by adding a beam search process to the sentence selection and also a centroid estimation attention model that leads to improved results. We demonstrate this in several multi-document summarization datasets, including in a multilingual scenario.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game
Authors:
Sam Toyer,
Olivia Watkins,
Ethan Adrian Mendes,
Justin Svegliato,
Luke Bailey,
Tiffany Wang,
Isaac Ong,
Karim Elmaaroufi,
Pieter Abbeel,
Trevor Darrell,
Alan Ritter,
Stuart Russell
Abstract:
While Large Language Models (LLMs) are increasingly being used in real-world applications, they remain vulnerable to prompt injection attacks: malicious third party prompts that subvert the intent of the system designer. To help researchers study this problem, we present a dataset of over 126,000 prompt injection attacks and 46,000 prompt-based "defenses" against prompt injection, all created by p…
▽ More
While Large Language Models (LLMs) are increasingly being used in real-world applications, they remain vulnerable to prompt injection attacks: malicious third party prompts that subvert the intent of the system designer. To help researchers study this problem, we present a dataset of over 126,000 prompt injection attacks and 46,000 prompt-based "defenses" against prompt injection, all created by players of an online game called Tensor Trust. To the best of our knowledge, this is currently the largest dataset of human-generated adversarial examples for instruction-following LLMs. The attacks in our dataset have a lot of easily interpretable stucture, and shed light on the weaknesses of LLMs. We also use the dataset to create a benchmark for resistance to two types of prompt injection, which we refer to as prompt extraction and prompt hijacking. Our benchmark results show that many models are vulnerable to the attack strategies in the Tensor Trust dataset. Furthermore, we show that some attack strategies from the dataset generalize to deployed LLM-based applications, even though they have a very different set of constraints to the game. We release all data and source code at https://tensortrust.ai/paper
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
The Director: A Composable Behaviour System with Soft Transitions
Authors:
Ysobel Sims,
Trent Houliston,
Thomas O'Brien,
Alexandre Mendes,
Stephan Chalup
Abstract:
Software frameworks for behaviour are critical in robotics as they enable the correct and efficient execution of functions. While modern behaviour systems have improved their composability, they do not focus on smooth transitions and often lack functionality. In this work, we present the Director, a novel behaviour framework that addresses these problems. It has functionality for soft transitions,…
▽ More
Software frameworks for behaviour are critical in robotics as they enable the correct and efficient execution of functions. While modern behaviour systems have improved their composability, they do not focus on smooth transitions and often lack functionality. In this work, we present the Director, a novel behaviour framework that addresses these problems. It has functionality for soft transitions, multiple implementations of the same action chosen based on conditionals, and strict resource control. The system was successfully used in the 2022/2023 Virtual Season and RoboCup 2023 Bordeaux, in the Humanoid Kid Size League. It is implemented at https://github.com/NUbots/DirectorSoccer, which also contains over thirty automated tests and technical documentation on its implementation in NUClear.
△ Less
Submitted 1 May, 2024; v1 submitted 17 September, 2023;
originally announced September 2023.
-
Polyglot Code Smell Detection for Infrastructure as Code with GLITCH
Authors:
Nuno Saavedra,
João Gonçalves,
Miguel Henriques,
João F. Ferreira,
Alexandra Mendes
Abstract:
This paper presents GLITCH, a new technology-agnostic framework that enables automated polyglot code smell detection for Infrastructure as Code scripts. GLITCH uses an intermediate representation on which different code smell detectors can be defined. It currently supports the detection of nine security smells and nine design & implementation smells in scripts written in Ansible, Chef, Docker, Pup…
▽ More
This paper presents GLITCH, a new technology-agnostic framework that enables automated polyglot code smell detection for Infrastructure as Code scripts. GLITCH uses an intermediate representation on which different code smell detectors can be defined. It currently supports the detection of nine security smells and nine design & implementation smells in scripts written in Ansible, Chef, Docker, Puppet, or Terraform. Studies conducted with GLITCH not only show that GLITCH can reduce the effort of writing code smell analyses for multiple IaC technologies, but also that it has higher precision and recall than current state-of-the-art tools. A video describing and demonstrating GLITCH is available at: https://youtu.be/E4RhCcZjWbk
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Patient-centric health data sovereignty: an approach using Proxy re-encryption
Authors:
Bruno Rodrigues,
Ivone Amorim,
Ivan Costa,
Alexandra Mendes
Abstract:
The exponential growth in the digitisation of services implies the handling and storage of large volumes of data. Businesses and services see data sharing and crossing as an opportunity to improve and produce new business opportunities. The health sector is one area where this proves to be true, enabling better and more innovative treatments. Notwithstanding, this raises concerns regarding persona…
▽ More
The exponential growth in the digitisation of services implies the handling and storage of large volumes of data. Businesses and services see data sharing and crossing as an opportunity to improve and produce new business opportunities. The health sector is one area where this proves to be true, enabling better and more innovative treatments. Notwithstanding, this raises concerns regarding personal data being treated and processed. In this paper, we present a patient-centric platform for the secure sharing of health records by shifting the control over the data to the patient, therefore, providing a step further towards data sovereignty. Data sharing is performed only with the consent of the patient, allowing it to revoke access at any given time. Furthermore, we also provide a break-glass approach, resorting to Proxy Re-encryption (PRE) and the concept of a centralised trusted entity that possesses instant access to patients' medical records. Lastly, an analysis is made to assess the performance of the platform's key operations, and the impact that a PRE scheme has on those operations.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Quantifying Valence and Arousal in Text with Multilingual Pre-trained Transformers
Authors:
Gonçalo Azevedo Mendes,
Bruno Martins
Abstract:
The analysis of emotions expressed in text has numerous applications. In contrast to categorical analysis, focused on classifying emotions according to a pre-defined set of common classes, dimensional approaches can offer a more nuanced way to distinguish between different emotions. Still, dimensional methods have been less studied in the literature. Considering a valence-arousal dimensional space…
▽ More
The analysis of emotions expressed in text has numerous applications. In contrast to categorical analysis, focused on classifying emotions according to a pre-defined set of common classes, dimensional approaches can offer a more nuanced way to distinguish between different emotions. Still, dimensional methods have been less studied in the literature. Considering a valence-arousal dimensional space, this work assesses the use of pre-trained Transformers to predict these two dimensions on a continuous scale, with input texts from multiple languages and domains. We specifically combined multiple annotated datasets from previous studies, corresponding to either emotional lexica or short text documents, and evaluated models of multiple sizes and trained under different settings. Our results show that model size can have a significant impact on the quality of predictions, and that by fine-tuning a large model we can confidently predict valence and arousal in multiple languages. We make available the code, models, and supporting data.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Low Latency Video Denoising for Online Conferencing Using CNN Architectures
Authors:
Altanai Bisht,
Ana Carolina de Souza Mendes,
Justin David Thoreson II,
Shadrokh Samavi
Abstract:
In this paper, we propose a pipeline for real-time video denoising with low runtime cost and high perceptual quality. The vast majority of denoising studies focus on image denoising. However, a minority of research works focusing on video denoising do so with higher performance costs to obtain higher quality while maintaining temporal coherence. The approach we introduce in this paper leverages th…
▽ More
In this paper, we propose a pipeline for real-time video denoising with low runtime cost and high perceptual quality. The vast majority of denoising studies focus on image denoising. However, a minority of research works focusing on video denoising do so with higher performance costs to obtain higher quality while maintaining temporal coherence. The approach we introduce in this paper leverages the advantages of both image and video-denoising architectures. Our pipeline first denoises the keyframes or one-fifth of the frames using HI-GAN blind image denoising architecture. Then, the remaining four-fifths of the noisy frames and the denoised keyframe data are fed into the FastDVDnet video denoising model. The final output is rendered in the user's display in real-time. The combination of these low-latency neural network architectures produces real-time denoising with high perceptual quality with applications in video conferencing and other real-time media streaming systems. A custom noise detector analyzer provides real-time feedback to adapt the weights and improve the models' output.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
Improving abstractive summarization with energy-based re-ranking
Authors:
Diogo Pernes,
Afonso Mendes,
André F. T. Martins
Abstract:
Current abstractive summarization systems present important weaknesses which prevent their deployment in real-world applications, such as the omission of relevant information and the generation of factual inconsistencies (also known as hallucinations). At the same time, automatic evaluation metrics such as CTC scores have been recently proposed that exhibit a higher correlation with human judgment…
▽ More
Current abstractive summarization systems present important weaknesses which prevent their deployment in real-world applications, such as the omission of relevant information and the generation of factual inconsistencies (also known as hallucinations). At the same time, automatic evaluation metrics such as CTC scores have been recently proposed that exhibit a higher correlation with human judgments than traditional lexical-overlap metrics such as ROUGE. In this work, we intend to close the loop by leveraging the recent advances in summarization metrics to create quality-aware abstractive summarizers. Namely, we propose an energy-based model that learns to re-rank summaries according to one or a combination of these metrics. We experiment using several metrics to train our energy-based re-ranker and show that it consistently improves the scores achieved by the predicted summaries. Nonetheless, human evaluation results show that the re-ranking approach should be used with care for highly abstractive summaries, as the available metrics are not yet sufficiently reliable for this purpose.
△ Less
Submitted 7 November, 2022; v1 submitted 27 October, 2022;
originally announced October 2022.
-
Simplifying Multilingual News Clustering Through Projection From a Shared Space
Authors:
João Santos,
Afonso Mendes,
Sebastião Miranda
Abstract:
The task of organizing and clustering multilingual news articles for media monitoring is essential to follow news stories in real time. Most approaches to this task focus on high-resource languages (mostly English), with low-resource languages being disregarded. With that in mind, we present a much simpler online system that is able to cluster an incoming stream of documents without depending on l…
▽ More
The task of organizing and clustering multilingual news articles for media monitoring is essential to follow news stories in real time. Most approaches to this task focus on high-resource languages (mostly English), with low-resource languages being disregarded. With that in mind, we present a much simpler online system that is able to cluster an incoming stream of documents without depending on language-specific features. We empirically demonstrate that the use of multilingual contextual embeddings as the document representation significantly improves clustering quality. We challenge previous crosslingual approaches by removing the precondition of building monolingual clusters. We model the clustering process as a set of linear classifiers to aggregate similar documents, and correct closely-related multilingual clusters through merging in an online fashion. Our system achieves state-of-the-art results on a multilingual news stream clustering dataset, and we introduce a new evaluation for zero-shot news clustering in multiple languages. We make our code available as open-source.
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
An NLP Solution to Foster the Use of Information in Electronic Health Records for Efficiency in Decision-Making in Hospital Care
Authors:
Adelino Leite-Moreira,
Afonso Mendes,
Afonso Pedrosa,
Amândio Rocha-Sousa,
Ana Azevedo,
André Amaral-Gomes,
Cláudia Pinto,
Helena Figueira,
Nuno Rocha Pereira,
Pedro Mendes,
Tiago Pimenta
Abstract:
The project aimed to define the rules and develop a technological solution to automatically identify a set of attributes within free-text clinical records written in Portuguese. The first application developed and implemented on this basis was a structured summary of a patient's clinical history, including previous diagnoses and procedures, usual medication, and relevant characteristics or conditi…
▽ More
The project aimed to define the rules and develop a technological solution to automatically identify a set of attributes within free-text clinical records written in Portuguese. The first application developed and implemented on this basis was a structured summary of a patient's clinical history, including previous diagnoses and procedures, usual medication, and relevant characteristics or conditions for clinical decisions, such as allergies, being under anticoagulant therapy, etc. The project's goal was achieved by a multidisciplinary team that included clinicians, epidemiologists, computational linguists, machine learning researchers and software engineers, bringing together the expertise and perspectives of a public hospital, the university and the private sector. Relevant benefits to users and patients are related with facilitated access to the patient's history, which translates into exhaustiveness in apprehending the patient's clinical past and efficiency due to time saving.
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
Exploring Usable Security to Improve the Impact of Formal Verification: A Research Agenda
Authors:
Carolina Carreira,
João F. Ferreira,
Alexandra Mendes,
Nicolas Christin
Abstract:
As software becomes more complex and assumes an even greater role in our lives, formal verification is set to become the gold standard in securing software systems into the future, since it can guarantee the absence of errors and entire classes of attack. Recent advances in formal verification are being used to secure everything from unmanned drones to the internet.
At the same time, the usable…
▽ More
As software becomes more complex and assumes an even greater role in our lives, formal verification is set to become the gold standard in securing software systems into the future, since it can guarantee the absence of errors and entire classes of attack. Recent advances in formal verification are being used to secure everything from unmanned drones to the internet.
At the same time, the usable security research community has made huge progress in improving the usability of security products and end-users comprehension of security issues. However, there have been no human-centered studies focused on the impact of formal verification on the use and adoption of formally verified software products. We propose a research agenda to fill this gap and to contribute with the first collection of studies on people's mental models on formal verification and associated security and privacy guarantees and threats. The proposed research has the potential to increase the adoption of more secure products and it can be directly used by the security and formal methods communities to create more effective and secure software tools.
△ Less
Submitted 15 November, 2021;
originally announced November 2021.
-
Priberam at MESINESP Multi-label Classification of Medical Texts Task
Authors:
Ruben Cardoso,
Zita Marinho,
Afonso Mendes,
Sebastião Miranda
Abstract:
Medical articles provide current state of the art treatments and diagnostics to many medical practitioners and professionals. Existing public databases such as MEDLINE contain over 27 million articles, making it difficult to extract relevant content without the use of efficient search engines. Information retrieval tools are crucial in order to navigate and provide meaningful recommendations for a…
▽ More
Medical articles provide current state of the art treatments and diagnostics to many medical practitioners and professionals. Existing public databases such as MEDLINE contain over 27 million articles, making it difficult to extract relevant content without the use of efficient search engines. Information retrieval tools are crucial in order to navigate and provide meaningful recommendations for articles and treatments. Classifying these articles into broader medical topics can improve the retrieval of related articles. The set of medical labels considered for the MESINESP task is on the order of several thousands of labels (DeCS codes), which falls under the extreme multi-label classification problem. The heterogeneous and highly hierarchical structure of medical topics makes the task of manually classifying articles extremely laborious and costly. It is, therefore, crucial to automate the process of classification. Typical machine learning algorithms become computationally demanding with such a large number of labels and achieving better recall on such datasets becomes an unsolved problem.
This work presents Priberam's participation at the BioASQ task Mesinesp. We address the large multi-label classification problem through the use of four different models: a Support Vector Machine (SVM), a customised search engine (Priberam Search), a BERT based classifier, and a SVM-rank ensemble of all the previous models. Results demonstrate that all three individual models perform well and the best performance is achieved by their ensemble, granting Priberam the 6th place in the present challenge and making it the 2nd best team.
△ Less
Submitted 12 May, 2021;
originally announced May 2021.
-
Priberam Labs at the NTCIR-15 SHINRA2020-ML: Classification Task
Authors:
Ruben Cardoso,
Afonso Mendes,
Andre Lamurias
Abstract:
Wikipedia is an online encyclopedia available in 285 languages. It composes an extremely relevant Knowledge Base (KB), which could be leveraged by automatic systems for several purposes. However, the structure and organisation of such information are not prone to automatic parsing and understanding and it is, therefore, necessary to structure this knowledge. The goal of the current SHINRA2020-ML t…
▽ More
Wikipedia is an online encyclopedia available in 285 languages. It composes an extremely relevant Knowledge Base (KB), which could be leveraged by automatic systems for several purposes. However, the structure and organisation of such information are not prone to automatic parsing and understanding and it is, therefore, necessary to structure this knowledge. The goal of the current SHINRA2020-ML task is to leverage Wikipedia pages in order to categorise their corresponding entities across 268 hierarchical categories, belonging to the Extended Named Entity (ENE) ontology. In this work, we propose three distinct models based on the contextualised embeddings yielded by Multilingual BERT. We explore the performances of a linear layer with and without explicit usage of the ontology's hierarchy, and a Gated Recurrent Units (GRU) layer. We also test several pooling strategies to leverage BERT's embeddings and selection criteria based on the labels' scores. We were able to achieve good performance across a large variety of languages, including those not seen during the fine-tuning process (zero-shot languages).
△ Less
Submitted 12 May, 2021;
originally announced May 2021.
-
Skeptic: Automatic, Justified and Privacy-Preserving Password Composition Policy Selection
Authors:
Saul Johnson,
João F. Ferreira,
Alexandra Mendes,
Julien Cordry
Abstract:
The choice of password composition policy to enforce on a password-protected system represents a critical security decision, and has been shown to significantly affect the vulnerability of user-chosen passwords to guessing attacks. In practice, however, this choice is not usually rigorous or justifiable, with a tendency for system administrators to choose password composition policies based on int…
▽ More
The choice of password composition policy to enforce on a password-protected system represents a critical security decision, and has been shown to significantly affect the vulnerability of user-chosen passwords to guessing attacks. In practice, however, this choice is not usually rigorous or justifiable, with a tendency for system administrators to choose password composition policies based on intuition alone. In this work, we propose a novel methodology that draws on password probability distributions constructed from large sets of real-world password data which have been filtered according to various password composition policies. Password probabilities are then redistributed to simulate different user password reselection behaviours in order to automatically determine the password composition policy that will induce the distribution of user-chosen passwords with the greatest uniformity, a metric which we show to be a useful proxy to measure overall resistance to password guessing attacks. Further, we show that by fitting power-law equations to the password probability distributions we generate, we can justify our choice of password composition policy without any direct access to user password data. Finally, we present Skeptic -- a software toolkit that implements this methodology, including a DSL to enable system administrators with no background in password security to compare and rank password composition policies without resorting to expensive and time-consuming user studies. Drawing on 205,176,321 pass words across 3 datasets, we lend validity to our approach by demonstrating that the results we obtain align closely with findings from a previous empirical study into password composition policy effectiveness.
△ Less
Submitted 15 March, 2024; v1 submitted 7 July, 2020;
originally announced July 2020.
-
Multi-Stage Transfer Learning with an Application to Selection Process
Authors:
Andre Mendes,
Julian Togelius,
Leandro dos Santos Coelho
Abstract:
In multi-stage processes, decisions happen in an ordered sequence of stages. Many of them have the structure of dual funnel problem: as the sample size decreases from one stage to the other, the information increases. A related example is a selection process, where applicants apply for a position, prize, or grant. In each stage, more applicants are evaluated and filtered out, and from the remainin…
▽ More
In multi-stage processes, decisions happen in an ordered sequence of stages. Many of them have the structure of dual funnel problem: as the sample size decreases from one stage to the other, the information increases. A related example is a selection process, where applicants apply for a position, prize, or grant. In each stage, more applicants are evaluated and filtered out, and from the remaining ones, more information is collected. In the last stage, decision-makers use all available information to make their final decision. To train a classifier for each stage becomes impracticable as they can underfit due to the low dimensionality in early stages or overfit due to the small sample size in the latter stages. In this work, we proposed a \textit{Multi-StaGe Transfer Learning} (MSGTL) approach that uses knowledge from simple classifiers trained in early stages to improve the performance of classifiers in the latter stages. By transferring weights from simpler neural networks trained in larger datasets, we able to fine-tune more complex neural networks in the latter stages without overfitting due to the small sample size. We show that it is possible to control the trade-off between conserving knowledge and fine-tuning using a simple probabilistic map. Experiments using real-world data demonstrate the efficacy of our approach as it outperforms other state-of-the-art methods for transfer learning and regularization.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
Accuracy of MRI Classification Algorithms in a Tertiary Memory Center Clinical Routine Cohort
Authors:
Alexandre Morin,
Jorge Samper-González,
Anne Bertrand,
Sebastian Stroer,
Didier Dormont,
Aline Mendes,
Pierrick Coupé,
Jamila Ahdidan,
Marcel Lévy,
Dalila Samri,
Harald Hampel,
Bruno Dubois,
Marc Teichmann,
Stéphane Epelbaum,
Olivier Colliot
Abstract:
BACKGROUND:Automated volumetry software (AVS) has recently become widely available to neuroradiologists. MRI volumetry with AVS may support the diagnosis of dementias by identifying regional atrophy. Moreover, automatic classifiers using machine learning techniques have recently emerged as promising approaches to assist diagnosis. However, the performance of both AVS and automatic classifiers has…
▽ More
BACKGROUND:Automated volumetry software (AVS) has recently become widely available to neuroradiologists. MRI volumetry with AVS may support the diagnosis of dementias by identifying regional atrophy. Moreover, automatic classifiers using machine learning techniques have recently emerged as promising approaches to assist diagnosis. However, the performance of both AVS and automatic classifiers has been evaluated mostly in the artificial setting of research datasets.OBJECTIVE:Our aim was to evaluate the performance of two AVS and an automatic classifier in the clinical routine condition of a memory clinic.METHODS:We studied 239 patients with cognitive troubles from a single memory center cohort. Using clinical routine T1-weighted MRI, we evaluated the classification performance of: 1) univariate volumetry using two AVS (volBrain and Neuroreader$^{TM}$); 2) Support Vector Machine (SVM) automatic classifier, using either the AVS volumes (SVM-AVS), or whole gray matter (SVM-WGM); 3) reading by two neuroradiologists. The performance measure was the balanced diagnostic accuracy. The reference standard was consensus diagnosis by three neurologists using clinical, biological (cerebrospinal fluid) and imaging data and following international criteria.RESULTS:Univariate AVS volumetry provided only moderate accuracies (46% to 71% with hippocampal volume). The accuracy improved when using SVM-AVS classifier (52% to 85%), becoming close to that of SVM-WGM (52 to 90%). Visual classification by neuroradiologists ranged between SVM-AVS and SVM-WGM.CONCLUSION:In the routine practice of a memory clinic, the use of volumetric measures provided by AVS yields only moderate accuracy. Automatic classifiers can improve accuracy and could be a useful tool to assist diagnosis.
△ Less
Submitted 19 March, 2020;
originally announced March 2020.
-
Unified Multi-Domain Learning and Data Imputation using Adversarial Autoencoder
Authors:
Andre Mendes,
Julian Togelius,
Leandro dos Santos Coelho
Abstract:
We present a novel framework that can combine multi-domain learning (MDL), data imputation (DI) and multi-task learning (MTL) to improve performance for classification and regression tasks in different domains. The core of our method is an adversarial autoencoder that can: (1) learn to produce domain-invariant embeddings to reduce the difference between domains; (2) learn the data distribution for…
▽ More
We present a novel framework that can combine multi-domain learning (MDL), data imputation (DI) and multi-task learning (MTL) to improve performance for classification and regression tasks in different domains. The core of our method is an adversarial autoencoder that can: (1) learn to produce domain-invariant embeddings to reduce the difference between domains; (2) learn the data distribution for each domain and correctly perform data imputation on missing data. For MDL, we use the Maximum Mean Discrepancy (MMD) measure to align the domain distributions. For DI, we use an adversarial approach where a generator fill in information for missing data and a discriminator tries to distinguish between real and imputed values. Finally, using the universal feature representation in the embeddings, we train a classifier using MTL that given input from any domain, can predict labels for all domains. We demonstrate the superior performance of our approach compared to other state-of-art methods in three distinct settings, DG-DI in image recognition with unstructured data, MTL-DI in grade estimation with structured data and MDMTL-DI in a selection process using mixed data.
△ Less
Submitted 15 March, 2020;
originally announced March 2020.
-
Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes
Authors:
Andre Mendes,
Julian Togelius,
Leandro dos Santos Coelho
Abstract:
In multi-stage processes, decisions occur in an ordered sequence of stages. Early stages usually have more observations with general information (easier/cheaper to collect), while later stages have fewer observations but more specific data. This situation can be represented by a dual funnel structure, in which the sample size decreases from one stage to the other while the information increases. T…
▽ More
In multi-stage processes, decisions occur in an ordered sequence of stages. Early stages usually have more observations with general information (easier/cheaper to collect), while later stages have fewer observations but more specific data. This situation can be represented by a dual funnel structure, in which the sample size decreases from one stage to the other while the information increases. Training classifiers in this scenario is challenging since information in the early stages may not contain distinct patterns to learn (underfitting). In contrast, the small sample size in later stages can cause overfitting. We address both cases by introducing a framework that combines adversarial autoencoders (AAE), multi-task learning (MTL), and multi-label semi-supervised learning (MLSSL). We improve the decoder of the AAE with an MTL component so it can jointly reconstruct the original input and use feature nets to predict the features for the next stages. We also introduce a sequence constraint in the output of an MLSSL classifier to guarantee the sequential pattern in the predictions. Using real-world data from different domains (selection process, medical diagnosis), we show that our approach outperforms other state-of-the-art methods.
△ Less
Submitted 15 March, 2020;
originally announced March 2020.
-
Lost in Disclosure: On The Inference of Password Composition Policies
Authors:
Saul Johnson,
João Ferreira,
Alexandra Mendes,
Julien Cordry
Abstract:
Large-scale password data breaches are becoming increasingly commonplace, which has enabled researchers to produce a substantial body of password security research utilising real-world password datasets, which often contain numbers of records in the tens or even hundreds of millions. While much study has been conducted on how password composition policies (sets of rules that a user must abide by w…
▽ More
Large-scale password data breaches are becoming increasingly commonplace, which has enabled researchers to produce a substantial body of password security research utilising real-world password datasets, which often contain numbers of records in the tens or even hundreds of millions. While much study has been conducted on how password composition policies (sets of rules that a user must abide by when creating a password) influence the distribution of user-chosen passwords on a system, much less research has been done on inferring the password composition policy that a given set of user-chosen passwords was created under. In this paper, we state the problem with the naive approach to this challenge, and suggest a simple approach that produces more reliable results. We also present pol-infer, a tool that implements this approach, and demonstrates its use in inferring password composition policies.
△ Less
Submitted 15 March, 2024; v1 submitted 12 March, 2020;
originally announced March 2020.
-
Data integration and prediction models of photovoltaic production from Brazilian northeastern
Authors:
Hugo Abreu Mendes,
Henrique Ferreira Nunes,
Manoel da Nobrega Marinho,
Paulo Salgado Gomes de Mattos Neto
Abstract:
All productive branches of society need an estimate to be able to control their expenses well. In the energy business, electric utilities use this information to control the power flow in the grid. For better energy production estimation of photovoltaic systems, it is necessary to join multiples geospatial and meteorological variables. This work proposes the creation of a satellite data integratio…
▽ More
All productive branches of society need an estimate to be able to control their expenses well. In the energy business, electric utilities use this information to control the power flow in the grid. For better energy production estimation of photovoltaic systems, it is necessary to join multiples geospatial and meteorological variables. This work proposes the creation of a satellite data integration platform, with production estimation models, base stations measurement and actual production capacity. This work presents statistical, probabilistic and artificial intelligence models that generate spatial and temporal production estimates that could improve production gains as well as facilitate the monitoring and supervision of new enterprises are presented.
△ Less
Submitted 6 March, 2020; v1 submitted 29 January, 2020;
originally announced January 2020.
-
Multi-objective Evolutionary Approach to Grey-Box Identification of Buck Converter
Authors:
Faizal Hafiz,
Akshya Swain,
Eduardo M. A. M. Mendes,
Luis Aguirre
Abstract:
The present study proposes a simple grey-box identification approach to model a real DC-DC buck converter operating in continuous conduction mode. The problem associated with the information void in the observed dynamical data, which is often obtained over a relatively narrow input range, is alleviated by exploiting the known static behavior of buck converter as a priori knowledge. A simple method…
▽ More
The present study proposes a simple grey-box identification approach to model a real DC-DC buck converter operating in continuous conduction mode. The problem associated with the information void in the observed dynamical data, which is often obtained over a relatively narrow input range, is alleviated by exploiting the known static behavior of buck converter as a priori knowledge. A simple method is developed based on the concept of term clusters to determine the static response of the candidate models. The error in the static behavior is then directly embedded into the multi-objective framework for structure selection. In essence, the proposed approach casts grey-box identification problem into a multi-objective framework to balance bias-variance dilemma of model building while explicitly integrating a priori knowledge into the structure selection process. The results of the investigation, considering the case of practical buck converter, demonstrate that it is possible to identify parsimonious models which can capture both the dynamic and static behavior of the system over a wide input range.
△ Less
Submitted 20 February, 2020; v1 submitted 10 September, 2019;
originally announced September 2019.
-
Using Near Infrared Spectroscopy and Machine Learning to diagnose Systemic Sclerosis
Authors:
Joelle Feijó de França,
Hugo Abreu Mendes,
Lucas Gallindo Costa,
Andrea Tavares Dantas,
Angela Luzia Branco Pinto Duarte,
Anderson Stevens Leônidas Gomes,
Emery Cleiton Cabral Correia Lins
Abstract:
The motivation of this work is the use of non-invasive and low cost techniques to obtain a faster and more accurate diagnosis of systemic sclerosis (SSc), rheumatic, autoimmune, chronic and rare disease. The technique in question is Near Infrared Spectroscopy (NIRS). Spectra were acquired from three different regions of hand's volunteers. Machine learning algorithms are used to classify and search…
▽ More
The motivation of this work is the use of non-invasive and low cost techniques to obtain a faster and more accurate diagnosis of systemic sclerosis (SSc), rheumatic, autoimmune, chronic and rare disease. The technique in question is Near Infrared Spectroscopy (NIRS). Spectra were acquired from three different regions of hand's volunteers. Machine learning algorithms are used to classify and search for the best optical wavelength. The results demonstrate that it is easy to obtain wavelength bands more important for the diagnosis. We use the algorithm RFECV and SVC. The results suggests that the most important wavelength band is at 1270 nm, referring to the luminescence of Singlet Oxygen. The results indicates that the Proximal Interphalangeal Joints region returns better accuracy's scores. Optical spectrometers can be found at low prices and can be easily used in clinical evaluations, while the algorithms used are completely diffused on open source platforms.
△ Less
Submitted 16 August, 2019;
originally announced August 2019.
-
Automated Fact Checking in the News Room
Authors:
Sebastião Miranda,
David Nogueira,
Afonso Mendes,
Andreas Vlachos,
Andrew Secker,
Rebecca Garrett,
Jeff Mitchel,
Zita Marinho
Abstract:
Fact checking is an essential task in journalism; its importance has been highlighted due to recently increased concerns and efforts in combating misinformation. In this paper, we present an automated fact-checking platform which given a claim, it retrieves relevant textual evidence from a document collection, predicts whether each piece of evidence supports or refutes the claim, and returns a fin…
▽ More
Fact checking is an essential task in journalism; its importance has been highlighted due to recently increased concerns and efforts in combating misinformation. In this paper, we present an automated fact-checking platform which given a claim, it retrieves relevant textual evidence from a document collection, predicts whether each piece of evidence supports or refutes the claim, and returns a final verdict. We describe the architecture of the system and the user interface, focusing on the choices made to improve its user-friendliness and transparency. We conduct a user study of the fact-checking platform in a journalistic setting: we integrated it with a collection of news articles and provide an evaluation of the platform using feedback from journalists in their workflow. We found that the predictions of our platform were correct 58\% of the time, and 59\% of the returned evidence was relevant.
△ Less
Submitted 3 April, 2019;
originally announced April 2019.
-
Jointly Extracting and Compressing Documents with Summary State Representations
Authors:
Afonso Mendes,
Shashi Narayan,
Sebastião Miranda,
Zita Marinho,
André F. T. Martins,
Shay B. Cohen
Abstract:
We present a new neural model for text summarization that first extracts sentences from a document and then compresses them. The proposed model offers a balance that sidesteps the difficulties in abstractive methods while generating more concise summaries than extractive methods. In addition, our model dynamically determines the length of the output summary based on the gold summaries it observes…
▽ More
We present a new neural model for text summarization that first extracts sentences from a document and then compresses them. The proposed model offers a balance that sidesteps the difficulties in abstractive methods while generating more concise summaries than extractive methods. In addition, our model dynamically determines the length of the output summary based on the gold summaries it observes during training and does not require length constraints typical to extractive summarization. The model achieves state-of-the-art results on the CNN/DailyMail and Newsroom datasets, improving over current extractive and abstractive methods. Human evaluations demonstrate that our model generates concise and informative summaries. We also make available a new dataset of oracle compressive summaries derived automatically from the CNN/DailyMail reference summaries.
△ Less
Submitted 5 April, 2019; v1 submitted 3 April, 2019;
originally announced April 2019.
-
Comparing Computing Platforms for Deep Learning on a Humanoid Robot
Authors:
Alexander Biddulph,
Trent Houlistion,
Alexandre Mendes,
Stephan K. Chalup
Abstract:
The goal of this study is to test two different computing platforms with respect to their suitability for running deep networks as part of a humanoid robot software system. One of the platforms is the CPU-centered Intel NUC7i7BNH and the other is a NVIDIA Jetson TX2 system that puts more emphasis on GPU processing. The experiments addressed a number of benchmarking tasks including pedestrian detec…
▽ More
The goal of this study is to test two different computing platforms with respect to their suitability for running deep networks as part of a humanoid robot software system. One of the platforms is the CPU-centered Intel NUC7i7BNH and the other is a NVIDIA Jetson TX2 system that puts more emphasis on GPU processing. The experiments addressed a number of benchmarking tasks including pedestrian detection using deep neural networks. Some of the results were unexpected but demonstrate that platforms exhibit both advantages and disadvantages when taking computational performance and electrical power requirements of such a system into account.
△ Less
Submitted 20 January, 2019; v1 submitted 10 September, 2018;
originally announced September 2018.
-
Maximising Throughput in a Complex Coal Export System
Authors:
Mateus Rocha de Paula,
Natashia Boland,
Andreas Ernst,
Alexandre Mendes,
Martin Savelsbergh
Abstract:
The Port of Newcastle features three coal export terminals, operating primarily in cargo assembly mode, that share a rail network on their inbound side, and a channel on their outbound side. Maximising throughput at a single coal terminal, taking into account its layout, its equipment, and its operating policies, is already challenging, but maximising throughput of the Hunter Valley coal export sy…
▽ More
The Port of Newcastle features three coal export terminals, operating primarily in cargo assembly mode, that share a rail network on their inbound side, and a channel on their outbound side. Maximising throughput at a single coal terminal, taking into account its layout, its equipment, and its operating policies, is already challenging, but maximising throughput of the Hunter Valley coal export system as a whole requires that terminals and inbound and outbound shared resources be considered simultaneously. Existing approaches to do so either lack realism or are too computationally demanding to be useful as an everyday planning tool. We present a parallel genetic algorithm to optimise the integrated system. The algorithm models activities in continuous time, can handle practical planning horizons efficiently, and generates solutions that match or improve solutions obtained with the state-of-the-art solvers, whilst vastly outperforming them both in memory usage and running time.
△ Less
Submitted 22 August, 2018; v1 submitted 18 August, 2018;
originally announced August 2018.
-
The NUbots Team Description Paper 2015
Authors:
Josiah Walker,
Trent Houliston,
Brendan Annable,
Alex Biddulph,
Jake Fountain,
Mitchell Metcalfe,
Anita Sugo,
Monica Olejniczak,
Stephan K. Chalup,
Robert A. R. King,
Alexandre Mendes,
Peter Turner
Abstract:
The NUbots are an interdisciplinary RoboCup team from The University of Newcastle, Australia. The team has a history of strong contributions in the areas of machine learning and computer vision. The NUbots have participated in RoboCup leagues since 2002, placing first several times in the past. In 2014 the NUbots also partnered with the University of Newcastle Mechatronics Laboratory to participat…
▽ More
The NUbots are an interdisciplinary RoboCup team from The University of Newcastle, Australia. The team has a history of strong contributions in the areas of machine learning and computer vision. The NUbots have participated in RoboCup leagues since 2002, placing first several times in the past. In 2014 the NUbots also partnered with the University of Newcastle Mechatronics Laboratory to participate in the RobotX Marine Robotics Challenge, which resulted in several new ideas and improvements to the NUbots vision system for RoboCup. This paper summarizes the history of the NUbots team, describes the roles and research of the team members, gives an overview of the NUbots' robots, their software system, and several associated research projects.
△ Less
Submitted 11 February, 2015;
originally announced February 2015.
-
FlexDM: Enabling robust and reliable parallel data mining using WEKA
Authors:
Madison Flannery,
David M Budden,
Alexandre Mendes
Abstract:
Performing massive data mining experiments with multiple datasets and methods is a common task faced by most bioinformatics and computational biology laboratories. WEKA is a machine learning package designed to facilitate this task by providing tools that allow researchers to select from several classification methods and specific test strategies. Despite its popularity, the current WEKA environme…
▽ More
Performing massive data mining experiments with multiple datasets and methods is a common task faced by most bioinformatics and computational biology laboratories. WEKA is a machine learning package designed to facilitate this task by providing tools that allow researchers to select from several classification methods and specific test strategies. Despite its popularity, the current WEKA environment for batch experiments, namely Experimenter, has four limitations that impact its usability: the selection of value ranges for methods options lacks flexibility and is not intuitive; there is no support for parallelisation when running large-scale data mining tasks; the XML schema is difficult to read, necessitating the use of the Experimenter's graphical user interface for generation and modification; and robustness is limited by the fact that results are not saved until the last test has concluded.
FlexDM implements an interface to WEKA to run batch processing tasks in a simple and intuitive way. In a short and easy-to-understand XML file, one can define hundreds of tests to be performed on several datasets. FlexDM also allows those tests to be executed asynchronously in parallel to take advantage of multi-core processors, significantly increasing usability and productivity. Results are saved incrementally for better robustness and reliability.
FlexDM is implemented in Java and runs on Windows, Linux and OSX. As we encourage other researchers to explore and adopt our software, FlexDM is made available as a pre-configured bootable reference environment. All code, supporting documentation and usage examples are also available for download at http://sourceforge.net/projects/flexdm.
△ Less
Submitted 18 December, 2014;
originally announced December 2014.
-
Addressing the non-functional requirements of computer vision systems: A case study
Authors:
Shannon Fenn,
Alexandre Mendes,
David Budden
Abstract:
Computer vision plays a major role in the robotics industry, where vision data is frequently used for navigation and high-level decision making. Although there is significant research in algorithms and functional requirements, there is a comparative lack of emphasis on how best to map these abstract concepts onto an appropriate software architecture.
In this study, we distinguish between the fun…
▽ More
Computer vision plays a major role in the robotics industry, where vision data is frequently used for navigation and high-level decision making. Although there is significant research in algorithms and functional requirements, there is a comparative lack of emphasis on how best to map these abstract concepts onto an appropriate software architecture.
In this study, we distinguish between the functional and non-functional requirements of a computer vision system. Using a RoboCup humanoid robot system as a case study, we propose and develop a software architecture that fulfills the latter criteria.
The modifiability of the proposed architecture is demonstrated by detailing a number of feature detection algorithms and emphasizing which aspects of the underlying framework were modified to support their integration. To demonstrate portability, we port our vision system (designed for an application-specific DARwIn-OP humanoid robot) to a general-purpose, Raspberry Pi computer. We evaluate performance on both platforms and compare them to a vision system optimised for functional requirements only.
The architecture and implementation presented in this study provide a highly generalisable framework for computer vision system design that is of particular benefit in research and development, competition and other environments in which rapid system evolution is necessary.
△ Less
Submitted 30 October, 2014;
originally announced October 2014.
-
Towards a good notion of categories of logics
Authors:
Caio de Andrade Mendes,
Hugo Luiz Mariano
Abstract:
We consider (finitary, propositional) logics through the original use of Category Theory: the study of the "sociology of mathematical objects", aligning us with a recent, and growing, trend of study logics through its relations with other logics (e.g. process of combinations of logics as bring [Gab] and possible translation semantics [Car]). So will be objects of study the classes of logics, i.e.…
▽ More
We consider (finitary, propositional) logics through the original use of Category Theory: the study of the "sociology of mathematical objects", aligning us with a recent, and growing, trend of study logics through its relations with other logics (e.g. process of combinations of logics as bring [Gab] and possible translation semantics [Car]). So will be objects of study the classes of logics, i.e. categories whose objects are logical systems (i.e., a signature with a Tarskian consequence relation) and the morphisms are related to (some concept of) translations between these systems. The present work provides the first steps of a project of considering categories of logical systems satisfying simultaneously certain natural requirements: it seems that in the literature ([AFLM1], [AFLM2], [AFLM3], [BC], [BCC1], [BCC2], [CG], [FC]) this is achieved only partially.
△ Less
Submitted 27 March, 2016; v1 submitted 14 April, 2014;
originally announced April 2014.
-
The NUbots Team Description Paper 2014
Authors:
Josiah Walker,
Trent Houliston,
Brendan Annable,
Alex Biddulph,
Andrew Dabson,
Jake Fountain,
Taylor Johnson,
Jordan Johnson,
Mitchell Metcalfe,
Anita Sugo,
Stephan K. Chalup,
Robert A. R. King,
Alexandre Mendes,
Peter Turner
Abstract:
The NUbots team, from The University of Newcastle, Australia, has had a strong record of success in the RoboCup Standard Platform League since first entering in 2002. The team has also competed within the RoboCup Humanoid Kid-Size League since 2012. The 2014 team brings a renewed focus on software architecture, modularity, and the ability to easily share code. This paper summarizes the history of…
▽ More
The NUbots team, from The University of Newcastle, Australia, has had a strong record of success in the RoboCup Standard Platform League since first entering in 2002. The team has also competed within the RoboCup Humanoid Kid-Size League since 2012. The 2014 team brings a renewed focus on software architecture, modularity, and the ability to easily share code. This paper summarizes the history of the NUbots team, describes the roles and research of the team members, gives an overview of the NUbots' robots and software system, and addresses relevant research projects within the the Newcastle Robotics Laboratory.
△ Less
Submitted 27 March, 2014;
originally announced March 2014.
-
Guessing games
Authors:
Anthony Mendes,
Kent E. Morrison
Abstract:
In a guessing game, players guess the value of a random real number selected using some probability density function. The winner may be determined in various ways; for example, a winner can be a player whose guess is closest in magnitude to the target or a winner can be a player coming closest without guessing higher than the target. We study optimal strategies for players in these games and deter…
▽ More
In a guessing game, players guess the value of a random real number selected using some probability density function. The winner may be determined in various ways; for example, a winner can be a player whose guess is closest in magnitude to the target or a winner can be a player coming closest without guessing higher than the target. We study optimal strategies for players in these games and determine some of them for two, three, and four players.
△ Less
Submitted 10 January, 2014;
originally announced January 2014.
-
Learning to answer questions
Authors:
Ana Cristina Mendes,
Luísa Coheur,
Sérgio Curto
Abstract:
We present an open-domain Question-Answering system that learns to answer questions based on successful past interactions. We follow a pattern-based approach to Answer-Extraction, where (lexico-syntactic) patterns that relate a question to its answer are automatically learned and used to answer future questions. Results show that our approach contributes to the system's best performance when it is…
▽ More
We present an open-domain Question-Answering system that learns to answer questions based on successful past interactions. We follow a pattern-based approach to Answer-Extraction, where (lexico-syntactic) patterns that relate a question to its answer are automatically learned and used to answer future questions. Results show that our approach contributes to the system's best performance when it is conjugated with typical Answer-Extraction strategies. Moreover, it allows the system to learn with the answered questions and to rectify wrong or unsolved past questions.
△ Less
Submitted 4 September, 2013;
originally announced September 2013.
-
Towards the Rapid Development of a Natural Language Understanding Module
Authors:
Catarina Moreira,
Ana Cristina Mendes,
Luísa Coheur,
Bruno Martins
Abstract:
When develo** a conversational agent, there is often an urgent need to have a prototype available in order to test the application with real users. A Wizard of Oz is a possibility, but sometimes the agent should be simply deployed in the environment where it will be used. Here, the agent should be able to capture as many interactions as possible and to understand how people react to failure. In…
▽ More
When develo** a conversational agent, there is often an urgent need to have a prototype available in order to test the application with real users. A Wizard of Oz is a possibility, but sometimes the agent should be simply deployed in the environment where it will be used. Here, the agent should be able to capture as many interactions as possible and to understand how people react to failure. In this paper, we focus on the rapid development of a natural language understanding module by non experts. Our approach follows the learning paradigm and sees the process of understanding natural language as a classification problem. We test our module with a conversational agent that answers questions in the art domain. Moreover, we show how our approach can be used by a natural language interface to a cinema database.
△ Less
Submitted 6 February, 2013;
originally announced February 2013.