-
Hiperwalk: Simulation of Quantum Walks with Heterogeneous High-Performance Computing
Authors:
Paulo Motta,
Gustavo A. Bezerra,
Anderson F. P. Santos,
Renato Portugal
Abstract:
The Hiperwalk package is designed to facilitate the simulation of quantum walks using heterogeneous high-performance computing, taking advantage of the parallel processing power of diverse processors such as CPUs, GPUs, and acceleration cards. This package enables the simulation of both the continuous-time and discrete-time quantum walk models, effectively modeling the behavior of quantum systems…
▽ More
The Hiperwalk package is designed to facilitate the simulation of quantum walks using heterogeneous high-performance computing, taking advantage of the parallel processing power of diverse processors such as CPUs, GPUs, and acceleration cards. This package enables the simulation of both the continuous-time and discrete-time quantum walk models, effectively modeling the behavior of quantum systems on large graphs. Hiperwalk features a user-friendly Python package frontend with comprehensive documentation, as well as a high-performance C-based inner core that leverages parallel computing for efficient linear algebra calculations. This versatile tool empowers researchers to better understand quantum walk behavior, optimize implementation, and explore a wide range of potential applications, including spatial search algorithms.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
The Role of Generative AI in Software Development Productivity: A Pilot Case Study
Authors:
Mariana Coutinho,
Lorena Marques,
Anderson Santos,
Marcio Dahia,
Cesar Franca,
Ronnie de Souza Santos
Abstract:
With software development increasingly reliant on innovative technologies, there is a growing interest in exploring the potential of generative AI tools to streamline processes and enhance productivity. In this scenario, this paper investigates the integration of generative AI tools within software development, focusing on understanding their uses, benefits, and challenges to software professional…
▽ More
With software development increasingly reliant on innovative technologies, there is a growing interest in exploring the potential of generative AI tools to streamline processes and enhance productivity. In this scenario, this paper investigates the integration of generative AI tools within software development, focusing on understanding their uses, benefits, and challenges to software professionals, in particular, looking at aspects of productivity. Through a pilot case study involving software practitioners working in different roles, we gathered valuable experiences on the integration of generative AI tools into their daily work routines. Our findings reveal a generally positive perception of these tools in individual productivity while also highlighting the need to address identified limitations. Overall, our research sets the stage for further exploration into the evolving landscape of software development practices with the integration of generative AI tools.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Clustered Retrieved Augmented Generation (CRAG)
Authors:
Simon Akesson,
Frances A. Santos
Abstract:
Providing external knowledge to Large Language Models (LLMs) is a key point for using these models in real-world applications for several reasons, such as incorporating up-to-date content in a real-time manner, providing access to domain-specific knowledge, and contributing to hallucination prevention. The vector database-based Retrieval Augmented Generation (RAG) approach has been widely adopted…
▽ More
Providing external knowledge to Large Language Models (LLMs) is a key point for using these models in real-world applications for several reasons, such as incorporating up-to-date content in a real-time manner, providing access to domain-specific knowledge, and contributing to hallucination prevention. The vector database-based Retrieval Augmented Generation (RAG) approach has been widely adopted to this end. Thus, any part of external knowledge can be retrieved and provided to some LLM as the input context. Despite RAG approach's success, it still might be unfeasible for some applications, because the context retrieved can demand a longer context window than the size supported by LLM. Even when the context retrieved fits into the context window size, the number of tokens might be expressive and, consequently, impact costs and processing time, becoming impractical for most applications. To address these, we propose CRAG, a novel approach able to effectively reduce the number of prompting tokens without degrading the quality of the response generated compared to a solution using RAG. Through our experiments, we show that CRAG can reduce the number of tokens by at least 46\%, achieving more than 90\% in some cases, compared to RAG. Moreover, the number of tokens with CRAG does not increase considerably when the number of reviews analyzed is higher, unlike RAG, where the number of tokens is almost 9x higher when there are 75 reviews compared to 4 reviews.
△ Less
Submitted 24 May, 2024;
originally announced June 2024.
-
A Vision on Open Science for the Evolution of Software Engineering Research and Practice
Authors:
Edson OliveiraJr,
Fernanda Madeiral,
Alcemir Rodrigues Santos,
Christina von Flach,
Sergio Soares
Abstract:
Open Science aims to foster openness and collaboration in research, leading to more significant scientific and social impact. However, practicing Open Science comes with several challenges and is currently not properly rewarded. In this paper, we share our vision for addressing those challenges through a conceptual framework that connects essential building blocks for a change in the Software Engi…
▽ More
Open Science aims to foster openness and collaboration in research, leading to more significant scientific and social impact. However, practicing Open Science comes with several challenges and is currently not properly rewarded. In this paper, we share our vision for addressing those challenges through a conceptual framework that connects essential building blocks for a change in the Software Engineering community, both culturally and technically. The idea behind this framework is that Open Science is treated as a first-class requirement for better Software Engineering research, practice, recognition, and relevant social impact. There is a long road for us, as a community, to truly embrace and gain from the benefits of Open Science. Nevertheless, we shed light on the directions for promoting the necessary culture shift and empowering the Software Engineering community.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Exploring the Potential of Data-Driven Spatial Audio Enhancement Using a Single-Channel Model
Authors:
Arthur N. dos Santos,
Bruno S. Masiero,
Túlio C. L. Mateus
Abstract:
One key aspect differentiating data-driven single- and multi-channel speech enhancement and dereverberation methods is that both the problem formulation and complexity of the solutions are considerably more challenging in the latter case. Additionally, with limited computational resources, it is cumbersome to train models that require the management of larger datasets or those with more complex de…
▽ More
One key aspect differentiating data-driven single- and multi-channel speech enhancement and dereverberation methods is that both the problem formulation and complexity of the solutions are considerably more challenging in the latter case. Additionally, with limited computational resources, it is cumbersome to train models that require the management of larger datasets or those with more complex designs. In this scenario, an unverified hypothesis that single-channel methods can be adapted to multi-channel scenarios simply by processing each channel independently holds significant implications, boosting compatibility between sound scene capture and system input-output formats, while also allowing modern research to focus on other challenging aspects, such as full-bandwidth audio enhancement, competitive noise suppression, and unsupervised learning. This study verifies this hypothesis by comparing the enhancement promoted by a basic single-channel speech enhancement and dereverberation model with two other multi-channel models tailored to separate clean speech from noisy 3D mixes. A direction of arrival estimation model was used to objectively evaluate its capacity to preserve spatial information by comparing the output signals with ground-truth coordinate values. Consequently, a trade-off arises between preserving spatial information with a more straightforward single-channel solution at the cost of obtaining lower gains in intelligibility scores.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Goal Recognition via Linear Programming
Authors:
Felipe Meneguzzi,
Luísa R. de A. Santos,
Ramon Fraga Pereira,
André G. Pereira
Abstract:
Goal Recognition is the task by which an observer aims to discern the goals that correspond to plans that comply with the perceived behavior of subject agents given as a sequence of observations. Research on Goal Recognition as Planning encompasses reasoning about the model of a planning task, the observations, and the goals using planning techniques, resulting in very efficient recognition approa…
▽ More
Goal Recognition is the task by which an observer aims to discern the goals that correspond to plans that comply with the perceived behavior of subject agents given as a sequence of observations. Research on Goal Recognition as Planning encompasses reasoning about the model of a planning task, the observations, and the goals using planning techniques, resulting in very efficient recognition approaches. In this article, we design novel recognition approaches that rely on the Operator-Counting framework, proposing new constraints, and analyze their constraints' properties both theoretically and empirically. The Operator-Counting framework is a technique that efficiently computes heuristic estimates of cost-to-goal using Integer/Linear Programming (IP/LP). In the realm of theory, we prove that the new constraints provide lower bounds on the cost of plans that comply with observations. We also provide an extensive empirical evaluation to assess how the new constraints improve the quality of the solution, and we found that they are especially informed in deciding which goals are unlikely to be part of the solution. Our novel recognition approaches have two pivotal advantages: first, they employ new IP/LP constraints for efficiently recognizing goals; second, we show how the new IP/LP constraints can improve the recognition of goals under both partial and noisy observability.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
ProtoAL: Interpretable Deep Active Learning with prototypes for medical imaging
Authors:
Iury B. de A. Santos,
André C. P. L. F. de Carvalho
Abstract:
The adoption of Deep Learning algorithms in the medical imaging field is a prominent area of research, with high potential for advancing AI-based Computer-aided diagnosis (AI-CAD) solutions. However, current solutions face challenges due to a lack of interpretability features and high data demands, prompting recent efforts to address these issues. In this study, we propose the ProtoAL method, wher…
▽ More
The adoption of Deep Learning algorithms in the medical imaging field is a prominent area of research, with high potential for advancing AI-based Computer-aided diagnosis (AI-CAD) solutions. However, current solutions face challenges due to a lack of interpretability features and high data demands, prompting recent efforts to address these issues. In this study, we propose the ProtoAL method, where we integrate an interpretable DL model into the Deep Active Learning (DAL) framework. This approach aims to address both challenges by focusing on the medical imaging context and utilizing an inherently interpretable model based on prototypes. We evaluated ProtoAL on the Messidor dataset, achieving an area under the precision-recall curve of 0.79 while utilizing only 76.54\% of the available labeled data. These capabilities can enhances the practical usability of a DL model in the medical field, providing a means of trust calibration in domain experts and a suitable solution for learning in the data scarcity context often found.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
ROBUST: 221 Bugs in the Robot Operating System
Authors:
Christopher S. Timperley,
Gijs van der Hoorn,
André Santos,
Harshavardhan Deshpande,
Andrzej Wąsowski
Abstract:
As robotic systems such as autonomous cars and delivery drones assume greater roles and responsibilities within society, the likelihood and impact of catastrophic software failure within those systems is increased.To aid researchers in the development of new methods to measure and assure the safety and quality of robotics software, we systematically curated a dataset of 221 bugs across 7 popular a…
▽ More
As robotic systems such as autonomous cars and delivery drones assume greater roles and responsibilities within society, the likelihood and impact of catastrophic software failure within those systems is increased.To aid researchers in the development of new methods to measure and assure the safety and quality of robotics software, we systematically curated a dataset of 221 bugs across 7 popular and diverse software systems implemented via the Robot Operating System (ROS). We produce historically accurate recreations of each of the 221 defective software versions in the form of Docker images, and use a grounded theory approach to examine and categorize their corresponding faults, failures, and fixes. Finally, we reflect on the implications of our findings and outline future research directions for the community.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Efficiently Estimating Mutual Information Between Attributes Across Tables
Authors:
Aécio Santos,
Flip Korn,
Juliana Freire
Abstract:
Relational data augmentation is a powerful technique for enhancing data analytics and improving machine learning models by incorporating columns from external datasets. However, it is challenging to efficiently discover relevant external tables to join with a given input table. Existing approaches rely on data discovery systems to identify joinable tables from external sources, typically based on…
▽ More
Relational data augmentation is a powerful technique for enhancing data analytics and improving machine learning models by incorporating columns from external datasets. However, it is challenging to efficiently discover relevant external tables to join with a given input table. Existing approaches rely on data discovery systems to identify joinable tables from external sources, typically based on overlap or containment. However, the sheer number of tables obtained from these systems results in irrelevant joins that need to be performed; this can be computationally expensive or even infeasible in practice. We address this limitation by proposing the use of efficient mutual information (MI) estimation for finding relevant joinable tables. We introduce a new sketching method that enables efficient evaluation of relationship discovery queries by estimating MI without materializing the joins and returning a smaller set of tables that are more likely to be relevant. We also demonstrate the effectiveness of our approach at approximating MI in extensive experiments using synthetic and real-world datasets.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Leveraging Self-Supervised Learning for Scene Recognition in Child Sexual Abuse Imagery
Authors:
Pedro H. V. Valois,
João Macedo,
Leo S. F. Ribeiro,
Jefersson A. dos Santos,
Sandra Avila
Abstract:
Crime in the 21st century is split into a virtual and real world. However, the former has become a global menace to people's well-being and security in the latter. The challenges it presents must be faced with unified global cooperation, and we must rely more than ever on automated yet trustworthy tools to combat the ever-growing nature of online offenses. Over 10 million child sexual abuse report…
▽ More
Crime in the 21st century is split into a virtual and real world. However, the former has become a global menace to people's well-being and security in the latter. The challenges it presents must be faced with unified global cooperation, and we must rely more than ever on automated yet trustworthy tools to combat the ever-growing nature of online offenses. Over 10 million child sexual abuse reports are submitted to the US National Center for Missing & Exploited Children every year, and over 80% originated from online sources. Therefore, investigation centers and clearinghouses cannot manually process and correctly investigate all imagery. In light of that, reliable automated tools that can securely and efficiently deal with this data are paramount. In this sense, the scene recognition task looks for contextual cues in the environment, being able to group and classify child sexual abuse data without requiring to be trained on sensitive material. The scarcity and limitations of working with child sexual abuse images lead to self-supervised learning, a machine-learning methodology that leverages unlabeled data to produce powerful representations that can be more easily transferred to target tasks. This work shows that self-supervised deep learning models pre-trained on scene-centric data can reach 71.6% balanced accuracy on our indoor scene classification task and, on average, 2.2 percentage points better performance than a fully supervised version. We cooperate with Brazilian Federal Police experts to evaluate our indoor classification model on actual child abuse material. The results demonstrate a notable discrepancy between the features observed in widely used scene datasets and those depicted on sensitive materials.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Self-calibrated convolution towards glioma segmentation
Authors:
Felipe C. R. Salvagnini,
Gerson O. Barbosa,
Alexandre X. Falcao,
Cid A. N. Santos
Abstract:
Accurate brain tumor segmentation in the early stages of the disease is crucial for the treatment's effectiveness, avoiding exhaustive visual inspection of a qualified specialist on 3D MR brain images of multiple protocols (e.g., T1, T2, T2-FLAIR, T1-Gd). Several networks exist for Glioma segmentation, being nnU-Net one of the best. In this work, we evaluate self-calibrated convolutions in differe…
▽ More
Accurate brain tumor segmentation in the early stages of the disease is crucial for the treatment's effectiveness, avoiding exhaustive visual inspection of a qualified specialist on 3D MR brain images of multiple protocols (e.g., T1, T2, T2-FLAIR, T1-Gd). Several networks exist for Glioma segmentation, being nnU-Net one of the best. In this work, we evaluate self-calibrated convolutions in different parts of the nnU-Net network to demonstrate that self-calibrated modules in skip connections can significantly improve the enhanced-tumor and tumor-core segmentation accuracy while preserving the wholetumor segmentation accuracy.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
A Survey of Designs for Combined 2D+3D Visual Representations
Authors:
Jiayi Hong,
Rostyslav Hnatyshyn,
Ebrar A. D. Santos,
Ross Maciejewski,
Tobias Isenberg
Abstract:
We examine visual representations of data that make use of combinations of both 2D and 3D data map**s. Combining 2D and 3D representations is a common technique that allows viewers to understand multiple facets of the data with which they are interacting. While 3D representations focus on the spatial character of the data or the dedicated 3D data map**, 2D representations often show abstract d…
▽ More
We examine visual representations of data that make use of combinations of both 2D and 3D data map**s. Combining 2D and 3D representations is a common technique that allows viewers to understand multiple facets of the data with which they are interacting. While 3D representations focus on the spatial character of the data or the dedicated 3D data map**, 2D representations often show abstract data properties and take advantage of the unique benefits of map** to a plane. Many systems have used unique combinations of both types of data map**s effectively. Yet there are no systematic reviews of the methods in linking 2D and 3D representations. We systematically survey the relationships between 2D and 3D visual representations in major visualization publications -- IEEE VIS, IEEE TVCG, and EuroVis -- from 2012 to 2022. We closely examined 105 papers where 2D and 3D representations are connected visually, interactively, or through animation. These approaches are designed based on their visual environment, the relationships between their visual representations, and their possible layouts. Through our analysis, we introduce a design space as well as provide design guidelines for effectively linking 2D and 3D visual representations.
△ Less
Submitted 12 January, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
A Case Study on Test Case Construction with Large Language Models: Unveiling Practical Insights and Challenges
Authors:
Roberto Francisco de Lima Junior,
Luiz Fernando Paes de Barros Presta,
Lucca Santos Borborema,
Vanderson Nogueira da Silva,
Marcio Leal de Melo Dahia,
Anderson Carlos Sousa e Santos
Abstract:
This paper presents a detailed case study examining the application of Large Language Models (LLMs) in the construction of test cases within the context of software engineering. LLMs, characterized by their advanced natural language processing capabilities, are increasingly garnering attention as tools to automate and enhance various aspects of the software development life cycle. Leveraging a cas…
▽ More
This paper presents a detailed case study examining the application of Large Language Models (LLMs) in the construction of test cases within the context of software engineering. LLMs, characterized by their advanced natural language processing capabilities, are increasingly garnering attention as tools to automate and enhance various aspects of the software development life cycle. Leveraging a case study methodology, we systematically explore the integration of LLMs in the test case construction process, aiming to shed light on their practical efficacy, challenges encountered, and implications for software quality assurance. The study encompasses the selection of a representative software application, the formulation of test case construction methodologies employing LLMs, and the subsequent evaluation of outcomes. Through a blend of qualitative and quantitative analyses, this study assesses the impact of LLMs on test case comprehensiveness, accuracy, and efficiency. Additionally, delves into challenges such as model interpretability and adaptation to diverse software contexts. The findings from this case study contributes with nuanced insights into the practical utility of LLMs in the domain of test case construction, elucidating their potential benefits and limitations. By addressing real-world scenarios and complexities, this research aims to inform software practitioners and researchers alike about the tangible implications of incorporating LLMs into the software testing landscape, fostering a more comprehensive understanding of their role in optimizing the software development process.
△ Less
Submitted 21 December, 2023; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Inferring the Graph of Networked Dynamical Systems under Partial Observability and Spatially Colored Noise
Authors:
Augusto Santos,
Diogo Rente,
Rui Seabra,
José M. F. Moura
Abstract:
In a Networked Dynamical System (NDS), each node is a system whose dynamics are coupled with the dynamics of neighboring nodes. The global dynamics naturally builds on this network of couplings and it is often excited by a noise input with nontrivial structure. The underlying network is unknown in many applications and should be inferred from observed data. We assume: i) Partial observability -- t…
▽ More
In a Networked Dynamical System (NDS), each node is a system whose dynamics are coupled with the dynamics of neighboring nodes. The global dynamics naturally builds on this network of couplings and it is often excited by a noise input with nontrivial structure. The underlying network is unknown in many applications and should be inferred from observed data. We assume: i) Partial observability -- time series data is only available over a subset of the nodes; ii) Input noise -- it is correlated across distinct nodes while temporally independent, i.e., it is spatially colored. We present a feasibility condition on the noise correlation structure wherein there exists a consistent network inference estimator to recover the underlying fundamental dependencies among the observed nodes. Further, we describe a structure identification algorithm that exhibits competitive performance across distinct regimes of network connectivity, observability, and noise correlation.
△ Less
Submitted 18 December, 2023;
originally announced December 2023.
-
Learning the Causal Structure of Networked Dynamical Systems under Latent Nodes and Structured Noise
Authors:
Augusto Santos,
Diogo Rente,
Rui Seabra,
José M. F. Moura
Abstract:
This paper considers learning the hidden causal network of a linear networked dynamical system (NDS) from the time series data at some of its nodes -- partial observability. The dynamics of the NDS are driven by colored noise that generates spurious associations across pairs of nodes, rendering the problem much harder. To address the challenge of noise correlation and partial observability, we ass…
▽ More
This paper considers learning the hidden causal network of a linear networked dynamical system (NDS) from the time series data at some of its nodes -- partial observability. The dynamics of the NDS are driven by colored noise that generates spurious associations across pairs of nodes, rendering the problem much harder. To address the challenge of noise correlation and partial observability, we assign to each pair of nodes a feature vector computed from the time series data of observed nodes. The feature embedding is engineered to yield structural consistency: there exists an affine hyperplane that consistently partitions the set of features, separating the feature vectors corresponding to connected pairs of nodes from those corresponding to disconnected pairs. The causal inference problem is thus addressed via clustering the designed features. We demonstrate with simple baseline supervised methods the competitive performance of the proposed causal inference mechanism under broad connectivity regimes and noise correlation levels, including a real world network. Further, we devise novel technical guarantees of structural consistency for linear NDS under the considered regime.
△ Less
Submitted 12 February, 2024; v1 submitted 10 December, 2023;
originally announced December 2023.
-
Better, Not Just More: Data-Centric Machine Learning for Earth Observation
Authors:
Ribana Roscher,
Marc Rußwurm,
Caroline Gevaert,
Michael Kampffmeyer,
Jefersson A. dos Santos,
Maria Vakalopoulou,
Ronny Hänsch,
Stine Hansen,
Keiller Nogueira,
Jonathan Prexl,
Devis Tuia
Abstract:
Recent developments and research in modern machine learning have led to substantial improvements in the geospatial field. Although numerous deep learning architectures and models have been proposed, the majority of them have been solely developed on benchmark datasets that lack strong real-world relevance. Furthermore, the performance of many methods has already saturated on these datasets. We arg…
▽ More
Recent developments and research in modern machine learning have led to substantial improvements in the geospatial field. Although numerous deep learning architectures and models have been proposed, the majority of them have been solely developed on benchmark datasets that lack strong real-world relevance. Furthermore, the performance of many methods has already saturated on these datasets. We argue that a shift from a model-centric view to a complementary data-centric perspective is necessary for further improvements in accuracy, generalization ability, and real impact on end-user applications. Furthermore, considering the entire machine learning cycle - from problem definition to model deployment with feedback - is crucial for enhancing machine learning models that can be reliable in unforeseen situations. This work presents a definition as well as a precise categorization and overview of automated data-centric learning approaches for geospatial data. It highlights the complementary role of data-centric learning with respect to model-centric in the larger machine learning deployment cycle. We review papers across the entire geospatial field and categorize them into different groups. A set of representative experiments shows concrete implementation examples. These examples provide concrete steps to act on geospatial data with data-centric machine learning approaches.
△ Less
Submitted 22 June, 2024; v1 submitted 8 December, 2023;
originally announced December 2023.
-
Quantum Counting on the Complete Bipartite Graph
Authors:
Gustavo A. Bezerra,
Raqueline A. M. Santos,
Renato Portugal
Abstract:
Quantum counting is a key quantum algorithm that aims to determine the number of marked elements in a database. This algorithm is based on the quantum phase estimation algorithm and uses the evolution operator of Grover's algorithm because its non-trivial eigenvalues are dependent on the number of marked elements. Since Grover's algorithm can be viewed as a quantum walk on a complete graph, a natu…
▽ More
Quantum counting is a key quantum algorithm that aims to determine the number of marked elements in a database. This algorithm is based on the quantum phase estimation algorithm and uses the evolution operator of Grover's algorithm because its non-trivial eigenvalues are dependent on the number of marked elements. Since Grover's algorithm can be viewed as a quantum walk on a complete graph, a natural way to extend quantum counting is to use the evolution operator of quantum-walk-based search on non-complete graphs instead of Grover's operator. In this paper, we explore this extension by analyzing the coined quantum walk on the complete bipartite graph with an arbitrary number of marked vertices. We show that some eigenvalues of the evolution operator depend on the number of marked vertices and using this fact we show that the quantum phase estimation can be used to obtain the number of marked vertices. The time complexity for estimating the number of marked vertices in the bipartite graph with our algorithm aligns closely with that of the original quantum counting algorithm.
△ Less
Submitted 8 December, 2023; v1 submitted 17 November, 2023;
originally announced November 2023.
-
YOLOv7 for Mosquito Breeding Grounds Detection and Tracking
Authors:
Camila Laranjeira,
Daniel Andrade,
Jefersson A. dos Santos
Abstract:
With the looming threat of climate change, neglected tropical diseases such as dengue, zika, and chikungunya have the potential to become an even greater global concern. Remote sensing technologies can aid in controlling the spread of Aedes Aegypti, the transmission vector of such diseases, by automating the detection and map** of mosquito breeding sites, such that local entities can properly in…
▽ More
With the looming threat of climate change, neglected tropical diseases such as dengue, zika, and chikungunya have the potential to become an even greater global concern. Remote sensing technologies can aid in controlling the spread of Aedes Aegypti, the transmission vector of such diseases, by automating the detection and map** of mosquito breeding sites, such that local entities can properly intervene. In this work, we leverage YOLOv7, a state-of-the-art and computationally efficient detection approach, to localize and track mosquito foci in videos captured by unmanned aerial vehicles. We experiment on a dataset released to the public as part of the ICIP 2023 grand challenge entitled Automatic Detection of Mosquito Breeding Grounds. We show that YOLOv7 can be directly applied to detect larger foci categories such as pools, tires, and water tanks and that a cheap and straightforward aggregation of frame-by-frame detection can incorporate time consistency into the tracking process.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
An experiment on an automated literature survey of data-driven speech enhancement methods
Authors:
Arthur dos Santos,
Jayr Pereira,
Rodrigo Nogueira,
Bruno Masiero,
Shiva Sander-Tavallaey,
Elias Zea
Abstract:
The increasing number of scientific publications in acoustics, in general, presents difficulties in conducting traditional literature surveys. This work explores the use of a generative pre-trained transformer (GPT) model to automate a literature survey of 116 articles on data-driven speech enhancement methods. The main objective is to evaluate the capabilities and limitations of the model in prov…
▽ More
The increasing number of scientific publications in acoustics, in general, presents difficulties in conducting traditional literature surveys. This work explores the use of a generative pre-trained transformer (GPT) model to automate a literature survey of 116 articles on data-driven speech enhancement methods. The main objective is to evaluate the capabilities and limitations of the model in providing accurate responses to specific queries about the papers selected from a reference human-based survey. While we see great potential to automate literature surveys in acoustics, improvements are needed to address technical questions more clearly and accurately.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Multilingual Natural Language Processing Model for Radiology Reports -- The Summary is all you need!
Authors:
Mariana Lindo,
Ana Sofia Santos,
André Ferreira,
Jianning Li,
Gijs Luijten,
Gustavo Correia,
Moon Kim,
Benedikt Michael Schaarschmidt,
Cornelius Deuschl,
Johannes Haubold,
Jens Kleesiek,
Jan Egger,
Victor Alves
Abstract:
The impression section of a radiology report summarizes important radiology findings and plays a critical role in communicating these findings to physicians. However, the preparation of these summaries is time-consuming and error-prone for radiologists. Recently, numerous models for radiology report summarization have been developed. Nevertheless, there is currently no model that can summarize the…
▽ More
The impression section of a radiology report summarizes important radiology findings and plays a critical role in communicating these findings to physicians. However, the preparation of these summaries is time-consuming and error-prone for radiologists. Recently, numerous models for radiology report summarization have been developed. Nevertheless, there is currently no model that can summarize these reports in multiple languages. Such a model could greatly improve future research and the development of Deep Learning models that incorporate data from patients with different ethnic backgrounds. In this study, the generation of radiology impressions in different languages was automated by fine-tuning a model, publicly available, based on a multilingual text-to-text Transformer to summarize findings available in English, Portuguese, and German radiology reports. In a blind test, two board-certified radiologists indicated that for at least 70% of the system-generated summaries, the quality matched or exceeded the corresponding human-written summaries, suggesting substantial clinical reliability. Furthermore, this study showed that the multilingual model outperformed other models that specialized in summarizing radiology reports in only one language, as well as models that were not specifically designed for summarizing radiology reports, such as ChatGPT.
△ Less
Submitted 13 January, 2024; v1 submitted 29 September, 2023;
originally announced October 2023.
-
Multi-Bellman operator for convergence of $Q$-learning with linear function approximation
Authors:
Diogo S. Carvalho,
Pedro A. Santos,
Francisco S. Melo
Abstract:
We study the convergence of $Q$-learning with linear function approximation. Our key contribution is the introduction of a novel multi-Bellman operator that extends the traditional Bellman operator. By exploring the properties of this operator, we identify conditions under which the projected multi-Bellman operator becomes contractive, providing improved fixed-point guarantees compared to the Bell…
▽ More
We study the convergence of $Q$-learning with linear function approximation. Our key contribution is the introduction of a novel multi-Bellman operator that extends the traditional Bellman operator. By exploring the properties of this operator, we identify conditions under which the projected multi-Bellman operator becomes contractive, providing improved fixed-point guarantees compared to the Bellman operator. To leverage these insights, we propose the multi $Q$-learning algorithm with linear function approximation. We demonstrate that this algorithm converges to the fixed-point of the projected multi-Bellman operator, yielding solutions of arbitrary accuracy. Finally, we validate our approach by applying it to well-known environments, showcasing the effectiveness and applicability of our findings.
△ Less
Submitted 28 September, 2023;
originally announced September 2023.
-
Sampling Methods for Inner Product Sketching
Authors:
Majid Daliri,
Juliana Freire,
Christopher Musco,
Aécio Santos,
Haoxiang Zhang
Abstract:
Recently, Bessa et al. (PODS 2023) showed that sketches based on coordinated weighted sampling theoretically and empirically outperform popular linear sketching methods like Johnson-Lindentrauss projection and CountSketch for the ubiquitous problem of inner product estimation. We further develop this finding by introducing and analyzing two alternative sampling-based methods. In contrast to the co…
▽ More
Recently, Bessa et al. (PODS 2023) showed that sketches based on coordinated weighted sampling theoretically and empirically outperform popular linear sketching methods like Johnson-Lindentrauss projection and CountSketch for the ubiquitous problem of inner product estimation. We further develop this finding by introducing and analyzing two alternative sampling-based methods. In contrast to the computationally expensive algorithm in Bessa et al., our methods run in linear time (to compute the sketch) and perform better in practice, significantly beating linear sketching on a variety of tasks. For example, they provide state-of-the-art results for estimating the correlation between columns in unjoined tables, a problem that we show how to reduce to inner product estimation in a black-box way. While based on known sampling techniques (threshold and priority sampling) we introduce significant new theoretical analysis to prove approximation guarantees for our methods.
△ Less
Submitted 15 January, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision
Authors:
Jianning Li,
Zongwei Zhou,
Jiancheng Yang,
Antonio Pepe,
Christina Gsaxner,
Gijs Luijten,
Chongyu Qu,
Tiezheng Zhang,
Xiaoxi Chen,
Wenxuan Li,
Marek Wodzinski,
Paul Friedrich,
Kangxian Xie,
Yuan **,
Narmada Ambigapathy,
Enrico Nasca,
Naida Solak,
Gian Marco Melito,
Viet Duc Vu,
Afaque R. Memon,
Christopher Schlachta,
Sandrine De Ribaupierre,
Rajnikant Patel,
Roy Eagleson,
Xiaojun Chen
, et al. (132 additional authors not shown)
Abstract:
Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of Shape…
▽ More
Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915 models). For the medical domain, we present a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems. As a unique feature, we directly model the majority of shapes on the imaging data of real patients. As of today, MedShapeNet includes 23 dataset with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via a web interface and a Python application programming interface (API) and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications in virtual, augmented, or mixed reality, and 3D printing. Exemplary, we present use cases in the fields of classification of brain tumors, facial and skull reconstructions, multi-class anatomy completion, education, and 3D printing. In future, we will extend the data and improve the interfaces. The project pages are: https://medshapenet.ikim.nrw/ and https://github.com/Jianningli/medshapenet-feedback
△ Less
Submitted 12 December, 2023; v1 submitted 30 August, 2023;
originally announced August 2023.
-
Efficient set-theoretic algorithms for computing high-order Forman-Ricci curvature on abstract simplicial complexes
Authors:
Danillo Barros de Souza,
Jonatas T. S. da Cunha,
Fernando A. N. Santos,
Jürgen Jost,
Serafim Rodrigues
Abstract:
Forman-Ricci curvature (FRC) is a potent and powerful tool for analysing empirical networks, as the distribution of the curvature values can identify structural information that is not readily detected by other geometrical methods. Crucially, FRC captures higher-order structural information of clique complexes of a graph or Vietoris-Rips complexes, which is not readily accessible to alternative me…
▽ More
Forman-Ricci curvature (FRC) is a potent and powerful tool for analysing empirical networks, as the distribution of the curvature values can identify structural information that is not readily detected by other geometrical methods. Crucially, FRC captures higher-order structural information of clique complexes of a graph or Vietoris-Rips complexes, which is not readily accessible to alternative methods. However, existing FRC platforms are prohibitively computationally expensive. Therefore, herein we develop an efficient set-theoretic formulation for computing such high-order FRC in simplicial complexes. Significantly, our set theory representation reveals previous computational bottlenecks and also accelerates the computation of FRC. Finally, We provide a pseudo-code, a software implementation coined FastForman, as well as a benchmark comparison with alternative implementations. We envisage that FastForman will be used in Topological and Geometrical Data analysis for high-dimensional complex data sets. Moreover, our development paves the way for future generalisations towards efficient computations of FRC on cell complexes.
△ Less
Submitted 9 May, 2024; v1 submitted 22 August, 2023;
originally announced August 2023.
-
Simple Analysis of Priority Sampling
Authors:
Majid Daliri,
Juliana Freire,
Christopher Musco,
Aécio Santos,
Haoxiang Zhang
Abstract:
We prove a tight upper bound on the variance of the priority sampling method (aka sequential Poisson sampling). Our proof is significantly shorter and simpler than the original proof given by Mario Szegedy at STOC 2006, which resolved a conjecture by Duffield, Lund, and Thorup.
We prove a tight upper bound on the variance of the priority sampling method (aka sequential Poisson sampling). Our proof is significantly shorter and simpler than the original proof given by Mario Szegedy at STOC 2006, which resolved a conjecture by Duffield, Lund, and Thorup.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
Manual Tests Do Smell! Cataloging and Identifying Natural Language Test Smells
Authors:
Elvys Soares,
Manoel Aranda,
Naelson Oliveira,
Márcio Ribeiro,
Rohit Gheyi,
Emerson Souza,
Ivan Machado,
André Santos,
Baldoino Fonseca,
Rodrigo Bonifácio
Abstract:
Background: Test smells indicate potential problems in the design and implementation of automated software tests that may negatively impact test code maintainability, coverage, and reliability. When poorly described, manual tests written in natural language may suffer from related problems, which enable their analysis from the point of view of test smells. Despite the possible prejudice to manuall…
▽ More
Background: Test smells indicate potential problems in the design and implementation of automated software tests that may negatively impact test code maintainability, coverage, and reliability. When poorly described, manual tests written in natural language may suffer from related problems, which enable their analysis from the point of view of test smells. Despite the possible prejudice to manually tested software products, little is known about test smells in manual tests, which results in many open questions regarding their types, frequency, and harm to tests written in natural language. Aims: Therefore, this study aims to contribute to a catalog of test smells for manual tests. Method: We perform a two-fold empirical strategy. First, an exploratory study in manual tests of three systems: the Ubuntu Operational System, the Brazilian Electronic Voting Machine, and the User Interface of a large smartphone manufacturer. We use our findings to propose a catalog of eight test smells and identification rules based on syntactical and morphological text analysis, validating our catalog with 24 in-company test engineers. Second, using our proposals, we create a tool based on Natural Language Processing (NLP) to analyze the subject systems' tests, validating the results. Results: We observed the occurrence of eight test smells. A survey of 24 in-company test professionals showed that 80.7% agreed with our catalog definitions and examples. Our NLP-based tool achieved a precision of 92%, recall of 95%, and f-measure of 93.5%, and its execution evidenced 13,169 occurrences of our cataloged test smells in the analyzed systems. Conclusion: We contribute with a catalog of natural language test smells and novel detection strategies that better explore the capabilities of current NLP mechanisms with promising results and reduced effort to analyze tests written in different idioms.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Building Persuasive Robots with Social Power Strategies
Authors:
Mojgan Hashemian,
Marta Couto,
Samuel Mascarenhas,
Ana Paiva,
Pedro A. Santos,
Rui Prada
Abstract:
Can social power endow social robots with the capacity to persuade? This paper represents our recent endeavor to design persuasive social robots. We have designed and run three different user studies to investigate the effectiveness of different bases of social power (inspired by French and Raven's theory) on peoples' compliance to the requests of social robots. The results show that robotic persu…
▽ More
Can social power endow social robots with the capacity to persuade? This paper represents our recent endeavor to design persuasive social robots. We have designed and run three different user studies to investigate the effectiveness of different bases of social power (inspired by French and Raven's theory) on peoples' compliance to the requests of social robots. The results show that robotic persuaders that exert social power (specifically from expert, reward, and coercion bases) demonstrate increased ability to influence humans. The first study provides a positive answer and shows that under the same circumstances, people with different personalities prefer robots using a specific social power base. In addition, social rewards can be useful in persuading individuals. The second study suggests that by employing social power, social robots are capable of persuading people objectively to select a less desirable choice among others. Finally, the third study shows that the effect of power on persuasion does not decay over time and might strengthen under specific circumstances. Moreover, exerting stronger social power does not necessarily lead to higher persuasion. Overall, we argue that the results of these studies are relevant for designing human--robot-interaction scenarios especially the ones aiming at behavioral change.
△ Less
Submitted 1 September, 2023; v1 submitted 12 July, 2023;
originally announced July 2023.
-
Hinting Pipeline and Multivariate Regression CNN for Maize Kernel Counting on the Ear
Authors:
Felipe Araújo,
Igor Gadelha,
Rodrigo Tsukahara,
Luiz Pita,
Filipe Costa,
Igor Vaz,
Andreza Santos,
Guilherme Fôlego
Abstract:
Maize is a highly nutritional cereal widely used for human and animal consumption and also as raw material by the biofuels industries. This highlights the importance of precisely quantifying the corn grain productivity in season, hel** the commercialization process, operationalization, and critical decision-making. Considering the manual labor cost of counting maize kernels, we propose in this w…
▽ More
Maize is a highly nutritional cereal widely used for human and animal consumption and also as raw material by the biofuels industries. This highlights the importance of precisely quantifying the corn grain productivity in season, hel** the commercialization process, operationalization, and critical decision-making. Considering the manual labor cost of counting maize kernels, we propose in this work a novel preprocessing pipeline named hinting that guides the attention of the model to the center of the corn kernels and enables a deep learning model to deliver better performance, given a picture of one side of the corn ear. Also, we propose a multivariate CNN regressor that outperforms single regression results. Experiments indicated that the proposed approach excels the current manual estimates, obtaining MAE of 34.4 and R2 of 0.74 against 35.38 and 0.72 for the manual estimate, respectively.
△ Less
Submitted 10 June, 2023;
originally announced June 2023.
-
Computing Education in the Era of Generative AI
Authors:
Paul Denny,
James Prather,
Brett A. Becker,
James Finnie-Ansley,
Arto Hellas,
Juho Leinonen,
Andrew Luxton-Reilly,
Brent N. Reeves,
Eddie Antonio Santos,
Sami Sarsa
Abstract:
The computing education community has a rich history of pedagogical innovation designed to support students in introductory courses, and to support teachers in facilitating student learning. Very recent advances in artificial intelligence have resulted in code generation models that can produce source code from natural language problem descriptions -- with impressive accuracy in many cases. The wi…
▽ More
The computing education community has a rich history of pedagogical innovation designed to support students in introductory courses, and to support teachers in facilitating student learning. Very recent advances in artificial intelligence have resulted in code generation models that can produce source code from natural language problem descriptions -- with impressive accuracy in many cases. The wide availability of these models and their ease of use has raised concerns about potential impacts on many aspects of society, including the future of computing education. In this paper, we discuss the challenges and opportunities such models present to computing educators, with a focus on introductory programming classrooms. We summarize the results of two recent articles, the first evaluating the performance of code generation models on typical introductory-level programming problems, and the second exploring the quality and novelty of learning resources generated by these models. We consider likely impacts of such models upon pedagogical practice in the context of the most recent advances at the time of writing.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
Design and implementation of intelligent packet filtering in IoT microcontroller-based devices
Authors:
Gustavo de Carvalho Bertoli,
Gabriel Victor C. Fernandes,
Pedro H. Borges Monici,
César H. de Araujo Guibo,
Lourenço Alves Pereira Jr.,
Aldri Santos
Abstract:
Internet of Things (IoT) devices are increasingly pervasive and essential components in enabling new applications and services. However, their widespread use also exposes them to exploitable vulnerabilities and flaws that can lead to significant losses. In this context, ensuring robust cybersecurity measures is essential to protect IoT devices from malicious attacks. However, the current solutions…
▽ More
Internet of Things (IoT) devices are increasingly pervasive and essential components in enabling new applications and services. However, their widespread use also exposes them to exploitable vulnerabilities and flaws that can lead to significant losses. In this context, ensuring robust cybersecurity measures is essential to protect IoT devices from malicious attacks. However, the current solutions that provide flexible policy specifications and higher security levels for IoT devices are scarce. To address this gap, we introduce T800, a low-resource packet filter that utilizes machine learning (ML) algorithms to classify packets in IoT devices. We present a detailed performance benchmarking framework and demonstrate T800's effectiveness on the ESP32 system-on-chip microcontroller and ESP-IDF framework. Our evaluation shows that T800 is an efficient solution that increases device computational capacity by excluding unsolicited malicious traffic from the processing pipeline. Additionally, T800 is adaptable to different systems and provides a well-documented performance evaluation strategy for security ML-based mechanisms on ESP32-based IoT systems. Our research contributes to improving the cybersecurity of resource-constrained IoT devices and provides a scalable, efficient solution that can be used to enhance the security of IoT systems.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
Fashion Object Detection for Tops & Bottoms
Authors:
Andreas Petridis,
Mirela Popa,
Filipa Peleja,
Dario Dotti,
Alberto de Santos
Abstract:
Fashion is one of the largest world's industries and computer vision techniques have been becoming more popular in recent years, in particular, for tasks such as object detection and apparel segmentation. Even with the rapid growth in computer vision solutions, specifically for the fashion industry, many problems are far for being resolved. Therefore, not at all times, adjusting out-of-the-box pre…
▽ More
Fashion is one of the largest world's industries and computer vision techniques have been becoming more popular in recent years, in particular, for tasks such as object detection and apparel segmentation. Even with the rapid growth in computer vision solutions, specifically for the fashion industry, many problems are far for being resolved. Therefore, not at all times, adjusting out-of-the-box pre-trained computer vision models will provide the desired solution. In the present paper is proposed a pipeline that takes a noisy image with a person and specifically detects the regions with garments that are bottoms or tops. Our solution implements models that are capable of finding human parts in an image e.g. full-body vs half-body, or no human is found. Then, other models knowing that there's a human and its composition (e.g. not always we have a full-body) finds the bounding boxes/regions of the image that very likely correspond to a bottom or a top. For the creation of bounding boxes/regions task, a benchmark dataset was specifically prepared. The results show that the Mask RCNN solution is robust, and generalized enough to be used and scalable in unseen apparel/fashion data.
△ Less
Submitted 29 May, 2023;
originally announced May 2023.
-
Human Body Shape Classification Based on a Single Image
Authors:
Cameron Trotter,
Filipa Peleja,
Dario Dotti,
Alberto de Santos
Abstract:
There is high demand for online fashion recommender systems that incorporate the needs of the consumer's body shape. As such, we present a methodology to classify human body shape from a single image. This is achieved through the use of instance segmentation and keypoint estimation models, trained only on open-source benchmarking datasets. The system is capable of performing in noisy environments…
▽ More
There is high demand for online fashion recommender systems that incorporate the needs of the consumer's body shape. As such, we present a methodology to classify human body shape from a single image. This is achieved through the use of instance segmentation and keypoint estimation models, trained only on open-source benchmarking datasets. The system is capable of performing in noisy environments owing to to robust background subtraction. The proposed methodology does not require 3D body recreation as a result of classification based on estimated keypoints, nor requires historical information about a user to operate - calculating all required measurements at the point of use. We evaluate our methodology both qualitatively against existing body shape classifiers and quantitatively against a novel dataset of images, which we provide for use to the community. The resultant body shape classification can be utilised in a variety of downstream tasks, such as input to size and fit recommendation or virtual try-on systems.
△ Less
Submitted 29 May, 2023;
originally announced May 2023.
-
"It's Weird That it Knows What I Want": Usability and Interactions with Copilot for Novice Programmers
Authors:
James Prather,
Brent N. Reeves,
Paul Denny,
Brett A. Becker,
Juho Leinonen,
Andrew Luxton-Reilly,
Garrett Powell,
James Finnie-Ansley,
Eddie Antonio Santos
Abstract:
Recent developments in deep learning have resulted in code-generation models that produce source code from natural language and code-based prompts with high accuracy. This is likely to have profound effects in the classroom, where novices learning to code can now use free tools to automatically suggest solutions to programming exercises and assignments. However, little is currently known about how…
▽ More
Recent developments in deep learning have resulted in code-generation models that produce source code from natural language and code-based prompts with high accuracy. This is likely to have profound effects in the classroom, where novices learning to code can now use free tools to automatically suggest solutions to programming exercises and assignments. However, little is currently known about how novices interact with these tools in practice. We present the first study that observes students at the introductory level using one such code auto-generating tool, Github Copilot, on a typical introductory programming (CS1) assignment. Through observations and interviews we explore student perceptions of the benefits and pitfalls of this technology for learning, present new observed interaction patterns, and discuss cognitive and metacognitive difficulties faced by students. We consider design implications of these findings, specifically in terms of how tools like Copilot can better support and scaffold the novice programming experience.
△ Less
Submitted 5 April, 2023;
originally announced April 2023.
-
Incremental Value and Interpretability of Radiomics Features of Both Lung and Epicardial Adipose Tissue for Detecting the Severity of COVID-19 Infection
Authors:
Ni Yao,
Yanhui Tian,
Daniel Gama das Neves,
Chen Zhao,
Claudio Tinoco Mesquita,
Wolney de Andrade Martins,
Alair Augusto Sarmet Moreira Damas dos Santos,
Yanting Li,
Chuang Han,
Fubao Zhu,
Neng Dai,
Weihua Zhou
Abstract:
Epicardial adipose tissue (EAT) is known for its pro-inflammatory properties and association with Coronavirus Disease 2019 (COVID-19) severity. However, current EAT segmentation methods do not consider positional information. Additionally, the detection of COVID-19 severity lacks consideration for EAT radiomics features, which limits interpretability. This study investigates the use of radiomics f…
▽ More
Epicardial adipose tissue (EAT) is known for its pro-inflammatory properties and association with Coronavirus Disease 2019 (COVID-19) severity. However, current EAT segmentation methods do not consider positional information. Additionally, the detection of COVID-19 severity lacks consideration for EAT radiomics features, which limits interpretability. This study investigates the use of radiomics features from EAT and lungs to detect the severity of COVID-19 infections. A retrospective analysis of 515 patients with COVID-19 (Cohort1: 415, Cohort2: 100) was conducted using a proposed three-stage deep learning approach for EAT extraction. Lung segmentation was achieved using a published method. A hybrid model for detecting the severity of COVID-19 was built in a derivation cohort, and its performance and uncertainty were evaluated in internal (125, Cohort1) and external (100, Cohort2) validation cohorts. For EAT extraction, the Dice similarity coefficients (DSC) of the two centers were 0.972 (+-0.011) and 0.968 (+-0.005), respectively. For severity detection, the hybrid model with radiomics features of both lungs and EAT showed improvements in AUC, net reclassification improvement (NRI), and integrated discrimination improvement (IDI) compared to the model with only lung radiomics features. The hybrid model exhibited an increase of 0.1 (p<0.001), 19.3%, and 18.0% respectively, in the internal validation cohort and an increase of 0.09 (p<0.001), 18.0%, and 18.0%, respectively, in the external validation cohort while outperforming existing detection methods. Uncertainty quantification and radiomics features analysis confirmed the interpretability of case prediction after inclusion of EAT features.
△ Less
Submitted 6 December, 2023; v1 submitted 28 January, 2023;
originally announced January 2023.
-
RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs
Authors:
André Santos,
João Dinis Ferreira,
Onur Mutlu,
Gabriel Falcao
Abstract:
In recent years, Convolutional Neural Networks (CNNs) have become the standard class of deep neural network for image processing, classification and segmentation tasks. However, the large strides in accuracy obtained by CNNs have been derived from increasing the complexity of network topologies, which incurs sizeable performance and energy penalties in the training and inference of CNNs. Many rece…
▽ More
In recent years, Convolutional Neural Networks (CNNs) have become the standard class of deep neural network for image processing, classification and segmentation tasks. However, the large strides in accuracy obtained by CNNs have been derived from increasing the complexity of network topologies, which incurs sizeable performance and energy penalties in the training and inference of CNNs. Many recent works have validated the effectiveness of parameter quantization, which consists in reducing the bit width of the network's parameters, to enable the attainment of considerable performance and energy efficiency gains without significantly compromising accuracy. However, it is difficult to compare the relative effectiveness of different quantization methods. To address this problem, we introduce RedBit, an open-source framework that provides a transparent, extensible and easy-to-use interface to evaluate the effectiveness of different algorithms and parameter configurations on network accuracy. We use RedBit to perform a comprehensive survey of five state-of-the-art quantization methods applied to the MNIST, CIFAR-10 and ImageNet datasets. We evaluate a total of 2300 individual bit width combinations, independently tuning the width of the network's weight and input activation parameters, from 32 bits down to 1 bit (e.g., 8/8, 2/2, 1/32, 1/1, for weights/activations). Upwards of 20000 hours of computing time in a pool of state-of-the-art GPUs were used to generate all the results in this paper. For 1-bit quantization, the accuracy losses for the MNIST, CIFAR-10 and ImageNet datasets range between [0.26%, 0.79%], [9.74%, 32.96%] and [10.86%, 47.36%] top-1, respectively. We actively encourage the reader to download the source code and experiment with RedBit, and to submit their own observed results to our public repository, available at https://github.com/IT-Coimbra/RedBit.
△ Less
Submitted 15 January, 2023;
originally announced January 2023.
-
Weighted Minwise Hashing Beats Linear Sketching for Inner Product Estimation
Authors:
Aline Bessa,
Majid Daliri,
Juliana Freire,
Cameron Musco,
Christopher Musco,
Aécio Santos,
Haoxiang Zhang
Abstract:
We present a new approach for computing compact sketches that can be used to approximate the inner product between pairs of high-dimensional vectors. Based on the Weighted MinHash algorithm, our approach admits strong accuracy guarantees that improve on the guarantees of popular linear sketching approaches for inner product estimation, such as CountSketch and Johnson-Lindenstrauss projection. Spec…
▽ More
We present a new approach for computing compact sketches that can be used to approximate the inner product between pairs of high-dimensional vectors. Based on the Weighted MinHash algorithm, our approach admits strong accuracy guarantees that improve on the guarantees of popular linear sketching approaches for inner product estimation, such as CountSketch and Johnson-Lindenstrauss projection. Specifically, while our method admits guarantees that exactly match linear sketching for dense vectors, it yields significantly lower error for sparse vectors with limited overlap between non-zero entries. Such vectors arise in many applications involving sparse data. They are also important in increasingly popular dataset search applications, where inner product sketches are used to estimate data covariance, conditional means, and other quantities involving columns in unjoined tables. We complement our theoretical results by showing that our approach empirically outperforms existing linear sketches and unweighted hashing-based sketches for sparse vectors.
△ Less
Submitted 5 May, 2023; v1 submitted 13 January, 2023;
originally announced January 2023.
-
Bayesian Additive Main Effects and Multiplicative Interaction Models using Tensor Regression for Multi-environmental Trials
Authors:
Antonia A. L. Dos Santos,
Danilo A. Sarti,
Rafael A. Moral,
Andrew C. Parnell
Abstract:
We propose a Bayesian tensor regression model to accommodate the effect of multiple factors on phenotype prediction. We adopt a set of prior distributions that resolve identifiability issues that may arise between the parameters in the model. Simulation experiments show that our method out-performs previous related models and machine learning algorithms under different sample sizes and degrees of…
▽ More
We propose a Bayesian tensor regression model to accommodate the effect of multiple factors on phenotype prediction. We adopt a set of prior distributions that resolve identifiability issues that may arise between the parameters in the model. Simulation experiments show that our method out-performs previous related models and machine learning algorithms under different sample sizes and degrees of complexity. We further explore the applicability of our model by analysing real-world data related to wheat production across Ireland from 2010 to 2019. Our model performs competitively and overcomes key limitations found in other analogous approaches. Finally, we adapt a set of visualisations for the posterior distribution of the tensor effects that facilitate the identification of optimal interactions between the tensor variables whilst accounting for the uncertainty in the posterior distribution.
△ Less
Submitted 9 January, 2023;
originally announced January 2023.
-
GAN-Based Content Generation of Maps for Strategy Games
Authors:
Vasco Nunes,
João Dias,
Pedro A. Santos
Abstract:
Maps are a very important component of strategy games, and a time-consuming task if done by hand. Maps generated by traditional PCG techniques such as Perlin noise or tile-based PCG techniques look unnatural and unappealing, thus not providing the best user experience for the players. However it is possible to have a generator that can create realistic and natural images of maps, given that it is…
▽ More
Maps are a very important component of strategy games, and a time-consuming task if done by hand. Maps generated by traditional PCG techniques such as Perlin noise or tile-based PCG techniques look unnatural and unappealing, thus not providing the best user experience for the players. However it is possible to have a generator that can create realistic and natural images of maps, given that it is trained how to do so. We propose a model for the generation of maps based on Generative Adversarial Networks (GAN). In our implementation we tested out different variants of GAN-based networks on a dataset of heightmaps. We conducted extensive empirical evaluation to determine the advantages and properties of each approach. The results obtained are promising, showing that it is indeed possible to generate realistic looking maps using this type of approach.
△ Less
Submitted 7 January, 2023;
originally announced January 2023.
-
Programming Is Hard -- Or at Least It Used to Be: Educational Opportunities And Challenges of AI Code Generation
Authors:
Brett A. Becker,
Paul Denny,
James Finnie-Ansley,
Andrew Luxton-Reilly,
James Prather,
Eddie Antonio Santos
Abstract:
The introductory programming sequence has been the focus of much research in computing education. The recent advent of several viable and freely-available AI-driven code generation tools present several immediate opportunities and challenges in this domain. In this position paper we argue that the community needs to act quickly in deciding what possible opportunities can and should be leveraged an…
▽ More
The introductory programming sequence has been the focus of much research in computing education. The recent advent of several viable and freely-available AI-driven code generation tools present several immediate opportunities and challenges in this domain. In this position paper we argue that the community needs to act quickly in deciding what possible opportunities can and should be leveraged and how, while also working on how to overcome or otherwise mitigate the possible challenges. Assuming that the effectiveness and proliferation of these tools will continue to progress rapidly, without quick, deliberate, and concerted efforts, educators will lose advantage in hel** shape what opportunities come to be, and what challenges will endure. With this paper we aim to seed this discussion within the computing education community.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Authors:
BigScience Workshop,
:,
Teven Le Scao,
Angela Fan,
Christopher Akiki,
Ellie Pavlick,
Suzana Ilić,
Daniel Hesslow,
Roman Castagné,
Alexandra Sasha Luccioni,
François Yvon,
Matthias Gallé,
Jonathan Tow,
Alexander M. Rush,
Stella Biderman,
Albert Webson,
Pawan Sasanka Ammanamanchi,
Thomas Wang,
Benoît Sagot,
Niklas Muennighoff,
Albert Villanova del Moral,
Olatunji Ruwase,
Rachel Bawden,
Stas Bekman,
Angelina McMillan-Major
, et al. (369 additional authors not shown)
Abstract:
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access…
▽ More
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
△ Less
Submitted 27 June, 2023; v1 submitted 9 November, 2022;
originally announced November 2022.
-
Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning
Authors:
Pedro P. Santos,
Diogo S. Carvalho,
Miguel Vasco,
Alberto Sardinha,
Pedro A. Santos,
Ana Paiva,
Francisco S. Melo
Abstract:
We introduce hybrid execution in multi-agent reinforcement learning (MARL), a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time by taking advantage of information-sharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully…
▽ More
We introduce hybrid execution in multi-agent reinforcement learning (MARL), a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time by taking advantage of information-sharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully decentralized), to a setting featuring full communication (fully centralized), but the agents do not know beforehand which communication level they will encounter at execution time. To formalize our setting, we define a new class of multi-agent partially observable Markov decision processes (POMDPs) that we name hybrid-POMDPs, which explicitly model a communication process between the agents. We contribute MARO, an approach that makes use of an auto-regressive predictive model, trained in a centralized manner, to estimate missing agents' observations at execution time. We evaluate MARO on standard scenarios and extensions of previous benchmarks tailored to emphasize the negative impact of partial observability in MARL. Experimental results show that our method consistently outperforms relevant baselines, allowing agents to act with faulty communication while successfully exploiting shared information.
△ Less
Submitted 5 June, 2023; v1 submitted 12 October, 2022;
originally announced October 2022.
-
Robotic Learning the Sequence of Packing Irregular Objects from Human Demonstrations
Authors:
André Santos,
Nuno Ferreira Duarte,
Atabak Dehban,
José Santos-Victor
Abstract:
We tackle the challenge of robotic bin packing with irregular objects, such as groceries. Given the diverse physical attributes of these objects and the complex constraints governing their placement and manipulation, employing preprogrammed strategies becomes unfeasible. Our approach is to learn directly from expert demonstrations in order to extract implicit task knowledge and strategies to ensur…
▽ More
We tackle the challenge of robotic bin packing with irregular objects, such as groceries. Given the diverse physical attributes of these objects and the complex constraints governing their placement and manipulation, employing preprogrammed strategies becomes unfeasible. Our approach is to learn directly from expert demonstrations in order to extract implicit task knowledge and strategies to ensure safe object positioning, efficient use of space, and the generation of human-like behaviors that enhance human-robot trust.
We rely on human demonstrations to learn a Markov chain for predicting the object packing sequence for a given set of items and then compare it with human performance. Our experimental results show that the model outperforms human performance by generating sequence predictions that humans classify as human-like more frequently than human-generated sequences.
The human demonstrations were collected using our proposed VR platform, BoxED, which is a box packaging environment for simulating real-world objects and scenarios for fast and streamlined data collection with the purpose of teaching robots. We collected data from 43 participants packing a total of 263 boxes with supermarket-like objects, yielding 4644 object manipulations. Our VR platform can be easily adapted to new scenarios and objects, and is publicly available, alongside our dataset, at https://github.com/andrejfsantos4/BoxED.
△ Less
Submitted 8 November, 2023; v1 submitted 4 October, 2022;
originally announced October 2022.
-
Explanation-by-Example Based on Item Response Theory
Authors:
Lucas F. F. Cardoso,
José de S. Ribeiro,
Vitor C. A. Santos,
Raíssa L. Silva,
Marcelle P. Mota,
Ricardo B. C. Prudêncio,
Ronnie C. O. Alves
Abstract:
Intelligent systems that use Machine Learning classification algorithms are increasingly common in everyday society. However, many systems use black-box models that do not have characteristics that allow for self-explanation of their predictions. This situation leads researchers in the field and society to the following question: How can I trust the prediction of a model I cannot understand? In th…
▽ More
Intelligent systems that use Machine Learning classification algorithms are increasingly common in everyday society. However, many systems use black-box models that do not have characteristics that allow for self-explanation of their predictions. This situation leads researchers in the field and society to the following question: How can I trust the prediction of a model I cannot understand? In this sense, XAI emerges as a field of AI that aims to create techniques capable of explaining the decisions of the classifier to the end-user. As a result, several techniques have emerged, such as Explanation-by-Example, which has a few initiatives consolidated by the community currently working with XAI. This research explores the Item Response Theory (IRT) as a tool to explaining the models and measuring the level of reliability of the Explanation-by-Example approach. To this end, four datasets with different levels of complexity were used, and the Random Forest model was used as a hypothesis test. From the test set, 83.8% of the errors are from instances in which the IRT points out the model as unreliable.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Song Emotion Recognition: a Performance Comparison Between Audio Features and Artificial Neural Networks
Authors:
Karen Rosero,
Arthur Nicholas dos Santos,
Pedro Benevenuto Valadares,
Bruno Sanches Masiero
Abstract:
When songs are composed or performed, there is often an intent by the singer/songwriter of expressing feelings or emotions through it. For humans, matching the emotiveness in a musical composition or performance with the subjective perception of an audience can be quite challenging. Fortunately, the machine learning approach for this problem is simpler. Usually, it takes a data-set, from which aud…
▽ More
When songs are composed or performed, there is often an intent by the singer/songwriter of expressing feelings or emotions through it. For humans, matching the emotiveness in a musical composition or performance with the subjective perception of an audience can be quite challenging. Fortunately, the machine learning approach for this problem is simpler. Usually, it takes a data-set, from which audio features are extracted to present this information to a data-driven model, that will, in turn, train to predict what is the probability that a given song matches a target emotion. In this paper, we studied the most common features and models used in recent publications to tackle this problem, revealing which ones are best suited for recognizing emotion in a cappella songs.
△ Less
Submitted 24 September, 2022;
originally announced September 2022.
-
Generalizing intrusion detection for heterogeneous networks: A stacked-unsupervised federated learning approach
Authors:
Gustavo de Carvalho Bertoli,
Lourenço Alves Pereira Junior,
Aldri Luiz dos Santos,
Osamu Saotome
Abstract:
The constantly evolving digital transformation imposes new requirements on our society. Aspects relating to reliance on the networking domain and the difficulty of achieving security by design pose a challenge today. As a result, data-centric and machine-learning approaches arose as feasible solutions for securing large networks. Although, in the network security domain, ML-based solutions face a…
▽ More
The constantly evolving digital transformation imposes new requirements on our society. Aspects relating to reliance on the networking domain and the difficulty of achieving security by design pose a challenge today. As a result, data-centric and machine-learning approaches arose as feasible solutions for securing large networks. Although, in the network security domain, ML-based solutions face a challenge regarding the capability to generalize between different contexts. In other words, solutions based on specific network data usually do not perform satisfactorily on other networks. This paper describes the stacked-unsupervised federated learning (FL) approach to generalize on a cross-silo configuration for a flow-based network intrusion detection system (NIDS). The proposed approach we have examined comprises a deep autoencoder in conjunction with an energy flow classifier in an ensemble learning task. Our approach performs better than traditional local learning and naive cross-evaluation (training in one context and testing on another network data). Remarkably, the proposed approach demonstrates a sound performance in the case of non-iid data silos. In conjunction with an informative feature in an ensemble architecture for unsupervised learning, we advise that the proposed FL-based NIDS results in a feasible approach for generalization between heterogeneous networks. To the best of our knowledge, our proposal is the first successful approach to applying unsupervised FL on the problem of network intrusion detection generalization using flow-based data.
△ Less
Submitted 28 November, 2022; v1 submitted 1 September, 2022;
originally announced September 2022.
-
Recovering the Graph Underlying Networked Dynamical Systems under Partial Observability: A Deep Learning Approach
Authors:
Sérgio Machado,
Anirudh Sridhar,
Paulo Gil,
Jorge Henriques,
José M. F. Moura,
Augusto Santos
Abstract:
We study the problem of graph structure identification, i.e., of recovering the graph of dependencies among time series. We model these time series data as components of the state of linear stochastic networked dynamical systems. We assume partial observability, where the state evolution of only a subset of nodes comprising the network is observed. We devise a new feature vector computed from the…
▽ More
We study the problem of graph structure identification, i.e., of recovering the graph of dependencies among time series. We model these time series data as components of the state of linear stochastic networked dynamical systems. We assume partial observability, where the state evolution of only a subset of nodes comprising the network is observed. We devise a new feature vector computed from the observed time series and prove that these features are linearly separable, i.e., there exists a hyperplane that separates the cluster of features associated with connected pairs of nodes from those associated with disconnected pairs. This renders the features amenable to train a variety of classifiers to perform causal inference. In particular, we use these features to train Convolutional Neural Networks (CNNs). The resulting causal inference mechanism outperforms state-of-the-art counterparts w.r.t. sample-complexity. The trained CNNs generalize well over structurally distinct networks (dense or sparse) and noise-level profiles. Remarkably, they also generalize well to real-world networks while trained over a synthetic network (realization of a random graph). Finally, the proposed method consistently reconstructs the graph in a pairwise manner, that is, by deciding if an edge or arrow is present or absent in each pair of nodes, from the corresponding time series of each pair. This fits the framework of large-scale systems, where observation or processing of all nodes in the network is prohibitive.
△ Less
Submitted 12 April, 2023; v1 submitted 8 August, 2022;
originally announced August 2022.
-
Emergent social NPC interactions in the Social NPCs Skyrim mod and beyond
Authors:
Manuel Guimarães,
Pedro A. Santos,
Arnav Jhala
Abstract:
This work presents an implementation of a social architecture model for authoring Non-Player Character (NPC) in open world games inspired in academic research on agentbased modeling. Believable NPC authoring is burdensome in terms of rich dialogue and responsive behaviors.
We briefly present the characteristics and advantages of using a social agent architecture for this task and describe an imp…
▽ More
This work presents an implementation of a social architecture model for authoring Non-Player Character (NPC) in open world games inspired in academic research on agentbased modeling. Believable NPC authoring is burdensome in terms of rich dialogue and responsive behaviors.
We briefly present the characteristics and advantages of using a social agent architecture for this task and describe an implementation of a social agent architecture CiF-CK released as a mod Social NPCs for The Elder Scrolls V: Skyrim
△ Less
Submitted 20 January, 2023; v1 submitted 27 July, 2022;
originally announced July 2022.
-
LudVision -- Remote Detection of Exotic Invasive Aquatic Floral Species using Drone-Mounted Multispectral Data
Authors:
António J. Abreu,
Luís A. Alexandre,
João A. Santos,
Filippo Basso
Abstract:
Remote sensing is the process of detecting and monitoring the physical characteristics of an area by measuring its reflected and emitted radiation at a distance. It is being broadly used to monitor ecosystems, mainly for their preservation. Ever-growing reports of invasive species have affected the natural balance of ecosystems. Exotic invasive species have a critical impact when introduced into n…
▽ More
Remote sensing is the process of detecting and monitoring the physical characteristics of an area by measuring its reflected and emitted radiation at a distance. It is being broadly used to monitor ecosystems, mainly for their preservation. Ever-growing reports of invasive species have affected the natural balance of ecosystems. Exotic invasive species have a critical impact when introduced into new ecosystems and may lead to the extinction of native species. In this study, we focus on Ludwigia peploides, considered by the European Union as an aquatic invasive species. Its presence can negatively impact the surrounding ecosystem and human activities such as agriculture, fishing, and navigation. Our goal was to develop a method to identify the presence of the species. We used images collected by a drone-mounted multispectral sensor to achieve this, creating our LudVision data set. To identify the targeted species on the collected images, we propose a new method for detecting Ludwigia p. in multispectral images. The method is based on existing state-of-the-art semantic segmentation methods modified to handle multispectral data. The proposed method achieved a producer's accuracy of 79.9% and a user's accuracy of 95.5%.
△ Less
Submitted 13 July, 2022; v1 submitted 12 July, 2022;
originally announced July 2022.
-
An adaptive music generation architecture for games based on the deep learning Transformer mode
Authors:
Gustavo Amaral Costa dos Santos,
Augusto Baffa,
Jean-Pierre Briot,
Bruno Feijó,
Antonio Luz Furtado
Abstract:
This paper presents an architecture for generating music for video games based on the Transformer deep learning model. Our motivation is to be able to customize the generation according to the taste of the player, who can select a corpus of training examples, corresponding to his preferred musical style. The system generates various musical layers, following the standard layering strategy currentl…
▽ More
This paper presents an architecture for generating music for video games based on the Transformer deep learning model. Our motivation is to be able to customize the generation according to the taste of the player, who can select a corpus of training examples, corresponding to his preferred musical style. The system generates various musical layers, following the standard layering strategy currently used by composers designing video game music. To adapt the music generated to the game play and to the player(s) situation, we are using an arousal-valence model of emotions, in order to control the selection of musical layers. We discuss current limitations and prospects for the future, such as collaborative and interactive control of the musical components.
△ Less
Submitted 10 September, 2022; v1 submitted 4 July, 2022;
originally announced July 2022.
-
Variational Inference for Additive Main and Multiplicative Interaction Effects Models
Authors:
AntÔnia A. L. Dos Santos,
Rafael A. Moral,
Danilo A. Sarti,
Andrew C. Parnell
Abstract:
In plant breeding the presence of a genotype by environment (GxE) interaction has a strong impact on cultivation decision making and the introduction of new crop cultivars. The combination of linear and bilinear terms has been shown to be very useful in modelling this type of data. A widely-used approach to identify GxE is the Additive Main Effects and Multiplicative Interaction Effects (AMMI) mod…
▽ More
In plant breeding the presence of a genotype by environment (GxE) interaction has a strong impact on cultivation decision making and the introduction of new crop cultivars. The combination of linear and bilinear terms has been shown to be very useful in modelling this type of data. A widely-used approach to identify GxE is the Additive Main Effects and Multiplicative Interaction Effects (AMMI) model. However, as data frequently can be high-dimensional, Markov chain Monte Carlo (MCMC) approaches can be computationally infeasible. In this article, we consider a variational inference approach for such a model. We derive variational approximations for estimating the parameters and we compare the approximations to MCMC using both simulated and real data. The new inferential framework we propose is on average two times faster whilst maintaining the same predictive performance as MCMC.
△ Less
Submitted 29 June, 2022;
originally announced July 2022.