Search | arXiv e-print repository

Analyzing constrained LLM through PDFA-learning

Authors: Matías Carrasco, Franz Mayr, Sergio Yovine, Johny Kidd, Martín Iturbide, Juan Pedro da Silva, Alejo Garat

Abstract: We define a congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation. We develop an algorithm for efficiently learning the quotient with respect to this congruence and evaluate it on case studies for analyzing statistical properties of LLM. We define a congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation. We develop an algorithm for efficiently learning the quotient with respect to this congruence and evaluate it on case studies for analyzing statistical properties of LLM. △ Less

Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

Comments: Workshop Paper

arXiv:2402.18511 [pdf]

Leveraging Compliant Tactile Perception for Haptic Blind Surface Reconstruction

Authors: Laurent Yves Emile Ramos Cheret, Vinicius Prado da Fonseca, Thiago Eustaquio Alves de Oliveira

Abstract: Non-flat surfaces pose difficulties for robots operating in unstructured environments. Reconstructions of uneven surfaces may only be partially possible due to non-compliant end-effectors and limitations on vision systems such as transparency, reflections, and occlusions. This study achieves blind surface reconstruction by harnessing the robotic manipulator's kinematic data and a compliant tactile… ▽ More Non-flat surfaces pose difficulties for robots operating in unstructured environments. Reconstructions of uneven surfaces may only be partially possible due to non-compliant end-effectors and limitations on vision systems such as transparency, reflections, and occlusions. This study achieves blind surface reconstruction by harnessing the robotic manipulator's kinematic data and a compliant tactile sensing module, which incorporates inertial, magnetic, and pressure sensors. The module's flexibility enables us to estimate contact positions and surface normals by analyzing its deformation during interactions with unknown objects. While previous works collect only positional information, we include the local normals in a geometrical approach to estimate curvatures between adjacent contact points. These parameters then guide a spline-based patch generation, which allows us to recreate larger surfaces without an increase in complexity while reducing the time-consuming step of probing the surface. Experimental validation demonstrates that this approach outperforms an off-the-shelf vision system in estimation accuracy. Moreover, this compliant haptic method works effectively even when the manipulator's approach angle is not aligned with the surface normals, which is ideal for unknown non-flat surfaces. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Comments: 7 pages, 9 figures, 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

arXiv:2401.06790 [pdf, other]

Using Zero-shot Prompting in the Automatic Creation and Expansion of Topic Taxonomies for Tagging Retail Banking Transactions

Authors: Daniel de S. Moraes, Pedro T. C. Santos, Polyana B. da Costa, Matheus A. S. Pinto, Ivan de J. P. Pinto, Álvaro M. G. da Veiga, Sergio Colcher, Antonio J. G. Busson, Rafael H. Rocha, Rennan Gaio, Rafael Miceli, Gabriela Tourinho, Marcos Rabaioli, Leandro Santos, Fellipe Marques, David Favaro

Abstract: This work presents an unsupervised method for automatically constructing and expanding topic taxonomies using instruction-based fine-tuned LLMs (Large Language Models). We apply topic modeling and keyword extraction techniques to create initial topic taxonomies and LLMs to post-process the resulting terms and create a hierarchy. To expand an existing taxonomy with new terms, we use zero-shot promp… ▽ More This work presents an unsupervised method for automatically constructing and expanding topic taxonomies using instruction-based fine-tuned LLMs (Large Language Models). We apply topic modeling and keyword extraction techniques to create initial topic taxonomies and LLMs to post-process the resulting terms and create a hierarchy. To expand an existing taxonomy with new terms, we use zero-shot prompting to find out where to add new nodes, which, to our knowledge, is the first work to present such an approach to taxonomy tasks. We use the resulting taxonomies to assign tags that characterize merchants from a retail bank dataset. To evaluate our work, we asked 12 volunteers to answer a two-part form in which we first assessed the quality of the taxonomies created and then the tags assigned to merchants based on that taxonomy. The evaluation revealed a coherence rate exceeding 90% for the chosen taxonomies. The taxonomies' expansion with LLMs also showed exciting results for parent node prediction, with an f1-score above 70% in our taxonomies. △ Less

Submitted 11 February, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

arXiv:2401.06601 [pdf, other]

A proposal to increase data utility on Global Differential Privacy data based on data use predictions

Authors: Henry C. Nunes, Marlon P. da Silva, Charles V. Neu, Avelino F. Zorzo

Abstract: This paper presents ongoing research focused on improving the utility of data protected by Global Differential Privacy(DP) in the scenario of summary statistics. Our approach is based on predictions on how an analyst will use statistics released under DP protection, so that a developer can optimise data utility on further usage of the data in the privacy budget allocation. This novel approach can… ▽ More This paper presents ongoing research focused on improving the utility of data protected by Global Differential Privacy(DP) in the scenario of summary statistics. Our approach is based on predictions on how an analyst will use statistics released under DP protection, so that a developer can optimise data utility on further usage of the data in the privacy budget allocation. This novel approach can potentially improve the utility of data without compromising privacy constraints. We also propose a metric that can be used by the developer to optimise the budget allocation process. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2401.01200 [pdf, other]

Skin cancer diagnosis using NIR spectroscopy data of skin lesions in vivo using machine learning algorithms

Authors: Flavio P. Loss, Pedro H. da Cunha, Matheus B. Rocha, Madson Poltronieri Zanoni, Leandro M. de Lima, Isadora Tavares Nascimento, Isabella Rezende, Tania R. P. Canuto, Luciana de Paula Vieira, Renan Rossoni, Maria C. S. Santos, Patricia Lyra Frasson, Wanderson Romão, Paulo R. Filgueiras, Renato A. Krohling

Abstract: Skin lesions are classified in benign or malignant. Among the malignant, melanoma is a very aggressive cancer and the major cause of deaths. So, early diagnosis of skin cancer is very desired. In the last few years, there is a growing interest in computer aided diagnostic (CAD) using most image and clinical data of the lesion. These sources of information present limitations due to their inability… ▽ More Skin lesions are classified in benign or malignant. Among the malignant, melanoma is a very aggressive cancer and the major cause of deaths. So, early diagnosis of skin cancer is very desired. In the last few years, there is a growing interest in computer aided diagnostic (CAD) using most image and clinical data of the lesion. These sources of information present limitations due to their inability to provide information of the molecular structure of the lesion. NIR spectroscopy may provide an alternative source of information to automated CAD of skin lesions. The most commonly used techniques and classification algorithms used in spectroscopy are Principal Component Analysis (PCA), Partial Least Squares - Discriminant Analysis (PLS-DA), and Support Vector Machines (SVM). Nonetheless, there is a growing interest in applying the modern techniques of machine and deep learning (MDL) to spectroscopy. One of the main limitations to apply MDL to spectroscopy is the lack of public datasets. Since there is no public dataset of NIR spectral data to skin lesions, as far as we know, an effort has been made and a new dataset named NIR-SC-UFES, has been collected, annotated and analyzed generating the gold-standard for classification of NIR spectral data to skin cancer. Next, the machine learning algorithms XGBoost, CatBoost, LightGBM, 1D-convolutional neural network (1D-CNN) were investigated to classify cancer and non-cancer skin lesions. Experimental results indicate the best performance obtained by LightGBM with pre-processing using standard normal variate (SNV), feature extraction providing values of 0.839 for balanced accuracy, 0.851 for recall, 0.852 for precision, and 0.850 for F-score. The obtained results indicate the first steps in CAD of skin lesions aiming the automated triage of patients with skin lesions in vivo using NIR spectral data. △ Less

Submitted 2 January, 2024; originally announced January 2024.

arXiv:2312.11805 [pdf, other]

Gemini: A Family of Highly Capable Multimodal Models

Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI. △ Less

Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2310.14553 [pdf, other]

Denoising Opponents Position in Partial Observation Environment

Authors: Aref Sayareh, Aria Sardari, Vahid Khoddami, Nader Zare, Vinicius Prado da Fonseca, Amilcar Soares

Abstract: The RoboCup competitions hold various leagues, and the Soccer Simulation 2D League is a major among them. Soccer Simulation 2D (SS2D) match involves two teams, including 11 players and a coach for each team, competing against each other. The players can only communicate with the Soccer Simulation Server during the game. Several code bases are released publicly to simplify team development. So rese… ▽ More The RoboCup competitions hold various leagues, and the Soccer Simulation 2D League is a major among them. Soccer Simulation 2D (SS2D) match involves two teams, including 11 players and a coach for each team, competing against each other. The players can only communicate with the Soccer Simulation Server during the game. Several code bases are released publicly to simplify team development. So researchers can easily focus on decision-making and implementing machine learning methods. SS2D actions and behaviors are only partially accurate due to different challenges, such as noise and partial observation. Therefore, one strategy is to implement alternative denoising methods to tackle observation inaccuracy. Our idea is to predict opponent positions while they have yet to be seen in a finite number of cycles using machine learning methods to make more accurate actions such as pass. We will explain our position prediction idea powered by Long Short-Term Memory models (LSTM) and Deep Neural Networks (DNN). The results show that the LSTM and DNN predict the opponents' position more accurately than the standard algorithm, such as the last-seen method. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2310.09709 [pdf, other]

New Advances in Body Composition Assessment with ShapedNet: A Single Image Deep Regression Approach

Authors: Navar Medeiros M. Nascimento, Pedro Cavalcante de Sousa Junior, Pedro Yuri Rodrigues Nunes, Suane Pires Pinheiro da Silva, Luiz Lannes Loureiro, Victor Zaban Bittencourt, Valden Luis Matos Capistrano Junior, Pedro Pedrosa Rebouças Filho

Abstract: We introduce a novel technique called ShapedNet to enhance body composition assessment. This method employs a deep neural network capable of estimating Body Fat Percentage (BFP), performing individual identification, and enabling localization using a single photograph. The accuracy of ShapedNet is validated through comprehensive comparisons against the gold standard method, Dual-Energy X-ray Absor… ▽ More We introduce a novel technique called ShapedNet to enhance body composition assessment. This method employs a deep neural network capable of estimating Body Fat Percentage (BFP), performing individual identification, and enabling localization using a single photograph. The accuracy of ShapedNet is validated through comprehensive comparisons against the gold standard method, Dual-Energy X-ray Absorptiometry (DXA), utilizing 1273 healthy adults spanning various ages, sexes, and BFP levels. The results demonstrate that ShapedNet outperforms in 19.5% state of the art computer vision-based approaches for body fat estimation, achieving a Mean Absolute Percentage Error (MAPE) of 4.91% and Mean Absolute Error (MAE) of 1.42. The study evaluates both gender-based and Gender-neutral approaches, with the latter showcasing superior performance. The method estimates BFP with 95% confidence within an error margin of 4.01% to 5.81%. This research advances multi-task learning and body composition assessment theory through ShapedNet. △ Less

Submitted 14 October, 2023; originally announced October 2023.

Comments: Preprinted version in October 2023. The paper is under consideration at Pattern Recognition Letters

arXiv:2307.15208 [pdf, other]

Generative AI for Medical Imaging: extending the MONAI Framework

Authors: Walter H. L. Pinaya, Mark S. Graham, Eric Kerfoot, Petru-Daniel Tudosiu, Jessica Dafflon, Virginia Fernandez, Pedro Sanchez, Julia Wolleb, Pedro F. da Costa, Ashay Patel, Hyung** Chung, Can Zhao, Wei Peng, Zelong Liu, Xueyan Mei, Oeslle Lucena, Jong Chul Ye, Sotirios A. Tsaftaris, Prerna Dogra, Andrew Feng, Marc Modat, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

Abstract: Recent advances in generative AI have brought incredible breakthroughs in several areas, including medical imaging. These generative models have tremendous potential not only to help safely share medical data via synthetic datasets but also to perform an array of diverse applications, such as anomaly detection, image-to-image translation, denoising, and MRI reconstruction. However, due to the comp… ▽ More Recent advances in generative AI have brought incredible breakthroughs in several areas, including medical imaging. These generative models have tremendous potential not only to help safely share medical data via synthetic datasets but also to perform an array of diverse applications, such as anomaly detection, image-to-image translation, denoising, and MRI reconstruction. However, due to the complexity of these models, their implementation and reproducibility can be difficult. This complexity can hinder progress, act as a use barrier, and dissuade the comparison of new methods with existing works. In this study, we present MONAI Generative Models, a freely available open-source platform that allows researchers and developers to easily train, evaluate, and deploy generative models and related applications. Our platform reproduces state-of-art studies in a standardised way involving different architectures (such as diffusion models, autoregressive transformers, and GANs), and provides pre-trained models for the community. We have implemented these models in a generalisable fashion, illustrating that their results can be extended to 2D or 3D scenarios, including medical images with different modalities (like CT, MRI, and X-Ray data) and from different anatomical areas. Finally, we adopt a modular and extensible approach, ensuring long-term maintainability and the extension of current applications for future features. △ Less

Submitted 27 July, 2023; originally announced July 2023.

arXiv:2307.10018 [pdf, other]

RobôCIn Small Size League Extended Team Description Paper for RoboCup 2023

Authors: Aline Lima de Oliveira, Cauê Addae da Silva Gomes, Cecília Virginia Santos da Silva, Charles Matheus de Sousa Alves, Danilo Andrade Martins de Souza, Driele Pires Ferreira Araújo Xavier, Edgleyson Pereira da Silva, Felipe Bezerra Martins, Lucas Henrique Cavalcanti Santos, Lucas Dias Maciel, Matheus Paixão Gumercindo dos Santos, Matheus Lafayette Vasconcelos, Matheus Vinícius Teotonio do Nascimento Andrade, João Guilherme Oliveira Carvalho de Melo, João Pedro Souza Pereira de Moura, José Ronald da Silva, José Victor Silva Cruz, Pedro Henrique Santana de Morais, Pedro Paulo Salman de Oliveira, Riei Joaquim Matos Rodrigues, Roberto Costa Fernandes, Ryan Vinicius Santos Morais, Tamara Mayara Ramos Teobaldo, Washington Igor dos Santos Silva, Edna Natividade Silva Barros

Abstract: RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Ou… ▽ More RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Our team has successfully published 2 articles related to SSL at two high-impact conferences: the 25th RoboCup International Symposium and the 19th IEEE Latin American Robotics Symposium (LARS 2022). Over the last year, we have been continuously migrating from our past codebase to Unification. We will describe the new architecture implemented and some points of software and AI refactoring. In addition, we discuss the process of integrating machined components into the mechanical system, our development for participating in the vision blackout challenge last year and what we are preparing for this year. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2306.00766 [pdf, other]

Impact of using a privacy model on smart buildings data for CO2 prediction

Authors: Marlon P. da Silva, Henry C. Nunes, Charles V. Neu, Luana T. Thomas, Avelino F. Zorzo, Charles Morisset

Abstract: There is a constant trade-off between the utility of the data collected and processed by the many systems forming the Internet of Things (IoT) revolution and the privacy concerns of the users living in the spaces hosting these sensors. Privacy models, such as the SITA (Spatial, Identity, Temporal, and Activity) model, can help address this trade-off. In this paper, we focus on the problem of… ▽ More There is a constant trade-off between the utility of the data collected and processed by the many systems forming the Internet of Things (IoT) revolution and the privacy concerns of the users living in the spaces hosting these sensors. Privacy models, such as the SITA (Spatial, Identity, Temporal, and Activity) model, can help address this trade-off. In this paper, we focus on the problem of $CO_2$ prediction, which is crucial for health monitoring but can be used to monitor occupancy, which might reveal some private information. We apply a number of transformations on a real dataset from a Smart Building to simulate different SITA configurations on the collected data. We use the transformed data with multiple Machine Learning (ML) techniques to analyse the performance of the models to predict $CO_{2}$ levels. Our results show that, for different algorithms, different SITA configurations do not make one algorithm perform better or worse than others, compared to the baseline data; also, in our experiments, the temporal dimension was particularly sensitive, with scores decreasing up to $18.9\%$ between the original and the transformed data. The results can be useful to show the effect of different levels of data privacy on the data utility of IoT applications, and can also help to identify which parameters are more relevant for those systems so that higher privacy settings can be adopted while data utility is still preserved. △ Less

Submitted 1 June, 2023; originally announced June 2023.

arXiv:2304.09064 [pdf, other]

LLM-based Interaction for Content Generation: A Case Study on the Perception of Employees in an IT department

Authors: Alexandre Agossah, Frédérique Krupa, Matthieu Perreira Da Silva, Patrick Le Callet

Abstract: In the past years, AI has seen many advances in the field of NLP. This has led to the emergence of LLMs, such as the now famous GPT-3.5, which revolutionise the way humans can access or generate content. Current studies on LLM-based generative tools are mainly interested in the performance of such tools in generating relevant content (code, text or image). However, ethical concerns related to the… ▽ More In the past years, AI has seen many advances in the field of NLP. This has led to the emergence of LLMs, such as the now famous GPT-3.5, which revolutionise the way humans can access or generate content. Current studies on LLM-based generative tools are mainly interested in the performance of such tools in generating relevant content (code, text or image). However, ethical concerns related to the design and use of generative tools seem to be growing, impacting the public acceptability for specific tasks. This paper presents a questionnaire survey to identify the intention to use generative tools by employees of an IT company in the context of their work. This survey is based on empirical models measuring intention to use (TAM by Davis, 1989, and UTAUT2 by Venkatesh and al., 2008). Our results indicate a rather average acceptability of generative tools, although the more useful the tool is perceived to be, the higher the intention to use seems to be. Furthermore, our analyses suggest that the frequency of use of generative tools is likely to be a key factor in understanding how employees perceive these tools in the context of their work. Following on from this work, we plan to investigate the nature of the requests that may be made to these tools by specific audiences. △ Less

Submitted 18 April, 2023; originally announced April 2023.

Comments: 14 pages (bibliography inclued), 6 figures, preprint submitted to Work-In-Progress session of ACM IMX'23 Interactive Media Experience

ACM Class: I.2.7; J.7

arXiv:2212.10913 [pdf]

Ensemble learning techniques for intrusion detection system in the context of cybersecurity

Authors: Andricson Abeline Moreira, Carlos A. C. Tojeiro, Carlos J. Reis, Gustavo Henrique Massaro, Igor Andrade Brito e Kelton A. P. da Costa

Abstract: Recently, there has been an interest in improving the resources available in Intrusion Detection System (IDS) techniques. In this sense, several studies related to cybersecurity show that the environment invasions and information kidnap** are increasingly recurrent and complex. The criticality of the business involving operations in an environment using computing resources does not allow the vul… ▽ More Recently, there has been an interest in improving the resources available in Intrusion Detection System (IDS) techniques. In this sense, several studies related to cybersecurity show that the environment invasions and information kidnap** are increasingly recurrent and complex. The criticality of the business involving operations in an environment using computing resources does not allow the vulnerability of the information. Cybersecurity has taken on a dimension within the universe of indispensable technology in corporations, and the prevention of risks of invasions into the environment is dealt with daily by Security teams. Thus, the main objective of the study was to investigate the Ensemble Learning technique using the Stacking method, supported by the Support Vector Machine (SVM) and k-Nearest Neighbour (kNN) algorithms aiming at an optimization of the results for DDoS attack detection. For this, the Intrusion Detection System concept was used with the application of the Data Mining and Machine Learning Orange tool to obtain better results △ Less

Submitted 21 December, 2022; originally announced December 2022.

Comments: in Portuguese language. CIACA - Conferencia Ibero-Americana Computação Aplicada 2022 Proceedings

arXiv:2212.10707 [pdf, other]

doi 10.5220/0011664100003417

Extractive Text Summarization Using Generalized Additive Models with Interactions for Sentence Selection

Authors: Vinícius Camargo da Silva, João Paulo Papa, Kelton Augusto Pontara da Costa

Abstract: Automatic Text Summarization (ATS) is becoming relevant with the growth of textual data; however, with the popularization of public large-scale datasets, some recent machine learning approaches have focused on dense models and architectures that, despite producing notable results, usually turn out in models difficult to interpret. Given the challenge behind interpretable learning-based text summar… ▽ More Automatic Text Summarization (ATS) is becoming relevant with the growth of textual data; however, with the popularization of public large-scale datasets, some recent machine learning approaches have focused on dense models and architectures that, despite producing notable results, usually turn out in models difficult to interpret. Given the challenge behind interpretable learning-based text summarization and the importance it may have for evolving the current state of the ATS field, this work studies the application of two modern Generalized Additive Models with interactions, namely Explainable Boosting Machine and GAMI-Net, to the extractive summarization problem based on linguistic features and binary classification. △ Less

Submitted 20 December, 2022; originally announced December 2022.

arXiv:2212.04984 [pdf, other]

Transformer-based normative modelling for anomaly detection of early schizophrenia

Authors: Pedro F Da Costa, Jessica Dafflon, Sergio Leonardo Mendes, João Ricardo Sato, M. Jorge Cardoso, Robert Leech, Emily JH Jones, Walter H. L. Pinaya

Abstract: Despite the impact of psychiatric disorders on clinical health, early-stage diagnosis remains a challenge. Machine learning studies have shown that classifiers tend to be overly narrow in the diagnosis prediction task. The overlap between conditions leads to high heterogeneity among participants that is not adequately captured by classification models. To address this issue, normative approaches h… ▽ More Despite the impact of psychiatric disorders on clinical health, early-stage diagnosis remains a challenge. Machine learning studies have shown that classifiers tend to be overly narrow in the diagnosis prediction task. The overlap between conditions leads to high heterogeneity among participants that is not adequately captured by classification models. To address this issue, normative approaches have surged as an alternative method. By using a generative model to learn the distribution of healthy brain data patterns, we can identify the presence of pathologies as deviations or outliers from the distribution learned by the model. In particular, deep generative models showed great results as normative models to identify neurological lesions in the brain. However, unlike most neurological lesions, psychiatric disorders present subtle changes widespread in several brain regions, making these alterations challenging to identify. In this work, we evaluate the performance of transformer-based normative models to detect subtle brain changes expressed in adolescents and young adults. We trained our model on 3D MRI scans of neurotypical individuals (N=1,765). Then, we obtained the likelihood of neurotypical controls and psychiatric patients with early-stage schizophrenia from an independent dataset (N=93) from the Human Connectome Project. Using the predicted likelihood of the scans as a proxy for a normative score, we obtained an AUROC of 0.82 when assessing the difference between controls and individuals with early-stage schizophrenia. Our approach surpassed recent normative methods based on brain age and Gaussian Process, showing the promising use of deep generative models to help in individualised analyses. △ Less

Submitted 8 December, 2022; originally announced December 2022.

Comments: 10 pages, 2 figures, 2 tables, presented at NeurIPS22@PAI4MH

arXiv:2211.14372 [pdf, other]

Interpretability Analysis of Deep Models for COVID-19 Detection

Authors: Daniel Peixoto Pinto da Silva, Edresson Casanova, Lucas Rafael Stefanel Gris, Arnaldo Candido Junior, Marcelo Finger, Flaviane Svartman, Beatriz Raposo, Marcus Vinícius Moreira Martins, Sandra Maria Aluísio, Larissa Cristina Berti, João Paulo Teixeira

Abstract: During the outbreak of COVID-19 pandemic, several research areas joined efforts to mitigate the damages caused by SARS-CoV-2. In this paper we present an interpretability analysis of a convolutional neural network based model for COVID-19 detection in audios. We investigate which features are important for model decision process, investigating spectrograms, F0, F0 standard deviation, sex and age.… ▽ More During the outbreak of COVID-19 pandemic, several research areas joined efforts to mitigate the damages caused by SARS-CoV-2. In this paper we present an interpretability analysis of a convolutional neural network based model for COVID-19 detection in audios. We investigate which features are important for model decision process, investigating spectrograms, F0, F0 standard deviation, sex and age. Following, we analyse model decisions by generating heat maps for the trained models to capture their attention during the decision process. Focusing on a explainable Inteligence Artificial approach, we show that studied models can taken unbiased decisions even in the presence of spurious data in the training set, given the adequate preprocessing steps. Our best model has 94.44% of accuracy in detection, with results indicating that models favors spectrograms for the decision process, particularly, high energy areas in the spectrogram related to prosodic domains, while F0 also leads to efficient COVID-19 detection. △ Less

Submitted 25 November, 2022; originally announced November 2022.

Comments: 14 pages, 4 figures

arXiv:2209.07162 [pdf, other]

Brain Imaging Generation with Latent Diffusion Models

Authors: Walter H. L. Pinaya, Petru-Daniel Tudosiu, Jessica Dafflon, Pedro F da Costa, Virginia Fernandez, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

Abstract: Deep neural networks have brought remarkable breakthroughs in medical image analysis. However, due to their data-hungry nature, the modest dataset sizes in medical imaging projects might be hindering their full potential. Generating synthetic data provides a promising alternative, allowing to complement training datasets and conducting medical image research at a larger scale. Diffusion models rec… ▽ More Deep neural networks have brought remarkable breakthroughs in medical image analysis. However, due to their data-hungry nature, the modest dataset sizes in medical imaging projects might be hindering their full potential. Generating synthetic data provides a promising alternative, allowing to complement training datasets and conducting medical image research at a larger scale. Diffusion models recently have caught the attention of the computer vision community by producing photorealistic synthetic images. In this study, we explore using Latent Diffusion Models to generate synthetic images from high-resolution 3D brain images. We used T1w MRI images from the UK Biobank dataset (N=31,740) to train our models to learn about the probabilistic distribution of brain images, conditioned on covariables, such as age, sex, and brain structure volumes. We found that our models created realistic data, and we could use the conditioning variables to control the data generation effectively. Besides that, we created a synthetic dataset with 100,000 brain images and made it openly available to the scientific community. △ Less

Submitted 15 September, 2022; originally announced September 2022.

Comments: 10 pages, 3 figures, Accepted in the Deep Generative Models workshop @ MICCAI 2022

arXiv:2208.01712 [pdf, other]

No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling

Authors: Marília Costa Rosendo Silva, Felipe Alves Siqueira, João Pedro Mantovani Tarrega, João Vitor Pataca Beinotti, Augusto Sousa Nunes, Miguel de Mattos Gardini, Vinícius Adolfo Pereira da Silva, Nádia Félix Felipe da Silva, André Carlos Ponce de Leon Ferreira de Carvalho

Abstract: Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variabi… ▽ More Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variability depending on the machine learning algorithm. Furthermore, the distortions can be misleading when regarding cluster geometry. Amongst the causes, the presence of outliers and anomalies can be a determining factor. Despite the relevance of initialization and outlier issues for text clustering and topic modeling, the authors did not find an in-depth analysis of them. This survey provides a systematic literature review (2011-2022) of these subareas and proposes a common terminology since similar procedures have different terms. The authors describe research opportunities, trends, and open issues. The appendices summarize the theoretical background of the text vectorization, the factorization, and the clustering algorithms that are directly or indirectly related to the reviewed works. △ Less

Submitted 2 August, 2022; originally announced August 2022.

ACM Class: I.2; I.2.7; I.5.3

arXiv:2207.06366 [pdf, other]

N-Grammer: Augmenting Transformers with latent n-grams

Authors: Aurko Roy, Rohan Anil, Guangda Lai, Benjamin Lee, Jeffrey Zhao, Shuyuan Zhang, Shibo Wang, Ye Zhang, Shen Wu, Rigel Swavely, Tao, Yu, Phuong Dao, Christopher Fifty, Zhifeng Chen, Yonghui Wu

Abstract: Transformer models have recently emerged as one of the foundational models in natural language processing, and as a byproduct, there is significant recent interest and investment in scaling these models. However, the training and inference costs of these large Transformer language models are prohibitive, thus necessitating more research in identifying more efficient variants. In this work, we prop… ▽ More Transformer models have recently emerged as one of the foundational models in natural language processing, and as a byproduct, there is significant recent interest and investment in scaling these models. However, the training and inference costs of these large Transformer language models are prohibitive, thus necessitating more research in identifying more efficient variants. In this work, we propose a simple yet effective modification to the Transformer architecture inspired by the literature in statistical language modeling, by augmenting the model with n-grams that are constructed from a discrete latent representation of the text sequence. We evaluate our model, the N-Grammer on language modeling on the C4 data-set as well as text classification on the SuperGLUE data-set, and find that it outperforms several strong baselines such as the Transformer and the Primer. We open-source our model for reproducibility purposes in Jax. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Comments: 8 pages, 2 figures

arXiv:2206.03461 [pdf, other]

Fast Unsupervised Brain Anomaly Detection and Segmentation with Diffusion Models

Authors: Walter H. L. Pinaya, Mark S. Graham, Robert Gray, Pedro F Da Costa, Petru-Daniel Tudosiu, Paul Wright, Yee H. Mah, Andrew D. MacKinnon, James T. Teo, Rolf Jager, David Werring, Geraint Rees, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

Abstract: Deep generative models have emerged as promising tools for detecting arbitrary anomalies in data, dispensing with the necessity for manual labelling. Recently, autoregressive transformers have achieved state-of-the-art performance for anomaly detection in medical imaging. Nonetheless, these models still have some intrinsic weaknesses, such as requiring images to be modelled as 1D sequences, the ac… ▽ More Deep generative models have emerged as promising tools for detecting arbitrary anomalies in data, dispensing with the necessity for manual labelling. Recently, autoregressive transformers have achieved state-of-the-art performance for anomaly detection in medical imaging. Nonetheless, these models still have some intrinsic weaknesses, such as requiring images to be modelled as 1D sequences, the accumulation of errors during the sampling process, and the significant inference times associated with transformers. Denoising diffusion probabilistic models are a class of non-autoregressive generative models recently shown to produce excellent samples in computer vision (surpassing Generative Adversarial Networks), and to achieve log-likelihoods that are competitive with transformers while having fast inference times. Diffusion models can be applied to the latent representations learnt by autoencoders, making them easily scalable and great candidates for application to high dimensional data, such as medical images. Here, we propose a method based on diffusion models to detect and segment anomalies in brain imaging. By training the models on healthy data and then exploring its diffusion and reverse steps across its Markov chain, we can identify anomalous areas in the latent space and hence identify anomalies in the pixel space. Our diffusion models achieve competitive performance compared with autoregressive approaches across a series of experiments with 2D CT and MRI data involving synthetic and real pathological lesions with much reduced inference times, making their usage clinically viable. △ Less

Submitted 7 June, 2022; originally announced June 2022.

arXiv:2205.09185 [pdf, other]

doi 10.1016/j.nima.2022.167748

AI-assisted Optimization of the ECCE Tracking System at the Electron Ion Collider

Authors: C. Fanelli, Z. Papandreou, K. Suresh, J. K. Adkins, Y. Akiba, A. Albataineh, M. Amaryan, I. C. Arsene, C. Ayerbe Gayoso, J. Bae, X. Bai, M. D. Baker, M. Bashkanov, R. Bellwied, F. Benmokhtar, V. Berdnikov, J. C. Bernauer, F. Bock, W. Boeglin, M. Borysova, E. Brash, P. Brindza, W. J. Briscoe, M. Brooks, S. Bueltmann , et al. (258 additional authors not shown)

Abstract: The Electron-Ion Collider (EIC) is a cutting-edge accelerator facility that will study the nature of the "glue" that binds the building blocks of the visible matter in the universe. The proposed experiment will be realized at Brookhaven National Laboratory in approximately 10 years from now, with detector design and R&D currently ongoing. Notably, EIC is one of the first large-scale facilities to… ▽ More The Electron-Ion Collider (EIC) is a cutting-edge accelerator facility that will study the nature of the "glue" that binds the building blocks of the visible matter in the universe. The proposed experiment will be realized at Brookhaven National Laboratory in approximately 10 years from now, with detector design and R&D currently ongoing. Notably, EIC is one of the first large-scale facilities to leverage Artificial Intelligence (AI) already starting from the design and R&D phases. The EIC Comprehensive Chromodynamics Experiment (ECCE) is a consortium that proposed a detector design based on a 1.5T solenoid. The EIC detector proposal review concluded that the ECCE design will serve as the reference design for an EIC detector. Herein we describe a comprehensive optimization of the ECCE tracker using AI. The work required a complex parametrization of the simulated detector system. Our approach dealt with an optimization problem in a multidimensional design space driven by multiple objectives that encode the detector performance, while satisfying several mechanical constraints. We describe our strategy and show results obtained for the ECCE tracking system. The AI-assisted design is agnostic to the simulation framework and can be extended to other sub-detectors or to a system of sub-detectors to further optimize the performance of the EIC detector. △ Less

Submitted 19 May, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

Comments: 16 pages, 18 figures, 2 appendices, 3 tables

arXiv:2205.06670 [pdf]

doi 10.22533/at.ed.8702118084

Rectangular mesh contour generation algorithm for finite differences calculus

Authors: Pedro Zaffalon da Silva, Neyva Maria Lopes Romeiro, Iury Pereira de Souza, Paulo Laerte Natti, Eliandro Rodrigues Cirilo

Abstract: In this work, a 2D contour generation algorithm is proposed for irregular regions. The contour of the physical domain is approximated by mesh segments using the known coordinates of the contour. For this purpose, the algorithm uses a repeating structure that analyzes the known irregular contour coordinates to approximate the physical domain contour by mesh segments. To this end, the algorithm calc… ▽ More In this work, a 2D contour generation algorithm is proposed for irregular regions. The contour of the physical domain is approximated by mesh segments using the known coordinates of the contour. For this purpose, the algorithm uses a repeating structure that analyzes the known irregular contour coordinates to approximate the physical domain contour by mesh segments. To this end, the algorithm calculates the slope of the line defined by the known point of the irregular contours and the neighboring vertices. In this way, the algorithm calculates the points of the line and its distance to the closest known nodes of the mesh, allowing to obtain the points of the approximate contour. This process is repeated until the approximate contour is obtained. Therefore, this approximate contour generation algorithm, from known nodes of a mesh, is suitable for describing meshes involving geometries with irregular contours and for calculating finite differences in numerical simulations. The contour is evaluated through three geometries, the difference between the areas delimited by the given contour and the approximate contour, the number of nodes and the number of internal points. It can be seen that the increase in geometry complexity implies the need for a greater number of nodes in the contour, generating more refined meshes that allow reaching differences in areas below 2%. △ Less

Submitted 10 May, 2022; originally announced May 2022.

Comments: In Portuguese language

Journal ref: In: Coleção desafios das engenharias: Engenharia de computação. Capítulo 4, 1ed. Ponta Grossa: Atena Editora, 2021, p. 38-52

arXiv:2203.02728 [pdf, other]

An End-to-End Approach for Seam Carving Detection using Deep Neural Networks

Authors: Thierry P. Moreira, Marcos Cleison S. Santana, Leandro A. Passos João Paulo Papa, Kelton Augusto P. da Costa

Abstract: Seam carving is a computational method capable of resizing images for both reduction and expansion based on its content, instead of the image geometry. Although the technique is mostly employed to deal with redundant information, i.e., regions composed of pixels with similar intensity, it can also be used for tampering images by inserting or removing relevant objects. Therefore, detecting such a p… ▽ More Seam carving is a computational method capable of resizing images for both reduction and expansion based on its content, instead of the image geometry. Although the technique is mostly employed to deal with redundant information, i.e., regions composed of pixels with similar intensity, it can also be used for tampering images by inserting or removing relevant objects. Therefore, detecting such a process is of extreme importance regarding the image security domain. However, recognizing seam-carved images does not represent a straightforward task even for human eyes, and robust computation tools capable of identifying such alterations are very desirable. In this paper, we propose an end-to-end approach to cope with the problem of automatic seam carving detection that can obtain state-of-the-art results. Experiments conducted over public and private datasets with several tampering configurations evidence the suitability of the proposed model. △ Less

Submitted 5 March, 2022; originally announced March 2022.

arXiv:2202.06095 [pdf, other]

doi 10.1111/EXSY.13570

A Review of Deep Learning-based Approaches for Deepfake Content Detection

Authors: Leandro A. Passos, Danilo Jodas, Kelton A. P. da Costa, Luis A. Souza Júnior, Douglas Rodrigues, Javier Del Ser, David Camacho, João Paulo Papa

Abstract: Recent advancements in deep learning generative models have raised concerns as they can create highly convincing counterfeit images and videos. This poses a threat to people's integrity and can lead to social instability. To address this issue, there is a pressing need to develop new computational models that can efficiently detect forged content and alert users to potential image and video manipu… ▽ More Recent advancements in deep learning generative models have raised concerns as they can create highly convincing counterfeit images and videos. This poses a threat to people's integrity and can lead to social instability. To address this issue, there is a pressing need to develop new computational models that can efficiently detect forged content and alert users to potential image and video manipulations. This paper presents a comprehensive review of recent studies for deepfake content detection using deep learning-based approaches. We aim to broaden the state-of-the-art research by systematically reviewing the different categories of fake content detection. Furthermore, we report the advantages and drawbacks of the examined works, and prescribe several future directions towards the issues and shortcomings still unsolved on deepfake detection. △ Less

Submitted 15 February, 2024; v1 submitted 12 February, 2022; originally announced February 2022.

arXiv:2201.10453 [pdf, other]

The First AI4TSP Competition: Learning to Solve Stochastic Routing Problems

Authors: Laurens Bliek, Paulo da Costa, Reza Refaei Afshar, Yingqian Zhang, Tom Catshoek, Daniël Vos, Sicco Verwer, Fynn Schmitt-Ulms, André Hottung, Tapan Shah, Meinolf Sellmann, Kevin Tierney, Carl Perreault-Lafleur, Caroline Leboeuf, Federico Bobbio, Justine Pepin, Warley Almeida Silva, Ricardo Gama, Hugo L. Fernandes, Martin Zaefferer, Manuel López-Ibáñez, Ekhine Irurozki

Abstract: This paper reports on the first international competition on AI for the traveling salesman problem (TSP) at the International Joint Conference on Artificial Intelligence 2021 (IJCAI-21). The TSP is one of the classical combinatorial optimization problems, with many variants inspired by real-world applications. This first competition asked the participants to develop algorithms to solve a time-depe… ▽ More This paper reports on the first international competition on AI for the traveling salesman problem (TSP) at the International Joint Conference on Artificial Intelligence 2021 (IJCAI-21). The TSP is one of the classical combinatorial optimization problems, with many variants inspired by real-world applications. This first competition asked the participants to develop algorithms to solve a time-dependent orienteering problem with stochastic weights and time windows (TD-OPSWTW). It focused on two types of learning approaches: surrogate-based optimization and deep reinforcement learning. In this paper, we describe the problem, the setup of the competition, the winning methods, and give an overview of the results. The winning methods described in this work have advanced the state-of-the-art in using AI for stochastic routing problems. Overall, by organizing this competition we have introduced routing problems as an interesting problem setting for AI researchers. The simulator of the problem has been made open-source and can be used by other researchers as a benchmark for new AI methods. △ Less

Submitted 25 January, 2022; originally announced January 2022.

Comments: 21 pages

MSC Class: 68T05

arXiv:2110.15731 [pdf, other]

CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Authors: Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Frederico Santos de Oliveira, Lucas Oliveira, Ricardo Corso Fernandes Junior, Daniel Peixoto Pinto da Silva, Fernando Gorgulho Fayet, Bruno Baldissera Carlotto, Lucas Rafael Stefanel Gris, Sandra Maria Aluísio

Abstract: Automatic Speech recognition (ASR) is a complex and challenging task. In recent years, there have been significant advances in the area. In particular, for the Brazilian Portuguese (BP) language, there were about 376 hours public available for ASR task until the second half of 2020. With the release of new datasets in early 2021, this number increased to 574 hours. The existing resources, however,… ▽ More Automatic Speech recognition (ASR) is a complex and challenging task. In recent years, there have been significant advances in the area. In particular, for the Brazilian Portuguese (BP) language, there were about 376 hours public available for ASR task until the second half of 2020. With the release of new datasets in early 2021, this number increased to 574 hours. The existing resources, however, are composed of audios containing only read and prepared speech. There is a lack of datasets including spontaneous speech, which are essential in different ASR applications. This paper presents CORAA (Corpus of Annotated Audios) v1. with 290.77 hours, a publicly available dataset for ASR in BP containing validated pairs (audio-transcription). CORAA also contains European Portuguese audios (4.69 hours). We also present a public ASR model based on Wav2Vec 2.0 XLSR-53 and fine-tuned over CORAA. Our model achieved a Word Error Rate of 24.18% on CORAA test set and 20.08% on Common Voice test set. When measuring the Character Error Rate, we obtained 11.02% and 6.34% for CORAA and Common Voice, respectively. CORAA corpora were assembled to both improve ASR models in BP with phenomena from spontaneous speech and motivate young researchers to start their studies on ASR for Portuguese. All the corpora are publicly available at https://github.com/nilc-nlp/CORAA under the CC BY-NC-ND 4.0 license. △ Less

Submitted 18 November, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: This paper is under consideration at Language Resources and Evaluation (LREV)

arXiv:2109.08051 [pdf, ps, other]

Frame by frame completion probability of an NFL pass

Authors: Gustavo Pompeu da Silva, Rafael de Andrade Moral

Abstract: American football is an increasingly popular sport, with a growing audience in many countries in the world. The most watched American football league in the world is the United States' National Football League (NFL), where every offensive play can be either a run or a pass, and in this work we focus on passes. Many factors can affect the probability of pass completion, such as receiver separation… ▽ More American football is an increasingly popular sport, with a growing audience in many countries in the world. The most watched American football league in the world is the United States' National Football League (NFL), where every offensive play can be either a run or a pass, and in this work we focus on passes. Many factors can affect the probability of pass completion, such as receiver separation from the nearest defender, distance from receiver to passer, offense formation, among many others. When predicting the completion probability of a pass, it is essential to know who the target of the pass is. By using distance measures between players and the ball, it is possible to calculate empirical probabilities and predict very accurately who the target will be. The big question is: how likely is it for a pass to be completed in an NFL match while the ball is in the air? We developed a machine learning algorithm to answer this based on several predictors. Using data from the 2018 NFL season, we obtained conditional and marginal predictions for pass completion probability based on a random forest model. This is based on a two-stage procedure: first, we calculate the probability of each offensive player being the pass target, then, conditional on the target, we predict completion probability based on the random forest model. Finally, the general completion probability can be calculated using the law of total probability. We present animations for selected plays and show the pass completion probability evolution. △ Less

Submitted 16 September, 2021; originally announced September 2021.

Comments: 26 pages, 13 figures, 5 tables

arXiv:2109.02148 [pdf, other]

Adaptive Turbo Equalization for Nonlinearity Compensation in WDM Systems

Authors: Edson Porto da Silva, Metodi Plamenov Yankov

Abstract: In this paper, the performance of adaptive turbo equalization for nonlinearity compensation (NLC) is investigated. A turbo equalization scheme is proposed where a recursive least-squares (RLS) algorithm is used as an adaptive channel estimator to track the time-varying intersymbol interference (ISI) coefficients associated with inter-channel nonlinear interference (NLI) model. The estimated channe… ▽ More In this paper, the performance of adaptive turbo equalization for nonlinearity compensation (NLC) is investigated. A turbo equalization scheme is proposed where a recursive least-squares (RLS) algorithm is used as an adaptive channel estimator to track the time-varying intersymbol interference (ISI) coefficients associated with inter-channel nonlinear interference (NLI) model. The estimated channel coefficients are used by a MIMO 2x2 soft-input soft-output (SISO) linear minimum mean square error (LMMSE) equalizer to compensate for the time-varying ISI. The SISO LMMSE equalizer and the SISO forward error correction (FEC) decoder exchange extrinsic information in every turbo iteration, allowing the receiver to improve the performance of the channel estimation and the equalization, achieving lower bit-error-rate (BER) values. The proposed scheme is investigated for polarization multiplexed 64QAM and 256QAM, although it applies to any proper modulation format. Extensive numerical results are presented. It is shown that the scheme allows up to 0.7 dB extra gain in effectively received signal-to-noise ratio (SNR) and up to 0.2 bits/symbol/pol in generalized mutual information (GMI), on top of the gain provided by single-channel digital backpropagation. △ Less

Submitted 5 September, 2021; originally announced September 2021.

arXiv:2108.13202 [pdf, other]

PTRAIL -- A python package for parallel trajectory data preprocessing

Authors: Salman Haidri, Yaksh J. Haranwala, Vania Bogorny, Chiara Renso, Vinicius Prado da Fonseca, Amilcar Soares

Abstract: Trajectory data represent a trace of an object that changes its position in space over time. This kind of data is complex to handle and analyze, since it is generally produced in huge quantities, often prone to errors generated by the geolocation device, human mishandling, or area coverage limitation. Therefore, there is a need for software specifically tailored to preprocess trajectory data. In t… ▽ More Trajectory data represent a trace of an object that changes its position in space over time. This kind of data is complex to handle and analyze, since it is generally produced in huge quantities, often prone to errors generated by the geolocation device, human mishandling, or area coverage limitation. Therefore, there is a need for software specifically tailored to preprocess trajectory data. In this work we propose PTRAIL, a python package offering several trajectory preprocessing steps, including filtering, feature extraction, and interpolation. PTRAIL uses parallel computation and vectorization, being suitable for large datasets and fast compared to other python libraries. △ Less

Submitted 26 August, 2021; originally announced August 2021.

arXiv:2108.12214 [pdf, other]

Machine Learning for Performance Prediction of Spark Cloud Applications

Authors: Alexandre Maros, Fabricio Murai, Ana Paula Couto da Silva, Jussara M. Almeida, Marco Lattuada, Eugenio Gianniti, Marjan Hosseini, Danilo Ardagna

Abstract: Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the und… ▽ More Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the underlying resources at runtime. Machine Learning (ML), providing black box solutions to model the relationship between application performance and system configuration without requiring in-detail knowledge of the system, has become a popular way of predicting the performance of big data applications. We investigate the cost-benefits of using supervised ML models for predicting the performance of applications on Spark, one of today's most widely used frameworks for big data analysis. We compare our approach with \textit{Ernest} (an ML-based technique proposed in the literature by the Spark inventors) on a range of scenarios, application workloads, and cloud system configurations. Our experiments show that Ernest can accurately estimate the performance of very regular applications, but it fails when applications exhibit more irregular patterns and/or when extrapolating on bigger data set sizes. Results show that our models match or exceed Ernest's performance, sometimes enabling us to reduce the prediction error from 126-187% to only 5-19%. △ Less

Submitted 27 August, 2021; originally announced August 2021.

Comments: Published in 2019 IEEE 12th International Conference on Cloud Computing (CLOUD)

ACM Class: B.8.2; I.2

arXiv:2108.09778 [pdf]

"Sharing Wisdoms from the East": Develo** a Native Theory of ICT4D Using Grounded Theory Methodology (GTM) -- Experience from Timor-Leste

Authors: Abel Pires da Silva

Abstract: There have been repeated calls made for theory-building studies in ICT4D research to solidify the existence of this research field. However, theory-building studies are not yet common, even though ICT4D as a research domain is a promising venue to develop native and indigenous theories. To this end, this paper outlines a theory-building study in ICT4D, based on the author's experience in developin… ▽ More There have been repeated calls made for theory-building studies in ICT4D research to solidify the existence of this research field. However, theory-building studies are not yet common, even though ICT4D as a research domain is a promising venue to develop native and indigenous theories. To this end, this paper outlines a theory-building study in ICT4D, based on the author's experience in develo** a mid-range theory called 'Cultivating-Sustainability' of E-government projects, a native mid-range theory of ICT4D. The paper synthesizes the GTM literature and provides a step-by-step illustration of GTM use in practice for research students and early career ICT4D academics. It introduces the key strategies and principles of GTM, such as the theoretical sampling strategy, the constant comparison strategy, the concept-emergent principle, and the use of literature throughout the study process. Then discusses the steps involved in the data collection and analysis process to develop a theory using case studies as sources of empirical data; it concludes with a discussion on using the strategies and principles in the three case studies. It is expected that this paper contributes to the diversification of research methodology, particularly to our collective quest for develo** native and indigenous theories in the ICT4D research domain. △ Less

Submitted 22 August, 2021; originally announced August 2021.

Comments: In proceedings of the 1st Virtual Conference on Implications of Information and Digital Technologies for Development, 2021

arXiv:2107.13589 [pdf, other]

Improved quantum error correction using soft information

Authors: Christopher A. Pattison, Michael E. Beverland, Marcus P. da Silva, Nicolas Delfosse

Abstract: The typical model for measurement noise in quantum error correction is to randomly flip the binary measurement outcome. In experiments, measurements yield much richer information - e.g., continuous current values, discrete photon counts - which is then mapped into binary outcomes by discarding some of this information. In this work, we consider methods to incorporate all of this richer information… ▽ More The typical model for measurement noise in quantum error correction is to randomly flip the binary measurement outcome. In experiments, measurements yield much richer information - e.g., continuous current values, discrete photon counts - which is then mapped into binary outcomes by discarding some of this information. In this work, we consider methods to incorporate all of this richer information, typically called soft information, into the decoding of quantum error correction codes, and in particular the surface code. We describe how to modify both the Minimum Weight Perfect Matching and Union-Find decoders to leverage soft information, and demonstrate these soft decoders outperform the standard (hard) decoders that can only access the binary measurement outcomes. Moreover, we observe that the soft decoder achieves a threshold 25\% higher than any hard decoder for phenomenological noise with Gaussian soft measurement outcomes. We also introduce a soft measurement error model with amplitude dam**, in which measurement time leads to a trade-off between measurement resolution and additional disturbance of the qubits. Under this model we observe that the performance of the surface code is very sensitive to the choice of the measurement time - for a distance-19 surface code, a five-fold increase in measurement time can lead to a thousand-fold increase in logical error rate. Moreover, the measurement time that minimizes the physical error rate is distinct from the one that minimizes the logical performance, pointing to the benefits of jointly optimizing the physical and quantum error correction layers. △ Less

Submitted 28 July, 2021; originally announced July 2021.

Comments: 27 pages, 13 figures

arXiv:2107.04702 [pdf]

Um Metodo para Busca Automatica de Redes Neurais Artificiais

Authors: Anderson P. da Silva, Teresa B. Ludermir, Leandro M. Almeida

Abstract: This paper describes a method that automatically searches Artificial Neural Networks using Cellular Genetic Algorithms. The main difference of this method for a common genetic algorithm is the use of a cellular automaton capable of providing the location for individuals, reducing the possibility of local minima in search space. This method employs an evolutionary search for simultaneous choices of… ▽ More This paper describes a method that automatically searches Artificial Neural Networks using Cellular Genetic Algorithms. The main difference of this method for a common genetic algorithm is the use of a cellular automaton capable of providing the location for individuals, reducing the possibility of local minima in search space. This method employs an evolutionary search for simultaneous choices of initial weights, transfer functions, architectures and learning rules. Experimental results have shown that the developed method can find compact, efficient networks with a satisfactory generalization power and with shorter training times when compared to other methods found in the literature. △ Less

Submitted 9 July, 2021; originally announced July 2021.

Comments: 13 pages, in Portuguese, 4 figures, 2 tables

arXiv:2105.15119 [pdf, other]

Policies for the Dynamic Traveling Maintainer Problem with Alerts

Authors: Paulo da Costa, Peter Verleijsdonk, Simon Voorberg, Alp Akcay, Stella Kapodistria, Willem van Jaarsveld, Yingqian Zhang

Abstract: Downtime of industrial assets such as wind turbines and medical imaging devices comes at a sharp cost. To avoid such downtime costs, companies seek to initiate maintenance just before failure. Unfortunately, this is challenging for the following two reasons: On the one hand, because asset failures are notoriously difficult to predict, even in the presence of real-time monitoring devices which sign… ▽ More Downtime of industrial assets such as wind turbines and medical imaging devices comes at a sharp cost. To avoid such downtime costs, companies seek to initiate maintenance just before failure. Unfortunately, this is challenging for the following two reasons: On the one hand, because asset failures are notoriously difficult to predict, even in the presence of real-time monitoring devices which signal early degradation. On the other hand, because the available resources to serve a network of geographically dispersed assets are typically limited. In this paper, we propose a novel dynamic traveling maintainer problem with alerts model that incorporates these two challenges and we provide three solution approaches on how to dispatch the limited resources. Namely, we propose: (i) Greedy heuristic approaches that rank assets on urgency, proximity and economic risk; (ii) A novel traveling maintainer heuristic approach that optimizes short-term costs; and (iii) A deep reinforcement learning (DRL) approach that optimizes long-term costs. Each approach has different requirements concerning the available alert information. Experiments with small asset networks show that all methods can approximate the optimal policy when given access to complete condition information. For larger networks, the proposed methods yield competitive policies, with DRL consistently achieving the lowest costs. △ Less

Submitted 20 May, 2022; v1 submitted 31 May, 2021; originally announced May 2021.

arXiv:2104.00788 [pdf, other]

doi 10.1371/journal.pone.0269174

The Effects of Spectral Dimensionality Reduction on Hyperspectral Pixel Classification: A Case Study

Authors: Kiran Mantripragada, Phuong D. Dao, Yuhong He, Faisal Z. Qureshi

Abstract: This paper presents a systematic study of the effects of hyperspectral pixel dimensionality reduction on the pixel classification task. We use five dimensionality reduction methods -- PCA, KPCA, ICA, AE, and DAE -- to compress 301-dimensional hyperspectral pixels. Compressed pixels are subsequently used to perform pixel classifications. Pixel classification accuracies together with compression met… ▽ More This paper presents a systematic study of the effects of hyperspectral pixel dimensionality reduction on the pixel classification task. We use five dimensionality reduction methods -- PCA, KPCA, ICA, AE, and DAE -- to compress 301-dimensional hyperspectral pixels. Compressed pixels are subsequently used to perform pixel classifications. Pixel classification accuracies together with compression method, compression rates, and reconstruction errors provide a new lens to study the suitability of a compression method for the task of pixel classification. We use three high-resolution hyperspectral image datasets, representing three common landscape types (i.e. urban, transitional suburban, and forests) collected by the Remote Sensing and Spatial Ecosystem Modeling laboratory of the University of Toronto. We found that PCA, KPCA, and ICA post greater signal reconstruction capability; however, when compression rates are more than 90\% these methods show lower classification scores. AE and DAE methods post better classification accuracy at 95\% compression rate, however their performance drops as compression rate approaches 97\%. Our results suggest that both the compression method and the compression rate are important considerations when designing a hyperspectral pixel classification pipeline. △ Less

Submitted 27 January, 2022; v1 submitted 1 April, 2021; originally announced April 2021.

Comments: 15 pages

arXiv:2012.09575 [pdf, other]

Task Uncertainty Loss Reduce Negative Transfer in Asymmetric Multi-task Feature Learning

Authors: Rafael Peres da Silva, Chayaporn Suphavilai, Niranjan Nagarajan

Abstract: Multi-task learning (MTL) is frequently used in settings where a target task has to be learnt based on limited training data, but knowledge can be leveraged from related auxiliary tasks. While MTL can improve task performance overall relative to single-task learning (STL), these improvements can hide negative transfer (NT), where STL may deliver better performance for many individual tasks. Asymme… ▽ More Multi-task learning (MTL) is frequently used in settings where a target task has to be learnt based on limited training data, but knowledge can be leveraged from related auxiliary tasks. While MTL can improve task performance overall relative to single-task learning (STL), these improvements can hide negative transfer (NT), where STL may deliver better performance for many individual tasks. Asymmetric multitask feature learning (AMTFL) is an approach that tries to address this by allowing tasks with higher loss values to have smaller influence on feature representations for learning other tasks. Task loss values do not necessarily indicate reliability of models for a specific task. We present examples of NT in two orthogonal datasets (image recognition and pharmacogenomics) and tackle this challenge by using aleatoric homoscedastic uncertainty to capture the relative confidence between tasks, and set weights for task loss. Our results show that this approach reduces NT providing a new approach to enable robust MTL. △ Less

Submitted 17 December, 2020; originally announced December 2020.

Comments: Accepted in AAAI 2021 Student Abstract and Poster Program

arXiv:2007.09989 [pdf, other]

Bayesian optimization for automatic design of face stimuli

Authors: Pedro F. da Costa, Romy Lorenz, Ricardo Pio Monti, Emily Jones, Robert Leech

Abstract: Investigating the cognitive and neural mechanisms involved with face processing is a fundamental task in modern neuroscience and psychology. To date, the majority of such studies have focused on the use of pre-selected stimuli. The absence of personalized stimuli presents a serious limitation as it fails to account for how each individual face processing system is tuned to cultural embeddings or h… ▽ More Investigating the cognitive and neural mechanisms involved with face processing is a fundamental task in modern neuroscience and psychology. To date, the majority of such studies have focused on the use of pre-selected stimuli. The absence of personalized stimuli presents a serious limitation as it fails to account for how each individual face processing system is tuned to cultural embeddings or how it is disrupted in disease. In this work, we propose a novel framework which combines generative adversarial networks (GANs) with Bayesian optimization to identify individual response patterns to many different faces. Formally, we employ Bayesian optimization to efficiently search the latent space of state-of-the-art GAN models, with the aim to automatically generate novel faces, to maximize an individual subject's response. We present results from a web-based proof-of-principle study, where participants rated images of themselves generated via performing Bayesian optimization over the latent space of a GAN. We show how the algorithm can efficiently locate an individual's optimal face while map** out their response across different semantic transformations of a face; inter-individual analyses suggest how the approach can provide rich information about individual differences in face processing. △ Less

Submitted 20 July, 2020; originally announced July 2020.

Comments: Accepted at ICML2020 workshop track

arXiv:2006.15401 [pdf, other]

You Shall not Pass: Avoiding Spurious Paths in Shortest-Path Based Centralities in Multidimensional Complex Networks

Authors: Klaus Wehmuth, Artur Ziviani, Leonardo Chinelate Costa, Ana Paula Couto da Silva, Alex Borges Vieira

Abstract: In complex network analysis, centralities based on shortest paths, such as betweenness and closeness, are widely used. More recently, many complex systems are being represented by time-varying, multilayer, and time-varying multilayer networks, i.e. multidimensional (or high order) networks. Nevertheless, it is well-known that the aggregation process may create spurious paths on the aggregated view… ▽ More In complex network analysis, centralities based on shortest paths, such as betweenness and closeness, are widely used. More recently, many complex systems are being represented by time-varying, multilayer, and time-varying multilayer networks, i.e. multidimensional (or high order) networks. Nevertheless, it is well-known that the aggregation process may create spurious paths on the aggregated view of such multidimensional (high order) networks. Consequently, these spurious paths may then cause shortest-path based centrality metrics to produce incorrect results, thus undermining the network centrality analysis. In this context, we propose a method able to avoid taking into account spurious paths when computing centralities based on shortest paths in multidimensional (or high order) networks. Our method is based on MultiAspect Graphs~(MAG) to represent the multidimensional networks and we show that well-known centrality algorithms can be straightforwardly adapted to the MAG environment. Moreover, we show that, by using this MAG representation, pitfalls usually associated with spurious paths resulting from aggregation in multidimensional networks can be avoided at the time of the aggregation process. As a result, shortest-path based centralities are assured to be computed correctly for multidimensional networks, without taking into account spurious paths that could otherwise lead to incorrect results. We also present a case study that shows the impact of spurious paths in the computing of shortest paths and consequently of shortest-path based centralities, such as betweenness and closeness, thus illustrating the importance of this contribution. △ Less

Submitted 19 August, 2020; v1 submitted 27 June, 2020; originally announced June 2020.

Comments: 17 pages, 6 figures

arXiv:2005.14650 [pdf, ps, other]

WhylSon: Proving your Michelson Smart Contracts in Why3

Authors: Luís Pedro Arrojado da Horta, João Santos Reis, Mário Pereira, Simão Melo de Sousa

Abstract: This paper introduces WhylSon, a deductive verification tool for smart contracts written in Michelson, which is the low-level language of the Tezos blockchain. WhylSon accepts a formally specified Michelson contract and automatically translates it to an equivalent program written in WhyML, the programming and specification language of the Why3 framework. Smart contract instructions are mapped into… ▽ More This paper introduces WhylSon, a deductive verification tool for smart contracts written in Michelson, which is the low-level language of the Tezos blockchain. WhylSon accepts a formally specified Michelson contract and automatically translates it to an equivalent program written in WhyML, the programming and specification language of the Why3 framework. Smart contract instructions are mapped into a corresponding WhyML shallow-embedding of the their axiomatic semantics, which we also developed in the context of this work. One major advantage of this approach is that it allows an out-of-the-box integration with the Why3 framework, namely its VCGen and the backend support for several automated theorem provers. We also discuss the use of WhylSon to automatically prove the correctness of diverse annotated smart contracts. △ Less

Submitted 29 May, 2020; originally announced May 2020.

arXiv:2005.07473 [pdf, other]

doi 10.1016/j.future.2021.07.014

Predicting User Emotional Tone in Mental Disorder Online Communities

Authors: Bárbara Silveira, Henrique S. Silva, Fabricio Murai, Ana Paula Couto da Silva

Abstract: In recent years, Online Social Networks have become an important medium for people who suffer from mental disorders to share moments of hardship, and receive emotional and informational support. In this work, we analyze how discussions in Reddit communities related to mental disorders can help improve the health conditions of their users. Using the emotional tone of users' writing as a proxy for e… ▽ More In recent years, Online Social Networks have become an important medium for people who suffer from mental disorders to share moments of hardship, and receive emotional and informational support. In this work, we analyze how discussions in Reddit communities related to mental disorders can help improve the health conditions of their users. Using the emotional tone of users' writing as a proxy for emotional state, we uncover relationships between user interactions and state changes. First, we observe that authors of negative posts often write rosier comments after engaging in discussions, indicating that users' emotional state can improve due to social support. Second, we build models based on SOTA text embedding techniques and RNNs to predict shifts in emotional tone. This differs from most of related work, which focuses primarily on detecting mental disorders from user activity. We demonstrate the feasibility of accurately predicting the users' reactions to the interactions experienced in these platforms, and present some examples which illustrate that the models are correctly capturing the effects of comments on the author's emotional tone. Our models hold promising implications for interventions to provide support for people struggling with mental illnesses. △ Less

Submitted 27 July, 2021; v1 submitted 15 May, 2020; originally announced May 2020.

Comments: 8 pages, 3 figures, 3 tables

ACM Class: J.3; I.2.7

Journal ref: Future Generation Computer Systems, Volume 125, 2021, Pages 641-651, ISSN 0167-739X

arXiv:2005.01291 [pdf, other]

doi 10.1145/3340631.3394883

Human Strategic Steering Improves Performance of Interactive Optimization

Authors: Fabio Colella, Pedram Daee, Jussi Jokinen, Antti Oulasvirta, Samuel Kaski

Abstract: A central concern in an interactive intelligent system is optimization of its actions, to be maximally helpful to its human user. In recommender systems for instance, the action is to choose what to recommend, and the optimization task is to recommend items the user prefers. The optimization is done based on earlier user's feedback (e.g. "likes" and "dislikes"), and the algorithms assume the feedb… ▽ More A central concern in an interactive intelligent system is optimization of its actions, to be maximally helpful to its human user. In recommender systems for instance, the action is to choose what to recommend, and the optimization task is to recommend items the user prefers. The optimization is done based on earlier user's feedback (e.g. "likes" and "dislikes"), and the algorithms assume the feedback to be faithful. That is, when the user clicks "like," they actually prefer the item. We argue that this fundamental assumption can be extensively violated by human users, who are not passive feedback sources. Instead, they are in control, actively steering the system towards their goal. To verify this hypothesis, that humans steer and are able to improve performance by steering, we designed a function optimization task where a human and an optimization algorithm collaborate to find the maximum of a 1-dimensional function. At each iteration, the optimization algorithm queries the user for the value of a hidden function $f$ at a point $x$, and the user, who sees the hidden function, provides an answer about $f(x)$. Our study on 21 participants shows that users who understand how the optimization works, strategically provide biased answers (answers not equal to $f(x)$), which results in the algorithm finding the optimum significantly faster. Our work highlights that next-generation intelligent systems will need user models capable of hel** users who steer systems to pursue their goals. △ Less

Submitted 4 May, 2020; originally announced May 2020.

Comments: 10 pages, 5 figures, The paper is published in the proceedings of UMAP 2020. Codes available at https://github.com/fcole90/interactive_bayesian_optimisation

arXiv:2004.09397 [pdf, other]

Multi-label Stream Classification with Self-Organizing Maps

Authors: Ricardo Cerri, Joel David Costa Júnior, Elaine Ribeiro de Faria Paiva, João Manuel Portela da Gama

Abstract: Several learning algorithms have been proposed for offline multi-label classification. However, applications in areas such as traffic monitoring, social networks, and sensors produce data continuously, the so called data streams, posing challenges to batch multi-label learning. With the lack of stationarity in the distribution of data streams, new algorithms are needed to online adapt to such chan… ▽ More Several learning algorithms have been proposed for offline multi-label classification. However, applications in areas such as traffic monitoring, social networks, and sensors produce data continuously, the so called data streams, posing challenges to batch multi-label learning. With the lack of stationarity in the distribution of data streams, new algorithms are needed to online adapt to such changes (concept drift). Also, in realistic applications, changes occur in scenarios of infinitely delayed labels, where the true classes of the arrival instances are never available. We propose an online unsupervised incremental method based on self-organizing maps for multi-label stream classification with infinitely delayed labels. In the classification phase, we use a k-nearest neighbors strategy to compute the winning neurons in the maps, adapting to concept drift by online adjusting neuron weight vectors and dataset label cardinality. We predict labels for each instance using the Bayes rule and the outputs of each neuron, adapting the probabilities and conditional probabilities of the classes in the stream. Experiments using synthetic and real datasets show that our method is highly competitive with several ones from the literature, in both stationary and concept drift scenarios. △ Less

Submitted 20 April, 2020; originally announced April 2020.

Comments: 7 pages, 14 figures

ACM Class: I.2.6

arXiv:2004.09298 [pdf, ps, other]

doi 10.1016/j.jde.2021.07.045

An Efficient Method for Computing Liouvillian First Integrals of Planar Polynomial Vector Fields

Authors: L. G. S. Duarte, L. A. C. P. da Mota

Abstract: Here we present an efficient method to compute Darboux polynomials for polynomial vector fields in the plane. This approach is restricetd to polynomial vector fields presenting a Liouvillian first integral (or, equivalently, to rational first order differential equations (rational 1ODEs) presenting a Liouvillian general solution). The key to obtaining this method was to separate the procedure of s… ▽ More Here we present an efficient method to compute Darboux polynomials for polynomial vector fields in the plane. This approach is restricetd to polynomial vector fields presenting a Liouvillian first integral (or, equivalently, to rational first order differential equations (rational 1ODEs) presenting a Liouvillian general solution). The key to obtaining this method was to separate the procedure of solving the (nonlinear) algebraic systems resulting from the equation that translates the condition of existence of a Darboux polynomial into feasible steos (procedures that requires less memory consumption). We also present a brief performance analysis of the algorithms developed. △ Less

Submitted 20 April, 2020; originally announced April 2020.

Journal ref: Journal of Differential Equations Volume 300, 5 November 2021, Pages 356-385

arXiv:2004.01608 [pdf, other]

Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning

Authors: Paulo R. de O. da Costa, Jason Rhuggenaath, Yingqian Zhang, Alp Akcay

Abstract: Recent works using deep learning to solve the Traveling Salesman Problem (TSP) have focused on learning construction heuristics. Such approaches find TSP solutions of good quality but require additional procedures such as beam search and sampling to improve solutions and achieve state-of-the-art performance. However, few studies have focused on improvement heuristics, where a given solution is imp… ▽ More Recent works using deep learning to solve the Traveling Salesman Problem (TSP) have focused on learning construction heuristics. Such approaches find TSP solutions of good quality but require additional procedures such as beam search and sampling to improve solutions and achieve state-of-the-art performance. However, few studies have focused on improvement heuristics, where a given solution is improved until reaching a near-optimal one. In this work, we propose to learn a local search heuristic based on 2-opt operators via deep reinforcement learning. We propose a policy gradient algorithm to learn a stochastic policy that selects 2-opt operations given a current solution. Moreover, we introduce a policy neural network that leverages a pointing attention mechanism, which unlike previous works, can be easily extended to more general k-opt moves. Our results show that the learned policies can improve even over random initial solutions and approach near-optimal solutions at a faster rate than previous state-of-the-art deep learning methods. △ Less

Submitted 14 September, 2020; v1 submitted 3 April, 2020; originally announced April 2020.

Comments: To appear in Proceedings Machine Learning Research - ACML 2020

arXiv:2002.11213 [pdf, other]

Speech2Phone: A Novel and Efficient Method for Training Speaker Recognition Models

Authors: Edresson Casanova, Arnaldo Candido Junior, Christopher Shulby, Frederico Santos de Oliveira, Lucas Rafael Stefanel Gris, Hamilton Pereira da Silva, Sandra Maria Aluisio, Moacir Antonelli Ponti

Abstract: In this paper we present an efficient method for training models for speaker recognition using small or under-resourced datasets. This method requires less data than other SOTA (State-Of-The-Art) methods, e.g. the Angular Prototypical and GE2E loss functions, while achieving similar results to those methods. This is done using the knowledge of the reconstruction of a phoneme in the speaker's voice… ▽ More In this paper we present an efficient method for training models for speaker recognition using small or under-resourced datasets. This method requires less data than other SOTA (State-Of-The-Art) methods, e.g. the Angular Prototypical and GE2E loss functions, while achieving similar results to those methods. This is done using the knowledge of the reconstruction of a phoneme in the speaker's voice. For this purpose, a new dataset was built, composed of 40 male speakers, who read sentences in Portuguese, totaling approximately 3h. We compare the three best architectures trained using our method to select the best one, which is the one with a shallow architecture. Then, we compared this model with the SOTA method for the speaker recognition task: the Fast ResNet-34 trained with approximately 2,000 hours, using the loss functions Angular Prototypical and GE2E. Three experiments were carried out with datasets in different languages. Among these three experiments, our model achieved the second best result in two experiments and the best result in one of them. This highlights the importance of our method, which proved to be a great competitor to SOTA speaker recognition models, with 500x less data and a simpler approach. △ Less

Submitted 18 June, 2021; v1 submitted 25 February, 2020; originally announced February 2020.

Comments: Submitted to BRACIS

arXiv:2002.07481 [pdf, other]

Quantitative Evaluation of Time-Dependent Multidimensional Projection Techniques

Authors: E. F. Vernier, R. Garcia, I. P. da Silva, J. L. D. Comba, A. C. Telea

Abstract: Dimensionality reduction methods are an essential tool for multidimensional data analysis, and many interesting processes can be studied as time-dependent multivariate datasets. There are, however, few studies and proposals that leverage on the concise power of expression of projections in the context of dynamic/temporal data. In this paper, we aim at providing an approach to assess projection tec… ▽ More Dimensionality reduction methods are an essential tool for multidimensional data analysis, and many interesting processes can be studied as time-dependent multivariate datasets. There are, however, few studies and proposals that leverage on the concise power of expression of projections in the context of dynamic/temporal data. In this paper, we aim at providing an approach to assess projection techniques for dynamic data and understand the relationship between visual quality and stability. Our approach relies on an experimental setup that consists of existing techniques designed for time-dependent data and new variations of static methods. To support the evaluation of these techniques, we provide a collection of datasets that has a wide variety of traits that encode dynamic patterns, as well as a set of spatial and temporal stability metrics that assess the quality of the layouts. We present an evaluation of 11 methods, 10 datasets, and 12 quality metrics, and elect the best-suited methods for projecting time-dependent multivariate data, exploring the design choices and characteristics of each method. All our results are documented and made available in a public repository to allow reproducibility of results. △ Less

Submitted 18 February, 2020; originally announced February 2020.

arXiv:2001.10071 [pdf]

doi 10.1186/s13326-022-00269-1

SemClinBr -- a multi institutional and multi specialty semantically annotated corpus for Portuguese clinical NLP tasks

Authors: Lucas Emanuel Silva e Oliveira, Ana Carolina Peters, Adalniza Moura Pucca da Silva, Caroline P. Gebeluca, Yohan Bonescki Gumiel, Lilian Mie Mukai Cintho, Deborah Ribeiro Carvalho, Sadid A. Hasan, Claudia Maria Cabral Moro

Abstract: The high volume of research focusing on extracting patient's information from electronic health records (EHR) has led to an increase in the demand for annotated corpora, which are a very valuable resource for both the development and evaluation of natural language processing (NLP) algorithms. The absence of a multi-purpose clinical corpus outside the scope of the English language, especially in Br… ▽ More The high volume of research focusing on extracting patient's information from electronic health records (EHR) has led to an increase in the demand for annotated corpora, which are a very valuable resource for both the development and evaluation of natural language processing (NLP) algorithms. The absence of a multi-purpose clinical corpus outside the scope of the English language, especially in Brazilian Portuguese, is glaring and severely impacts scientific progress in the biomedical NLP field. In this study, we developed a semantically annotated corpus using clinical texts from multiple medical specialties, document types, and institutions. We present the following: (1) a survey listing common aspects and lessons learned from previous research, (2) a fine-grained annotation schema which could be replicated and guide other annotation initiatives, (3) a web-based annotation tool focusing on an annotation suggestion feature, and (4) both intrinsic and extrinsic evaluation of the annotations. The result of this work is the SemClinBr, a corpus that has 1,000 clinical notes, labeled with 65,117 entities and 11,263 relations, and can support a variety of clinical NLP tasks and boost the EHR's secondary use for the Portuguese language. △ Less

Submitted 27 January, 2020; originally announced January 2020.

arXiv:2001.09896 [pdf, other]

doi 10.1007/978-981-33-6977-1

Semantic Sensitive TF-IDF to Determine Word Relevance in Documents

Authors: Amir Jalilifard, Vinicius F. Caridá, Alex F. Mansano, Rogers S. Cristo, Felipe Penhorate C. da Fonseca

Abstract: Keyword extraction has received an increasing attention as an important research topic which can lead to have advancements in diverse applications such as document context categorization, text indexing and document classification. In this paper we propose STF-IDF, a novel semantic method based on TF-IDF, for scoring word importance of informal documents in a corpus. A set of nearly four million do… ▽ More Keyword extraction has received an increasing attention as an important research topic which can lead to have advancements in diverse applications such as document context categorization, text indexing and document classification. In this paper we propose STF-IDF, a novel semantic method based on TF-IDF, for scoring word importance of informal documents in a corpus. A set of nearly four million documents from health-care social media was collected and was trained in order to draw semantic model and to find the word embeddings. Then, the features of semantic space were utilized to rearrange the original TF-IDF scores through an iterative solution so as to improve the moderate performance of this algorithm on informal texts. After testing the proposed method with 200 randomly chosen documents, our method managed to decrease the TF-IDF mean error rate by a factor of 50% and reaching the mean error of 13.7%, as opposed to 27.2% of the original TF-IDF. △ Less

Submitted 25 January, 2021; v1 submitted 5 January, 2020; originally announced January 2020.

Comments: 11 pages, 2 figures, 22 references

arXiv:2001.08966 [pdf, other]

Design optimisation of a multi-mode wave energy converter

Authors: Nataliia Y. Sergiienko, Mehdi Neshat, Leandro S. P. da Silva, Bradley Alexander, Markus Wagner

Abstract: A wave energy converter (WEC) similar to the CETO system developed by Carnegie Clean Energy is considered for design optimisation. This WEC is able to absorb power from heave, surge and pitch motion modes, making the optimisation problem nontrivial. The WEC dynamics is simulated using the spectral-domain model taking into account hydrodynamic forces, viscous drag, and power take-off forces. The de… ▽ More A wave energy converter (WEC) similar to the CETO system developed by Carnegie Clean Energy is considered for design optimisation. This WEC is able to absorb power from heave, surge and pitch motion modes, making the optimisation problem nontrivial. The WEC dynamics is simulated using the spectral-domain model taking into account hydrodynamic forces, viscous drag, and power take-off forces. The design parameters for optimisation include the buoy radius, buoy height, tether inclination angles, and control variables (dam** and stiffness). The WEC design is optimised for the wave climate at Albany test site in Western Australia considering unidirectional irregular waves. Two objective functions are considered: (i) maximisation of the annual average power output, and (ii) minimisation of the levelised cost of energy (LCoE) for a given sea site. The LCoE calculation is approximated as a ratio of the produced energy to the significant mass of the system that includes the mass of the buoy and anchor system. Six different heuristic optimisation methods are applied in order to evaluate and compare the performance of the best known evolutionary algorithms, a swarm intelligence technique and a numerical optimisation approach. The results demonstrate that if we are interested in maximising energy production without taking into account the cost of manufacturing such a system, the buoy should be built as large as possible (20 m radius and 30 m height). However, if we want the system that produces cheap energy, then the radius of the buoy should be approximately 11-14~m while the height should be as low as possible. These results coincide with the overall design that Carnegie Clean Energy has selected for its CETO 6 multi-moored unit. However, it should be noted that this study is not informed by them, so this can be seen as an independent validation of the design choices. △ Less

Submitted 24 January, 2020; originally announced January 2020.

arXiv:2001.04449 [pdf, other]

doi 10.1088/2058-9565/ab7559

A quantum-classical cloud platform optimized for variational hybrid algorithms

Authors: Peter J. Karalekas, Nikolas A. Tezak, Eric C. Peterson, Colm A. Ryan, Marcus P. da Silva, Robert S. Smith

Abstract: In order to support near-term applications of quantum computing, a new compute paradigm has emerged--the quantum-classical cloud--in which quantum computers (QPUs) work in tandem with classical computers (CPUs) via a shared cloud infrastructure. In this work, we enumerate the architectural requirements of a quantum-classical cloud platform, and present a framework for benchmarking its runtime perf… ▽ More In order to support near-term applications of quantum computing, a new compute paradigm has emerged--the quantum-classical cloud--in which quantum computers (QPUs) work in tandem with classical computers (CPUs) via a shared cloud infrastructure. In this work, we enumerate the architectural requirements of a quantum-classical cloud platform, and present a framework for benchmarking its runtime performance. In addition, we walk through two platform-level enhancements, parametric compilation and active qubit reset, that specifically optimize a quantum-classical architecture to support variational hybrid algorithms (VHAs), the most promising applications of near-term quantum hardware. Finally, we show that integrating these two features into the Rigetti Quantum Cloud Services (QCS) platform results in considerable improvements to the latencies that govern algorithm runtime. △ Less

Submitted 30 May, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

Comments: 21 pages, 8 figures; updated references to match published version

Journal ref: Quantum Sci. Technol. 5 024003 (2020)

Showing 1–50 of 80 results for author: Dao, P