Skip to main content

Showing 1–50 of 80 results for author: Dao, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08269  [pdf, other

    cs.FL cs.AI cs.LG

    Analyzing constrained LLM through PDFA-learning

    Authors: Matías Carrasco, Franz Mayr, Sergio Yovine, Johny Kidd, Martín Iturbide, Juan Pedro da Silva, Alejo Garat

    Abstract: We define a congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation. We develop an algorithm for efficiently learning the quotient with respect to this congruence and evaluate it on case studies for analyzing statistical properties of LLM.

    Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Workshop Paper

  2. arXiv:2402.18511  [pdf

    cs.RO

    Leveraging Compliant Tactile Perception for Haptic Blind Surface Reconstruction

    Authors: Laurent Yves Emile Ramos Cheret, Vinicius Prado da Fonseca, Thiago Eustaquio Alves de Oliveira

    Abstract: Non-flat surfaces pose difficulties for robots operating in unstructured environments. Reconstructions of uneven surfaces may only be partially possible due to non-compliant end-effectors and limitations on vision systems such as transparency, reflections, and occlusions. This study achieves blind surface reconstruction by harnessing the robotic manipulator's kinematic data and a compliant tactile… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 7 pages, 9 figures, 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

  3. arXiv:2401.06790  [pdf, other

    cs.CL cs.AI

    Using Zero-shot Prompting in the Automatic Creation and Expansion of Topic Taxonomies for Tagging Retail Banking Transactions

    Authors: Daniel de S. Moraes, Pedro T. C. Santos, Polyana B. da Costa, Matheus A. S. Pinto, Ivan de J. P. Pinto, Álvaro M. G. da Veiga, Sergio Colcher, Antonio J. G. Busson, Rafael H. Rocha, Rennan Gaio, Rafael Miceli, Gabriela Tourinho, Marcos Rabaioli, Leandro Santos, Fellipe Marques, David Favaro

    Abstract: This work presents an unsupervised method for automatically constructing and expanding topic taxonomies using instruction-based fine-tuned LLMs (Large Language Models). We apply topic modeling and keyword extraction techniques to create initial topic taxonomies and LLMs to post-process the resulting terms and create a hierarchy. To expand an existing taxonomy with new terms, we use zero-shot promp… ▽ More

    Submitted 11 February, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  4. arXiv:2401.06601  [pdf, other

    cs.CR cs.DB

    A proposal to increase data utility on Global Differential Privacy data based on data use predictions

    Authors: Henry C. Nunes, Marlon P. da Silva, Charles V. Neu, Avelino F. Zorzo

    Abstract: This paper presents ongoing research focused on improving the utility of data protected by Global Differential Privacy(DP) in the scenario of summary statistics. Our approach is based on predictions on how an analyst will use statistics released under DP protection, so that a developer can optimise data utility on further usage of the data in the privacy budget allocation. This novel approach can… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

  5. arXiv:2401.01200  [pdf, other

    cs.CV cs.AI

    Skin cancer diagnosis using NIR spectroscopy data of skin lesions in vivo using machine learning algorithms

    Authors: Flavio P. Loss, Pedro H. da Cunha, Matheus B. Rocha, Madson Poltronieri Zanoni, Leandro M. de Lima, Isadora Tavares Nascimento, Isabella Rezende, Tania R. P. Canuto, Luciana de Paula Vieira, Renan Rossoni, Maria C. S. Santos, Patricia Lyra Frasson, Wanderson Romão, Paulo R. Filgueiras, Renato A. Krohling

    Abstract: Skin lesions are classified in benign or malignant. Among the malignant, melanoma is a very aggressive cancer and the major cause of deaths. So, early diagnosis of skin cancer is very desired. In the last few years, there is a growing interest in computer aided diagnostic (CAD) using most image and clinical data of the lesion. These sources of information present limitations due to their inability… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

  6. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2310.14553  [pdf, other

    cs.RO cs.AI cs.MA

    Denoising Opponents Position in Partial Observation Environment

    Authors: Aref Sayareh, Aria Sardari, Vahid Khoddami, Nader Zare, Vinicius Prado da Fonseca, Amilcar Soares

    Abstract: The RoboCup competitions hold various leagues, and the Soccer Simulation 2D League is a major among them. Soccer Simulation 2D (SS2D) match involves two teams, including 11 players and a coach for each team, competing against each other. The players can only communicate with the Soccer Simulation Server during the game. Several code bases are released publicly to simplify team development. So rese… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  8. arXiv:2310.09709  [pdf, other

    cs.CV cs.AI cs.LG

    New Advances in Body Composition Assessment with ShapedNet: A Single Image Deep Regression Approach

    Authors: Navar Medeiros M. Nascimento, Pedro Cavalcante de Sousa Junior, Pedro Yuri Rodrigues Nunes, Suane Pires Pinheiro da Silva, Luiz Lannes Loureiro, Victor Zaban Bittencourt, Valden Luis Matos Capistrano Junior, Pedro Pedrosa Rebouças Filho

    Abstract: We introduce a novel technique called ShapedNet to enhance body composition assessment. This method employs a deep neural network capable of estimating Body Fat Percentage (BFP), performing individual identification, and enabling localization using a single photograph. The accuracy of ShapedNet is validated through comprehensive comparisons against the gold standard method, Dual-Energy X-ray Absor… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

    Comments: Preprinted version in October 2023. The paper is under consideration at Pattern Recognition Letters

  9. arXiv:2307.15208  [pdf, other

    eess.IV cs.CV

    Generative AI for Medical Imaging: extending the MONAI Framework

    Authors: Walter H. L. Pinaya, Mark S. Graham, Eric Kerfoot, Petru-Daniel Tudosiu, Jessica Dafflon, Virginia Fernandez, Pedro Sanchez, Julia Wolleb, Pedro F. da Costa, Ashay Patel, Hyung** Chung, Can Zhao, Wei Peng, Zelong Liu, Xueyan Mei, Oeslle Lucena, Jong Chul Ye, Sotirios A. Tsaftaris, Prerna Dogra, Andrew Feng, Marc Modat, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

    Abstract: Recent advances in generative AI have brought incredible breakthroughs in several areas, including medical imaging. These generative models have tremendous potential not only to help safely share medical data via synthetic datasets but also to perform an array of diverse applications, such as anomaly detection, image-to-image translation, denoising, and MRI reconstruction. However, due to the comp… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  10. arXiv:2307.10018  [pdf, other

    cs.RO cs.AI

    RobôCIn Small Size League Extended Team Description Paper for RoboCup 2023

    Authors: Aline Lima de Oliveira, Cauê Addae da Silva Gomes, Cecília Virginia Santos da Silva, Charles Matheus de Sousa Alves, Danilo Andrade Martins de Souza, Driele Pires Ferreira Araújo Xavier, Edgleyson Pereira da Silva, Felipe Bezerra Martins, Lucas Henrique Cavalcanti Santos, Lucas Dias Maciel, Matheus Paixão Gumercindo dos Santos, Matheus Lafayette Vasconcelos, Matheus Vinícius Teotonio do Nascimento Andrade, João Guilherme Oliveira Carvalho de Melo, João Pedro Souza Pereira de Moura, José Ronald da Silva, José Victor Silva Cruz, Pedro Henrique Santana de Morais, Pedro Paulo Salman de Oliveira, Riei Joaquim Matos Rodrigues, Roberto Costa Fernandes, Ryan Vinicius Santos Morais, Tamara Mayara Ramos Teobaldo, Washington Igor dos Santos Silva, Edna Natividade Silva Barros

    Abstract: RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Ou… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  11. arXiv:2306.00766  [pdf, other

    cs.CR

    Impact of using a privacy model on smart buildings data for CO2 prediction

    Authors: Marlon P. da Silva, Henry C. Nunes, Charles V. Neu, Luana T. Thomas, Avelino F. Zorzo, Charles Morisset

    Abstract: There is a constant trade-off between the utility of the data collected and processed by the many systems forming the Internet of Things (IoT) revolution and the privacy concerns of the users living in the spaces hosting these sensors. Privacy models, such as the SITA (Spatial, Identity, Temporal, and Activity) model, can help address this trade-off. In this paper, we focus on the problem of… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  12. arXiv:2304.09064  [pdf, other

    cs.HC cs.AI

    LLM-based Interaction for Content Generation: A Case Study on the Perception of Employees in an IT department

    Authors: Alexandre Agossah, Frédérique Krupa, Matthieu Perreira Da Silva, Patrick Le Callet

    Abstract: In the past years, AI has seen many advances in the field of NLP. This has led to the emergence of LLMs, such as the now famous GPT-3.5, which revolutionise the way humans can access or generate content. Current studies on LLM-based generative tools are mainly interested in the performance of such tools in generating relevant content (code, text or image). However, ethical concerns related to the… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Comments: 14 pages (bibliography inclued), 6 figures, preprint submitted to Work-In-Progress session of ACM IMX'23 Interactive Media Experience

    ACM Class: I.2.7; J.7

  13. arXiv:2212.10913  [pdf

    cs.CR cs.LG

    Ensemble learning techniques for intrusion detection system in the context of cybersecurity

    Authors: Andricson Abeline Moreira, Carlos A. C. Tojeiro, Carlos J. Reis, Gustavo Henrique Massaro, Igor Andrade Brito e Kelton A. P. da Costa

    Abstract: Recently, there has been an interest in improving the resources available in Intrusion Detection System (IDS) techniques. In this sense, several studies related to cybersecurity show that the environment invasions and information kidnap** are increasingly recurrent and complex. The criticality of the business involving operations in an environment using computing resources does not allow the vul… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: in Portuguese language. CIACA - Conferencia Ibero-Americana Computação Aplicada 2022 Proceedings

  14. Extractive Text Summarization Using Generalized Additive Models with Interactions for Sentence Selection

    Authors: Vinícius Camargo da Silva, João Paulo Papa, Kelton Augusto Pontara da Costa

    Abstract: Automatic Text Summarization (ATS) is becoming relevant with the growth of textual data; however, with the popularization of public large-scale datasets, some recent machine learning approaches have focused on dense models and architectures that, despite producing notable results, usually turn out in models difficult to interpret. Given the challenge behind interpretable learning-based text summar… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

  15. arXiv:2212.04984  [pdf, other

    cs.LG cs.AI

    Transformer-based normative modelling for anomaly detection of early schizophrenia

    Authors: Pedro F Da Costa, Jessica Dafflon, Sergio Leonardo Mendes, João Ricardo Sato, M. Jorge Cardoso, Robert Leech, Emily JH Jones, Walter H. L. Pinaya

    Abstract: Despite the impact of psychiatric disorders on clinical health, early-stage diagnosis remains a challenge. Machine learning studies have shown that classifiers tend to be overly narrow in the diagnosis prediction task. The overlap between conditions leads to high heterogeneity among participants that is not adequately captured by classification models. To address this issue, normative approaches h… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: 10 pages, 2 figures, 2 tables, presented at NeurIPS22@PAI4MH

  16. arXiv:2211.14372  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Interpretability Analysis of Deep Models for COVID-19 Detection

    Authors: Daniel Peixoto Pinto da Silva, Edresson Casanova, Lucas Rafael Stefanel Gris, Arnaldo Candido Junior, Marcelo Finger, Flaviane Svartman, Beatriz Raposo, Marcus Vinícius Moreira Martins, Sandra Maria Aluísio, Larissa Cristina Berti, João Paulo Teixeira

    Abstract: During the outbreak of COVID-19 pandemic, several research areas joined efforts to mitigate the damages caused by SARS-CoV-2. In this paper we present an interpretability analysis of a convolutional neural network based model for COVID-19 detection in audios. We investigate which features are important for model decision process, investigating spectrograms, F0, F0 standard deviation, sex and age.… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: 14 pages, 4 figures

  17. arXiv:2209.07162  [pdf, other

    eess.IV cs.CV q-bio.QM

    Brain Imaging Generation with Latent Diffusion Models

    Authors: Walter H. L. Pinaya, Petru-Daniel Tudosiu, Jessica Dafflon, Pedro F da Costa, Virginia Fernandez, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

    Abstract: Deep neural networks have brought remarkable breakthroughs in medical image analysis. However, due to their data-hungry nature, the modest dataset sizes in medical imaging projects might be hindering their full potential. Generating synthetic data provides a promising alternative, allowing to complement training datasets and conducting medical image research at a larger scale. Diffusion models rec… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: 10 pages, 3 figures, Accepted in the Deep Generative Models workshop @ MICCAI 2022

  18. arXiv:2208.01712  [pdf, other

    cs.LG cs.CL cs.IR stat.ML

    No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling

    Authors: Marília Costa Rosendo Silva, Felipe Alves Siqueira, João Pedro Mantovani Tarrega, João Vitor Pataca Beinotti, Augusto Sousa Nunes, Miguel de Mattos Gardini, Vinícius Adolfo Pereira da Silva, Nádia Félix Felipe da Silva, André Carlos Ponce de Leon Ferreira de Carvalho

    Abstract: Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variabi… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    ACM Class: I.2; I.2.7; I.5.3

  19. arXiv:2207.06366  [pdf, other

    cs.CL cs.LG

    N-Grammer: Augmenting Transformers with latent n-grams

    Authors: Aurko Roy, Rohan Anil, Guangda Lai, Benjamin Lee, Jeffrey Zhao, Shuyuan Zhang, Shibo Wang, Ye Zhang, Shen Wu, Rigel Swavely, Tao, Yu, Phuong Dao, Christopher Fifty, Zhifeng Chen, Yonghui Wu

    Abstract: Transformer models have recently emerged as one of the foundational models in natural language processing, and as a byproduct, there is significant recent interest and investment in scaling these models. However, the training and inference costs of these large Transformer language models are prohibitive, thus necessitating more research in identifying more efficient variants. In this work, we prop… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

    Comments: 8 pages, 2 figures

  20. arXiv:2206.03461  [pdf, other

    cs.CV eess.IV q-bio.QM

    Fast Unsupervised Brain Anomaly Detection and Segmentation with Diffusion Models

    Authors: Walter H. L. Pinaya, Mark S. Graham, Robert Gray, Pedro F Da Costa, Petru-Daniel Tudosiu, Paul Wright, Yee H. Mah, Andrew D. MacKinnon, James T. Teo, Rolf Jager, David Werring, Geraint Rees, Parashkev Nachev, Sebastien Ourselin, M. Jorge Cardoso

    Abstract: Deep generative models have emerged as promising tools for detecting arbitrary anomalies in data, dispensing with the necessity for manual labelling. Recently, autoregressive transformers have achieved state-of-the-art performance for anomaly detection in medical imaging. Nonetheless, these models still have some intrinsic weaknesses, such as requiring images to be modelled as 1D sequences, the ac… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  21. arXiv:2205.09185  [pdf, other

    physics.ins-det cs.LG hep-ex nucl-ex physics.comp-ph

    AI-assisted Optimization of the ECCE Tracking System at the Electron Ion Collider

    Authors: C. Fanelli, Z. Papandreou, K. Suresh, J. K. Adkins, Y. Akiba, A. Albataineh, M. Amaryan, I. C. Arsene, C. Ayerbe Gayoso, J. Bae, X. Bai, M. D. Baker, M. Bashkanov, R. Bellwied, F. Benmokhtar, V. Berdnikov, J. C. Bernauer, F. Bock, W. Boeglin, M. Borysova, E. Brash, P. Brindza, W. J. Briscoe, M. Brooks, S. Bueltmann , et al. (258 additional authors not shown)

    Abstract: The Electron-Ion Collider (EIC) is a cutting-edge accelerator facility that will study the nature of the "glue" that binds the building blocks of the visible matter in the universe. The proposed experiment will be realized at Brookhaven National Laboratory in approximately 10 years from now, with detector design and R&D currently ongoing. Notably, EIC is one of the first large-scale facilities to… ▽ More

    Submitted 19 May, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

    Comments: 16 pages, 18 figures, 2 appendices, 3 tables

  22. Rectangular mesh contour generation algorithm for finite differences calculus

    Authors: Pedro Zaffalon da Silva, Neyva Maria Lopes Romeiro, Iury Pereira de Souza, Paulo Laerte Natti, Eliandro Rodrigues Cirilo

    Abstract: In this work, a 2D contour generation algorithm is proposed for irregular regions. The contour of the physical domain is approximated by mesh segments using the known coordinates of the contour. For this purpose, the algorithm uses a repeating structure that analyzes the known irregular contour coordinates to approximate the physical domain contour by mesh segments. To this end, the algorithm calc… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: In Portuguese language

    Journal ref: In: Coleção desafios das engenharias: Engenharia de computação. Capítulo 4, 1ed. Ponta Grossa: Atena Editora, 2021, p. 38-52

  23. arXiv:2203.02728  [pdf, other

    cs.CV

    An End-to-End Approach for Seam Carving Detection using Deep Neural Networks

    Authors: Thierry P. Moreira, Marcos Cleison S. Santana, Leandro A. Passos João Paulo Papa, Kelton Augusto P. da Costa

    Abstract: Seam carving is a computational method capable of resizing images for both reduction and expansion based on its content, instead of the image geometry. Although the technique is mostly employed to deal with redundant information, i.e., regions composed of pixels with similar intensity, it can also be used for tampering images by inserting or removing relevant objects. Therefore, detecting such a p… ▽ More

    Submitted 5 March, 2022; originally announced March 2022.

  24. A Review of Deep Learning-based Approaches for Deepfake Content Detection

    Authors: Leandro A. Passos, Danilo Jodas, Kelton A. P. da Costa, Luis A. Souza Júnior, Douglas Rodrigues, Javier Del Ser, David Camacho, João Paulo Papa

    Abstract: Recent advancements in deep learning generative models have raised concerns as they can create highly convincing counterfeit images and videos. This poses a threat to people's integrity and can lead to social instability. To address this issue, there is a pressing need to develop new computational models that can efficiently detect forged content and alert users to potential image and video manipu… ▽ More

    Submitted 15 February, 2024; v1 submitted 12 February, 2022; originally announced February 2022.

  25. arXiv:2201.10453  [pdf, other

    cs.AI

    The First AI4TSP Competition: Learning to Solve Stochastic Routing Problems

    Authors: Laurens Bliek, Paulo da Costa, Reza Refaei Afshar, Yingqian Zhang, Tom Catshoek, Daniël Vos, Sicco Verwer, Fynn Schmitt-Ulms, André Hottung, Tapan Shah, Meinolf Sellmann, Kevin Tierney, Carl Perreault-Lafleur, Caroline Leboeuf, Federico Bobbio, Justine Pepin, Warley Almeida Silva, Ricardo Gama, Hugo L. Fernandes, Martin Zaefferer, Manuel López-Ibáñez, Ekhine Irurozki

    Abstract: This paper reports on the first international competition on AI for the traveling salesman problem (TSP) at the International Joint Conference on Artificial Intelligence 2021 (IJCAI-21). The TSP is one of the classical combinatorial optimization problems, with many variants inspired by real-world applications. This first competition asked the participants to develop algorithms to solve a time-depe… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Comments: 21 pages

    MSC Class: 68T05

  26. arXiv:2110.15731  [pdf, other

    cs.CL cs.SD eess.AS

    CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

    Authors: Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Frederico Santos de Oliveira, Lucas Oliveira, Ricardo Corso Fernandes Junior, Daniel Peixoto Pinto da Silva, Fernando Gorgulho Fayet, Bruno Baldissera Carlotto, Lucas Rafael Stefanel Gris, Sandra Maria Aluísio

    Abstract: Automatic Speech recognition (ASR) is a complex and challenging task. In recent years, there have been significant advances in the area. In particular, for the Brazilian Portuguese (BP) language, there were about 376 hours public available for ASR task until the second half of 2020. With the release of new datasets in early 2021, this number increased to 574 hours. The existing resources, however,… ▽ More

    Submitted 18 November, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: This paper is under consideration at Language Resources and Evaluation (LREV)

  27. arXiv:2109.08051  [pdf, ps, other

    stat.ML cs.LG

    Frame by frame completion probability of an NFL pass

    Authors: Gustavo Pompeu da Silva, Rafael de Andrade Moral

    Abstract: American football is an increasingly popular sport, with a growing audience in many countries in the world. The most watched American football league in the world is the United States' National Football League (NFL), where every offensive play can be either a run or a pass, and in this work we focus on passes. Many factors can affect the probability of pass completion, such as receiver separation… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: 26 pages, 13 figures, 5 tables

  28. arXiv:2109.02148  [pdf, other

    eess.SP cs.IT

    Adaptive Turbo Equalization for Nonlinearity Compensation in WDM Systems

    Authors: Edson Porto da Silva, Metodi Plamenov Yankov

    Abstract: In this paper, the performance of adaptive turbo equalization for nonlinearity compensation (NLC) is investigated. A turbo equalization scheme is proposed where a recursive least-squares (RLS) algorithm is used as an adaptive channel estimator to track the time-varying intersymbol interference (ISI) coefficients associated with inter-channel nonlinear interference (NLI) model. The estimated channe… ▽ More

    Submitted 5 September, 2021; originally announced September 2021.

  29. arXiv:2108.13202  [pdf, other

    cs.DC cs.AI

    PTRAIL -- A python package for parallel trajectory data preprocessing

    Authors: Salman Haidri, Yaksh J. Haranwala, Vania Bogorny, Chiara Renso, Vinicius Prado da Fonseca, Amilcar Soares

    Abstract: Trajectory data represent a trace of an object that changes its position in space over time. This kind of data is complex to handle and analyze, since it is generally produced in huge quantities, often prone to errors generated by the geolocation device, human mishandling, or area coverage limitation. Therefore, there is a need for software specifically tailored to preprocess trajectory data. In t… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

  30. arXiv:2108.12214  [pdf, other

    cs.DC cs.PF

    Machine Learning for Performance Prediction of Spark Cloud Applications

    Authors: Alexandre Maros, Fabricio Murai, Ana Paula Couto da Silva, Jussara M. Almeida, Marco Lattuada, Eugenio Gianniti, Marjan Hosseini, Danilo Ardagna

    Abstract: Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the und… ▽ More

    Submitted 27 August, 2021; originally announced August 2021.

    Comments: Published in 2019 IEEE 12th International Conference on Cloud Computing (CLOUD)

    ACM Class: B.8.2; I.2

  31. arXiv:2108.09778  [pdf

    cs.CY

    "Sharing Wisdoms from the East": Develo** a Native Theory of ICT4D Using Grounded Theory Methodology (GTM) -- Experience from Timor-Leste

    Authors: Abel Pires da Silva

    Abstract: There have been repeated calls made for theory-building studies in ICT4D research to solidify the existence of this research field. However, theory-building studies are not yet common, even though ICT4D as a research domain is a promising venue to develop native and indigenous theories. To this end, this paper outlines a theory-building study in ICT4D, based on the author's experience in developin… ▽ More

    Submitted 22 August, 2021; originally announced August 2021.

    Comments: In proceedings of the 1st Virtual Conference on Implications of Information and Digital Technologies for Development, 2021

  32. arXiv:2107.13589  [pdf, other

    quant-ph cs.IT

    Improved quantum error correction using soft information

    Authors: Christopher A. Pattison, Michael E. Beverland, Marcus P. da Silva, Nicolas Delfosse

    Abstract: The typical model for measurement noise in quantum error correction is to randomly flip the binary measurement outcome. In experiments, measurements yield much richer information - e.g., continuous current values, discrete photon counts - which is then mapped into binary outcomes by discarding some of this information. In this work, we consider methods to incorporate all of this richer information… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

    Comments: 27 pages, 13 figures

  33. arXiv:2107.04702  [pdf

    cs.NE

    Um Metodo para Busca Automatica de Redes Neurais Artificiais

    Authors: Anderson P. da Silva, Teresa B. Ludermir, Leandro M. Almeida

    Abstract: This paper describes a method that automatically searches Artificial Neural Networks using Cellular Genetic Algorithms. The main difference of this method for a common genetic algorithm is the use of a cellular automaton capable of providing the location for individuals, reducing the possibility of local minima in search space. This method employs an evolutionary search for simultaneous choices of… ▽ More

    Submitted 9 July, 2021; originally announced July 2021.

    Comments: 13 pages, in Portuguese, 4 figures, 2 tables

  34. arXiv:2105.15119  [pdf, other

    math.OC cs.LG

    Policies for the Dynamic Traveling Maintainer Problem with Alerts

    Authors: Paulo da Costa, Peter Verleijsdonk, Simon Voorberg, Alp Akcay, Stella Kapodistria, Willem van Jaarsveld, Yingqian Zhang

    Abstract: Downtime of industrial assets such as wind turbines and medical imaging devices comes at a sharp cost. To avoid such downtime costs, companies seek to initiate maintenance just before failure. Unfortunately, this is challenging for the following two reasons: On the one hand, because asset failures are notoriously difficult to predict, even in the presence of real-time monitoring devices which sign… ▽ More

    Submitted 20 May, 2022; v1 submitted 31 May, 2021; originally announced May 2021.

  35. The Effects of Spectral Dimensionality Reduction on Hyperspectral Pixel Classification: A Case Study

    Authors: Kiran Mantripragada, Phuong D. Dao, Yuhong He, Faisal Z. Qureshi

    Abstract: This paper presents a systematic study of the effects of hyperspectral pixel dimensionality reduction on the pixel classification task. We use five dimensionality reduction methods -- PCA, KPCA, ICA, AE, and DAE -- to compress 301-dimensional hyperspectral pixels. Compressed pixels are subsequently used to perform pixel classifications. Pixel classification accuracies together with compression met… ▽ More

    Submitted 27 January, 2022; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: 15 pages

  36. arXiv:2012.09575  [pdf, other

    cs.LG cs.AI q-bio.GN

    Task Uncertainty Loss Reduce Negative Transfer in Asymmetric Multi-task Feature Learning

    Authors: Rafael Peres da Silva, Chayaporn Suphavilai, Niranjan Nagarajan

    Abstract: Multi-task learning (MTL) is frequently used in settings where a target task has to be learnt based on limited training data, but knowledge can be leveraged from related auxiliary tasks. While MTL can improve task performance overall relative to single-task learning (STL), these improvements can hide negative transfer (NT), where STL may deliver better performance for many individual tasks. Asymme… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: Accepted in AAAI 2021 Student Abstract and Poster Program

  37. arXiv:2007.09989  [pdf, other

    cs.LG cs.HC stat.ML

    Bayesian optimization for automatic design of face stimuli

    Authors: Pedro F. da Costa, Romy Lorenz, Ricardo Pio Monti, Emily Jones, Robert Leech

    Abstract: Investigating the cognitive and neural mechanisms involved with face processing is a fundamental task in modern neuroscience and psychology. To date, the majority of such studies have focused on the use of pre-selected stimuli. The absence of personalized stimuli presents a serious limitation as it fails to account for how each individual face processing system is tuned to cultural embeddings or h… ▽ More

    Submitted 20 July, 2020; originally announced July 2020.

    Comments: Accepted at ICML2020 workshop track

  38. arXiv:2006.15401  [pdf, other

    cs.SI physics.soc-ph

    You Shall not Pass: Avoiding Spurious Paths in Shortest-Path Based Centralities in Multidimensional Complex Networks

    Authors: Klaus Wehmuth, Artur Ziviani, Leonardo Chinelate Costa, Ana Paula Couto da Silva, Alex Borges Vieira

    Abstract: In complex network analysis, centralities based on shortest paths, such as betweenness and closeness, are widely used. More recently, many complex systems are being represented by time-varying, multilayer, and time-varying multilayer networks, i.e. multidimensional (or high order) networks. Nevertheless, it is well-known that the aggregation process may create spurious paths on the aggregated view… ▽ More

    Submitted 19 August, 2020; v1 submitted 27 June, 2020; originally announced June 2020.

    Comments: 17 pages, 6 figures

  39. arXiv:2005.14650  [pdf, ps, other

    cs.PL

    WhylSon: Proving your Michelson Smart Contracts in Why3

    Authors: Luís Pedro Arrojado da Horta, João Santos Reis, Mário Pereira, Simão Melo de Sousa

    Abstract: This paper introduces WhylSon, a deductive verification tool for smart contracts written in Michelson, which is the low-level language of the Tezos blockchain. WhylSon accepts a formally specified Michelson contract and automatically translates it to an equivalent program written in WhyML, the programming and specification language of the Why3 framework. Smart contract instructions are mapped into… ▽ More

    Submitted 29 May, 2020; originally announced May 2020.

  40. arXiv:2005.07473  [pdf, other

    cs.LG cs.CL cs.SI stat.ML

    Predicting User Emotional Tone in Mental Disorder Online Communities

    Authors: Bárbara Silveira, Henrique S. Silva, Fabricio Murai, Ana Paula Couto da Silva

    Abstract: In recent years, Online Social Networks have become an important medium for people who suffer from mental disorders to share moments of hardship, and receive emotional and informational support. In this work, we analyze how discussions in Reddit communities related to mental disorders can help improve the health conditions of their users. Using the emotional tone of users' writing as a proxy for e… ▽ More

    Submitted 27 July, 2021; v1 submitted 15 May, 2020; originally announced May 2020.

    Comments: 8 pages, 3 figures, 3 tables

    ACM Class: J.3; I.2.7

    Journal ref: Future Generation Computer Systems, Volume 125, 2021, Pages 641-651, ISSN 0167-739X

  41. arXiv:2005.01291  [pdf, other

    cs.HC cs.AI cs.LG

    Human Strategic Steering Improves Performance of Interactive Optimization

    Authors: Fabio Colella, Pedram Daee, Jussi Jokinen, Antti Oulasvirta, Samuel Kaski

    Abstract: A central concern in an interactive intelligent system is optimization of its actions, to be maximally helpful to its human user. In recommender systems for instance, the action is to choose what to recommend, and the optimization task is to recommend items the user prefers. The optimization is done based on earlier user's feedback (e.g. "likes" and "dislikes"), and the algorithms assume the feedb… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: 10 pages, 5 figures, The paper is published in the proceedings of UMAP 2020. Codes available at https://github.com/fcole90/interactive_bayesian_optimisation

  42. arXiv:2004.09397  [pdf, other

    cs.LG stat.ML

    Multi-label Stream Classification with Self-Organizing Maps

    Authors: Ricardo Cerri, Joel David Costa Júnior, Elaine Ribeiro de Faria Paiva, João Manuel Portela da Gama

    Abstract: Several learning algorithms have been proposed for offline multi-label classification. However, applications in areas such as traffic monitoring, social networks, and sensors produce data continuously, the so called data streams, posing challenges to batch multi-label learning. With the lack of stationarity in the distribution of data streams, new algorithms are needed to online adapt to such chan… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Comments: 7 pages, 14 figures

    ACM Class: I.2.6

  43. An Efficient Method for Computing Liouvillian First Integrals of Planar Polynomial Vector Fields

    Authors: L. G. S. Duarte, L. A. C. P. da Mota

    Abstract: Here we present an efficient method to compute Darboux polynomials for polynomial vector fields in the plane. This approach is restricetd to polynomial vector fields presenting a Liouvillian first integral (or, equivalently, to rational first order differential equations (rational 1ODEs) presenting a Liouvillian general solution). The key to obtaining this method was to separate the procedure of s… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Journal ref: Journal of Differential Equations Volume 300, 5 November 2021, Pages 356-385

  44. arXiv:2004.01608  [pdf, other

    cs.LG cs.AI stat.ML

    Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning

    Authors: Paulo R. de O. da Costa, Jason Rhuggenaath, Yingqian Zhang, Alp Akcay

    Abstract: Recent works using deep learning to solve the Traveling Salesman Problem (TSP) have focused on learning construction heuristics. Such approaches find TSP solutions of good quality but require additional procedures such as beam search and sampling to improve solutions and achieve state-of-the-art performance. However, few studies have focused on improvement heuristics, where a given solution is imp… ▽ More

    Submitted 14 September, 2020; v1 submitted 3 April, 2020; originally announced April 2020.

    Comments: To appear in Proceedings Machine Learning Research - ACML 2020

  45. arXiv:2002.11213  [pdf, other

    cs.CL cs.SD eess.AS

    Speech2Phone: A Novel and Efficient Method for Training Speaker Recognition Models

    Authors: Edresson Casanova, Arnaldo Candido Junior, Christopher Shulby, Frederico Santos de Oliveira, Lucas Rafael Stefanel Gris, Hamilton Pereira da Silva, Sandra Maria Aluisio, Moacir Antonelli Ponti

    Abstract: In this paper we present an efficient method for training models for speaker recognition using small or under-resourced datasets. This method requires less data than other SOTA (State-Of-The-Art) methods, e.g. the Angular Prototypical and GE2E loss functions, while achieving similar results to those methods. This is done using the knowledge of the reconstruction of a phoneme in the speaker's voice… ▽ More

    Submitted 18 June, 2021; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: Submitted to BRACIS

  46. arXiv:2002.07481  [pdf, other

    cs.GR

    Quantitative Evaluation of Time-Dependent Multidimensional Projection Techniques

    Authors: E. F. Vernier, R. Garcia, I. P. da Silva, J. L. D. Comba, A. C. Telea

    Abstract: Dimensionality reduction methods are an essential tool for multidimensional data analysis, and many interesting processes can be studied as time-dependent multivariate datasets. There are, however, few studies and proposals that leverage on the concise power of expression of projections in the context of dynamic/temporal data. In this paper, we aim at providing an approach to assess projection tec… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

  47. SemClinBr -- a multi institutional and multi specialty semantically annotated corpus for Portuguese clinical NLP tasks

    Authors: Lucas Emanuel Silva e Oliveira, Ana Carolina Peters, Adalniza Moura Pucca da Silva, Caroline P. Gebeluca, Yohan Bonescki Gumiel, Lilian Mie Mukai Cintho, Deborah Ribeiro Carvalho, Sadid A. Hasan, Claudia Maria Cabral Moro

    Abstract: The high volume of research focusing on extracting patient's information from electronic health records (EHR) has led to an increase in the demand for annotated corpora, which are a very valuable resource for both the development and evaluation of natural language processing (NLP) algorithms. The absence of a multi-purpose clinical corpus outside the scope of the English language, especially in Br… ▽ More

    Submitted 27 January, 2020; originally announced January 2020.

  48. arXiv:2001.09896  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Semantic Sensitive TF-IDF to Determine Word Relevance in Documents

    Authors: Amir Jalilifard, Vinicius F. Caridá, Alex F. Mansano, Rogers S. Cristo, Felipe Penhorate C. da Fonseca

    Abstract: Keyword extraction has received an increasing attention as an important research topic which can lead to have advancements in diverse applications such as document context categorization, text indexing and document classification. In this paper we propose STF-IDF, a novel semantic method based on TF-IDF, for scoring word importance of informal documents in a corpus. A set of nearly four million do… ▽ More

    Submitted 25 January, 2021; v1 submitted 5 January, 2020; originally announced January 2020.

    Comments: 11 pages, 2 figures, 22 references

  49. arXiv:2001.08966  [pdf, other

    cs.NE

    Design optimisation of a multi-mode wave energy converter

    Authors: Nataliia Y. Sergiienko, Mehdi Neshat, Leandro S. P. da Silva, Bradley Alexander, Markus Wagner

    Abstract: A wave energy converter (WEC) similar to the CETO system developed by Carnegie Clean Energy is considered for design optimisation. This WEC is able to absorb power from heave, surge and pitch motion modes, making the optimisation problem nontrivial. The WEC dynamics is simulated using the spectral-domain model taking into account hydrodynamic forces, viscous drag, and power take-off forces. The de… ▽ More

    Submitted 24 January, 2020; originally announced January 2020.

  50. A quantum-classical cloud platform optimized for variational hybrid algorithms

    Authors: Peter J. Karalekas, Nikolas A. Tezak, Eric C. Peterson, Colm A. Ryan, Marcus P. da Silva, Robert S. Smith

    Abstract: In order to support near-term applications of quantum computing, a new compute paradigm has emerged--the quantum-classical cloud--in which quantum computers (QPUs) work in tandem with classical computers (CPUs) via a shared cloud infrastructure. In this work, we enumerate the architectural requirements of a quantum-classical cloud platform, and present a framework for benchmarking its runtime perf… ▽ More

    Submitted 30 May, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

    Comments: 21 pages, 8 figures; updated references to match published version

    Journal ref: Quantum Sci. Technol. 5 024003 (2020)