Search | arXiv e-print repository

arXiv:2403.03832 [pdf]

Your device may know you better than you know yourself -- continuous authentication on novel dataset using machine learning

Authors: Pedro Gomes do Nascimento, Pidge Witiak, Tucker MacCallum, Zachary Winterfeldt, Rushit Dave

Abstract: This research aims to further understanding in the field of continuous authentication using behavioral biometrics. We are contributing a novel dataset that encompasses the gesture data of 15 users playing Minecraft with a Samsung Tablet, each for a duration of 15 minutes. Utilizing this dataset, we employed machine learning (ML) binary classifiers, being Random Forest (RF), K-Nearest Neighbors (KN… ▽ More This research aims to further understanding in the field of continuous authentication using behavioral biometrics. We are contributing a novel dataset that encompasses the gesture data of 15 users playing Minecraft with a Samsung Tablet, each for a duration of 15 minutes. Utilizing this dataset, we employed machine learning (ML) binary classifiers, being Random Forest (RF), K-Nearest Neighbors (KNN), and Support Vector Classifier (SVC), to determine the authenticity of specific user actions. Our most robust model was SVC, which achieved an average accuracy of approximately 90%, demonstrating that touch dynamics can effectively distinguish users. However, further studies are needed to make it viable option for authentication systems △ Less

Submitted 6 March, 2024; originally announced March 2024.

arXiv:2401.15024 [pdf, other]

SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Authors: Saleh Ashkboos, Maximilian L. Croci, Marcelo Gennari do Nascimento, Torsten Hoefler, James Hensman

Abstract: Large language models have become the cornerstone of natural language processing, but their use comes with substantial costs in terms of compute and memory resources. Sparsification provides a solution to alleviate these resource constraints, and recent works have shown that trained models can be sparsified post-hoc. Existing sparsification techniques face challenges as they need additional data s… ▽ More Large language models have become the cornerstone of natural language processing, but their use comes with substantial costs in terms of compute and memory resources. Sparsification provides a solution to alleviate these resource constraints, and recent works have shown that trained models can be sparsified post-hoc. Existing sparsification techniques face challenges as they need additional data structures and offer constrained speedup with current hardware. In this paper we present SliceGPT, a new post-training sparsification scheme which replaces each weight matrix with a smaller (dense) matrix, reducing the embedding dimension of the network. Through extensive experimentation, we show that SliceGPT can remove up to 25% of the model parameters (including embeddings) for LLAMA2-70B, OPT 66B and Phi-2 models while maintaining 99%, 99% and 90% zero-shot task performance of the dense model respectively. Our sliced models run on fewer GPUs and run faster without any additional code optimization: on 24GB consumer GPUs we reduce the total compute for inference on LLAMA2-70B to 64% of that of the dense model; on 40GB A100 GPUs we reduce it to 66%. We offer a new insight, computational invariance in transformer networks, which enables SliceGPT and we hope it will inspire and enable future avenues to reduce memory and computation demands for pre-trained models. Code is available at: https://github.com/microsoft/TransformerCompression △ Less

Submitted 9 February, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

Comments: 22 pages, 8 figures, accepted at ICLR24

arXiv:2311.08081 [pdf, other]

Evolutionary-enhanced quantum supervised learning model

Authors: Anton Simen Albino, Rodrigo Bloot, Otto M. Pires, Erick G. S. Nascimento

Abstract: Quantum supervised learning, utilizing variational circuits, stands out as a promising technology for NISQ devices due to its efficiency in hardware resource utilization during the creation of quantum feature maps and the implementation of hardware-efficient ansatz with trainable parameters. Despite these advantages, the training of quantum models encounters challenges, notably the barren plateau… ▽ More Quantum supervised learning, utilizing variational circuits, stands out as a promising technology for NISQ devices due to its efficiency in hardware resource utilization during the creation of quantum feature maps and the implementation of hardware-efficient ansatz with trainable parameters. Despite these advantages, the training of quantum models encounters challenges, notably the barren plateau phenomenon, leading to stagnation in learning during optimization iterations. This study proposes an innovative approach: an evolutionary-enhanced ansatz-free supervised learning model. In contrast to parametrized circuits, our model employs circuits with variable topology that evolves through an elitist method, mitigating the barren plateau issue. Additionally, we introduce a novel concept, the superposition of multi-hot encodings, facilitating the treatment of multi-classification problems. Our framework successfully avoids barren plateaus, resulting in enhanced model accuracy. Comparative analysis with variational quantum classifiers from the technology's state-of-the-art reveal a substantial improvement in training efficiency and precision. Furthermore, we conduct tests on a challenging dataset class, traditionally problematic for conventional kernel machines, demonstrating a potential alternative path for achieving quantum advantage in supervised learning for NISQ era. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2304.07310 [pdf]

doi 10.1093/jamia/ocad170

Federated and distributed learning applications for electronic health records and structured medical data: A sco** review

Authors: Siqi Li, Pinyan Liu, Gustavo G. Nascimento, Xinru Wang, Fabio Renato Manzolli Leite, Bibhas Chakraborty, Chuan Hong, Yilin Ning, Feng Xie, Zhen Ling Teo, Daniel Shu Wei Ting, Hamed Haddadi, Marcus Eng Hock Ong, Marco Aurélio Peres, Nan Liu

Abstract: Federated learning (FL) has gained popularity in clinical research in recent years to facilitate privacy-preserving collaboration. Structured data, one of the most prevalent forms of clinical data, has experienced significant growth in volume concurrently, notably with the widespread adoption of electronic health records in clinical practice. This review examines FL applications on structured medi… ▽ More Federated learning (FL) has gained popularity in clinical research in recent years to facilitate privacy-preserving collaboration. Structured data, one of the most prevalent forms of clinical data, has experienced significant growth in volume concurrently, notably with the widespread adoption of electronic health records in clinical practice. This review examines FL applications on structured medical data, identifies contemporary limitations and discusses potential innovations. We searched five databases, SCOPUS, MEDLINE, Web of Science, Embase, and CINAHL, to identify articles that applied FL to structured medical data and reported results following the PRISMA guidelines. Each selected publication was evaluated from three primary perspectives, including data quality, modeling strategies, and FL frameworks. Out of the 1160 papers screened, 34 met the inclusion criteria, with each article consisting of one or more studies that used FL to handle structured clinical/medical data. Of these, 24 utilized data acquired from electronic health records, with clinical predictions and association studies being the most common clinical research tasks that FL was applied to. Only one article exclusively explored the vertical FL setting, while the remaining 33 explored the horizontal FL setting, with only 14 discussing comparisons between single-site (local) and FL (global) analysis. The existing FL applications on structured medical data lack sufficient evaluations of clinically meaningful benefits, particularly when compared to single-site analyses. Therefore, it is crucial for future FL applications to prioritize clinical motivations and develop designs and methodologies that can effectively support and aid clinical practice and research. △ Less

Submitted 14 April, 2023; originally announced April 2023.

arXiv:2303.07131 [pdf, other]

Evolutionary quantum feature selection

Authors: Anton S. Albino, Otto M. Pires, Mauro Q. Nooblath, Erick G. S. Nascimento

Abstract: Effective feature selection is essential for enhancing the performance of artificial intelligence models. It involves identifying feature combinations that optimize a given metric, but this is a challenging task due to the problem's exponential time complexity. In this study, we present an innovative heuristic called Evolutionary Quantum Feature Selection (EQFS) that employs the Quantum Circuit Ev… ▽ More Effective feature selection is essential for enhancing the performance of artificial intelligence models. It involves identifying feature combinations that optimize a given metric, but this is a challenging task due to the problem's exponential time complexity. In this study, we present an innovative heuristic called Evolutionary Quantum Feature Selection (EQFS) that employs the Quantum Circuit Evolution (QCE) algorithm. Our approach harnesses the unique capabilities of QCE, which utilizes shallow depth circuits to generate sparse probability distributions. Our computational experiments demonstrate that EQFS can identify good feature combinations with quadratic scaling in the number of features. To evaluate EQFS's performance, we counted the number of times a given classical model assesses the cost function for a specific metric, as a function of the number of generations. △ Less

Submitted 13 March, 2023; originally announced March 2023.

arXiv:2204.07182 [pdf, other]

Analysing similarities between legal court documents using natural language processing approaches based on Transformers

Authors: Raphael Souza de Oliveira, Erick Giovani Sperandio Nascimento

Abstract: Recent advances in Artificial Intelligence (AI) have leveraged promising results in solving complex problems in the area of Natural Language Processing (NLP), being an important tool to help in the expeditious resolution of judicial proceedings in the legal area. In this context, this work targets the problem of detecting the degree of similarity between judicial documents that can be achieved in… ▽ More Recent advances in Artificial Intelligence (AI) have leveraged promising results in solving complex problems in the area of Natural Language Processing (NLP), being an important tool to help in the expeditious resolution of judicial proceedings in the legal area. In this context, this work targets the problem of detecting the degree of similarity between judicial documents that can be achieved in the inference group, by applying six NLP techniques based on the transformers architecture to a case study of legal proceedings in the Brazilian judicial system. The NLP transformer-based models, namely BERT, GPT-2 and RoBERTa, were pre-trained using a general purpose corpora of the Brazilian Portuguese language, and then were fine-tuned and specialised for the legal sector using 210,000 legal proceedings. Vector representations of each legal document were calculated based on their embeddings, which were used to cluster the lawsuits, calculating the quality of each model based on the cosine of the distance between the elements of the group to its centroid. We noticed that models based on transformers presented better performance when compared to previous traditional NLP techniques, with the RoBERTa model specialised for the Brazilian Portuguese language presenting the best results. This methodology can be also applied to other case studies for different languages, making it possible to advance in the current state of the art in the area of NLP applied to the legal sector. △ Less

Submitted 11 May, 2023; v1 submitted 14 April, 2022; originally announced April 2022.

arXiv:2204.03725 [pdf, other]

T4PdM: a Deep Neural Network based on the Transformer Architecture for Fault Diagnosis of Rotating Machinery

Authors: Erick Giovani Sperandio Nascimento, Julian Santana Liang, Ilan Sousa Figueiredo, Lilian Lefol Nani Guarieiro

Abstract: Deep learning and big data algorithms have become widely used in industrial applications to optimize several tasks in many complex systems. Particularly, deep learning model for diagnosing and prognosing machinery health has leveraged predictive maintenance (PdM) to be more accurate and reliable in decision making, in this way avoiding unnecessary interventions, machinery accidents, and environmen… ▽ More Deep learning and big data algorithms have become widely used in industrial applications to optimize several tasks in many complex systems. Particularly, deep learning model for diagnosing and prognosing machinery health has leveraged predictive maintenance (PdM) to be more accurate and reliable in decision making, in this way avoiding unnecessary interventions, machinery accidents, and environment catastrophes. Recently, Transformer Neural Networks have gained notoriety and have been increasingly the favorite choice for Natural Language Processing (NLP) tasks. Thus, given their recent major achievements in NLP, this paper proposes the development of an automatic fault classifier model for predictive maintenance based on a modified version of the Transformer architecture, namely T4PdM, to identify multiple types of faults in rotating machinery. Experimental results are developed and presented for the MaFaulDa and CWRU databases. T4PdM was able to achieve an overall accuracy of 99.98% and 98% for both datasets, respectively. In addition, the performance of the proposed model is compared to other previously published works. It has demonstrated the superiority of the model in detecting and classifying faults in rotating industrial machinery. Therefore, the proposed Transformer-based model can improve the performance of machinery fault analysis and diagnostic processes and leverage companies to a new era of the Industry 4.0. In addition, this methodology can be adapted to any other task of time series classification. △ Less

Submitted 7 April, 2022; originally announced April 2022.

arXiv:2202.08913 [pdf, other]

doi 10.3390/electronics11091473

Machine learning models and facial regions videos for estimating heart rate: a review on Patents, Datasets and Literature

Authors: Tiago Palma Pagano, Lucas Lemos Ortega, Victor Rocha Santos, Yasmin da Silva Bonfim, José Vinícius Dantas Paranhos, Paulo Henrique Miranda Sá, Lian Filipe Santana Nascimento, Ingrid Winkler, Erick Giovani Sperandio Nascimento

Abstract: Estimating heart rate is important for monitoring users in various situations. Estimates based on facial videos are increasingly being researched because it makes it possible to monitor cardiac information in a non-invasive way and because the devices are simpler, requiring only cameras that capture the user's face. From these videos of the user's face, machine learning is able to estimate heart r… ▽ More Estimating heart rate is important for monitoring users in various situations. Estimates based on facial videos are increasingly being researched because it makes it possible to monitor cardiac information in a non-invasive way and because the devices are simpler, requiring only cameras that capture the user's face. From these videos of the user's face, machine learning is able to estimate heart rate. This study investigates the benefits and challenges of using machine learning models to estimate heart rate from facial videos, through patents, datasets, and articles review. We searched Derwent Innovation, IEEE Xplore, Scopus, and Web of Science knowledge bases and identified 7 patent filings, 11 datasets, and 20 articles on heart rate, photoplethysmography, or electrocardiogram data. In terms of patents, we note the advantages of inventions related to heart rate estimation, as described by the authors. In terms of datasets, we discovered that most of them are for academic purposes and with different signs and annotations that allow coverage for subjects other than heartbeat estimation. In terms of articles, we discovered techniques, such as extracting regions of interest for heart rate reading and using Video Magnification for small motion extraction, and models such as EVM-CNN and VGG-16, that extract the observed individual's heart rate, the best regions of interest for signal extraction and ways to process them. △ Less

Submitted 17 February, 2022; originally announced February 2022.

arXiv:2202.08176 [pdf, other]

Bias and unfairness in machine learning models: a systematic literature review

Authors: Tiago Palma Pagano, Rafael Bessa Loureiro, Fernanda Vitória Nascimento Lisboa, Gustavo Oliveira Ramos Cruz, Rodrigo Matos Peixoto, Guilherme Aragão de Sousa Guimarães, Lucas Lisboa dos Santos, Maira Matos Araujo, Marco Cruz, Ewerton Lopes Silva de Oliveira, Ingrid Winkler, Erick Giovani Sperandio Nascimento

Abstract: One of the difficulties of artificial intelligence is to ensure that model decisions are fair and free of bias. In research, datasets, metrics, techniques, and tools are applied to detect and mitigate algorithmic unfairness and bias. This study aims to examine existing knowledge on bias and unfairness in Machine Learning models, identifying mitigation methods, fairness metrics, and supporting tool… ▽ More One of the difficulties of artificial intelligence is to ensure that model decisions are fair and free of bias. In research, datasets, metrics, techniques, and tools are applied to detect and mitigate algorithmic unfairness and bias. This study aims to examine existing knowledge on bias and unfairness in Machine Learning models, identifying mitigation methods, fairness metrics, and supporting tools. A Systematic Literature Review found 40 eligible articles published between 2017 and 2022 in the Scopus, IEEE Xplore, Web of Science, and Google Scholar knowledge bases. The results show numerous bias and unfairness detection and mitigation approaches for ML technologies, with clearly defined metrics in the literature, and varied metrics can be highlighted. We recommend further research to define the techniques and metrics that should be employed in each case to standardize and ensure the impartiality of the machine learning model, thus, allowing the most appropriate metric to detect bias and unfairness in a given context. △ Less

Submitted 3 November, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

arXiv:2007.07743 [pdf, other]

Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian Processes

Authors: Marcelo Gennari do Nascimento, Theo W. Costain, Victor Adrian Prisacariu

Abstract: We propose a novel method for neural network quantization that casts the neural architecture search problem as one of hyperparameter search to find non-uniform bit distributions throughout the layers of a CNN. We perform the search assuming a Multi-Task Gaussian Processes prior, which splits the problem to multiple tasks, each corresponding to different number of training epochs, and explore the s… ▽ More We propose a novel method for neural network quantization that casts the neural architecture search problem as one of hyperparameter search to find non-uniform bit distributions throughout the layers of a CNN. We perform the search assuming a Multi-Task Gaussian Processes prior, which splits the problem to multiple tasks, each corresponding to different number of training epochs, and explore the space by sampling those configurations that yield maximum information. We then show that with significantly lower precision in the last layers we achieve a minimal loss of accuracy with appreciable memory savings. We test our findings on the CIFAR10 and ImageNet datasets using the VGG, ResNet and GoogLeNet architectures. △ Less

Submitted 20 July, 2020; v1 submitted 15 July, 2020; originally announced July 2020.

Comments: Accepted for publication at ECCV 2020. Code availiable at https://code.active.vision . Updated for typo

arXiv:1901.05512 [pdf, other]

Fleet Prognosis with Physics-informed Recurrent Neural Networks

Authors: Renato Giorgiani Nascimento, Felipe A. C. Viana

Abstract: Services and warranties of large fleets of engineering assets is a very profitable business. The success of companies in that area is often related to predictive maintenance driven by advanced analytics. Therefore, accurate modeling, as a way to understand how the complex interactions between operating conditions and component capability define useful life, is key for services profitability. Unfor… ▽ More Services and warranties of large fleets of engineering assets is a very profitable business. The success of companies in that area is often related to predictive maintenance driven by advanced analytics. Therefore, accurate modeling, as a way to understand how the complex interactions between operating conditions and component capability define useful life, is key for services profitability. Unfortunately, building prognosis models for large fleets is a daunting task as factors such as duty cycle variation, harsh environments, inadequate maintenance, and problems with mass production can lead to large discrepancies between designed and observed useful lives. This paper introduces a novel physics-informed neural network approach to prognosis by extending recurrent neural networks to cumulative damage models. We propose a new recurrent neural network cell designed to merge physics-informed and data-driven layers. With that, engineers and scientists have the chance to use physics-informed layers to model parts that are well understood (e.g., fatigue crack growth) and use data-driven layers to model parts that are poorly characterized (e.g., internal loads). A simple numerical experiment is used to present the main features of the proposed physics-informed recurrent neural network for damage accumulation. The test problem consist of predicting fatigue crack length for a synthetic fleet of airplanes subject to different mission mixes. The model is trained using full observation inputs (far-field loads) and very limited observation of outputs (crack length at inspection for only a portion of the fleet). The results demonstrate that our proposed hybrid physics-informed recurrent neural network is able to accurately model fatigue crack growth even when the observed distribution of crack length does not match with the (unobservable) fleet distribution. △ Less

Submitted 16 January, 2019; originally announced January 2019.

Comments: Data and codes (including our implementation for both the multi-layer perceptron, the stress intensity and Paris law layers, the cumulative damage cell, as well as python driver scripts) used in this manuscript are publicly available on GitHub at https://github.com/PML-UCF/pinn. The data and code are released under the MIT License

arXiv:1708.07555 [pdf, other]

A Robust Indoor Scene Recognition Method based on Sparse Representation

Authors: Guilherme Nascimento, Camila Laranjeira, Vinicius Braz, Anisio Lacerda, Erickson R. Nascimento

Abstract: In this paper, we present a robust method for scene recognition, which leverages Convolutional Neural Networks (CNNs) features and Sparse Coding setting by creating a new representation of indoor scenes. Although CNNs highly benefited the fields of computer vision and pattern recognition, convolutional layers adjust weights on a global-approach, which might lead to losing important local details s… ▽ More In this paper, we present a robust method for scene recognition, which leverages Convolutional Neural Networks (CNNs) features and Sparse Coding setting by creating a new representation of indoor scenes. Although CNNs highly benefited the fields of computer vision and pattern recognition, convolutional layers adjust weights on a global-approach, which might lead to losing important local details such as objects and small structures. Our proposed scene representation relies on both: global features that mostly refers to environment's structure, and local features that are sparsely combined to capture characteristics of common objects of a given scene. This new representation is based on fragments of the scene and leverages features extracted by CNNs. The experimental evaluation shows that the resulting representation outperforms previous scene recognition methods on Scene15 and MIT67 datasets, and performs competitively on SUN397, while being highly robust to perturbations in the input image such as noise and occlusion. △ Less

Submitted 24 August, 2017; originally announced August 2017.

Comments: CIARP 2017. To appear

Showing 1–12 of 12 results for author: Nascimento, G