-
PORTULAN ExtraGLUE Datasets and Models: Kick-starting a Benchmark for the Neural Processing of Portuguese
Authors:
Tomás Osório,
Bernardo Leite,
Henrique Lopes Cardoso,
Luís Gomes,
João Rodrigues,
Rodrigo Santos,
António Branco
Abstract:
Leveraging research on the neural modelling of Portuguese, we contribute a collection of datasets for an array of language processing tasks and a corresponding collection of fine-tuned neural language models on these downstream tasks. To align with mainstream benchmarks in the literature, originally developed in English, and to kick start their Portuguese counterparts, the datasets were machine-tr…
▽ More
Leveraging research on the neural modelling of Portuguese, we contribute a collection of datasets for an array of language processing tasks and a corresponding collection of fine-tuned neural language models on these downstream tasks. To align with mainstream benchmarks in the literature, originally developed in English, and to kick start their Portuguese counterparts, the datasets were machine-translated from English with a state-of-the-art translation engine. The resulting PORTULAN ExtraGLUE benchmark is a basis for research on Portuguese whose improvement can be pursued in future work. Similarly, the respective fine-tuned neural language models, developed with a low-rank adaptation approach, are made available as baselines that can stimulate future work on the neural processing of Portuguese. All datasets and models have been developed and are made available for two variants of Portuguese: European and Brazilian.
△ Less
Submitted 8 May, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
A Benchmark for Data Management in Microservices
Authors:
Rodrigo Laigner,
Zhexiang Zhang,
Yijian Liu,
Leonardo Freitas Gomes,
Yongluan Zhou
Abstract:
Microservice architectures emerged as a popular architecture for designing scalable distributed applications. Although microservices have been extensively employed in industry settings for over a decade, there is little understanding of the data management challenges that arise in these applications. As a result, it is difficult to advance data system technologies for supporting microservice appli…
▽ More
Microservice architectures emerged as a popular architecture for designing scalable distributed applications. Although microservices have been extensively employed in industry settings for over a decade, there is little understanding of the data management challenges that arise in these applications. As a result, it is difficult to advance data system technologies for supporting microservice applications. To fill this gap, we present Online Marketplace, a microservice benchmark that incorporates core data management challenges that existing benchmarks have not sufficiently addressed. These challenges include transaction processing, query processing, event processing, constraint enforcement, and data replication. We have defined criteria for various data management issues to enable proper comparison across data systems and platforms.
After specifying the benchmark, we present the challenges we faced in creating workloads that accurately reflect the dynamic state of the microservices. We also discuss issues that we encountered when implementing Online Marketplace in state-of-the-art data platforms and meeting the criteria. Our evaluation demonstrates that the benchmark is a valuable tool for testing important properties sought by microservice practitioners. As a result, our proposed benchmark will facilitate the design of future data systems to meet the expectations of microservice practitioners.
△ Less
Submitted 19 May, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
Fostering the Ecosystem of Open Neural Encoders for Portuguese with Albertina PT* Family
Authors:
Rodrigo Santos,
João Rodrigues,
Luís Gomes,
João Silva,
António Branco,
Henrique Lopes Cardoso,
Tomás Freitas Osório,
Bernardo Leite
Abstract:
To foster the neural encoding of Portuguese, this paper contributes foundation encoder models that represent an expansion of the still very scarce ecosystem of large language models specifically developed for this language that are fully open, in the sense that they are open source and openly distributed for free under an open license for any purpose, thus including research and commercial usages.…
▽ More
To foster the neural encoding of Portuguese, this paper contributes foundation encoder models that represent an expansion of the still very scarce ecosystem of large language models specifically developed for this language that are fully open, in the sense that they are open source and openly distributed for free under an open license for any purpose, thus including research and commercial usages. Like most languages other than English, Portuguese is low-resourced in terms of these foundational language resources, there being the inaugural 900 million parameter Albertina and 335 million Bertimbau. Taking this couple of models as an inaugural set, we present the extension of the ecosystem of state-of-the-art open encoders for Portuguese with a larger, top performance-driven model with 1.5 billion parameters, and a smaller, efficiency-driven model with 100 million parameters. While achieving this primary goal, further results that are relevant for this ecosystem were obtained as well, namely new datasets for Portuguese based on the SuperGLUE benchmark, which we also distribute openly.
△ Less
Submitted 5 March, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
Advancing Generative AI for Portuguese with Open Decoder Gervásio PT*
Authors:
Rodrigo Santos,
João Silva,
Luís Gomes,
João Rodrigues,
António Branco
Abstract:
To advance the neural decoding of Portuguese, in this paper we present a fully open Transformer-based, instruction-tuned decoder model that sets a new state of the art in this respect. To develop this decoder, which we named Gervásio PT*, a strong LLaMA~2 7B model was used as a starting point, and its further improvement through additional training was done over language resources that include new…
▽ More
To advance the neural decoding of Portuguese, in this paper we present a fully open Transformer-based, instruction-tuned decoder model that sets a new state of the art in this respect. To develop this decoder, which we named Gervásio PT*, a strong LLaMA~2 7B model was used as a starting point, and its further improvement through additional training was done over language resources that include new instruction data sets of Portuguese prepared for this purpose, which are also contributed in this paper. All versions of Gervásio are open source and distributed for free under an open license, including for either research or commercial usage, and can be run on consumer-grade hardware, thus seeking to contribute to the advancement of research and innovation in language technology for Portuguese.
△ Less
Submitted 5 March, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Ultra-low power sensor devices for monitoring physical activity and respiratory frequency in farmed fish
Authors:
Juan Antonio Martos-Sitcha,
Javier Sosa,
Dailos Ramos-Valido,
Francisco Javier Bravo,
Cristina Carmona-Duarte,
Henrique Leonel Gomes,
Josep A. Calduch-Giner,
Enric Cabruja,
Aurelio Vega,
Miguel Angel Ferrer,
Manuel Lozano,
Juan Antonio Montiel-Nelson,
Juan Manuel Afonso,
Jaume Perez-Sanchez
Abstract:
Integration of technological solutions aims to improve accuracy, precision and repeatability in farming operations, and biosensor devices are increasingly used for understanding basic biology during livestock production. The aim of this study was to design and validate a miniaturized tri-axial accelerometer for non-invasive monitoring of farmed fish with re-programmable schedule protocols.The devi…
▽ More
Integration of technological solutions aims to improve accuracy, precision and repeatability in farming operations, and biosensor devices are increasingly used for understanding basic biology during livestock production. The aim of this study was to design and validate a miniaturized tri-axial accelerometer for non-invasive monitoring of farmed fish with re-programmable schedule protocols.The device was attached to the operculum of gilthead sea bream and European sea bass juveniles for monitoring their physical activity by measurements of movement accelerations in x and y-axes, while records of operculum beats served as a measurement of respiratory frequency. Data post-processing of exercised fish in swimming test chambers revealed an exponential increase of fish accelerations with the increase of fish speed from 1 body-length to 4 body-lengths per second, while a close relationship between oxygen consumption and opercular frequency was consistently found.The usefulness of low computational load for data pre-processing with on-board algorithms was verified from low to submaximal exercise, increasing this procedure the autonomy of the system up to 6 h of data recording with different programmable schedules. Visual observations regarding tissue damage, feeding behavior and circulating levels of stress markers did not reveal at short term a negative impact of device tagging. Reduced plasma levels of triglycerides revealed a transient inhibition of feed intake in small fish, but this disturbance was not detected in larger fish. All this considered together is the proof of concept that miniaturized devices are suitable for non-invasive and reliable metabolic phenoty** of farmed fish to improve their overall performance and welfare. Further work is underway for improving the attachment procedure and the full device packaging.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Solving the Discretised Multiphase Flow Equations with Interface Capturing on Structured Grids Using Machine Learning Libraries
Authors:
Boyang Chen,
Claire E. Heaney,
Jefferson L. M. A. Gomes,
Omar K. Matar,
Christopher C. Pain
Abstract:
This paper solves the discretised multiphase flow equations using tools and methods from machine-learning libraries. The idea comes from the observation that convolutional layers can be used to express a discretisation as a neural network whose weights are determined by the numerical method, rather than by training, and hence, we refer to this approach as Neural Networks for PDEs (NN4PDEs). To sol…
▽ More
This paper solves the discretised multiphase flow equations using tools and methods from machine-learning libraries. The idea comes from the observation that convolutional layers can be used to express a discretisation as a neural network whose weights are determined by the numerical method, rather than by training, and hence, we refer to this approach as Neural Networks for PDEs (NN4PDEs). To solve the discretised multiphase flow equations, a multigrid solver is implemented through a convolutional neural network with a U-Net architecture. Immiscible two-phase flow is modelled by the 3D incompressible Navier-Stokes equations with surface tension and advection of a volume fraction field, which describes the interface between the fluids. A new compressive algebraic volume-of-fluids method is introduced, based on a residual formulation using Petrov-Galerkin for accuracy and designed with NN4PDEs in mind. High-order finite-element based schemes are chosen to model a collapsing water column and a rising bubble. Results compare well with experimental data and other numerical results from the literature, demonstrating that, for the first time, finite element discretisations of multiphase flows can be solved using an approach based on (untrained) convolutional neural networks. A benefit of expressing numerical discretisations as neural networks is that the code can run, without modification, on CPUs, GPUs or the latest accelerators designed especially to run AI codes.
△ Less
Submitted 3 March, 2024; v1 submitted 12 January, 2024;
originally announced January 2024.
-
Advancing Neural Encoding of Portuguese with Transformer Albertina PT-*
Authors:
João Rodrigues,
Luís Gomes,
João Silva,
António Branco,
Rodrigo Santos,
Henrique Lopes Cardoso,
Tomás Osório
Abstract:
To advance the neural encoding of Portuguese (PT), and a fortiori the technological preparation of this language for the digital age, we developed a Transformer-based foundation model that sets a new state of the art in this respect for two of its variants, namely European Portuguese from Portugal (PT-PT) and American Portuguese from Brazil (PT-BR).
To develop this encoder, which we named Albert…
▽ More
To advance the neural encoding of Portuguese (PT), and a fortiori the technological preparation of this language for the digital age, we developed a Transformer-based foundation model that sets a new state of the art in this respect for two of its variants, namely European Portuguese from Portugal (PT-PT) and American Portuguese from Brazil (PT-BR).
To develop this encoder, which we named Albertina PT-*, a strong model was used as a starting point, DeBERTa, and its pre-training was done over data sets of Portuguese, namely over data sets we gathered for PT-PT and PT-BR, and over the brWaC corpus for PT-BR. The performance of Albertina and competing models was assessed by evaluating them on prominent downstream language processing tasks adapted for Portuguese.
Both Albertina PT-PT and PT-BR versions are distributed free of charge and under the most permissive license possible and can be run on consumer-grade hardware, thus seeking to contribute to the advancement of research and innovation in language technology for Portuguese.
△ Less
Submitted 20 June, 2023; v1 submitted 11 May, 2023;
originally announced May 2023.
-
Predicting IMDb Rating of TV Series with Deep Learning: The Case of Arrow
Authors:
Anna Luiza Gomes,
Getúlio Vianna,
Tatiana Escovedo,
Marcos Kalinowski
Abstract:
Context: The number of TV series offered nowadays is very high. Due to its large amount, many series are canceled due to a lack of originality that generates a low audience.
Problem: Having a decision support system that can show why some shows are a huge success or not would facilitate the choices of renewing or starting a show.
Solution: We studied the case of the series Arrow broadcasted by…
▽ More
Context: The number of TV series offered nowadays is very high. Due to its large amount, many series are canceled due to a lack of originality that generates a low audience.
Problem: Having a decision support system that can show why some shows are a huge success or not would facilitate the choices of renewing or starting a show.
Solution: We studied the case of the series Arrow broadcasted by CW Network and used descriptive and predictive modeling techniques to predict the IMDb rating. We assumed that the theme of the episode would affect its evaluation by users, so the dataset is composed only by the director of the episode, the number of reviews that episode got, the percentual of each theme extracted by the Latent Dirichlet Allocation (LDA) model of an episode, the number of viewers from Wikipedia and the rating from IMDb. The LDA model is a generative probabilistic model of a collection of documents made up of words.
Method: In this prescriptive research, the case study method was used, and its results were analyzed using a quantitative approach.
Summary of Results: With the features of each episode, the model that performed the best to predict the rating was Catboost due to a similar mean squared error of the KNN model but a better standard deviation during the test phase. It was possible to predict IMDb ratings with an acceptable root mean squared error of 0.55.
△ Less
Submitted 28 August, 2022;
originally announced August 2022.
-
Data Integrity Verification in Network Slicing using Oracles and Smart Contracts
Authors:
Joao Paulo de Brito Goncalves,
Gustavo Alochio,
Rodolfo da Silva Villaca,
Roberta Lima Gomes
Abstract:
The fifth-generation (5G) wireless networks are expected to provide various services compared to the 4G and previous generations of networks. The Quality of Service requirements can be quite different in terms of latency, bandwidth, reliability, and availability. 5G technology allows the fragmentation of the network into small pieces, known as network slices. This network slicing is done by specif…
▽ More
The fifth-generation (5G) wireless networks are expected to provide various services compared to the 4G and previous generations of networks. The Quality of Service requirements can be quite different in terms of latency, bandwidth, reliability, and availability. 5G technology allows the fragmentation of the network into small pieces, known as network slices. This network slicing is done by specific tools and the configuration must be protected from attacks that may be performed by malicious users. Thus in this paper, a solution to protect and prevent these failures from happening is addressed. For this solution to be carried out, a study was conducted on the Blockchain technology, as well as the use of Oracles in order to implement an integrity verification system, a system capable of assuring 5G network slices configuration integrity through a complete architecture involving Blockchain, Smart Contracts and Oracles.
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
Virtual Reality Applications in Software Engineering Education: A Systematic Review
Authors:
Gustavo Vargas de Andrade,
André Luiz Cordeiro Gomes,
Felipe Rohr Hoinoski,
Marília Guterres Ferreira,
Pablo Schoeffel,
Adilson Vahldick
Abstract:
Requirement Engineering (RE) is a Software Engineering (SE) process of defining, documenting, and maintaining the requirements from a problem. It is one of the most complex processes of SE because it addresses the relation between customer and developer. RE learning may be abstract and complex for most students because many of them cannot visualize the subject directly applied. Through the advance…
▽ More
Requirement Engineering (RE) is a Software Engineering (SE) process of defining, documenting, and maintaining the requirements from a problem. It is one of the most complex processes of SE because it addresses the relation between customer and developer. RE learning may be abstract and complex for most students because many of them cannot visualize the subject directly applied. Through the advancement of technology, Virtual Reality (VR) hardware is becoming increasingly more accessible, and it is not rare to use it in education. Little research and systematic studies explain the integration between SE and VR, and even less between RE and VR. Hence, this systematic review proposes to select and present studies that relate the use of VR applications to teach SE and RE concepts. We selected nine studies to include in this review. Despite the lack of articles addressing the topic, the results from this study showed that the use of VR technologies for learning SE is still very seminal. The projects based essentially on visualization. There are lack of tasks to build modeling artifacts, and also interaction with stakeholders and other software engineers. Learning tasks and the monitoring of students' progress by teachers also need to be considered.
△ Less
Submitted 25 April, 2022;
originally announced April 2022.
-
Creating and Reenacting Controllable 3D Humans with Differentiable Rendering
Authors:
Thiago L. Gomes,
Thiago M. Coutinho,
Rafael Azevedo,
Renato Martins,
Erickson R. Nascimento
Abstract:
This paper proposes a new end-to-end neural rendering architecture to transfer appearance and reenact human actors. Our method leverages a carefully designed graph convolutional network (GCN) to model the human body manifold structure, jointly with differentiable rendering, to synthesize new videos of people in different contexts from where they were initially recorded. Unlike recent appearance tr…
▽ More
This paper proposes a new end-to-end neural rendering architecture to transfer appearance and reenact human actors. Our method leverages a carefully designed graph convolutional network (GCN) to model the human body manifold structure, jointly with differentiable rendering, to synthesize new videos of people in different contexts from where they were initially recorded. Unlike recent appearance transferring methods, our approach can reconstruct a fully controllable 3D texture-mapped model of a person, while taking into account the manifold structure from body shape and texture appearance in the view synthesis. Specifically, our approach models mesh deformations with a three-stage GCN trained in a self-supervised manner on rendered silhouettes of the human body. It also infers texture appearance with a convolutional network in the texture domain, which is trained in an adversarial regime to reconstruct human texture from rendered images of actors in different poses. Experiments on different videos show that our method successfully infers specific body deformations and avoid creating texture artifacts while achieving the best values for appearance in terms of Structural Similarity (SSIM), Learned Perceptual Image Patch Similarity (LPIPS), Mean Squared Error (MSE), and Fréchet Video Distance (FVD). By taking advantages of both differentiable rendering and the 3D parametric model, our method is fully controllable, which allows controlling the human synthesis from both pose and rendering parameters. The source code is available at https://www.verlab.dcc.ufmg.br/retargeting-motion/wacv2022.
△ Less
Submitted 22 October, 2021;
originally announced October 2021.
-
A Survey of Static Formal Methods for Building Dependable Industrial Automation Systems
Authors:
Roopak Sinha,
Sandeep Patil,
Luis Gomes,
Valeriy Vyatkin
Abstract:
Industrial automation systems (IAS) need to be highly dependable; they should not merely function as expected but also do so in a reliable, safe, and secure manner. Formal methods are mathematical techniques that can greatly aid in develo** dependable systems and can be used across all phases of the system development life cycle (SDLC), including requirements engineering, system design and imple…
▽ More
Industrial automation systems (IAS) need to be highly dependable; they should not merely function as expected but also do so in a reliable, safe, and secure manner. Formal methods are mathematical techniques that can greatly aid in develo** dependable systems and can be used across all phases of the system development life cycle (SDLC), including requirements engineering, system design and implementation, verification and validation (testing), maintenance, and even documentation. This state-of-the-art survey reports existing formal approaches for creating more dependable IAS, focusing on static formal methods that are used before a system is completely implemented. We categorize surveyed works based on the phases of the SDLC, allowing us to identify research gaps and promising future directions for each phase.
△ Less
Submitted 12 August, 2021;
originally announced August 2021.
-
A Shape-Aware Retargeting Approach to Transfer Human Motion and Appearance in Monocular Videos
Authors:
Thiago L. Gomes,
Renato Martins,
João Ferreira,
Rafael Azevedo,
Guilherme Torres,
Erickson R. Nascimento
Abstract:
Transferring human motion and appearance between videos of human actors remains one of the key challenges in Computer Vision. Despite the advances from recent image-to-image translation approaches, there are several transferring contexts where most end-to-end learning-based retargeting methods still perform poorly. Transferring human appearance from one actor to another is only ensured when a stri…
▽ More
Transferring human motion and appearance between videos of human actors remains one of the key challenges in Computer Vision. Despite the advances from recent image-to-image translation approaches, there are several transferring contexts where most end-to-end learning-based retargeting methods still perform poorly. Transferring human appearance from one actor to another is only ensured when a strict setup has been complied, which is generally built considering their training regime's specificities. In this work, we propose a shape-aware approach based on a hybrid image-based rendering technique that exhibits competitive visual retargeting quality compared to state-of-the-art neural rendering approaches. The formulation leverages the user body shape into the retargeting while considering physical constraints of the motion in 3D and the 2D image domain. We also present a new video retargeting benchmark dataset composed of different videos with annotated human motions to evaluate the task of synthesizing people's videos, which can be used as a common base to improve tracking the progress in the field. The dataset and its evaluation protocols are designed to evaluate retargeting methods in more general and challenging conditions. Our method is validated in several experiments, comprising publicly available videos of actors with different shapes, motion types, and camera setups. The dataset and retargeting code are publicly available to the community at: https://www.verlab.dcc.ufmg.br/retargeting-motion.
△ Less
Submitted 28 April, 2021; v1 submitted 29 March, 2021;
originally announced March 2021.
-
Learning to dance: A graph convolutional adversarial network to generate realistic dance motions from audio
Authors:
João P. Ferreira,
Thiago M. Coutinho,
Thiago L. Gomes,
José F. Neto,
Rafael Azevedo,
Renato Martins,
Erickson R. Nascimento
Abstract:
Synthesizing human motion through learning techniques is becoming an increasingly popular approach to alleviating the requirement of new data capture to produce animations. Learning to move naturally from music, i.e., to dance, is one of the more complex motions humans often perform effortlessly. Each dance movement is unique, yet such movements maintain the core characteristics of the dance style…
▽ More
Synthesizing human motion through learning techniques is becoming an increasingly popular approach to alleviating the requirement of new data capture to produce animations. Learning to move naturally from music, i.e., to dance, is one of the more complex motions humans often perform effortlessly. Each dance movement is unique, yet such movements maintain the core characteristics of the dance style. Most approaches addressing this problem with classical convolutional and recursive neural models undergo training and variability issues due to the non-Euclidean geometry of the motion manifold structure.In this paper, we design a novel method based on graph convolutional networks to tackle the problem of automatic dance generation from audio information. Our method uses an adversarial learning scheme conditioned on the input music audios to create natural motions preserving the key movements of different music styles. We evaluate our method with three quantitative metrics of generative methods and a user study. The results suggest that the proposed GCN model outperforms the state-of-the-art dance generation method conditioned on music in different experiments. Moreover, our graph-convolutional approach is simpler, easier to be trained, and capable of generating more realistic motion styles regarding qualitative and different quantitative metrics. It also presented a visual movement perceptual quality comparable to real motion data.
△ Less
Submitted 30 November, 2020; v1 submitted 25 November, 2020;
originally announced November 2020.
-
Do As I Do: Transferring Human Motion and Appearance between Monocular Videos with Spatial and Temporal Constraints
Authors:
Thiago L. Gomes,
Renato Martins,
João Ferreira,
Erickson R. Nascimento
Abstract:
Creating plausible virtual actors from images of real actors remains one of the key challenges in computer vision and computer graphics. Marker-less human motion estimation and shape modeling from images in the wild bring this challenge to the fore. Although the recent advances on view synthesis and image-to-image translation, currently available formulations are limited to transfer solely style a…
▽ More
Creating plausible virtual actors from images of real actors remains one of the key challenges in computer vision and computer graphics. Marker-less human motion estimation and shape modeling from images in the wild bring this challenge to the fore. Although the recent advances on view synthesis and image-to-image translation, currently available formulations are limited to transfer solely style and do not take into account the character's motion and shape, which are by nature intermingled to produce plausible human forms. In this paper, we propose a unifying formulation for transferring appearance and retargeting human motion from monocular videos that regards all these aspects. Our method synthesizes new videos of people in a different context where they were initially recorded. Differently from recent appearance transferring methods, our approach takes into account body shape, appearance, and motion constraints. The evaluation is performed with several experiments using publicly available real videos containing hard conditions. Our method is able to transfer both human motion and appearance outperforming state-of-the-art methods, while preserving specific features of the motion that must be maintained (e.g., feet touching the floor, hands touching a particular object) and holding the best visual quality and appearance metrics such as Structural Similarity (SSIM) and Learned Perceptual Image Patch Similarity (LPIPS).
△ Less
Submitted 21 January, 2020; v1 submitted 8 January, 2020;
originally announced January 2020.
-
Generalising KAT to verify weighted computations
Authors:
Leandro Gomes,
Alexandre Madeira,
Luís Soares Barbosa
Abstract:
Kleene algebra with tests (KAT) was introduced as an algebraic structure to model and reason about classic imperative programs, i.e. sequences of discrete transitions guarded by Boolean tests. This paper introduces two generalisations of this structure able to express programs as weighted transitions and tests with outcomes in non necessarily bivalent truth spaces: graded Kleene algebra with tests…
▽ More
Kleene algebra with tests (KAT) was introduced as an algebraic structure to model and reason about classic imperative programs, i.e. sequences of discrete transitions guarded by Boolean tests. This paper introduces two generalisations of this structure able to express programs as weighted transitions and tests with outcomes in non necessarily bivalent truth spaces: graded Kleene algebra with tests (GKAT) and a variant where tests are also idempotent (I-GKAT). On this context, and in analogy to Kozen's encoding of Propositional Hoare Logic (PHL) in KAT [22], we discuss the encoding of a graded PHL in I-GKAT and of its while-free fragment in GKAT. Moreover, to establish semantics for these structures four new algebras are defined: FSET(T), FREL(K,T) and FLANG(K,T) over complete residuated lattices K and T, and M(n,A) over a GKAT or I-GKAT A. As a final exercise, the paper discusses some program equivalence proofs in a graded context.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
On the construction of multi-valued concurrent dynamic logic
Authors:
Leandro Gomes
Abstract:
Dynamic logic is a powerful framework for reasoning about imperative programs. An extension with a concurrent operator [18] was introduced to formalise programs running in parallel. In other direction, other authors proposed a systematic method for generating multi-valued propositional dynamic logics to reason about weighted programs [14]. This paper presents the first step of combining these two…
▽ More
Dynamic logic is a powerful framework for reasoning about imperative programs. An extension with a concurrent operator [18] was introduced to formalise programs running in parallel. In other direction, other authors proposed a systematic method for generating multi-valued propositional dynamic logics to reason about weighted programs [14]. This paper presents the first step of combining these two frameworks to introduce uncertainty in concurrent computations. In the developed framework, a weight is assigned to each branch of the parallel execution, resulting in a (possible) asymmetric parallelism, inherent to fuzzy programming paradigm [21, 2]. By adopting such an approach, a family of logics is obtained, called multi-valued concurrent propositional dynamic logics (CGDL(A)), parametric on an action lattice A specifying a notion of "weight" assigned to program execution. Additionally, the validity of some axioms of CPDL is discussed in the new family of generated logics.
△ Less
Submitted 1 November, 2019;
originally announced November 2019.
-
Using Near Infrared Spectroscopy and Machine Learning to diagnose Systemic Sclerosis
Authors:
Joelle Feijó de França,
Hugo Abreu Mendes,
Lucas Gallindo Costa,
Andrea Tavares Dantas,
Angela Luzia Branco Pinto Duarte,
Anderson Stevens Leônidas Gomes,
Emery Cleiton Cabral Correia Lins
Abstract:
The motivation of this work is the use of non-invasive and low cost techniques to obtain a faster and more accurate diagnosis of systemic sclerosis (SSc), rheumatic, autoimmune, chronic and rare disease. The technique in question is Near Infrared Spectroscopy (NIRS). Spectra were acquired from three different regions of hand's volunteers. Machine learning algorithms are used to classify and search…
▽ More
The motivation of this work is the use of non-invasive and low cost techniques to obtain a faster and more accurate diagnosis of systemic sclerosis (SSc), rheumatic, autoimmune, chronic and rare disease. The technique in question is Near Infrared Spectroscopy (NIRS). Spectra were acquired from three different regions of hand's volunteers. Machine learning algorithms are used to classify and search for the best optical wavelength. The results demonstrate that it is easy to obtain wavelength bands more important for the diagnosis. We use the algorithm RFECV and SVC. The results suggests that the most important wavelength band is at 1270 nm, referring to the luminescence of Singlet Oxygen. The results indicates that the Proximal Interphalangeal Joints region returns better accuracy's scores. Optical spectrometers can be found at low prices and can be easily used in clinical evaluations, while the algorithms used are completely diffused on open source platforms.
△ Less
Submitted 16 August, 2019;
originally announced August 2019.
-
Sentiment Aggregate Functions for Political Opinion Polling using Microblog Streams
Authors:
Pedro Saleiro,
Luís Gomes,
Carlos Soares
Abstract:
The automatic content analysis of mass media in the social sciences has become necessary and possible with the raise of social media and computational power. One particularly promising avenue of research concerns the use of sentiment analysis in microblog streams. However, one of the main challenges consists in aggregating sentiment polarity in a timely fashion that can be fed to the prediction me…
▽ More
The automatic content analysis of mass media in the social sciences has become necessary and possible with the raise of social media and computational power. One particularly promising avenue of research concerns the use of sentiment analysis in microblog streams. However, one of the main challenges consists in aggregating sentiment polarity in a timely fashion that can be fed to the prediction method. We investigated a large set of sentiment aggregate functions and performed a regression analysis using political opinion polls as gold standard. Our dataset contains nearly 233 000 tweets, classified according to their polarity (positive, negative or neutral), regarding the five main Portuguese political leaders during the Portuguese bailout (2011-2014). Results show that different sentiment aggregate functions exhibit different feature importance over time while the error keeps almost unchanged.
△ Less
Submitted 16 June, 2016;
originally announced June 2016.
-
Improving Spam Detection Based on Structural Similarity
Authors:
Luiz H. Gomes,
Fernando D. O. Castro,
Rodrigo B. Almeida,
Luis M. A. Bettencourt,
Virgilio A. F. Almeida,
Jussara M. Almeida
Abstract:
We propose a new detection algorithm that uses structural relationships between senders and recipients of email as the basis for the identification of spam messages. Users and receivers are represented as vectors in their reciprocal spaces. A measure of similarity between vectors is constructed and used to group users into clusters. Knowledge of their classification as past senders/receivers of…
▽ More
We propose a new detection algorithm that uses structural relationships between senders and recipients of email as the basis for the identification of spam messages. Users and receivers are represented as vectors in their reciprocal spaces. A measure of similarity between vectors is constructed and used to group users into clusters. Knowledge of their classification as past senders/receivers of spam or legitimate mail, comming from an auxiliary detection algorithm, is then used to label these clusters probabilistically. This knowledge comes from an auxiliary algorithm. The measure of similarity between the sender and receiver sets of a new message to the center vector of clusters is then used to asses the possibility of that message being legitimate or spam. We show that the proposed algorithm is able to correct part of the false positives (legitimate messages classified as spam) using a testbed of one week smtp log.
△ Less
Submitted 5 April, 2005;
originally announced April 2005.