-
The Presence and the State-of-Practice of Software Architects in the Brazilian Industry -- A Survey
Authors:
Valdemar Vicente Graciano Neto,
Diana Lorena Santos,
Andrey Gonçalves França,
Rafael Z. Frantz,
Edson de Oliveira-Jr,
Ahmad Mohsin,
Mohamad Kassab
Abstract:
Context: Software architecture intensely impacts the software quality. Therefore, the professional assigned to carry out the design, maintenance and evolution of architectures needs to have certain knowledge and skills in order not to compromise the resulting application. Objective: The aim of this work is to understand the characteristics of the companies regarding the presence or absence of soft…
▽ More
Context: Software architecture intensely impacts the software quality. Therefore, the professional assigned to carry out the design, maintenance and evolution of architectures needs to have certain knowledge and skills in order not to compromise the resulting application. Objective: The aim of this work is to understand the characteristics of the companies regarding the presence or absence of software architects in Brazil. Method: This work uses the Survey research as a means to collect evidence from professionals with the software architect profile, besides descriptive statistics and thematic analysis to analyze the results. Results: The study collected data from 105 professionals distributed in 24 Brazilian states. Results reveal that (i) not all companies have a software architect, (ii) in some cases, other professionals perform the activities of a software architect and (iii) there are companies that, even having a software architecture professional, have other roles also performing the duties of such a professional. Conclusions: Professionals hired as software architects have higher salaries than those hired in other roles that carry out such activity, although many of those other professionals still have duties that are typical of software architects.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Emissions Reporting Maturity Model: supporting cities to leverage emissions-related processes through performance indicators and artificial intelligence
Authors:
Victor de A. Xavier,
Felipe M. G. França,
Priscila M. V. Lima
Abstract:
Climate change and global warming have been trending topics worldwide since the Eco-92 conference. However, little progress has been made in reducing greenhouse gases (GHGs). The problems and challenges related to emissions are complex and require a concerted and comprehensive effort to address them. Emissions reporting is a critical component of GHG reduction policy and is therefore the focus of…
▽ More
Climate change and global warming have been trending topics worldwide since the Eco-92 conference. However, little progress has been made in reducing greenhouse gases (GHGs). The problems and challenges related to emissions are complex and require a concerted and comprehensive effort to address them. Emissions reporting is a critical component of GHG reduction policy and is therefore the focus of this work. The main goal of this work is two-fold: (i) to propose an emission reporting evaluation model to leverage emissions reporting overall quality and (ii) to use artificial intelligence (AI) to support the initiatives that improve emissions reporting. Thus, this work presents an Emissions Reporting Maturity Model (ERMM) for examining, clustering, and analysing data from emissions reporting initiatives to help the cities to deal with climate change and global warming challenges. The Performance Indicator Development Process (PIDP) proposed in this work provides ways to leverage the quality of the available data necessary for the execution of the evaluations identified by the ERMM. Hence, the PIDP supports the preparation of the data from emissions-related databases, the classification of the data according to similarities highlighted by different clustering techniques, and the identification of performance indicator candidates, which are strengthened by a qualitative analysis of selected data samples. Thus, the main goal of ERRM is to evaluate and classify the cities regarding the emission reporting processes, pointing out the drawbacks and challenges faced by other cities from different contexts, and at the end to help them to leverage the underlying emissions-related processes and emissions mitigation initiatives.
△ Less
Submitted 8 December, 2023;
originally announced January 2024.
-
ULEEN: A Novel Architecture for Ultra Low-Energy Edge Neural Networks
Authors:
Zachary Susskind,
Aman Arora,
Igor D. S. Miranda,
Alan T. L. Bacellar,
Luis A. Q. Villon,
Rafael F. Katopodis,
Leandro S. de Araujo,
Diego L. C. Dutra,
Priscila M. V. Lima,
Felipe M. G. Franca,
Mauricio Breternitz Jr.,
Lizy K. John
Abstract:
The deployment of AI models on low-power, real-time edge devices requires accelerators for which energy, latency, and area are all first-order concerns. There are many approaches to enabling deep neural networks (DNNs) in this domain, including pruning, quantization, compression, and binary neural networks (BNNs), but with the emergence of the "extreme edge", there is now a demand for even more ef…
▽ More
The deployment of AI models on low-power, real-time edge devices requires accelerators for which energy, latency, and area are all first-order concerns. There are many approaches to enabling deep neural networks (DNNs) in this domain, including pruning, quantization, compression, and binary neural networks (BNNs), but with the emergence of the "extreme edge", there is now a demand for even more efficient models. In order to meet the constraints of ultra-low-energy devices, we propose ULEEN, a model architecture based on weightless neural networks. Weightless neural networks (WNNs) are a class of neural model which use table lookups, not arithmetic, to perform computation. The elimination of energy-intensive arithmetic operations makes WNNs theoretically well suited for edge inference; however, they have historically suffered from poor accuracy and excessive memory usage. ULEEN incorporates algorithmic improvements and a novel training strategy inspired by BNNs to make significant strides in improving accuracy and reducing model size. We compare FPGA and ASIC implementations of an inference accelerator for ULEEN against edge-optimized DNN and BNN devices. On a Xilinx Zynq Z-7045 FPGA, we demonstrate classification on the MNIST dataset at 14.3 million inferences per second (13 million inferences/Joule) with 0.21 $μ$s latency and 96.2% accuracy, while Xilinx FINN achieves 12.3 million inferences per second (1.69 million inferences/Joule) with 0.31 $μ$s latency and 95.83% accuracy. In a 45nm ASIC, we achieve 5.1 million inferences/Joule and 38.5 million inferences/second at 98.46% accuracy, while a quantized Bit Fusion model achieves 9230 inferences/Joule and 19,100 inferences/second at 99.35% accuracy. In our search for ever more efficient edge devices, ULEEN shows that WNNs are deserving of consideration.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Geometric Methods for Sampling, Optimisation, Inference and Adaptive Agents
Authors:
Alessandro Barp,
Lancelot Da Costa,
Guilherme França,
Karl Friston,
Mark Girolami,
Michael I. Jordan,
Grigorios A. Pavliotis
Abstract:
In this chapter, we identify fundamental geometric structures that underlie the problems of sampling, optimisation, inference and adaptive decision-making. Based on this identification, we derive algorithms that exploit these geometric structures to solve these problems efficiently. We show that a wide range of geometric theories emerge naturally in these fields, ranging from measure-preserving pr…
▽ More
In this chapter, we identify fundamental geometric structures that underlie the problems of sampling, optimisation, inference and adaptive decision-making. Based on this identification, we derive algorithms that exploit these geometric structures to solve these problems efficiently. We show that a wide range of geometric theories emerge naturally in these fields, ranging from measure-preserving processes, information divergences, Poisson geometry, and geometric integration. Specifically, we explain how (i) leveraging the symplectic geometry of Hamiltonian systems enable us to construct (accelerated) sampling and optimisation methods, (ii) the theory of Hilbertian subspaces and Stein operators provides a general methodology to obtain robust estimators, (iii) preserving the information geometry of decision-making yields adaptive agents that perform active inference. Throughout, we emphasise the rich connections between these fields; e.g., inference draws on sampling and optimisation, and adaptive decision-making assesses decisions by inferring their counterfactual consequences. Our exposition provides a conceptual overview of underlying ideas, rather than a technical discussion, which can be found in the references herein.
△ Less
Submitted 25 July, 2022; v1 submitted 20 March, 2022;
originally announced March 2022.
-
Weightless Neural Networks for Efficient Edge Inference
Authors:
Zachary Susskind,
Aman Arora,
Igor Dantas Dos Santos Miranda,
Luis Armando Quintanilla Villon,
Rafael Fontella Katopodis,
Leandro Santiago de Araujo,
Diego Leonel Cadette Dutra,
Priscila Machado Vieira Lima,
Felipe Maia Galvao Franca,
Mauricio Breternitz Jr.,
Lizy K. John
Abstract:
Weightless Neural Networks (WNNs) are a class of machine learning model which use table lookups to perform inference. This is in contrast with Deep Neural Networks (DNNs), which use multiply-accumulate operations. State-of-the-art WNN architectures have a fraction of the implementation cost of DNNs, but still lag behind them on accuracy for common image recognition tasks. Additionally, many existi…
▽ More
Weightless Neural Networks (WNNs) are a class of machine learning model which use table lookups to perform inference. This is in contrast with Deep Neural Networks (DNNs), which use multiply-accumulate operations. State-of-the-art WNN architectures have a fraction of the implementation cost of DNNs, but still lag behind them on accuracy for common image recognition tasks. Additionally, many existing WNN architectures suffer from high memory requirements. In this paper, we propose a novel WNN architecture, BTHOWeN, with key algorithmic and architectural improvements over prior work, namely counting Bloom filters, hardware-friendly hashing, and Gaussian-based nonlinear thermometer encodings to improve model accuracy and reduce area and energy consumption. BTHOWeN targets the large and growing edge computing sector by providing superior latency and energy efficiency to comparable quantized DNNs. Compared to state-of-the-art WNNs across nine classification datasets, BTHOWeN on average reduces error by more than than 40% and model size by more than 50%. We then demonstrate the viability of the BTHOWeN architecture by presenting an FPGA-based accelerator, and compare its latency and resource usage against similarly accurate quantized DNN accelerators, including Multi-Layer Perceptron (MLP) and convolutional models. The proposed BTHOWeN models consume almost 80% less energy than the MLP models, with nearly 85% reduction in latency. In our quest for efficient ML on the edge, WNNs are clearly deserving of additional attention.
△ Less
Submitted 2 March, 2022;
originally announced March 2022.
-
What is the Best Grid-Map for Self-Driving Cars Localization? An Evaluation under Diverse Types of Illumination, Traffic, and Environment
Authors:
Filipe Mutz,
Thiago Oliveira-Santos,
Avelino Forechi,
Karin S. Komati,
Claudine Badue,
Felipe M. G. França,
Alberto F. De Souza
Abstract:
The localization of self-driving cars is needed for several tasks such as kee** maps updated, tracking objects, and planning. Localization algorithms often take advantage of maps for estimating the car pose. Since maintaining and using several maps is computationally expensive, it is important to analyze which type of map is more adequate for each application. In this work, we provide data for s…
▽ More
The localization of self-driving cars is needed for several tasks such as kee** maps updated, tracking objects, and planning. Localization algorithms often take advantage of maps for estimating the car pose. Since maintaining and using several maps is computationally expensive, it is important to analyze which type of map is more adequate for each application. In this work, we provide data for such analysis by comparing the accuracy of a particle filter localization when using occupancy, reflectivity, color, or semantic grid maps. To the best of our knowledge, such evaluation is missing in the literature. For building semantic and colour grid maps, point clouds from a Light Detection and Ranging (LiDAR) sensor are fused with images captured by a front-facing camera. Semantic information is extracted from images with a deep neural network. Experiments are performed in varied environments, under diverse conditions of illumination and traffic. Results show that occupancy grid maps lead to more accurate localization, followed by reflectivity grid maps. In most scenarios, the localization with semantic grid maps kept the position tracking without catastrophic losses, but with errors from 2 to 3 times bigger than the previous. Colour grid maps led to inaccurate and unstable localization even using a robust metric, the entropy correlation coefficient, for comparing online data and the map.
△ Less
Submitted 19 September, 2020;
originally announced September 2020.
-
Distributed Optimization, Averaging via ADMM, and Network Topology
Authors:
Guilherme França,
José Bento
Abstract:
There has been an increasing necessity for scalable optimization methods, especially due to the explosion in the size of datasets and model complexity in modern machine learning applications. Scalable solvers often distribute the computation over a network of processing units. For simple algorithms such as gradient descent the dependency of the convergence time with the topology of this network is…
▽ More
There has been an increasing necessity for scalable optimization methods, especially due to the explosion in the size of datasets and model complexity in modern machine learning applications. Scalable solvers often distribute the computation over a network of processing units. For simple algorithms such as gradient descent the dependency of the convergence time with the topology of this network is well-known. However, for more involved algorithms such as the Alternating Direction Methods of Multipliers (ADMM) much less is known. At the heart of many distributed optimization algorithms there exists a gossip subroutine which averages local information over the network, and whose efficiency is crucial for the overall performance of the method. In this paper we review recent research in this area and, with the goal of isolating such a communication exchange behaviour, we compare different algorithms when applied to a canonical distributed averaging consensus problem. We also show interesting connections between ADMM and lifted Markov chains besides providing an explicitly characterization of its convergence and optimal parameter tuning in terms of spectral properties of the network. Finally, we empirically study the connection between network topology and convergence rates for different algorithms on a real world problem of sensor localization.
△ Less
Submitted 5 September, 2020;
originally announced September 2020.
-
wisardpkg -- A library for WiSARD-based models
Authors:
Aluizio S. Lima Filho,
Gabriel P. Guarisa,
Leopoldo A. D. Lusquino Filho,
Luiz F. R. Oliveira,
Felipe M. G. Franca,
Priscila M. V. Lima
Abstract:
In order to facilitate the production of codes using WiSARD-based models, LabZero developed an ML library C++/Python called wisardpkg. This library is an MIT-licensed open-source package hosted on GitHub under the license.
In order to facilitate the production of codes using WiSARD-based models, LabZero developed an ML library C++/Python called wisardpkg. This library is an MIT-licensed open-source package hosted on GitHub under the license.
△ Less
Submitted 2 May, 2020;
originally announced May 2020.
-
Exploring the Equivalence between Dynamic Dataflow Model and Gamma - General Abstract Model for Multiset mAnipulation
Authors:
Rui R. Mello Junior,
Leandro S. Araujo,
Tiago A. O. Alves,
Leandro A. J. Marzulo,
Gabriel A. L. Paillard,
Felipe M. G. França
Abstract:
With the increase of the search for computational models where the expression of parallelism occurs naturally, some paradigms arise as options for the next generation of computers. In this context, dynamic Dataflow and Gamma - General Abstract Model for Multiset mAnipulation) - emerge as interesting computational models choices. In the dynamic Dataflow model, operations are performed as soon as th…
▽ More
With the increase of the search for computational models where the expression of parallelism occurs naturally, some paradigms arise as options for the next generation of computers. In this context, dynamic Dataflow and Gamma - General Abstract Model for Multiset mAnipulation) - emerge as interesting computational models choices. In the dynamic Dataflow model, operations are performed as soon as their associated operators are available, without rely on a Program Counter to dictate the execution order of instructions. The Gamma paradigm is based on a parallel multiset rewriting scheme. It provides a non-deterministic execution model inspired by an abstract chemical machine metaphor, where operations are formulated as reactions that occur freely among matching elements belonging to the multiset. In this work, equivalence relations between the dynamic Dataflow and Gamma paradigms are exposed and explored, while methods to convert from Dataflow to Gamma paradigm and vice versa are provided. It is shown that vertices and edges of a dynamic Dataflow graph can correspond, respectively, to reactions and multiset elements in the Gamma paradigm. Implementation aspects of execution environments that could be mutually beneficial to both models are also discussed. This work provides the scientific community with the possibility of taking profit of both parallel programming models, contributing with a versatility component to researchers and developers. Finally, it is important to state that, to the best of our knowledge, the similarity relations between both dynamic Dataflow and Gamma models presented here have not been reported in any previous work.
△ Less
Submitted 1 November, 2018;
originally announced November 2018.
-
Fractal and Multifractal Properties of Electrographic Recordings of Human Brain Activity: Toward Its Use as a Signal Feature for Machine Learning in Clinical Applications
Authors:
Lucas G. S. França,
José G. V. Miranda,
Marco Leite,
Niraj K. Sharma,
Matthew C. Walker,
Louis Lemieux,
Yujiang Wang
Abstract:
The brain is a system operating on multiple time scales, and characterisation of dynamics across time scales remains a challenge. One framework to study such dynamics is that of fractal geometry. However, currently there exists no established method for the study of brain dynamics using fractal geometry, due to the many challenges in the conceptual and technical understanding of the methods. We ai…
▽ More
The brain is a system operating on multiple time scales, and characterisation of dynamics across time scales remains a challenge. One framework to study such dynamics is that of fractal geometry. However, currently there exists no established method for the study of brain dynamics using fractal geometry, due to the many challenges in the conceptual and technical understanding of the methods. We aim to highlight some of the practical challenges of applying fractal geometry to brain dynamics and propose solutions to enable its wider use in neuroscience. Using intracranially recorded EEG and simulated data, we compared monofractal and multifractal methods with regards to their sensitivity to signal variance. We found that both correlate closely with signal variance, thus not offering new information about the signal. However, after applying an epoch-wise standardisation procedure to the signal, we found that multifractal measures could offer non-redundant information compared to signal variance, power and other established EEG signal measures. We also compared different multifractal estimation methods and found that the Chhabra-Jensen algorithm performed best. Finally, we investigated the impact of sampling frequency and epoch length on multifractal properties. Using epileptic seizures as an example event in the EEG, we show that there may be an optimal time scale for detecting temporal changes in multifractal properties around seizures. The practical issues we highlighted and our suggested solutions should help in develo** a robust method for the application of fractal geometry in EEG signals. Our analyses and observations also aid the theoretical understanding of the multifractal properties of the brain and might provide grounds for new discoveries in the study of brain signals. These could be crucial for understanding of neurological function and for the developments of new treatments.
△ Less
Submitted 11 December, 2018; v1 submitted 11 June, 2018;
originally announced June 2018.
-
An Explicit Convergence Rate for Nesterov's Method from SDP
Authors:
Sam Safavi,
Bikash Joshi,
Guilherme França,
José Bento
Abstract:
The framework of Integral Quadratic Constraints (IQC) introduced by Lessard et al. (2014) reduces the computation of upper bounds on the convergence rate of several optimization algorithms to semi-definite programming (SDP). In particular, this technique was applied to Nesterov's accelerated method (NAM). For quadratic functions, this SDP was explicitly solved leading to a new bound on the converg…
▽ More
The framework of Integral Quadratic Constraints (IQC) introduced by Lessard et al. (2014) reduces the computation of upper bounds on the convergence rate of several optimization algorithms to semi-definite programming (SDP). In particular, this technique was applied to Nesterov's accelerated method (NAM). For quadratic functions, this SDP was explicitly solved leading to a new bound on the convergence rate of NAM, and for arbitrary strongly convex functions it was shown numerically that IQC can improve bounds from Nesterov (2004). Unfortunately, an explicit analytic solution to the SDP was not provided. In this paper, we provide such an analytical solution, obtaining a new general and explicit upper bound on the convergence rate of NAM, which we further optimize over its parameters. To the best of our knowledge, this is the best, and explicit, upper bound on the convergence rate of NAM for strongly convex functions.
△ Less
Submitted 13 January, 2018;
originally announced January 2018.
-
Decanting the Contribution of Instruction Types and Loop Structures in the Reuse of Traces
Authors:
Andrey M. Coppieters,
Sheila de Oliveira,
Felipe M. G. França,
Maurício L. Pilla,
Amarildo T. da Costa
Abstract:
Reuse has been proposed as a microarchitecture-level mechanism to reduce the amount of executed instructions, collapsing dependencies and freeing resources for other instructions. Previous works have used reuse domains such as memory accesses, integer or not floating point, based on the reusability rate. However, these works have not studied the specific contribution of reusing different subsets o…
▽ More
Reuse has been proposed as a microarchitecture-level mechanism to reduce the amount of executed instructions, collapsing dependencies and freeing resources for other instructions. Previous works have used reuse domains such as memory accesses, integer or not floating point, based on the reusability rate. However, these works have not studied the specific contribution of reusing different subsets of instructions for performance. In this work, we analysed the sensitivity of trace reuse to instruction subsets, comparing their efficiency to their complementary subsets. We also studied the amount of reuse that can be extracted from loops. Our experiments show that disabling trace reuse outside loops does not harm performance but reduces in 12% the number of accesses to the reuse table. Our experiments with reuse subsets show that most of the speedup can be retained even when not reusing all types of instructions previously found in the reuse domain.
△ Less
Submitted 17 November, 2017;
originally announced November 2017.
-
Kernel k-Groups via Hartigan's Method
Authors:
Guilherme França,
Maria L. Rizzo,
Joshua T. Vogelstein
Abstract:
Energy statistics was proposed by Sz\' ekely in the 80's inspired by Newton's gravitational potential in classical mechanics and it provides a model-free hypothesis test for equality of distributions. In its original form, energy statistics was formulated in Euclidean spaces. More recently, it was generalized to metric spaces of negative type. In this paper, we consider a formulation for the clust…
▽ More
Energy statistics was proposed by Sz\' ekely in the 80's inspired by Newton's gravitational potential in classical mechanics and it provides a model-free hypothesis test for equality of distributions. In its original form, energy statistics was formulated in Euclidean spaces. More recently, it was generalized to metric spaces of negative type. In this paper, we consider a formulation for the clustering problem using a weighted version of energy statistics in spaces of negative type. We show that this approach leads to a quadratically constrained quadratic program in the associated kernel space, establishing connections with graph partitioning problems and kernel methods in machine learning. To find local solutions of such an optimization problem, we propose kernel k-groups, which is an extension of Hartigan's method to kernel spaces. Kernel k-groups is cheaper than spectral clustering and has the same computational cost as kernel k-means (which is based on Lloyd's heuristic) but our numerical results show an improved performance, especially in higher dimensions. Moreover, we verify the efficiency of kernel k-groups in community detection in sparse stochastic block models which has fascinating applications in several areas of science.
△ Less
Submitted 11 June, 2020; v1 submitted 26 October, 2017;
originally announced October 2017.
-
Markov Chain Lifting and Distributed ADMM
Authors:
Guilherme França,
José Bento
Abstract:
The time to converge to the steady state of a finite Markov chain can be greatly reduced by a lifting operation, which creates a new Markov chain on an expanded state space. For a class of quadratic objectives, we show an analogous behavior where a distributed ADMM algorithm can be seen as a lifting of Gradient Descent algorithm. This provides a deep insight for its faster convergence rate under o…
▽ More
The time to converge to the steady state of a finite Markov chain can be greatly reduced by a lifting operation, which creates a new Markov chain on an expanded state space. For a class of quadratic objectives, we show an analogous behavior where a distributed ADMM algorithm can be seen as a lifting of Gradient Descent algorithm. This provides a deep insight for its faster convergence rate under optimal parameter tuning. We conjecture that this gain is always present, as opposed to the lifting of a Markov chain which sometimes only provides a marginal speedup.
△ Less
Submitted 10 March, 2017;
originally announced March 2017.
-
RAdNet-VE: An Interest-Centric Mobile Ad Hoc Network for Vehicular Environments
Authors:
F. B. Gonçalves,
F. M. G. França,
C. L. Amorim
Abstract:
In this study, we propose a variation of the RAdNet for vehicular environments (RAdNet-VE). The proposed scheme extends the message header, mechanism for registering interest, and message forwarding mechanism of RAdNet. To obtain results, we performed simulation experiments involving two use scenarios and communication protocols developed from the Veins framework. Based on results obtained from th…
▽ More
In this study, we propose a variation of the RAdNet for vehicular environments (RAdNet-VE). The proposed scheme extends the message header, mechanism for registering interest, and message forwarding mechanism of RAdNet. To obtain results, we performed simulation experiments involving two use scenarios and communication protocols developed from the Veins framework. Based on results obtained from these experiments, we compare the performance of RAdNet-VE against that of RAdNet, a basic content-centric network (CCN) using reactive data routing, (CCN$_R$), and a basic CCN using proactive data routing, CCN$_P$. These CCNs provide non-cacheable data services. Moreover, the communication radio standards adopted in the scenarios 1 and 2 were respectively IEEE 802.11n and IEEE 802.11p. The results shown that the performance of the RAdNet-VE was superior to than those of RAdNet, CCN$_R$ and CCN$_P$. In this sense, RAdNet-VE protocol (RVEP) presented low communication latencies among nodes of just 20.4ms (scenario 1) and 2.87 ms (scenario 2). Our protocol also presented high data delivery rates, i.e, 83.05\% (scenario 1) and 88.05\% (scenario 2). Based on these and other results presented in this study, we argue that RAdNet-VE is a feasible alternative to CCNs as information-centric network (ICN) model for VANET, because the RVEP satisfies all of the necessary communication requirements.
△ Less
Submitted 6 January, 2017; v1 submitted 2 April, 2016;
originally announced April 2016.
-
Free Instrument for Movement Measure
Authors:
Norberto Peña,
Bruno Cecílio Credidio,
Lorena Peixoto Nogueira Rodriguez Martinez Salles Corrêa,
Lucas Gabriel Souza França,
Marcelo do Vale Cunha,
Marcos Cavalcanti de Sousa,
João Paulo Bomfim Cruz Vieira,
José Garcia Vivas Miranda
Abstract:
This paper presents the validation of a computational tool that serves to obtain continuous measurements of moving objects. The software uses techniques of computer vision, pattern recognition and optical flow, to enable tracking of objects in videos, generating data trajectory, velocity, acceleration and angular movement. The program was applied to track a ball around a simple pendulum. The metho…
▽ More
This paper presents the validation of a computational tool that serves to obtain continuous measurements of moving objects. The software uses techniques of computer vision, pattern recognition and optical flow, to enable tracking of objects in videos, generating data trajectory, velocity, acceleration and angular movement. The program was applied to track a ball around a simple pendulum. The methodology used to validate it, taking as a basis to compare the values measured by the program, as well as the theoretical values expected according to the model of a simple pendulum. The experiment is appropriate to the method because it was built within the limits of the linear harmonic oscillator and energy losses due to friction had been minimized, making it the most ideal possible. The results indicate that the tool is sensitive and accurate. Deviations of less than a millimeter to the extent of the trajectory, ensures the applicability of the software on physics, whether in research or in teaching topics.
△ Less
Submitted 29 June, 2013;
originally announced July 2013.
-
Couillard: Parallel Programming via Coarse-Grained Data-Flow Compilation
Authors:
Leandro A. J. Marzulo,
Tiago A. O. Alves,
Felipe M. G. França,
Vítor Santos Costa
Abstract:
Data-flow is a natural approach to parallelism. However, describing dependencies and control between fine-grained data-flow tasks can be complex and present unwanted overheads. TALM (TALM is an Architecture and Language for Multi-threading) introduces a user-defined coarse-grained parallel data-flow model, where programmers identify code blocks, called super-instructions, to be run in parallel and…
▽ More
Data-flow is a natural approach to parallelism. However, describing dependencies and control between fine-grained data-flow tasks can be complex and present unwanted overheads. TALM (TALM is an Architecture and Language for Multi-threading) introduces a user-defined coarse-grained parallel data-flow model, where programmers identify code blocks, called super-instructions, to be run in parallel and connect them in a data-flow graph. TALM has been implemented as a hybrid Von Neumann/data-flow execution system: the \emph{Trebuchet}. We have observed that TALM's usefulness largely depends on how programmers specify and connect super-instructions. Thus, we present \emph{Couillard}, a full compiler that creates, based on an annotated C-program, a data-flow graph and C-code corresponding to each super-instruction. We show that our toolchain allows one to benefit from data-flow execution and explore sophisticated parallel programming techniques, with small effort. To evaluate our system we have executed a set of real applications on a large multi-core machine. Comparison with popular parallel programming methods shows competitive speedups, while providing an easier parallel programing approach.
△ Less
Submitted 22 September, 2011;
originally announced September 2011.
-
Transactional WaveCache: Towards Speculative and Out-of-Order DataFlow Execution of Memory Operations
Authors:
Leandro A. J. Marzulo,
Felipe M. G. França,
Vítor Santos Costa
Abstract:
The WaveScalar is the first DataFlow Architecture that can efficiently provide the sequential memory semantics required by imperative languages. This work presents an alternative memory ordering mechanism for this architecture, the Transaction WaveCache. Our mechanism maintains the execution order of memory operations within blocks of code, called Waves, but adds the ability to speculatively exe…
▽ More
The WaveScalar is the first DataFlow Architecture that can efficiently provide the sequential memory semantics required by imperative languages. This work presents an alternative memory ordering mechanism for this architecture, the Transaction WaveCache. Our mechanism maintains the execution order of memory operations within blocks of code, called Waves, but adds the ability to speculatively execute, out-of-order, operations from different waves. This ordering mechanism is inspired by progress in supporting Transactional Memories. Waves are considered as atomic regions and executed as nested transactions. If a wave has finished the execution of all its memory operations, as soon as the previous waves are committed, it can be committed. If a hazard is detected in a speculative Wave, all the following Waves (children) are aborted and re-executed. We evaluate the WaveCache on a set artificial benchmarks. If the benchmark does not access memory often, we could achieve speedups of around 90%. Speedups of 33.1% and 24% were observed on more memory intensive applications, and slowdowns up to 16% arise if memory bandwidth is a bottleneck. For an application full of WAW, WAR and RAW hazards, a speedup of 139.7% was verified.
△ Less
Submitted 7 December, 2007;
originally announced December 2007.