-
The Impact of Position Errors on Crowd Simulation
Authors:
Lei Zhang,
Diego Lai,
Andriy V. Miranskyy
Abstract:
In large crowd events, there is always a potential possibility that a stampede accident will occur. The accident may cause injuries or even death. Approaches for controlling crowd flows and predicting dangerous congestion spots would be a boon to on-site authorities to manage the crowd and to prevent fatal accidents. One of the most popular approaches is real-time crowd simulation based on positio…
▽ More
In large crowd events, there is always a potential possibility that a stampede accident will occur. The accident may cause injuries or even death. Approaches for controlling crowd flows and predicting dangerous congestion spots would be a boon to on-site authorities to manage the crowd and to prevent fatal accidents. One of the most popular approaches is real-time crowd simulation based on position data from personal Global Positioning System (GPS) devices. However, the accuracy of spatial data varies for different GPS devices, and it is also affected by an environment in which an event takes place. In this paper, we would like to assess the effect of position errors on stampede prediction. We propose an Automatic Real-time dEtection of Stampedes (ARES) method to predict stampedes for large events. We implement three different stampede assessment methods in Menge framework and incorporate position errors. Our analysis suggests that the probability of simulated stampede changes significantly with the increase of the magnitude of position errors, which cannot be eliminated entirely with the help of classic techniques, such as the Kalman filter. Thus, it is our position that novel stampede assessment methods should be developed, focusing on the detection of position noise and the elimination of its effect.
△ Less
Submitted 25 October, 2018;
originally announced October 2018.
-
Rediscovery Datasets: Connecting Duplicate Reports
Authors:
Mefta Sadat,
Ayse Basar Bener,
Andriy V. Miranskyy
Abstract:
The same defect can be rediscovered by multiple clients, causing unplanned outages and leading to reduced customer satisfaction. In the case of popular open source software, high volume of defects is reported on a regular basis. A large number of these reports are actually duplicates / rediscoveries of each other. Researchers have analyzed the factors related to the content of duplicate defect rep…
▽ More
The same defect can be rediscovered by multiple clients, causing unplanned outages and leading to reduced customer satisfaction. In the case of popular open source software, high volume of defects is reported on a regular basis. A large number of these reports are actually duplicates / rediscoveries of each other. Researchers have analyzed the factors related to the content of duplicate defect reports in the past. However, some of the other potentially important factors, such as the inter-relationships among duplicate defect reports, are not readily available in defect tracking systems such as Bugzilla. This information may speed up bug fixing, enable efficient triaging, improve customer profiles, etc.
In this paper, we present three defect rediscovery datasets mined from Bugzilla. The datasets capture data for three groups of open source software projects: Apache, Eclipse, and KDE. The datasets contain information about approximately 914 thousands of defect reports over a period of 18 years (1999-2017) to capture the inter-relationships among duplicate defects. We believe that sharing these data with the community will help researchers and practitioners to better understand the nature of defect rediscovery and enhance the analysis of defect reports.
△ Less
Submitted 18 March, 2017;
originally announced March 2017.
-
Building Usage Profiles Using Deep Neural Nets
Authors:
Domenic Curro,
Konstantinos G. Derpanis,
Andriy V. Miranskyy
Abstract:
To improve software quality, one needs to build test scenarios resembling the usage of a software product in the field. This task is rendered challenging when a product's customer base is large and diverse. In this scenario, existing profiling approaches, such as operational profiling, are difficult to apply. In this work, we consider publicly available video tutorials of a product to profile usag…
▽ More
To improve software quality, one needs to build test scenarios resembling the usage of a software product in the field. This task is rendered challenging when a product's customer base is large and diverse. In this scenario, existing profiling approaches, such as operational profiling, are difficult to apply. In this work, we consider publicly available video tutorials of a product to profile usage. Our goal is to construct an automatic approach to extract information about user actions from instructional videos. To achieve this goal, we use a Deep Convolutional Neural Network (DCNN) to recognize user actions. Our pilot study shows that a DCNN trained to recognize user actions in video can classify five different actions in a collection of 236 publicly available Microsoft Word tutorial videos (published on YouTube). In our empirical evaluation we report a mean average precision of 94.42% across all actions. This study demonstrates the efficacy of DCNN-based methods for extracting software usage information from videos. Moreover, this approach may aid in other software engineering activities that require information about customer usage of a product.
△ Less
Submitted 23 February, 2017;
originally announced February 2017.
-
Database Engines: Evolution of Greenness
Authors:
Andriy V. Miranskyy,
Zainab Al-zanbouri,
David Godwin,
Ayse Basar Bener
Abstract:
Context: Information Technology consumes up to 10\% of the world's electricity generation, contributing to CO2 emissions and high energy costs. Data centers, particularly databases, use up to 23% of this energy. Therefore, building an energy-efficient (green) database engine could reduce energy consumption and CO2 emissions.
Goal: To understand the factors driving databases' energy consumption a…
▽ More
Context: Information Technology consumes up to 10\% of the world's electricity generation, contributing to CO2 emissions and high energy costs. Data centers, particularly databases, use up to 23% of this energy. Therefore, building an energy-efficient (green) database engine could reduce energy consumption and CO2 emissions.
Goal: To understand the factors driving databases' energy consumption and execution time throughout their evolution.
Method: We conducted an empirical case study of energy consumption by two MySQL database engines, InnoDB and MyISAM, across 40 releases. We examined the relationships of four software metrics to energy consumption and execution time to determine which metrics reflect the greenness and performance of a database.
Results: Our analysis shows that database engines' energy consumption and execution time increase as databases evolve. Moreover, the Lines of Code metric is correlated moderately to strongly with energy consumption and execution time in 88% of cases.
Conclusions: Our findings provide insights to both practitioners and researchers. Database administrators may use them to select a fast, green release of the MySQL database engine. MySQL database-engine developers may use the software metric to assess products' greenness and performance. Researchers may use our findings to further develop new hypotheses or build models to predict greenness and performance of databases.
△ Less
Submitted 9 January, 2017;
originally announced January 2017.
-
Metrics of Risk Associated with Defects Rediscovery
Authors:
Andriy V. Miranskyy,
Matthew Davison,
Mark Reesor
Abstract:
Software defects rediscovered by a large number of customers affect various stakeholders and may: 1) hint at gaps in a software manufacturer's Quality Assurance (QA) processes, 2) lead to an over-load of a software manufacturer's support and maintenance teams, and 3) consume customers' resources, leading to a loss of reputation and a decrease in sales.
Quantifying risk associated with the redisc…
▽ More
Software defects rediscovered by a large number of customers affect various stakeholders and may: 1) hint at gaps in a software manufacturer's Quality Assurance (QA) processes, 2) lead to an over-load of a software manufacturer's support and maintenance teams, and 3) consume customers' resources, leading to a loss of reputation and a decrease in sales.
Quantifying risk associated with the rediscovery of defects can help all of these stake-holders. In this chapter we present a set of metrics needed to quantify the risks. The metrics are designed to help: 1) the QA team to assess their processes; 2) the support and maintenance teams to allocate their resources; and 3) the customers to assess the risk associated with using the software product. The paper includes a validation case study which applies the risk metrics to industrial data. To calculate the metrics we use mathematical instruments like the heavy-tailed Kappa distribution and the G/M/k queuing model.
△ Less
Submitted 20 July, 2011;
originally announced July 2011.
-
Using entropy measures for comparison of software traces
Authors:
A. V. Miranskyy,
M. Davison,
M. Reesor,
S. S. Murtaza
Abstract:
The analysis of execution paths (also known as software traces) collected from a given software product can help in a number of areas including software testing, software maintenance and program comprehension. The lack of a scalable matching algorithm operating on detailed execution paths motivates the search for an alternative solution.
This paper proposes the use of word entropies for the clas…
▽ More
The analysis of execution paths (also known as software traces) collected from a given software product can help in a number of areas including software testing, software maintenance and program comprehension. The lack of a scalable matching algorithm operating on detailed execution paths motivates the search for an alternative solution.
This paper proposes the use of word entropies for the classification of software traces. Using a well-studied defective software as an example, we investigate the application of both Shannon and extended entropies (Landsberg-Vedral, Rényi and Tsallis) to the classification of traces related to various software defects. Our study shows that using entropy measures for comparisons gives an efficient and scalable method for comparing traces. The three extended entropies, with parameters chosen to emphasize rare events, all perform similarly and are superior to the Shannon entropy.
△ Less
Submitted 25 April, 2012; v1 submitted 26 October, 2010;
originally announced October 2010.