-
Urban map** in Dar es Salaam using AJIVE
Authors:
Rachel J. Carrington,
Ian L. Dryden,
Madeleine Ellis,
James O. Goulding,
Simon P. Preston,
David J. Sirl
Abstract:
Map** deprivation in urban areas is important, for example for identifying areas of greatest need and planning interventions. Traditional ways of obtaining deprivation estimates are based on either census or household survey data, which in many areas is unavailable or difficult to collect. However, there has been a huge rise in the amount of new, non-traditional forms of data, such as satellite…
▽ More
Map** deprivation in urban areas is important, for example for identifying areas of greatest need and planning interventions. Traditional ways of obtaining deprivation estimates are based on either census or household survey data, which in many areas is unavailable or difficult to collect. However, there has been a huge rise in the amount of new, non-traditional forms of data, such as satellite imagery and cell-phone call-record data, which may contain information useful for identifying deprivation. We use Angle-Based Joint and Individual Variation Explained (AJIVE) to jointly model satellite imagery data, cell-phone data, and survey data for the city of Dar es Salaam, Tanzania. We first identify interpretable low-dimensional structure from the imagery and cell-phone data, and find that we can use these to identify deprivation. We then consider what is gained from further incorporating the more traditional and costly survey data. We also introduce a scalar measure of deprivation as a response variable to be predicted, and consider various approaches to multiview regression, including using AJIVE scores as predictors.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
A Detailed Historical and Statistical Analysis of the Influence of Hardware Artifacts on SPEC Integer Benchmark Performance
Authors:
Yueyao Wang,
Samuel Furman,
Nicolas Hardy,
Margaret Ellis,
Godmar Back,
Yili Hong,
Kirk Cameron
Abstract:
The Standard Performance Evaluation Corporation (SPEC) CPU benchmark has been widely used as a measure of computing performance for decades. The SPEC is an industry-standardized, CPU-intensive benchmark suite and the collective data provide a proxy for the history of worldwide CPU and system performance. Past efforts have not provided or enabled answers to questions such as, how has the SPEC bench…
▽ More
The Standard Performance Evaluation Corporation (SPEC) CPU benchmark has been widely used as a measure of computing performance for decades. The SPEC is an industry-standardized, CPU-intensive benchmark suite and the collective data provide a proxy for the history of worldwide CPU and system performance. Past efforts have not provided or enabled answers to questions such as, how has the SPEC benchmark suite evolved empirically over time and what micro-architecture artifacts have had the most influence on performance? -- have any micro-benchmarks within the suite had undue influence on the results and comparisons among the codes? -- can the answers to these questions provide insights to the future of computer system performance? To answer these questions, we detail our historical and statistical analysis of specific hardware artifacts (clock frequencies, core counts, etc.) on the performance of the SPEC benchmarks since 1995. We discuss in detail several methods to normalize across benchmark evolutions. We perform both isolated and collective sensitivity analyses for various hardware artifacts and we identify one benchmark (libquantum) that had somewhat undue influence on performance outcomes. We also present the use of SPEC data to predict future performance.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
The Bayesian Spatial Bradley--Terry Model: Urban Deprivation Modeling in Tanzania
Authors:
R. G. Seymour,
D. Sirl,
S. Preston,
I. L. Dryden,
M. J. A. Ellis,
B. Perrat,
J. Goulding
Abstract:
Identifying the most deprived regions of any country or city is key if policy makers are to design successful interventions. However, locating areas with the greatest need is often surprisingly challenging in develo** countries. Due to the logistical challenges of traditional household surveying, official statistics can be slow to be updated; estimates that exist can be coarse, a consequence of…
▽ More
Identifying the most deprived regions of any country or city is key if policy makers are to design successful interventions. However, locating areas with the greatest need is often surprisingly challenging in develo** countries. Due to the logistical challenges of traditional household surveying, official statistics can be slow to be updated; estimates that exist can be coarse, a consequence of prohibitive costs and poor infrastructures; and mass urbanisation can render manually surveyed figures rapidly out-of-date. Comparative judgement models, such as the Bradley--Terry model, offer a promising solution. Leveraging local knowledge, elicited via comparisons of different areas' affluence, such models can both simplify logistics and circumvent biases inherent to house-hold surveys. Yet widespread adoption remains limited, due to the large amount of data existing approaches still require. We address this via development of a novel Bayesian Spatial Bradley--Terry model, which substantially decreases the amount of data comparisons required for effective inference. This model integrates a network representation of the city or country, along with assumptions of spatial smoothness that allow deprivation in one area to be informed by neighbouring areas. We demonstrate the practical effectiveness of this method, through a novel comparative judgement data set collected in Dar es Salaam, Tanzania.
△ Less
Submitted 28 October, 2021; v1 submitted 27 October, 2020;
originally announced October 2020.
-
On Design of Problem Token Questions in Quality of Experience Surveys
Authors:
Jayant Gupchup,
Ebrahim Beyrami,
Martin Ellis,
Yasaman Hosseinkashi,
Sam Johnson,
Ross Cutler
Abstract:
User surveys for Quality of Experience (QoE) are a critical source of information. In addition to the common "star rating" used to estimate Mean Opinion Score (MOS), more detailed survey questions (problem tokens) about specific areas provide valuable insight into the factors impacting QoE. This paper explores two aspects of the problem token questionnaire design. First, we study the bias introduc…
▽ More
User surveys for Quality of Experience (QoE) are a critical source of information. In addition to the common "star rating" used to estimate Mean Opinion Score (MOS), more detailed survey questions (problem tokens) about specific areas provide valuable insight into the factors impacting QoE. This paper explores two aspects of the problem token questionnaire design. First, we study the bias introduced by fixed question order, and second, we study the challenge of selecting a subset of questions to keep the token set small. Based on 900,000 calls gathered using a randomized controlled experiment from a live system, we find that the order bias can be significantly reduced by randomizing the display order of tokens. The difference in response rate varies based on token position and display design. It is worth noting that the users respond to the randomized-order variant at levels that are comparable to the fixed-order variant. The effective selection of a subset of token questions is achieved by extracting tokens that provide the highest information gain over user ratings. This selection is known to be in the class of NP-hard problems. We apply a well-known greedy submodular maximization method on our dataset to capture 94% of the information using just 30% of the questions.
△ Less
Submitted 18 August, 2018;
originally announced August 2018.