Skip to main content

Showing 1–8 of 8 results for author: Mattos, D I

.
  1. arXiv:2207.00222  [pdf, other

    cs.SE

    Bayesian causal inference in automotive software engineering and online evaluation

    Authors: Yuchu Liu, David Issa Mattos, Jan Bosch, Helena Holmström Olsson, Jonn Lantz

    Abstract: Randomised field experiments, such as A/B testing, have long been the gold standard for evaluating software changes. In the automotive domain, running randomised field experiments is not always desired, possible, or even ethical. In the face of such limitations, we develop a framework BOAT (Bayesian causal modelling for ObvservAtional Testing), utilising observational studies in combination with B… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: In submission

  2. arXiv:2204.08743  [pdf, other

    cs.SE

    On the Use of Causal Graphical Models for Designing Experiments in the Automotive Domain

    Authors: David Issa Mattos, Yuchu Liu

    Abstract: Randomized field experiments are the gold standard for evaluating the impact of software changes on customers. In the online domain, randomization has been the main tool to ensure exchangeability. However, due to the different deployment conditions and the high dependence on the surrounding environment, designing experiments for automotive software needs to consider a higher number of restricted v… ▽ More

    Submitted 25 April, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: In submission

  3. Bayesian propensity score matching in automotive embedded software engineering

    Authors: Yuchu Liu, David Issa Mattos, Jan Bosch, Helena Holmström Olsson, Jonn Lantz

    Abstract: Randomised field experiments, such as A/B testing, have long been the gold standard for evaluating the value that new software brings to customers. However, running randomised field experiments is not always desired, possible or even ethical in the development of automotive embedded software. In the face of such restrictions, we propose the use of the Bayesian propensity score matching technique f… ▽ More

    Submitted 26 September, 2021; originally announced September 2021.

    Comments: To appear at the 28th Asia-Pacific Software Engineering Conference (APSEC 2021)

  4. Size matters? Or not: A/B testing with limited sample in automotive embedded software

    Authors: Yuchu Liu, David Issa Mattos, Jan Bosch, Helena Holmström Olsson, Jonn Lantz

    Abstract: A/B testing is gaining attention in the automotive sector as a promising tool to measure causal effects from software changes. Different from the web-facing businesses, where A/B testing has been well-established, the automotive domain often suffers from limited eligible users to participate in online experiments. To address this shortcoming, we present a method for designing balanced control and… ▽ More

    Submitted 10 November, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

    Comments: In proceedings of the 2021 47th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)

  5. arXiv:2104.07381  [pdf, other

    cs.NE cs.DS

    On the Assessment of Benchmark Suites for Algorithm Comparison

    Authors: David Issa Mattos, Lucas Ruud, Jan Bosch, Helena Holmström Olsson

    Abstract: Benchmark suites, i.e. a collection of benchmark functions, are widely used in the comparison of black-box optimization algorithms. Over the years, research has identified many desired qualities for benchmark suites, such as diverse topology, different difficulties, scalability, representativeness of real-world problems among others. However, while the topology characteristics have been subjected… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

    Comments: In submission

  6. arXiv:2101.11227  [pdf, other

    stat.ME cs.MS

    Bayesian Paired-Comparison with the bpcs Package

    Authors: David Issa Mattos, Érika Martins Silva Ramos

    Abstract: This article introduces the bpcs R package (Bayesian Paired Comparison in Stan) and the statistical models implemented in the package. This package aims to facilitate the use of Bayesian models for paired comparison data in behavioral research. Bayesian analysis of paired comparison data allows parameter estimation even in conditions where the maximum likelihood does not exist, allows easy extensi… ▽ More

    Submitted 20 September, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

    Comments: Accepted for publication in the Journal of Behavior Research Methods (https://www.springer.com/journal/13428)

  7. arXiv:2010.03783  [pdf, other

    stat.ME

    Statistical Models for the Analysis of Optimization Algorithms with Benchmark Functions

    Authors: David Issa Mattos, Jan Bosch, Helena Holmström Olsson

    Abstract: Frequentist statistical methods, such as hypothesis testing, are standard practice in papers that provide benchmark comparisons. Unfortunately, these methods have often been misused, e.g., without testing for their statistical test assumptions or without controlling for family-wise errors in multiple group comparisons, among several other problems. Bayesian Data Analysis (BDA) addresses many of th… ▽ More

    Submitted 15 May, 2021; v1 submitted 8 October, 2020; originally announced October 2020.

    Journal ref: IEEE Transactions on Evolutionary Computation (DOI:10.1109/TEVC.2021.3081167)

  8. arXiv:1910.03878  [pdf, other

    cs.SE

    Engineering for a Science-Centric Experimentation Platform

    Authors: Nikos Diamantopoulos, Jeffrey Wong, David Issa Mattos, Ilias Gerostathopoulos, Matthew Wardrop, Tobias Mao, Colin McFarland

    Abstract: Netflix is an internet entertainment service that routinely employs experimentation to guide strategy around product innovations. As Netflix grew, it had the opportunity to explore increasingly specialized improvements to its service, which generated demand for deeper analyses supported by richer metrics and powered by more diverse statistical methodologies. To facilitate this, and more fully harn… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

    Comments: 10 pages