Skip to main content

Showing 1–7 of 7 results for author: Tanaka, E

Searching in archive stat. Search in all archives.
.
  1. arXiv:2311.09705  [pdf, other

    stat.CO stat.OT

    edibble: An R package to encapsulate elements of experimental designs for better planning, management and workflow

    Authors: Emi Tanaka

    Abstract: I present an R package called edibble that facilitates the design of experiments by encapsulating elements of the experiment in a series of composable functions. This package is an interpretation of "the grammar of experimental designs" by Tanaka (2023) in the R programming language. The main features of the edibble package are demonstrated, illustrating how it can be used to create a wide array o… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 32 pages

  2. arXiv:2308.05964  [pdf, other

    stat.AP

    A Plot is Worth a Thousand Tests: Assessing Residual Diagnostics with the Lineup Protocol

    Authors: Weihao Li, Dianne Cook, Emi Tanaka, Susan VanderPlas

    Abstract: Regression experts consistently recommend plotting residuals for model diagnosis, despite the availability of many numerical hypothesis test procedures designed to use residuals to assess problems with a model fit. Here we provide evidence for why this is good advice using data from a visual inference experiment. We show how conventional tests are too sensitive, which means that too often the conc… ▽ More

    Submitted 24 March, 2024; v1 submitted 11 August, 2023; originally announced August 2023.

  3. arXiv:2307.11593  [pdf, other

    cs.OH q-bio.QM stat.ME

    Towards a unified language in experimental designs propagated by a software framework

    Authors: Emi Tanaka

    Abstract: Experiments require human decisions in the design process, which in turn are reformulated and summarized as inputs into a system (computational or otherwise) to generate the experimental design. I leverage this system to promote a language of experimental designs by proposing a novel computational framework, called "the grammar of experimental designs", to specify experimental designs based on an… ▽ More

    Submitted 24 July, 2023; v1 submitted 11 July, 2023; originally announced July 2023.

  4. arXiv:2206.07532  [pdf, other

    stat.OT stat.CO

    Current state and prospects of R-packages for the design of experiments

    Authors: Emi Tanaka, Dewi Amaliah

    Abstract: Re-running an experiment is generally costly and, in some cases, impossible due to limited resources; therefore, the design of an experiment plays a critical role in increasing the quality of experimental data. In this paper, we describe the current state of R-packages for the design of experiments through an exploratory data analysis of package downloads, package metadata, and a comparison of cha… ▽ More

    Submitted 13 December, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: 14 pages, 8 figures, 1 supplementary material

  5. arXiv:2205.06417  [pdf, other

    stat.OT

    A Journey from Wild to Textbook Data to Reproducibly Refresh the Wages Data from the National Longitudinal Survey of Youth Database

    Authors: Dewi Amaliah, Dianne Cook, Emi Tanaka, Kate Hyde, Nicholas Tierney

    Abstract: Textbook data is essential for teaching statistics and data science methods because they are clean, allowing the instructor to focus on methodology. Ideally textbook data sets are refreshed regularly, especially when they are subsets taken from an on-going data collection. It is also important to use contemporary data for teaching, to imbue the sense that the methodology is relevant today. This pa… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

  6. Symbolic Formulae for Linear Mixed Models

    Authors: Emi Tanaka, Francis K. C. Hui

    Abstract: A statistical model is a mathematical representation of an often simplified or idealised data-generating process. In this paper, we focus on a particular type of statistical model, called linear mixed models (LMMs), that is widely used in many disciplines e.g.~agriculture, ecology, econometrics, psychology. Mixed models, also commonly known as multi-level, nested, hierarchical or panel data models… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

  7. arXiv:1807.07268  [pdf, other

    q-bio.QM stat.AP

    Simple robust genomic prediction and outlier detection for a multi-environmental field trial

    Authors: Emi Tanaka

    Abstract: The aim of plant breeding trials is often to identify germplasms that are well adapt to target environments. These germplasms are identified through genomic prediction from the analysis of multi-environmental field trial (MET) using linear mixed models. The occurrence of outliers in MET are common and known to adversely impact accuracy of genomic prediction yet the detection of outliers, and subse… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.