Skip to main content

Showing 1–10 of 10 results for author: Miller, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2310.01184  [pdf, ps, other

    stat.AP

    Applications of Improvements to the Pythagorean Won-Loss Expectation in Optimizing Rosters

    Authors: Alexander F. Almeida, Kevin Dayaratna, Steven J. Miller, Andrew K. Yang

    Abstract: Bill James' Pythagorean formula has for decades done an excellent job estimating a baseball team's winning percentage from very little data: if the average runs scored and allowed are denoted respectively by ${\rm RS}$ and ${\rm RA}$, there is some $γ$ such that the winning percentage is approximately ${\rm RS}^γ/ ({\rm RS}^γ+ {\rm RA}^γ)$. One important consequence is to determine the value of di… ▽ More

    Submitted 20 February, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

  2. arXiv:2101.08162  [pdf, ps, other

    stat.OT math.ST

    Lessons from the German Tank Problem

    Authors: George Clark, Alex Gonye, Steven J Miller

    Abstract: During World War II the German army used tanks to devastating advantage. The Allies needed accurate estimates of their tank production and deployment. They used two approaches to find these values: spies, and statistics. This note describes the statistical approach. Assuming the tanks are labeled consecutively starting at 1, if we observe $k$ serial numbers from an unknown number $N$ of tanks, wit… ▽ More

    Submitted 21 January, 2021; v1 submitted 19 January, 2021; originally announced January 2021.

    Comments: Version 2.1, 17 pages, 9 figures, to appear in the Mathematical Intelligencer, fixed two typos

    MSC Class: 62J05; 60C05 (primary); 05A10 (secondary)

  3. arXiv:1909.00306  [pdf, other

    stat.AP stat.ML

    Categorical Co-Frequency Analysis: Clustering Diagnosis Codes to Predict Hospital Readmissions

    Authors: Hallee E. Wong, Brianna C. Heggeseth, Steven J. Miller

    Abstract: Accurately predicting patients' risk of 30-day hospital readmission would enable hospitals to efficiently allocate resource-intensive interventions. We develop a new method, Categorical Co-Frequency Analysis (CoFA), for clustering diagnosis codes from the International Classification of Diseases (ICD) according to the similarity in relationships between covariates and readmission risk. CoFA measur… ▽ More

    Submitted 31 August, 2019; originally announced September 2019.

    Comments: 14 Pages

  4. arXiv:1904.09525  [pdf, other

    eess.SP stat.AP

    Recovery of the fetal electrocardiogram for morphological analysis from two trans-abdominal channels via optimal shrinkage

    Authors: Pei-Chun Su, Stephen Miller, Salim Idriss, Piers Barker, Hau-Tieng Wu

    Abstract: We propose a novel algorithm to recover fetal electrocardiogram (ECG) for both the fetal heart rate analysis and morphological analysis of its waveform from two or three trans-abdominal maternal ECG channels. We design an algorithm based on the optimal-shrinkage and the nonlocal Euclidean median under the wave-shape manifold model. For the fetal heart rate analysis, the algorithm is evaluated on p… ▽ More

    Submitted 8 August, 2019; v1 submitted 20 April, 2019; originally announced April 2019.

    Comments: 25 pages, 6 figures

  5. arXiv:1502.01682  [pdf, other

    cs.CL cs.LG stat.ML

    Use of Modality and Negation in Semantically-Informed Syntactic MT

    Authors: Kathryn Baker, Michael Bloodgood, Bonnie J. Dorr, Chris Callison-Burch, Nathaniel W. Filardo, Christine Piatko, Lori Levin, Scott Miller

    Abstract: This paper describes the resource- and system-building efforts of an eight-week Johns Hopkins University Human Language Technology Center of Excellence Summer Camp for Applied Language Exploration (SCALE-2009) on Semantically-Informed Machine Translation (SIMT). We describe a new modality/negation (MN) annotation scheme, the creation of a (publicly available) MN lexicon, and two automated MN tagge… ▽ More

    Submitted 5 February, 2015; originally announced February 2015.

    Comments: 28 pages, 13 figures, 2 tables; appeared in Computational Linguistics, 38(2):411-438, 2012

    ACM Class: I.2.7; I.2.6; I.5.1; I.5.4

    Journal ref: Computational Linguistics, 38(2):411-438, 2012

  6. arXiv:1409.7085  [pdf, other

    cs.CL cs.LG stat.ML

    Semantically-Informed Syntactic Machine Translation: A Tree-Grafting Approach

    Authors: Kathryn Baker, Michael Bloodgood, Chris Callison-Burch, Bonnie J. Dorr, Nathaniel W. Filardo, Lori Levin, Scott Miller, Christine Piatko

    Abstract: We describe a unified and coherent syntactic framework for supporting a semantically-informed syntactic approach to statistical machine translation. Semantically enriched syntactic tags assigned to the target-language training texts improved translation quality. The resulting system significantly outperformed a linguistically naive baseline model (Hiero), and reached the highest scores yet reporte… ▽ More

    Submitted 24 September, 2014; originally announced September 2014.

    Comments: 10 pages, 7 figures, 3 tables; appeared in Proceedings of the Ninth Conference of the Association for Machine Translation in the Americas (AMTA), October 2010

    ACM Class: I.2.7; I.2.6; I.5.1; I.5.4

    Journal ref: In Proceedings of the Ninth Conference of the Association for Machine Translation in the Americas (AMTA), Denver, Colorado, October 2010

  7. arXiv:1406.3402  [pdf, other

    math.HO stat.AP

    Relieving and Readjusting Pythagoras

    Authors: Victor Luo, Steven J. Miller

    Abstract: Bill James invented the Pythagorean expectation in the late 70's to predict a baseball team's winning percentage knowing just their runs scored and allowed. His original formula estimates a winning percentage of ${\rm RS}^2/({\rm RS}^2+{\rm RA}^2)$, where ${\rm RS}$ stands for runs scored and ${\rm RA}$ for runs allowed; later versions found better agreement with data by replacing the exponent 2 w… ▽ More

    Submitted 16 June, 2014; v1 submitted 12 June, 2014; originally announced June 2014.

    Comments: Version 1.1, 15 pages, 9 figures (correct some minor typos and two images)

    MSC Class: 46N30 (primary); 62F03; 62P99 (secondary)

  8. arXiv:1406.0758  [pdf, ps, other

    math.HO stat.OT

    Pythagoras at the Bat

    Authors: Steven J. Miller, Taylor Corcoran, Jennifer Gossels, Victor Luo, Jaclyn Porfilio

    Abstract: The Pythagorean formula is one of the most popular ways to measure the true ability of a team. It is very easy to use, estimating a team's winning percentage from the runs they score and allow. This data is readily available on standings pages; no computationally intensive simulations are needed. Normally accurate to within a few games per season, it allows teams to determine how much a run is wor… ▽ More

    Submitted 29 May, 2014; originally announced June 2014.

    Comments: Version 1.0, 25 pages, 6 images. This is an older version; a slightly updated version will appear in "Social Networks and the Economics of Sports", to be published by Springer-Verlag

  9. arXiv:1208.1725  [pdf

    stat.AP

    The Pythagorean Won-Loss Formula and Hockey: A Statistical Justification for Using the Classic Baseball Formula as an Evaluative Tool in Hockey

    Authors: Kevin D. Dayaratna, Steven J. Miller

    Abstract: Originally devised for baseball, the Pythagorean Won-Loss formula estimates the percentage of games a team should have won at a particular point in a season. For decades, this formula had no mathematical justification. In 2006, Steven Miller provided a statistical derivation by making some heuristic assumptions about the distributions of runs scored and allowed by baseball teams. We make a similar… ▽ More

    Submitted 18 October, 2013; v1 submitted 8 August, 2012; originally announced August 2012.

    Comments: 21 pages, 4 figures; Forthcoming in The Hockey Research Journal: A Publication of the Society for International Hockey Research, 2012/13

  10. arXiv:1205.4750  [pdf

    stat.AP

    First Order Approximations of the Pythagorean Won-Loss Formula for Predicting MLB Teams' Winning Percentages

    Authors: Kevin D. Dayaratna, Steven J. Miller

    Abstract: We mathematically prove that an existing linear predictor of baseball teams' winning percentages (Jones and Tappin 2005) is simply just a first-order approximation to Bill James' Pythagorean Won-Loss formula and can thus be written in terms of the formula's well-known exponent. We estimate the linear model on twenty seasons of Major League Baseball data and are able to verify that the resulting co… ▽ More

    Submitted 21 May, 2012; originally announced May 2012.

    Comments: 7 pages, 1 Table, Appendix with Alternative Proof; By the Numbers 21, 2012