Skip to main content

Showing 1–8 of 8 results for author: Song, Y S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2109.15261  [pdf, other

    stat.ME math.PR math.ST q-bio.QM

    A simple and flexible test of sample exchangeability with applications to statistical genomics

    Authors: Alan J. Aw, Jeffrey P. Spence, Yun S. Song

    Abstract: In scientific studies involving analyses of multivariate data, basic but important questions often arise for the researcher: Is the sample exchangeable, meaning that the joint distribution of the sample is invariant to the ordering of the units? Are the features independent of one another, or perhaps the features can be grouped so that the groups are mutually independent? In statistical genomics,… ▽ More

    Submitted 30 August, 2023; v1 submitted 30 September, 2021; originally announced September 2021.

    Comments: 24 pages. Supplementary Information file (38 pages, contains mathematical proofs) is available at https://github.com/songlab-cal/flinty/

    MSC Class: 62G10; 62H15; 62P10 ACM Class: G.3

  2. arXiv:2105.10590  [pdf, other

    stat.ML cs.LG q-bio.BM q-bio.QM

    Parallelizing Contextual Bandits

    Authors: Jeffrey Chan, Aldo Pacchiano, Nilesh Tripuraneni, Yun S. Song, Peter Bartlett, Michael I. Jordan

    Abstract: Standard approaches to decision-making under uncertainty focus on sequential exploration of the space of decisions. However, \textit{simultaneously} proposing a batch of decisions, which leverages available resources for parallel experimentation, has the potential to rapidly accelerate exploration. We present a family of (parallel) contextual bandit algorithms applicable to problems with bounded e… ▽ More

    Submitted 5 February, 2023; v1 submitted 21 May, 2021; originally announced May 2021.

  3. arXiv:1906.08230  [pdf, other

    cs.LG q-bio.BM stat.ML

    Evaluating Protein Transfer Learning with TAPE

    Authors: Roshan Rao, Nicholas Bhattacharya, Neil Thomas, Yan Duan, Xi Chen, John Canny, Pieter Abbeel, Yun S. Song

    Abstract: Protein modeling is an increasingly popular area of machine learning research. Semi-supervised learning has emerged as an important paradigm in protein modeling due to the high cost of acquiring supervised protein labels, but the current literature is fragmented when it comes to datasets and standardized evaluation techniques. To facilitate progress in this field, we introduce the Tasks Assessing… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

    Comments: 20 pages, 4 figures

  4. arXiv:1802.06153  [pdf, other

    cs.LG q-bio.PE stat.ML

    A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks

    Authors: Jeffrey Chan, Valerio Perrone, Jeffrey P. Spence, Paul A. Jenkins, Sara Mathieson, Yun S. Song

    Abstract: An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential chal… ▽ More

    Submitted 5 November, 2018; v1 submitted 16 February, 2018; originally announced February 2018.

    Comments: 9 pages, 8 figures

  5. arXiv:1612.03839  [pdf, other

    cs.LG stat.ML

    Tensor Decompositions via Two-Mode Higher-Order SVD (HOSVD)

    Authors: Miaoyan Wang, Yun S. Song

    Abstract: Tensor decompositions have rich applications in statistics and machine learning, and develo** efficient, accurate algorithms for the problem has received much attention recently. Here, we present a new method built on Kruskal's uniqueness theorem to decompose symmetric, nearly orthogonally decomposable tensors. Unlike the classical higher-order singular value decomposition which unfolds a tensor… ▽ More

    Submitted 18 April, 2017; v1 submitted 12 December, 2016; originally announced December 2016.

    Comments: 33 pages, 5 figures

    Journal ref: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), PMLR, Vol. 54 (2017) 614-622

  6. arXiv:1310.1068  [pdf, ps, other

    q-bio.PE math.FA stat.AP stat.ME

    A novel spectral method for inferring general diploid selection from time series genetic data

    Authors: Matthias Steinrücken, Anand Bhaskar, Yun S. Song

    Abstract: The increased availability of time series genetic variation data from experimental evolution studies and ancient DNA samples has created new opportunities to identify genomic regions under selective pressure and to estimate their associated fitness parameters. However, it is a challenging problem to compute the likelihood of nonneutral models for the population allele frequency dynamics, given the… ▽ More

    Submitted 26 January, 2015; v1 submitted 3 October, 2013; originally announced October 2013.

    Comments: Published in at http://dx.doi.org/10.1214/14-AOAS764 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS764

    Journal ref: Annals of Applied Statistics 2014, Vol. 8, No. 4, 2203-2222

  7. arXiv:1309.5056  [pdf, ps, other

    q-bio.PE math.ST stat.AP

    Descartes' rule of signs and the identifiability of population demographic models from genomic variation data

    Authors: Anand Bhaskar, Yun S. Song

    Abstract: The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has be… ▽ More

    Submitted 1 December, 2014; v1 submitted 19 September, 2013; originally announced September 2013.

    Comments: Published in at http://dx.doi.org/10.1214/14-AOS1264 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1264

    Journal ref: Annals of Statistics 2014, Vol. 42, No. 6, 2469-2493

  8. The influence of relatives on the efficiency and error rate of familial searching

    Authors: Rori V. Rohlfs, Erin Murphy, Yun S. Song, Montgomery Slatkin

    Abstract: We investigate the consequences of adopting the criteria used by the state of California, as described by Myers et al. (2011), for conducting familial searches. We carried out a simulation study of randomly generated profiles of related and unrelated individuals with 13-locus CODIS genotypes and YFiler Y-chromosome haplotypes, on which the Myers protocol for relative identification was carried out… ▽ More

    Submitted 14 August, 2013; v1 submitted 10 April, 2013; originally announced April 2013.

    Comments: main text: 19 pages, 4 tables, 2 figures supplemental text: 2 pages, 5 tables all together as single file

    Journal ref: PLoS ONE 8(8): e70495 (2013)