Skip to main content

Showing 1–41 of 41 results for author: Pachter, L

.
  1. arXiv:2405.12998  [pdf

    q-bio.OT physics.bio-ph

    The miscalibration of the honeybee odometer

    Authors: Laura Luebbert, Lior Pachter

    Abstract: We examine a series of articles on honeybee odometry and navigation published between 1996 and 2010, and find inconsistencies in results, duplicated figures, indications of data manipulation, and incorrect calculations. This suggests that redoing the experiments in question is warranted.

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 16 pages

  2. arXiv:2312.06114  [pdf, other

    physics.bio-ph q-bio.QM

    The virial theorem and the Price equation

    Authors: Steinunn Liorsdóttir, Lior Pachter

    Abstract: We observe that the time averaged continuous Price equation is identical to the positive momentum virial theorem, and we discuss the applications and implications of this connection.

    Submitted 10 December, 2023; originally announced December 2023.

    Comments: 8 pages

  3. arXiv:2308.15518  [pdf, other

    astro-ph.IM astro-ph.EP physics.pop-ph

    Data-Driven Approaches to Searches for the Technosignatures of Advanced Civilizations

    Authors: T. Joseph W. Lazio, S. G. Djorgovski, Andrew Howard, Curt Cutler, Sofia Z. Sheikh, Stefano Cavuoti, Denise Herzing, Kiri Wagstaff, Jason T. Wright, Vishal Gajjar, Kevin Hand, Umaa Rebbapragada, Bruce Allen, Erica Cartmill, Jacob Foster, Dawn Gelino, Matthew J. Graham, Giuseppe Longo, Ashish A. Mahabal, Lior Pachter, Vikram Ravi, Gerald Sussman

    Abstract: Humanity has wondered whether we are alone for millennia. The discovery of life elsewhere in the Universe, particularly intelligent life, would have profound effects, comparable to those of recognizing that the Earth is not the center of the Universe and that humans evolved from previous species. There has been rapid growth in the fields of extrasolar planets and data-driven astronomy. In a relati… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

    Comments: Final Report prepared for the W. M. Keck Institute for Space Studies (KISS), http://kiss.caltech.edu/workshops/technosignatures/technosignatures.html ; eds. Lazio, Djorgovski, Howard, & Cutler; The study leads gratefully acknowledge the outstanding support of Michele Judd, KISS Executive Director, and her dedicated staff, who made the study experience invigorating and enormously productive

  4. arXiv:2204.00960  [pdf

    q-bio.OT

    A decade of molecular cell atlases

    Authors: Lior Pachter

    Abstract: The recent opinion article "A decade of molecular cell atlases" by Stephen Quake narrates the incredible single-cell genomics technology advances that have taken place over the last decade, and how they have translated to increasingly resolved cell atlases. However the sequence of events described is inaccurate and contains several omissions and errors. The errors are corrected in this note.

    Submitted 2 April, 2022; originally announced April 2022.

  5. arXiv:2103.10992  [pdf, other

    q-bio.SC q-bio.MN q-bio.QM

    Analytic solution of chemical master equations involving gene switching. I: Representation theory and diagrammatic approach to exact solution

    Authors: John J. Vastola, Gennady Gorin, Lior Pachter, William R. Holmes

    Abstract: The chemical master equation (CME), which describes the discrete and stochastic molecule number dynamics associated with biological processes like transcription, is difficult to solve analytically. It is particularly hard to solve for models involving bursting/gene switching, a biological feature that tends to produce heavy-tailed single cell RNA counts distributions. In this paper, we present a n… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

    Comments: 108 pages, 12 figures

  6. arXiv:2003.12919  [pdf, other

    stat.ME q-bio.MN q-bio.QM

    Special Function Methods for Bursty Models of Transcription

    Authors: Gennady Gorin, Lior Pachter

    Abstract: We explore a Markov model used in the analysis of gene expression, involving the bursty production of pre-mRNA, its conversion to mature mRNA, and its consequent degradation. We demonstrate that the integration used to compute the solution of the stochastic system can be approximated by the evaluation of special functions. Furthermore, the form of the special function solution generalizes to a bro… ▽ More

    Submitted 28 March, 2020; originally announced March 2020.

    Comments: Body: 15 pages, 2 figures, 2 tables. Supplement: 10 pages, 1 figure

    Journal ref: Phys. Rev. E 102, 022409 (2020)

  7. arXiv:1706.06995  [pdf

    stat.AP

    A latent variable model for survival time prediction with censoring and diverse covariates

    Authors: Shannon R. McCurdy, Annette Molinaro, Lior Pachter

    Abstract: Fulfilling the promise of precision medicine requires accurately and precisely classifying disease states. For cancer, this includes prediction of survival time from a surfeit of covariates. Such data presents an opportunity for improved prediction, but also a challenge due to high dimensionality. Furthermore, disease populations can be heterogeneous. Integrative modeling is sensible, as the under… ▽ More

    Submitted 21 June, 2017; originally announced June 2017.

  8. arXiv:1601.03334  [pdf, other

    q-bio.QM stat.ME

    Estimating intrinsic and extrinsic noise from single-cell gene expression measurements

    Authors: Audrey Fu, Lior Pachter

    Abstract: Gene expression is stochastic and displays variation ("noise") both within and between cells. Intracellular (intrinsic) variance can be distinguished from extracellular (extrinsic) variance by applying the law of total variance to data from two-reporter assays that probe expression of identical gene pairs in single-cells. We examine established formulas for the estimation of intrinsic and extrinsi… ▽ More

    Submitted 13 January, 2016; originally announced January 2016.

  9. arXiv:1510.07371  [pdf, other

    q-bio.QM q-bio.GN

    Pseudoalignment for metagenomic read assignment

    Authors: Lorian Schaeffer, Harold Pimentel, Nicolas Bray, Páll Melsted, Lior Pachter

    Abstract: We explore connections between metagenomic read assignment and the quantification of transcripts from RNA-Seq data. In particular, we show that the recent idea of pseudoalignment introduced in the RNA-Seq context is suitable in the metagenomics setting. When coupled with the Expectation-Maximization (EM) algorithm, reads can be assigned far more accurately and quickly than is currently possible wi… ▽ More

    Submitted 1 December, 2015; v1 submitted 26 October, 2015; originally announced October 2015.

    Comments: Replaced accidentally duplicated figure with correct version; fixed some issues with figure generation and labeling; fixed problem with some missing genomes from database; added link to GitHub repo containing analysis code; included assessment of aggregate sensitivity and precision; clarified assessment metrics used

  10. arXiv:1510.00696  [pdf, ps, other

    q-bio.GN

    Keep Me Around: Intron Retention Detection and Analysis

    Authors: Harold Pimentel, John G. Conboy, Lior Pachter

    Abstract: We present a tool, keep me around (kma), a suite of python scripts and an R package that finds retained introns in RNA-Seq experiments and incorporates biological replicates to reduce the number of false positives when detecting retention events. kma uses the results of existing quantification tools that probabilistically assign multi-map** reads, thus interfacing easily with transcript quantifi… ▽ More

    Submitted 2 October, 2015; originally announced October 2015.

  11. arXiv:1505.02710  [pdf

    q-bio.QM cs.CE cs.DS q-bio.GN

    Near-optimal RNA-Seq quantification

    Authors: Nicolas Bray, Harold Pimentel, Páll Melsted, Lior Pachter

    Abstract: We present a novel approach to RNA-Seq quantification that is near optimal in speed and accuracy. Software implementing the approach, called kallisto, can be used to analyze 30 million unaligned paired-end RNA-Seq reads in less than 5 minutes on a standard laptop computer while providing results as accurate as those of the best existing tools. This removes a major computational bottleneck in RNA-S… ▽ More

    Submitted 15 May, 2015; v1 submitted 11 May, 2015; originally announced May 2015.

    Comments: - Added some results (paralog analysis, allele specific expression analysis, alignment comparison, accuracy analysis with TPMs) - Switched bootstrap analysis to human sample from SEQC-MAQCIII - Provided link to a snakefile that allows for reproducibility of all results and figures in the paper

  12. arXiv:1412.3800  [pdf

    q-bio.QM q-bio.BM

    Identifying RNA contacts from SHAPE-MaP by partial correlation analysis

    Authors: Akshay Tambe, Jennifer Doudna, Lior Pachter

    Abstract: In a recent paper Siegfried et al. published a new sequence-based structural RNA assay that utilizes mutational profiling to detect base pairing (MaP). Output from MaP provides information about both pairing (via reactivities) and contact (via correlations). Reactivities can be coupled to partition function folding models for structural inference, while correlations can reveal pairs of sites that… ▽ More

    Submitted 8 December, 2014; originally announced December 2014.

  13. arXiv:1212.3076  [pdf

    q-bio.GN q-bio.QM

    Comment on "Evidence of Abundant and Purifying Selection in Humans for Recently Acquired Regulatory Functions"

    Authors: Nicolas Bray, Lior Pachter

    Abstract: Ward and Kellis (Reports, September 5 2012) identify regulatory regions in the human genome exhibiting lineage-specific constraint and estimate the extent of purifying selection. There is no statistical rationale for the examples they highlight, and their estimates of the fraction of the genome under constraint are biased by arbitrary designations of completely constrained regions.

    Submitted 13 December, 2012; originally announced December 2012.

    Comments: This note was prepared for submission to Science as a Technical Comment in response to the paper "Evidence of Abundant and Purifying Selection in Humans for Recently Acquired Regulatory Functions" by Lucas Ward and Manolis Kellis

  14. arXiv:1109.5681   

    q-bio.GN

    Quantifying uniformity of mapped reads

    Authors: Valerie Hower, Richard Starfield, Adam Roberts, Lior Pachter

    Abstract: Summary: We describe a tool for quantifying the uniformity of mapped reads in high-throughput sequencing experiments. Our statistic directly measures the uniformity of both read position and fragment length, and we explain how to compute a p-value that can be used to quantify biases arising from experimental protocols and map** procedures. Our method is useful for comparing different protocols i… ▽ More

    Submitted 17 July, 2012; v1 submitted 26 September, 2011; originally announced September 2011.

    Comments: withdrawing based on the journal's policy

  15. arXiv:1106.5061  [pdf, other

    q-bio.QM stat.AP

    RNA structure characterization from chemical map** experiments

    Authors: Sharon Aviran, Julius B. Lucks, Lior Pachter

    Abstract: Despite great interest in solving RNA secondary structures due to their impact on function, it remains an open problem to determine structure from sequence. Among experimental approaches, a promising candidate is the "chemical modification strategy", which involves application of chemicals to RNA that are sensitive to structure and that result in modifications that can be assayed via sequencing te… ▽ More

    Submitted 29 June, 2011; v1 submitted 24 June, 2011; originally announced June 2011.

    Comments: 8 pages, 3 figures

  16. arXiv:1104.3889  [pdf, other

    q-bio.GN stat.ME

    Models for transcript quantification from RNA-Seq

    Authors: Lior Pachter

    Abstract: RNA-Seq is rapidly becoming the standard technology for transcriptome analysis. Fundamental to many of the applications of RNA-Seq is the quantification problem, which is the accurate measurement of relative transcript abundances from the sequenced reads. We focus on this problem, and review many recently published models that are used to estimate the relative abundances. In addition to describing… ▽ More

    Submitted 12 May, 2011; v1 submitted 19 April, 2011; originally announced April 2011.

  17. arXiv:1103.2384  [pdf, other

    math.CO

    Affine and Projective Tree Metric Theorems

    Authors: Aaron Kleinman, Matan Harel, Lior Pachter

    Abstract: The tree metric theorem provides a combinatorial four point condition that characterizes dissimilarity maps derived from pairwise compatible split systems. A similar (but weaker) four point condition characterizes dissimilarity maps derived from circular split systems (Kalmanson metrics). The tree metric theorem was first discovered in the context of phylogenetics and forms the basis of many tree… ▽ More

    Submitted 20 October, 2011; v1 submitted 11 March, 2011; originally announced March 2011.

  18. arXiv:1005.0793  [pdf, other

    q-bio.GN

    Shape-based peak identification for ChIP-Seq

    Authors: Valerie Hower, Steven N. Evans, Lior Pachter

    Abstract: We present a new algorithm for the identification of bound regions from ChIP-seq experiments. Our method for identifying statistically significant peaks from read coverage is inspired by the notion of persistence in topological data analysis and provides a non-parametric approach that is robust to noise in experiments. Specifically, our method reduces the peak calling problem to the study of tree-… ▽ More

    Submitted 5 May, 2010; originally announced May 2010.

    Comments: 12 pages, 6 figures

  19. arXiv:1004.5587  [pdf, other

    q-bio.GN math.PR stat.AP

    Coverage statistics for sequence census methods

    Authors: Steven N. Evans, Valerie Hower, Lior Pachter

    Abstract: Background: We study the statistical properties of fragment coverage in genome sequencing experiments. In an extension of the classic Lander-Waterman model, we consider the effect of the length distribution of fragments. We also introduce the notion of the shape of a coverage function, which can be used to detect abberations in coverage. The probability theory underlying these problems is essentia… ▽ More

    Submitted 30 April, 2010; originally announced April 2010.

    Comments: 10 pages, 4 figures

  20. arXiv:0805.1026  [pdf, ps, other

    math.CO

    Selecting universities: personal preference and rankings

    Authors: Peter Huggins, Lior Pachter

    Abstract: Polyhedral geometry can be used to quantitatively assess the dependence of rankings on personal preference, and provides a tool for both students and universities to assess US News and World Report rankings.

    Submitted 7 May, 2008; originally announced May 2008.

  21. arXiv:0802.2395  [pdf, ps, other

    math.CO math.ST

    Combinatorics of least squares trees

    Authors: Radu Mihaescu, Lior Pachter

    Abstract: A recurring theme in the least squares approach to phylogenetics has been the discovery of elegant combinatorial formulas for the least squares estimates of edge lengths. These formulas have proved useful for the development of efficient algorithms, and have also been important for understanding connections among popular phylogeny algorithms. For example, the selection criterion of the neighbor-… ▽ More

    Submitted 17 February, 2008; originally announced February 2008.

  22. arXiv:0710.5142  [pdf, other

    q-bio.QM q-bio.PE

    On the optimality of the neighbor-joining algorithm

    Authors: Kord Eickmeyer, Peter Huggins, Lior Pachter, Ruriko Yoshida

    Abstract: The popular neighbor-joining (NJ) algorithm used in phylogenetics is a greedy algorithm for finding the balanced minimum evolution (BME) tree associated to a dissimilarity map. From this point of view, NJ is ``optimal'' when the algorithm outputs the tree which minimizes the balanced minimum evolution criterion. We use the fact that the NJ tree topology and the BME tree topology are determined b… ▽ More

    Submitted 26 October, 2007; originally announced October 2007.

  23. Viral population estimation using pyrosequencing

    Authors: Nicholas Eriksson, Lior Pachter, Yumi Mitsuya, Soo-Yon Rhee, Chunlin Wang, Baback Gharizadeh, Mostafa Ronaghi, Robert W. Shafer, Niko Beerenwinkel

    Abstract: The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis… ▽ More

    Submitted 21 January, 2008; v1 submitted 1 July, 2007; originally announced July 2007.

    Comments: 23 pages, 13 figures

  24. arXiv:q-bio/0702049  [pdf, ps, other

    q-bio.QM stat.AP

    The Cyclohedron Test for Finding Periodic Genes in Time Course Expression Studies

    Authors: Jason Morton, Lior Pachter, Anne Shiu, Bernd Sturmfels

    Abstract: The problem of finding periodically expressed genes from time course microarray experiments is at the center of numerous efforts to identify the molecular components of biological clocks. We present a new approach to this problem based on the cyclohedron test, which is a rank test inspired by recent advances in algebraic combinatorics. The test has the advantage of being robust to measurement er… ▽ More

    Submitted 22 May, 2007; v1 submitted 23 February, 2007; originally announced February 2007.

    Comments: Revision consists of reorganization and further statistical discussion; 19 pages, 4 figures

  25. arXiv:math/0702564  [pdf, ps, other

    math.CO math.ST

    Convex Rank Tests and Semigraphoids

    Authors: Jason Morton, Lior Pachter, Anne Shiu, Bernd Sturmfels, Oliver Wienand

    Abstract: Convex rank tests are partitions of the symmetric group which have desirable geometric properties. The statistical tests defined by such partitions involve counting all permutations in the equivalence classes. Each class consists of the linear extensions of a partially ordered set specified by data. Our methods refine existing rank tests of non-parametric statistics, such as the sign test and th… ▽ More

    Submitted 16 February, 2008; v1 submitted 20 February, 2007; originally announced February 2007.

  26. arXiv:math/0702515  [pdf, ps, other

    math.CO q-bio.QM

    The Neighbor-Net Algorithm

    Authors: Dan Levy, Lior Pachter

    Abstract: The neighbor-joining algorithm is a popular phylogenetics method for constructing trees from dissimilarity maps. The neighbor-net algorithm is an extension of the neighbor-joining algorithm and is used for constructing split networks. We begin by describing the output of neighbor-net in terms of the tessellation of $\bar{\MM}_{0}^n(\mathbb{R})$ by associahedra. This highlights the fact that neig… ▽ More

    Submitted 12 May, 2008; v1 submitted 17 February, 2007; originally announced February 2007.

  27. arXiv:q-bio/0612046  [pdf, ps, other

    q-bio.GN q-bio.QM

    An introduction to reconstructing ancestral genomes

    Authors: Lior Pachter

    Abstract: Recent advances in high-throughput genomics technologies have resulted in the sequencing of large numbers of (near) complete genomes. These genome sequences are being mined for important functional elements, such as genes. They are also being compared and contrasted in order to identify other functional sequences, such as those involved in the regulation of genes. In cases where DNA sequences fr… ▽ More

    Submitted 25 December, 2006; originally announced December 2006.

    Comments: Expanded lecture notes from the AMS short course on modeling and simulation of biological networks held in San Antonio, TX January 2006. To appear in the Proceedings of Symposia in Applied Mathematics, AMS Short Course Subseries

  28. arXiv:q-bio/0611032  [pdf, ps, other

    q-bio.PE math.ST q-bio.QM

    Towards the Human Genotope

    Authors: Peter Huggins, Lior Pachter, Bernd Sturmfels

    Abstract: The human genotope is the convex hull of all allele frequency vectors that can be obtained from the genotypes present in the human population. In this paper we take a few initial steps towards a description of this object, which may be fundamental for future population based genetics studies. Here we use data from the HapMap Project, restricted to two ENCODE regions, to study a subpolytope of th… ▽ More

    Submitted 25 December, 2006; v1 submitted 9 November, 2006; originally announced November 2006.

  29. arXiv:math/0605173  [pdf, ps, other

    math.ST math.CO

    Geometry of rank tests

    Authors: Jason Morton, Lior Pachter, Anne Shiu, Bernd Sturmfels, Oliver Wienand

    Abstract: We study partitions of the symmetric group which have desirable geometric properties. The statistical tests defined by such partitions involve counting all permutations in the equivalence classes. These permutations are the linear extensions of partially ordered sets specified by the data. Our methods refine rank tests of non-parametric statistics, such as the sign test and the runs test, and ar… ▽ More

    Submitted 20 July, 2006; v1 submitted 6 May, 2006; originally announced May 2006.

    Comments: 8 pages, 4 figures. See also http://bio.math.berkeley.edu/ranktests/. v2: Expanded proofs, revised after reviewer comments

  30. arXiv:q-bio/0603034  [pdf, ps, other

    q-bio.PE math.CO

    Epistasis and Shapes of Fitness Landscapes

    Authors: Niko Beerenwinkel, Lior Pachter, Bernd Sturmfels

    Abstract: The relationship between the shape of a fitness landscape and the underlying gene interactions, or epistasis, has been extensively studied in the two-locus case. Gene interactions among multiple loci are usually reduced to two-way interactions. We present a geometric theory of shapes of fitness landscapes for multiple loci. A central concept is the genotope, which is the convex hull of all possi… ▽ More

    Submitted 14 April, 2006; v1 submitted 29 March, 2006; originally announced March 2006.

    Comments: 31 pages, 7 figures; typos removed, Example 3.10 added

  31. arXiv:cs/0602041  [pdf, ps, other

    cs.DS cs.DM

    Why neighbor-joining works

    Authors: Radu Mihaescu, Dan Levy, Lior Pachter

    Abstract: We show that the neighbor-joining algorithm is a robust quartet method for constructing trees from distances. This leads to a new performance guarantee that contains Atteson's optimal radius bound as a special case and explains many cases where neighbor-joining is successful even when Atteson's criterion is not satisfied. We also provide a proof for Atteson's conjecture on the optimal edge radiu… ▽ More

    Submitted 17 June, 2007; v1 submitted 10 February, 2006; originally announced February 2006.

    Comments: Revision 2

    ACM Class: F.2.0

  32. arXiv:q-bio/0512008  [pdf, ps, other

    q-bio.GN math.CO q-bio.QM

    Parametric Alignment of Drosophila Genomes

    Authors: Colin Dewey, Peter Huggins, Kevin Woods, Bernd Sturmfels, Lior Pachter

    Abstract: The classic algorithms of Needleman--Wunsch and Smith--Waterman find a maximum a posteriori probability alignment for a pair hidden Markov model (PHMM). In order to process large genomes that have undergone complex genome rearrangements, almost all existing whole genome alignment methods apply fast heuristics to divide genomes into small pieces which are suitable for Needleman--Wunsch alignment.… ▽ More

    Submitted 2 December, 2005; originally announced December 2005.

    Comments: 19 pages, 3 figures

  33. arXiv:q-bio/0510052  [pdf, ps, other

    q-bio.QM math.ST

    Alignment Metric Accuracy

    Authors: Ariel S. Schwartz, Eugene W. Myers, Lior Pachter

    Abstract: We propose a metric for the space of multiple sequence alignments that can be used to compare two alignments to each other. In the case where one of the alignments is a reference alignment, the resulting accuracy measure improves upon previous approaches, and provides a balanced assessment of the fidelity of both matches and gaps. Furthermore, in the case where a reference alignment is not avail… ▽ More

    Submitted 27 October, 2005; originally announced October 2005.

  34. arXiv:q-bio/0508001  [pdf, ps, other

    q-bio.QM math.CO

    Neighbor joining with phylogenetic diversity estimates

    Authors: Dan Levy, Ruriko Yoshida, Lior Pachter

    Abstract: The Neighbor-Joining algorithm is a recursive procedure for reconstructing trees that is based on a transformation of pairwise distances between leaves. We present a generalization of the neighbor-joining transformation, which uses estimates of phylogenetic diversity rather than pairwise distances in the tree. This leads to an improved neighbor-joining algorithm whose total running time is still… ▽ More

    Submitted 30 July, 2005; originally announced August 2005.

  35. arXiv:q-bio/0412012  [pdf, ps, other

    q-bio.GN q-bio.QM

    Subtree power analysis finds optimal species for comparative genomics

    Authors: Jon D. McAuliffe, Michael I. Jordan, Lior Pachter

    Abstract: Sequence comparison across multiple organisms aids in the detection of regions under selection. However, resource limitations require a prioritization of genomes to be sequenced. This prioritization should be grounded in two considerations: the lineal scope encompassing the biological phenomena of interest, and the optimal species within that scope for detecting functional elements. We introduce… ▽ More

    Submitted 6 December, 2004; originally announced December 2004.

    Comments: 16 pages, 3 figures, 3 tables

    Report number: UCB-Stat-TR-677

  36. arXiv:q-bio/0410008  [pdf

    q-bio.GN

    Needed for completion of the human genome: hypothesis driven experiments and biologically realistic mathematical models

    Authors: Roderic Guigo, Ewan Birney, Michael Brent, Emmanouil Dermitzakis, Lior Pachter, Hugues Roest Crollius, Victor Solovyev, Michael Q. Zhang

    Abstract: With the sponsorship of ``Fundacio La Caixa'' we met in Barcelona, November 21st and 22nd, to analyze the reasons why, after the completion of the human genome sequence, the identification all protein coding genes and their variants remains a distant goal. Here we report on our discussions and summarize some of the major challenges that need to be overcome in order to complete the human gene cat… ▽ More

    Submitted 6 October, 2004; originally announced October 2004.

    Comments: Report and discussion resulting from the `Fundacio La Caixa' gene finding meeting held November 21 and 22 2003 in Barcelona

  37. arXiv:math/0409132  [pdf, ps, other

    math.ST q-bio.QM

    The Mathematics of Phylogenomics

    Authors: Lior Pachter, Bernd Sturmfels

    Abstract: The grand challenges in biology today are being shaped by powerful high-throughput technologies that have revealed the genomes of many organisms, global expression patterns of genes and detailed information about variation within populations. We are therefore able to ask, for the first time, fundamental questions about the evolution of genomes, the structure of genes and their regulation, and th… ▽ More

    Submitted 27 September, 2005; v1 submitted 8 September, 2004; originally announced September 2004.

    Comments: 41 pages, 4 figures

    MSC Class: 92D20 (Primary) 62-02 (Secondary)

  38. arXiv:q-bio/0401033  [pdf, ps, other

    q-bio.GN cs.LG math.ST

    Parametric Inference for Biological Sequence Analysis

    Authors: Lior Pachter, Bernd Sturmfels

    Abstract: One of the major successes in computational biology has been the unification, using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied towards these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm,… ▽ More

    Submitted 25 January, 2004; originally announced January 2004.

    Comments: 15 pages, 4 figures. See also companion paper "Tropical Geometry of Statistical Models" (q-bio.QM/0311009)

  39. arXiv:q-bio/0311018  [pdf, ps, other

    q-bio.GN

    MAVID: Constrained ancestral alignment of multiple sequences

    Authors: Nicolas Bray, Lior Pachter

    Abstract: We describe a new global multiple alignment program capable of aligning a large number of genomic regions. Our progressive alignment approach incorporates the following ideas: maximum-likelihood inference of ancestral sequences, automatic guide-tree construction, protein based anchoring of ab-initio gene predictions, and constraints derived from a global homology map of the sequences. We have im… ▽ More

    Submitted 13 November, 2003; originally announced November 2003.

  40. arXiv:q-bio/0311009  [pdf, ps, other

    q-bio.QM math.AG q-bio.GN

    Tropical Geometry of Statistical Models

    Authors: Lior Pachter, Bernd Sturmfels

    Abstract: This paper presents a unified mathematical framework for inference in graphical models, building on the observation that graphical models are algebraic varieties. From this geometric viewpoint, observations generated from a model are coordinates of a point in the variety, and the sum-product algorithm is an efficient tool for evaluating specific coordinates. The question addressed here is how… ▽ More

    Submitted 25 January, 2004; v1 submitted 8 November, 2003; originally announced November 2003.

    Comments: 14 pages, 3 figures. Major revision. Applications now in companion paper, "Parametric Inference for Biological Sequence Analysis"

  41. arXiv:math/0311156  [pdf, ps, other

    math.CO q-bio.PE

    Reconstructing Trees from Subtree Weights

    Authors: Lior Pachter, David E Speyer

    Abstract: The tree-metric theorem provides a necessary and sufficient condition for a dissimilarity matrix to be a tree metric, and has served as the foundation for numerous distance-based reconstruction methods in phylogenetics. Our main result is an extension of the tree-metric theorem to more general dissimilarity maps. In particular, we show that a tree with n leaves is reconstructible from the weight… ▽ More

    Submitted 10 November, 2003; originally announced November 2003.