-
The minimal hitting set generation problem: algorithms and computation
Authors:
Andrew Gainer-Dewar,
Paola Vera-Licona
Abstract:
Finding inclusion-minimal "hitting sets" for a given collection of sets is a fundamental combinatorial problem with applications in domains as diverse as Boolean algebra, computational biology, and data mining. Much of the algorithmic literature focuses on the problem of *recognizing* the collection of minimal hitting sets; however, in many of the applications, it is more important to *generate* t…
▽ More
Finding inclusion-minimal "hitting sets" for a given collection of sets is a fundamental combinatorial problem with applications in domains as diverse as Boolean algebra, computational biology, and data mining. Much of the algorithmic literature focuses on the problem of *recognizing* the collection of minimal hitting sets; however, in many of the applications, it is more important to *generate* these hitting sets. We survey twenty algorithms from across a variety of domains, considering their history, classification, useful features, and computational performance on a variety of synthetic and real-world inputs. We also provide a suite of implementations of these algorithms with a ready-to-use, platform-agnostic interface based on Docker containers and the AlgoRun framework, so that interested computational scientists can easily perform similar tests with inputs from their own research areas on their own computers or through a convenient Web interface.
△ Less
Submitted 5 January, 2016;
originally announced January 2016.
-
TIGRESS: Trustful Inference of Gene REgulation using Stability Selection
Authors:
Anne-Claire Haury,
Fantine Mordelet,
Paola Vera-Licona,
Jean-Philippe Vert
Abstract:
Inferring the structure of gene regulatory networks (GRN) from gene expression data has many applications, from the elucidation of complex biological processes to the identification of potential drug targets. It is however a notoriously difficult problem, for which the many existing methods reach limited accuracy. In this paper, we formulate GRN inference as a sparse regression problem and investi…
▽ More
Inferring the structure of gene regulatory networks (GRN) from gene expression data has many applications, from the elucidation of complex biological processes to the identification of potential drug targets. It is however a notoriously difficult problem, for which the many existing methods reach limited accuracy. In this paper, we formulate GRN inference as a sparse regression problem and investigate the performance of a popular feature selection method, least angle regression (LARS) combined with stability selection. We introduce a novel, robust and accurate scoring technique for stability selection, which improves the performance of feature selection with LARS. The resulting method, which we call TIGRESS (Trustful Inference of Gene REgulation using Stability Selection), was ranked among the top methods in the DREAM5 gene network reconstruction challenge. We investigate in depth the influence of the various parameters of the method and show that a fine parameter tuning can lead to significant improvements and state-of-the-art performance for GRN inference. TIGRESS reaches state-of-the-art performance on benchmark data. This study confirms the potential of feature selection techniques for GRN inference. Code and data are available on http://cbio.ensmp.fr/~ahaury. Running TIGRESS online is possible on GenePattern: http://www.broadinstitute.org/cancer/software/genepattern/.
△ Less
Submitted 6 May, 2012;
originally announced May 2012.
-
Reverse Engineering of Molecular Networks from a Common Combinatorial Approach
Authors:
Bhaskar DasGupta,
Paola Vera-Licona,
Eduardo Sontag
Abstract:
The understanding of molecular cell biology requires insight into the structure and dynamics of networks that are made up of thousands of interacting molecules of DNA, RNA, proteins, metabolites, and other components. One of the central goals of systems biology is the unraveling of the as yet poorly characterized complex web of interactions among these components. This work is made harder by the f…
▽ More
The understanding of molecular cell biology requires insight into the structure and dynamics of networks that are made up of thousands of interacting molecules of DNA, RNA, proteins, metabolites, and other components. One of the central goals of systems biology is the unraveling of the as yet poorly characterized complex web of interactions among these components. This work is made harder by the fact that new species and interactions are continuously discovered in experimental work, necessitating the development of adaptive and fast algorithms for network construction and updating. Thus, the "reverse-engineering" of networks from data has emerged as one of the central concern of systems biology research.
A variety of reverse-engineering methods have been developed, based on tools from statistics, machine learning, and other mathematical domains. In order to effectively use these methods, it is essential to develop an understanding of the fundamental characteristics of these algorithms. With that in mind, this chapter is dedicated to the reverse-engineering of biological systems.
Specifically, we focus our attention on a particular class of methods for reverse-engineering, namely those that rely algorithmically upon the so-called "hitting-set" problem, which is a classical combinatorial and computer science problem, Each of these methods utilizes a different algorithm in order to obtain an exact or an approximate solution of the hitting set problem. We will explore the ultimate impact that the alternative algorithms have on the inference of published in silico biological networks.
△ Less
Submitted 23 February, 2011;
originally announced February 2011.
-
Parameter estimation for Boolean models of biological networks
Authors:
Elena Dimitrova,
Luis David Garcia-Puente,
Franziska Hinkelmann,
Abdul S. Jarrah,
Reinhard Laubenbacher,
Brandilyn Stigler,
Michael Stillman,
Paola Vera-Licona
Abstract:
Boolean networks have long been used as models of molecular networks and play an increasingly important role in systems biology. This paper describes a software package, Polynome, offered as a web service, that helps users construct Boolean network models based on experimental data and biological input. The key feature is a discrete analog of parameter estimation for continuous models. With only…
▽ More
Boolean networks have long been used as models of molecular networks and play an increasingly important role in systems biology. This paper describes a software package, Polynome, offered as a web service, that helps users construct Boolean network models based on experimental data and biological input. The key feature is a discrete analog of parameter estimation for continuous models. With only experimental data as input, the software can be used as a tool for reverse-engineering of Boolean network models from experimental time course data.
△ Less
Submitted 20 August, 2009;
originally announced August 2009.
-
Inference of ecological interaction networks
Authors:
Paola Vera-Licona,
Reinhard Laubenbacher
Abstract:
The inference of the interactions between organisms in an ecosystem from observational data is an important problem in ecology. This paper presents a mathematical inference method, originally developed for the inference of biochemical networks in molecular biology, adapted for the inference of networks of ecological interactions. The method is applied to a network of invertebrate families (taxa)…
▽ More
The inference of the interactions between organisms in an ecosystem from observational data is an important problem in ecology. This paper presents a mathematical inference method, originally developed for the inference of biochemical networks in molecular biology, adapted for the inference of networks of ecological interactions. The method is applied to a network of invertebrate families (taxa) in a rice field.
△ Less
Submitted 9 May, 2008;
originally announced May 2008.