-
Learning interpretable causal networks from very large datasets, application to 400,000 medical records of breast cancer patients
Authors:
Marcel da Câmara Ribeiro-Dantas,
Honghao Li,
Vincent Cabeli,
Louise Dupuis,
Franck Simon,
Liza Hettal,
Anne-Sophie Hamy,
Hervé Isambert
Abstract:
Discovering causal effects is at the core of scientific investigation but remains challenging when only observational data is available. In practice, causal networks are difficult to learn and interpret, and limited to relatively small datasets. We report a more reliable and scalable causal discovery method (iMIIC), based on a general mutual information supremum principle, which greatly improves t…
▽ More
Discovering causal effects is at the core of scientific investigation but remains challenging when only observational data is available. In practice, causal networks are difficult to learn and interpret, and limited to relatively small datasets. We report a more reliable and scalable causal discovery method (iMIIC), based on a general mutual information supremum principle, which greatly improves the precision of inferred causal relations while distinguishing genuine causes from putative and latent causal effects. We showcase iMIIC on synthetic and real-life healthcare data from 396,179 breast cancer patients from the US Surveillance, Epidemiology, and End Results program. More than 90\% of predicted causal effects appear correct, while the remaining unexpected direct and indirect causal effects can be interpreted in terms of diagnostic procedures, therapeutic timing, patient preference or socio-economic disparity. iMIIC's unique capabilities open up new avenues to discover reliable and interpretable causal networks across a range of research fields.
△ Less
Submitted 11 March, 2023;
originally announced March 2023.
-
Localized Joule heating produced by ion current focusing through micron-size holes
Authors:
V. Viasnoff,
U. Bockelmann,
A. Meller,
H. Isambert,
L. Laufer,
Y. Tsori
Abstract:
We provide an experimental demonstration that the focusing of ionic currents in a micron size hole connecting two chambers can produce local temperature increases of up to $100^\circ$ C with gradients as large as $1^\circ$ K$μm^{-1}$. We find a good agreement between the measured temperature profiles and a finite elements-based numerical calculation. We show how the thermal gradients can be used t…
▽ More
We provide an experimental demonstration that the focusing of ionic currents in a micron size hole connecting two chambers can produce local temperature increases of up to $100^\circ$ C with gradients as large as $1^\circ$ K$μm^{-1}$. We find a good agreement between the measured temperature profiles and a finite elements-based numerical calculation. We show how the thermal gradients can be used to measure the full melting profile of DNA duplexes within a region of 40 $μ$m. The possibility to produce even larger gradients using sub-micron pores is discussed.
△ Less
Submitted 27 April, 2010;
originally announced April 2010.
-
A comparative evolutionary study of transcription networks
Authors:
A. L. Sellerio,
B. Bassetti,
H. Isambert,
M. Cosentino Lagomarsino
Abstract:
We present a comparative analysis of large-scale topological and evolutionary properties of transcription networks in three species, the two distant bacteria E. coli and B. subtilis, and the yeast S. cerevisiae. The study focuses on the global aspects of feedback and hierarchy in transcriptional regulatory pathways. While confirming that gene duplication has a significant impact on the sha** o…
▽ More
We present a comparative analysis of large-scale topological and evolutionary properties of transcription networks in three species, the two distant bacteria E. coli and B. subtilis, and the yeast S. cerevisiae. The study focuses on the global aspects of feedback and hierarchy in transcriptional regulatory pathways. While confirming that gene duplication has a significant impact on the sha** of all the analyzed transcription networks, our results point to distinct trends between the bacteria, where time constraints in the transcription of downstream genes might be important in sha** the hierarchical structure of the network, and yeast, which seems able to sustain a higher wiring complexity, that includes the more feedback, intricate hierarchy, and the combinatorial use of heterodimers made of duplicate transcription factors.
△ Less
Submitted 15 May, 2008;
originally announced May 2008.
-
Hierarchy and Feedback in the Evolution of the E. coli Transcription Network
Authors:
M. Cosentino Lagomarsino,
P. Jona,
B. Bassetti,
H. Isambert
Abstract:
The E.coli transcription network has an essentially feedforward structure, with, however, abundant feedback at the level of self-regulations. Here, we investigate how these properties emerged during evolution. An assessment of the role of gene duplication based on protein domain architecture shows that (i) transcriptional autoregulators have mostly arisen through duplication, while (ii) the expe…
▽ More
The E.coli transcription network has an essentially feedforward structure, with, however, abundant feedback at the level of self-regulations. Here, we investigate how these properties emerged during evolution. An assessment of the role of gene duplication based on protein domain architecture shows that (i) transcriptional autoregulators have mostly arisen through duplication, while (ii) the expected feedback loops stemming from their initial cross-regulation are strongly selected against. This requires a divergent coevolution of the transcription factor DNA-binding sites and their respective DNA cis-regulatory regions. Moreover, we find that the network tends to grow by expansion of the existing hierarchical layers of computation, rather than by addition of new layers. We also argue that rewiring of regulatory links due to mutation/selection of novel transcription factor/DNA binding interactions appears not to significantly affect the network global hierarchy, and that horizontally transferred genes are mainly added at the bottom, as new target nodes. These findings highlight the important evolutionary roles of both duplication and selective deletion of crosstalks between autoregulators in the emergence of the hierarchical transcription network of E.coli.
△ Less
Submitted 24 January, 2007;
originally announced January 2007.
-
Encoding folding paths of RNA switches
Authors:
A. Xayaphoummine,
V. Viasnoff,
S. Harlepp,
H. Isambert
Abstract:
RNA co-transcriptional folding has long been suspected to play an active role in hel** proper native folding of ribozymes and structured regulatory motifs in mRNA untranslated regions. Yet, the underlying mechanisms and coding requirements for efficient co-transcriptional folding remain unclear. Traditional approaches have intrinsic limitations to dissect RNA folding paths, as they rely on seq…
▽ More
RNA co-transcriptional folding has long been suspected to play an active role in hel** proper native folding of ribozymes and structured regulatory motifs in mRNA untranslated regions. Yet, the underlying mechanisms and coding requirements for efficient co-transcriptional folding remain unclear. Traditional approaches have intrinsic limitations to dissect RNA folding paths, as they rely on sequence mutations or circular permutations that typically perturb both RNA folding paths and equilibrium structures. Here, we show that exploiting sequence symmetries instead of mutations can circumvent this problem by essentially decoupling folding paths from equilibrium structures of designed RNA sequences. Using bistable RNA switches with symmetrical helices conserved under sequence reversal, we demonstrate experimentally that native and transiently formed helices can guide efficient co-transcriptional folding into either long-lived structure of these RNA switches. Their folding path is controlled by the order of helix nucleations and subsequent exchanges during transcription, and may also be redirected by transient antisense interactions. Hence, transient intra- and intermolecular base pair interactions can effectively regulate the folding of nascent RNA molecules into different native structures, provided limited coding requirements, as discussed from an information theory perspective. This constitutive coupling between RNA synthesis and RNA folding regulation may have enabled the early emergence of autonomous RNA-based regulation networks.
△ Less
Submitted 22 November, 2006;
originally announced November 2006.
-
Asymptotic Evolution of Protein-Protein Interaction Networks for General Duplication-Divergence Models
Authors:
Kirill Evlampiev,
Herve Isambert
Abstract:
Genomic duplication-divergence events, which are the primary source of new protein functions, occur stochastically at a wide range of genomic scales, from single gene to whole genome duplications. Clearly, this fundamental evolutionary process must have largely conditioned the emerging structure of protein-protein interaction (PPI) networks, that control many cellular activities. We propose and…
▽ More
Genomic duplication-divergence events, which are the primary source of new protein functions, occur stochastically at a wide range of genomic scales, from single gene to whole genome duplications. Clearly, this fundamental evolutionary process must have largely conditioned the emerging structure of protein-protein interaction (PPI) networks, that control many cellular activities. We propose and asymptotically solve a general duplication-divergence model of PPI network evolution based on the statistical selection of duplication-derived interactions. We also introduce a conservation index, that formally defines the statistical evolutionary conservation of PPI networks. Distinct conditions on microscopic parameters are then shown to control global conservation and topology of emerging PPI networks. In particular, conserved, non-dense networks, which are the only ones of potential biological relevance, are also shown to be necessary scale-free.
△ Less
Submitted 22 November, 2006;
originally announced November 2006.
-
Evolution of Protein Interaction Networks by Whole Genome Duplication and Domain Shuffling
Authors:
K. Evlampiev,
H. Isambert
Abstract:
Successive whole genome duplications have recently been firmly established in all major eukaryote kingdoms. It is not clear, however, how such dramatic evolutionary process has contributed to shape the large scale topology of protein-protein interaction (PPI) networks. We propose and analytically solve a generic model of PPI network evolution under successive whole genome duplications. This demo…
▽ More
Successive whole genome duplications have recently been firmly established in all major eukaryote kingdoms. It is not clear, however, how such dramatic evolutionary process has contributed to shape the large scale topology of protein-protein interaction (PPI) networks. We propose and analytically solve a generic model of PPI network evolution under successive whole genome duplications. This demonstrates that the observed scale-free degree distributions and conserved multi-protein complexes may have concomitantly arised from i) intrinsic exponential dynamics of PPI network evolution and ii) asymmetric divergence of gene duplicates. This requirement of asymmetric divergence is in fact "spontaneously" fulfilled at the level of protein-binding domains. In addition, domain shuffling of multi-domain proteins is shown to provide a powerful combinatorial source of PPI network innovation, while preserving essential structures of the underlying single-domain interaction network. Finally, large scale features of PPI networks reflecting the "combinatorial logic" behind direct and indirect protein interactions are well reproduced numerically with only two adjusted parameters of clear biological significance.
△ Less
Submitted 25 June, 2006;
originally announced June 2006.
-
Prediction and statistics of pseudoknots in RNA structures using exactly clustered stochastic simulations
Authors:
A. Xayaphoummine,
T. Bucher,
F. Thalmann,
H. Isambert
Abstract:
Ab initio RNA secondary structure predictions have long dismissed helices interior to loops, so-called pseudoknots, despite their structural importance. Here, we report that many pseudoknots can be predicted through long time scales RNA folding simulations, which follow the stochastic closing and opening of individual RNA helices. The numerical efficacy of these stochastic simulations relies on…
▽ More
Ab initio RNA secondary structure predictions have long dismissed helices interior to loops, so-called pseudoknots, despite their structural importance. Here, we report that many pseudoknots can be predicted through long time scales RNA folding simulations, which follow the stochastic closing and opening of individual RNA helices. The numerical efficacy of these stochastic simulations relies on an O(n^2) clustering algorithm which computes time averages over a continously updated set of n reference structures. Applying this exact stochastic clustering approach, we typically obtain a 5- to 100-fold simulation speed-up for RNA sequences up to 400 bases, while the effective acceleration can be as high as 100,000-fold for short multistable molecules (<150 bases). We performed extensive folding statistics on random and natural RNA sequences, and found that pseudoknots are unevenly distributed amongst RNAstructures and account for up to 30% of base pairs in G+C rich RNA sequences (Online RNA folding kinetics server including pseudoknots : http://kinefold.u-strasbg.fr/ ).
△ Less
Submitted 29 September, 2003;
originally announced September 2003.
-
Probing complex RNA structures by mechanical force
Authors:
S. Harlepp,
T. Marchal,
J. Robert,
J-F. Leger,
A. Xayaphoummine,
H. Isambert,
D. Chatenay
Abstract:
RNA secondary structures of increasing complexity are probed combining single molecule stretching experiments and stochastic unfolding/refolding simulations. We find that force-induced unfolding pathways cannot usually be interpretated by solely invoking successive openings of native helices. Indeed, typical force-extension responses of complex RNA molecules are largely shaped by stretching-indu…
▽ More
RNA secondary structures of increasing complexity are probed combining single molecule stretching experiments and stochastic unfolding/refolding simulations. We find that force-induced unfolding pathways cannot usually be interpretated by solely invoking successive openings of native helices. Indeed, typical force-extension responses of complex RNA molecules are largely shaped by stretching-induced, long-lived intermediates including non-native helices. This is first shown for a set of generic structural motifs found in larger RNA structures, and then for Escherichia coli's 1540-base long 16S ribosomal RNA, which exhibits a surprisingly well-structured and reproducible unfolding pathway under mechanical stretching. Using out-of-equilibrium stochastic simulations, we demonstrate that these experimental results reflect the slow relaxation of RNA structural rearrangements. Hence, micromanipulations of single RNA molecules probe both their native structures and long-lived intermediates, so-called "kinetic traps", thereby capturing -at the single molecular level- the hallmark of RNA folding/unfolding dynamics.
△ Less
Submitted 15 September, 2003;
originally announced September 2003.
-
Undulation instability of lipid membranes under an electric field
Authors:
Pierre Sens,
Hervé Isambert
Abstract:
The influence of an electric field on a poorly conductive membrane such as a lipid bilayer is studied theoretically. The unbalanced electric stress created by an ionic current across a non-perfectly flat membrane gives rise to a destabilizing surface energy enhancing undulations. The deformation of a membrane attached to a frame and the subsequent force on the frame are derived and the electrohy…
▽ More
The influence of an electric field on a poorly conductive membrane such as a lipid bilayer is studied theoretically. The unbalanced electric stress created by an ionic current across a non-perfectly flat membrane gives rise to a destabilizing surface energy enhancing undulations. The deformation of a membrane attached to a frame and the subsequent force on the frame are derived and the electrohydrodynamic instability of a free floating membrane is also studied. We find a most unstable mode of undulation, of wavelength in the $μm$ range, connected to the crossover between membrane and solvent dominated dissipations.
△ Less
Submitted 29 June, 2001;
originally announced June 2001.