-
Automated rendering of multi-stranded DNA complexes with pseudoknots
Authors:
Malgorzata Nowicka,
Vinay K. Gautam,
Pekka Orponen
Abstract:
We present a general method for rendering representations of multi-stranded DNA complexes from textual descriptions into 2D diagrams. The complexes can be arbitrarily pseudoknotted, and if a planar rendering is possible, the method will determine one in time which is essentially linear in the size of the textual description. (That is, except for a final stochastic fine-tuning step.) If a planar re…
▽ More
We present a general method for rendering representations of multi-stranded DNA complexes from textual descriptions into 2D diagrams. The complexes can be arbitrarily pseudoknotted, and if a planar rendering is possible, the method will determine one in time which is essentially linear in the size of the textual description. (That is, except for a final stochastic fine-tuning step.) If a planar rendering is not possible, the method will compute a visually pleasing approximate rendering in quadratic time. Examples of diagrams produced by the method are presented in the paper.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Cotranscriptional kinetic folding of RNA secondary structures including pseudoknots
Authors:
Vo Hong Thanh,
Dani Korpela,
Pekka Orponen
Abstract:
Computational prediction of RNA structures is an important problem in computational structural biology. Studies of RNA structure formation often assume that the process starts from a fully synthesized sequence. Experimental evidence, however, has shown that RNA folds concurrently with its elongation. We investigate RNA secondary structure formation, including pseudoknots, that takes into account t…
▽ More
Computational prediction of RNA structures is an important problem in computational structural biology. Studies of RNA structure formation often assume that the process starts from a fully synthesized sequence. Experimental evidence, however, has shown that RNA folds concurrently with its elongation. We investigate RNA secondary structure formation, including pseudoknots, that takes into account the cotranscriptional effects. We propose a single-nucleotide resolution kinetic model of the folding process of RNA molecules, where the polymerase-driven elongation of an RNA strand by a new nucleotide is included as a primitive operation, together with a stochastic simulation method that implements this folding concurrently with the transcriptional synthesis. Numerical case studies show that our cotranscriptional RNA folding model can predict the formation of conformations that are favored in actual biological systems. Our new computational tool can thus provide quantitative predictions and offer useful insights into the kinetics of RNA folding.
△ Less
Submitted 17 March, 2021; v1 submitted 10 July, 2020;
originally announced July 2020.
-
Search Methods for Tile Sets in Patterned DNA Self-Assembly
Authors:
Mika Göös,
Tuomo Lempiäinen,
Eugen Czeizler,
Pekka Orponen
Abstract:
The Pattern self-Assembly Tile set Synthesis (PATS) problem, which arises in the theory of structured DNA self-assembly, is to determine a set of coloured tiles that, starting from a bordering seed structure, self-assembles to a given rectangular colour pattern. The task of finding minimum-size tile sets is known to be NP-hard. We explore several complete and incomplete search techniques for findi…
▽ More
The Pattern self-Assembly Tile set Synthesis (PATS) problem, which arises in the theory of structured DNA self-assembly, is to determine a set of coloured tiles that, starting from a bordering seed structure, self-assembles to a given rectangular colour pattern. The task of finding minimum-size tile sets is known to be NP-hard. We explore several complete and incomplete search techniques for finding minimal, or at least small, tile sets and also assess the reliability of the solutions obtained according to the kinetic Tile Assembly Model.
△ Less
Submitted 22 December, 2014;
originally announced December 2014.
-
Witness of unsatisfiability for a random 3-satisfiability formula
Authors:
Lu-Lu Wu,
Hai-Jun Zhou,
Mikko Alava,
Erik Aurell,
Pekka Orponen
Abstract:
The random 3-satisfiability (3-SAT) problem is in the unsatisfiable (UNSAT) phase when the clause density $α$ exceeds a critical value $α_s \approx 4.267$. However, rigorously proving the unsatisfiability of a given large 3-SAT instance is extremely difficult. In this paper we apply the mean-field theory of statistical physics to the unsatisfiability problem, and show that a specific type of UNSAT…
▽ More
The random 3-satisfiability (3-SAT) problem is in the unsatisfiable (UNSAT) phase when the clause density $α$ exceeds a critical value $α_s \approx 4.267$. However, rigorously proving the unsatisfiability of a given large 3-SAT instance is extremely difficult. In this paper we apply the mean-field theory of statistical physics to the unsatisfiability problem, and show that a specific type of UNSAT witnesses (Feige-Kim-Ofek witnesses) can in principle be constructed when the clause density $α> 19$. We then construct Feige-Kim-Ofek witnesses for single 3-SAT instances through a simple random sampling algorithm and a focused local search algorithm. The random sampling algorithm works only when $α$ scales at least linearly with the variable number $N$, but the focused local search algorithm works for clause densty $α> c N^{b}$ with $b \approx 0.59$ and prefactor $c \approx 8$. The exponent $b$ can be further decreased by enlarging the single parameter $S$ of the focused local search algorithm.
△ Less
Submitted 10 March, 2013;
originally announced March 2013.
-
Synthesizing Minimal Tile Sets for Patterned DNA Self-Assembly
Authors:
Mika Göös,
Pekka Orponen
Abstract:
The Pattern self-Assembly Tile set Synthesis (PATS) problem is to determine a set of coloured tiles that self-assemble to implement a given rectangular colour pattern. We give an exhaustive branch-and-bound algorithm to find tile sets of minimum cardinality for the PATS problem. Our algorithm makes use of a search tree in the lattice of partitions of the ambient rectangular grid, and an efficient…
▽ More
The Pattern self-Assembly Tile set Synthesis (PATS) problem is to determine a set of coloured tiles that self-assemble to implement a given rectangular colour pattern. We give an exhaustive branch-and-bound algorithm to find tile sets of minimum cardinality for the PATS problem. Our algorithm makes use of a search tree in the lattice of partitions of the ambient rectangular grid, and an efficient bounding function to prune this search tree. Empirical data on the performance of the algorithm shows that it compares favourably to previously presented heuristic solutions to the problem.
△ Less
Submitted 14 August, 2010; v1 submitted 16 November, 2009;
originally announced November 2009.
-
Locally computable approximations for spectral clustering and absorption times of random walks
Authors:
Pekka Orponen,
Satu Elisa Schaeffer,
Vanesa Avalos Gaytán
Abstract:
We address the problem of determining a natural local neighbourhood or "cluster" associated to a given seed vertex in an undirected graph. We formulate the task in terms of absorption times of random walks from other vertices to the vertex of interest, and observe that these times are well approximated by the components of the principal eigenvector of the corresponding fundamental matrix of the…
▽ More
We address the problem of determining a natural local neighbourhood or "cluster" associated to a given seed vertex in an undirected graph. We formulate the task in terms of absorption times of random walks from other vertices to the vertex of interest, and observe that these times are well approximated by the components of the principal eigenvector of the corresponding fundamental matrix of the graph's adjacency matrix. We further present a locally computable gradient-descent method to estimate this Dirichlet-Fiedler vector, based on minimising the respective Rayleigh quotient. Experimental evaluation shows that the approximations behave well and yield well-defined local clusters.
△ Less
Submitted 22 October, 2008;
originally announced October 2008.
-
Circumspect descent prevails in solving random constraint satisfaction problems
Authors:
Mikko Alava,
John Ardelius,
Erik Aurell,
Petteri Kaski,
Supriya Krishnamurthy,
Pekka Orponen,
Sakari Seitz
Abstract:
We study the performance of stochastic local search algorithms for random instances of the $K$-satisfiability ($K$-SAT) problem. We introduce a new stochastic local search algorithm, ChainSAT, which moves in the energy landscape of a problem instance by {\em never going upwards} in energy. ChainSAT is a \emph{focused} algorithm in the sense that it considers only variables occurring in unsatisfi…
▽ More
We study the performance of stochastic local search algorithms for random instances of the $K$-satisfiability ($K$-SAT) problem. We introduce a new stochastic local search algorithm, ChainSAT, which moves in the energy landscape of a problem instance by {\em never going upwards} in energy. ChainSAT is a \emph{focused} algorithm in the sense that it considers only variables occurring in unsatisfied clauses. We show by extensive numerical investigations that ChainSAT and other focused algorithms solve large $K$-SAT instances almost surely in linear time, up to high clause-to-variable ratios $α$; for example, for K=4 we observe linear-time performance well beyond the recently postulated clustering and condensation transitions in the solution space. The performance of ChainSAT is a surprise given that by design the algorithm gets trapped into the first local energy minimum it encounters, yet no such minima are encountered. We also study the geometry of the solution space as accessed by stochastic local search algorithms.
△ Less
Submitted 30 November, 2007;
originally announced November 2007.
-
Focused Local Search for Random 3-Satisfiability
Authors:
Sakari Seitz,
Mikko Alava,
Pekka Orponen
Abstract:
A local search algorithm solving an NP-complete optimisation problem can be viewed as a stochastic process moving in an 'energy landscape' towards eventually finding an optimal solution. For the random 3-satisfiability problem, the heuristic of focusing the local moves on the presently unsatisfiedclauses is known to be very effective: the time to solution has been observed to grow only linearly…
▽ More
A local search algorithm solving an NP-complete optimisation problem can be viewed as a stochastic process moving in an 'energy landscape' towards eventually finding an optimal solution. For the random 3-satisfiability problem, the heuristic of focusing the local moves on the presently unsatisfiedclauses is known to be very effective: the time to solution has been observed to grow only linearly in the number of variables, for a given clauses-to-variables ratio $α$ sufficiently far below the critical satisfiability threshold $α_c \approx 4.27$. We present numerical results on the behaviour of three focused local search algorithms for this problem, considering in particular the characteristics of a focused variant of the simple Metropolis dynamics. We estimate the optimal value for the ``temperature'' parameter $η$ for this algorithm, such that its linear-time regime extends as close to $α_c$ as possible. Similar parameter optimisation is performed also for the well-known WalkSAT algorithm and for the less studied, but very well performing Focused Record-to-Record Travel method. We observe that with an appropriate choice of parameters, the linear time regime for each of these algorithms seems to extend well into ratios $α> 4.2$ -- much further than has so far been generally assumed. We discuss the statistics of solution times for the algorithms, relate their performance to the process of ``whitening'', and present some conjectures on the shape of their computational phase diagrams.
△ Less
Submitted 28 January, 2005;
originally announced January 2005.
-
Efficient Algorithms for Sampling and Clustering of Large Nonuniform Networks
Authors:
Pekka Orponen,
Satu Elisa Schaeffer
Abstract:
We propose efficient algorithms for two key tasks in the analysis of large nonuniform networks: uniform node sampling and cluster detection. Our sampling technique is based on augmenting a simple, but slowly mixing uniform MCMC sampler with a regular random walk in order to speed up its convergence; however the combined MCMC chain is then only sampled when it is in its "uniform sampling" mode.Ou…
▽ More
We propose efficient algorithms for two key tasks in the analysis of large nonuniform networks: uniform node sampling and cluster detection. Our sampling technique is based on augmenting a simple, but slowly mixing uniform MCMC sampler with a regular random walk in order to speed up its convergence; however the combined MCMC chain is then only sampled when it is in its "uniform sampling" mode.Our clustering algorithm determines the relevant neighbourhood of a given node u in the network by first estimating the Fiedler vector of a Dirichlet matrix with u fixed at zero potential, and then finding the neighbourhood of u that yields a minimal weighted Cheeger ratio, where the edge weights are determined by differences in the estimated node potentials. Both of our algorithms are based on local computations, i.e. operations on the full adjacency matrix of the network are not used. The algorithms are evaluated experimentally using three types of nonuniform networks: Dorogovtsev-Goltsev-Mendes "pseudofractal graphs", scientific collaboration networks, and randomised "caveman graphs".
△ Less
Submitted 2 June, 2004;
originally announced June 2004.