-
When does deep learning fail and how to tackle it? A critical analysis on polymer sequence-property surrogate models
Authors:
Himanshu,
Tarak K Patra
Abstract:
Deep learning models are gaining popularity and potency in predicting polymer properties. These models can be built using pre-existing data and are useful for the rapid prediction of polymer properties. However, the performance of a deep learning model is intricately connected to its topology and the volume of training data. There is no facile protocol available to select a deep learning architect…
▽ More
Deep learning models are gaining popularity and potency in predicting polymer properties. These models can be built using pre-existing data and are useful for the rapid prediction of polymer properties. However, the performance of a deep learning model is intricately connected to its topology and the volume of training data. There is no facile protocol available to select a deep learning architecture, and there is a lack of a large volume of homogeneous sequence-property data of polymers. These two factors are the primary bottleneck for the efficient development of deep learning models. Here we assess the severity of these factors and propose new algorithms to address them. We show that a linear layer-by-layer expansion of a neural network can help in identifying the best neural network topology for a given problem. Moreover, we map the discrete sequence space of a polymer to a continuous one-dimensional latent space using a machine learning pipeline to identify minimal data points for building a universal deep learning model. We implement these approaches for three representative cases of building sequence-property surrogate models, viz., the single-molecule radius of gyration of a copolymer, adhesive free energy of a copolymer, and copolymer compatibilizer, demonstrating the generality of the proposed strategies. This work establishes efficient methods for building universal deep learning models with minimal data and hyperparameters for predicting sequence-defined properties of polymers.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
TeKnowbase: Towards Construction of a Knowledge-base of Technical Concepts
Authors:
Prajna Upadhyay,
Tanuma Patra,
Ashwini Purkar,
Maya Ramanath
Abstract:
In this paper, we describe the construction of TeKnowbase, a knowledge-base of technical concepts in computer science. Our main information sources are technical websites such as Webopedia and Techtarget as well as Wikipedia and online textbooks. We divide the knowledge-base construction problem into two parts -- the acquisition of entities and the extraction of relationships among these entities.…
▽ More
In this paper, we describe the construction of TeKnowbase, a knowledge-base of technical concepts in computer science. Our main information sources are technical websites such as Webopedia and Techtarget as well as Wikipedia and online textbooks. We divide the knowledge-base construction problem into two parts -- the acquisition of entities and the extraction of relationships among these entities. Our knowledge-base consists of approximately 100,000 triples. We conducted an evaluation on a sample of triples and report an accuracy of a little over 90\%. We additionally conducted classification experiments on StackOverflow data with features from TeKnowbase and achieved improved classification accuracy.
△ Less
Submitted 15 December, 2016;
originally announced December 2016.
-
Threshold Policy for Route Discovery Initiation in Mobile Ad hoc Networks
Authors:
Tapas Kumar Patra,
Joy Kuri
Abstract:
Achieving optimal transmission throughput in data networks in a multi-hop wireless networks is fundamental but hard problem. The situation is aggravated when nodes are mobile. Further, multi-rate system make the analysis of throughput more complicated. In mobile scenario, link may break or be created as nodes are moving within communication range. `Route Discovery' which is to find the optimal rou…
▽ More
Achieving optimal transmission throughput in data networks in a multi-hop wireless networks is fundamental but hard problem. The situation is aggravated when nodes are mobile. Further, multi-rate system make the analysis of throughput more complicated. In mobile scenario, link may break or be created as nodes are moving within communication range. `Route Discovery' which is to find the optimal route and transmission schedule is an important issue. Route discovery entails some cost; so one would not like to initiate discovery too often. On the other hand, not discovering reasonably often entails the risk of being stuck with a suboptimal route and/or schedule, which hurts end-to-end throughput. The implementation of the routing decision problem in one dimensional mobile ad hoc network as Markov decision process problem is already is discussed in the paper [1]. A heuristic based on threshold policy is discussed in the same paper without giving a way to find the threshold. In this paper, we suggested a rule for setting the threshold, given the parameters of the system. We also point out that our results remain valid in a slightly different mobility model; this model is a first step towards an `open' network in which existing relay nodes can leave and/or new relay nodes can join the network.
△ Less
Submitted 23 September, 2010;
originally announced September 2010.
-
Locating phase transitions in computationally hard problems
Authors:
B. Ashok,
T. K. Patra
Abstract:
We discuss how phase-transitions may be detected in computationally hard problems in the context of Anytime Algorithms. Treating the computational time, value and utility functions involved in the search results in analogy with quantities in statistical physics, we indicate how the onset of a computationally hard regime can be detected and the transit to higher quality solutions be quantified by a…
▽ More
We discuss how phase-transitions may be detected in computationally hard problems in the context of Anytime Algorithms. Treating the computational time, value and utility functions involved in the search results in analogy with quantities in statistical physics, we indicate how the onset of a computationally hard regime can be detected and the transit to higher quality solutions be quantified by an appropriate response function. The existence of a dynamical critical exponent is shown, enabling one to predict the onset of critical slowing down, rather than finding it after the event, in the specific case of a Travelling Salesman Problem. This can be used as a means of improving efficiency and speed in searches, and avoiding needless computations.
△ Less
Submitted 11 May, 2010;
originally announced May 2010.