-
Biased Hypothesis Formation From Projection Pursuit
Authors:
John Patterson,
Chris Avery,
Tyler Grear,
Donald J. Jacobs
Abstract:
The effect of bias on hypothesis formation is characterized for an automated data-driven projection pursuit neural network to extract and select features for binary classification of data streams. This intelligent exploratory process partitions a complete vector state space into disjoint subspaces to create working hypotheses quantified by similarities and differences observed between two groups o…
▽ More
The effect of bias on hypothesis formation is characterized for an automated data-driven projection pursuit neural network to extract and select features for binary classification of data streams. This intelligent exploratory process partitions a complete vector state space into disjoint subspaces to create working hypotheses quantified by similarities and differences observed between two groups of labeled data streams. Data streams are typically time sequenced, and may exhibit complex spatio-temporal patterns. For example, given atomic trajectories from molecular dynamics simulation, the machine's task is to quantify dynamical mechanisms that promote function by comparing protein mutants, some known to function while others are nonfunctional. Utilizing synthetic two-dimensional molecules that mimic the dynamics of functional and nonfunctional proteins, biases are identified and controlled in both the machine learning model and selected training data under different contexts. The refinement of a working hypothesis converges to a statistically robust multivariate perception of the data based on a context-dependent perspective. Including diverse perspectives during data exploration enhances interpretability of the multivariate characterization of similarities and differences.
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
Distribution of volume, microvoid percolation, and packing density in globular proteins
Authors:
Jenny Farmer,
Sheridan B. Green,
Donald J. Jacobs
Abstract:
A fast and accurate grid-based method with low memory requirement is presented to calculate volume characteristics in molecular systems. The distribution of volume and packing density is characterized in globular proteins, where void space is decomposed into microvoid volume and cavities based on a spherical test probe with variable radius. A scan over test probe radius is mapped onto a site perco…
▽ More
A fast and accurate grid-based method with low memory requirement is presented to calculate volume characteristics in molecular systems. The distribution of volume and packing density is characterized in globular proteins, where void space is decomposed into microvoid volume and cavities based on a spherical test probe with variable radius. A scan over test probe radius is mapped onto a site percolation problem for microvoid volume. Finite-size scaling is applied to determine critical exponents, which are found to be consistent with connectivity percolation exponents in three dimensions. Disparate results in the literature regarding packing density in the core of a protein compared to on its surface, and with respect to protein size, is elucidated in terms of microvoid volume within a unified implicit-solvent model. By parameterizing the model to match the results of explicit-solvent models that agree with experimental data, we verify that packing density within globular proteins is spatially uniform and independent of protein size.
△ Less
Submitted 19 October, 2018;
originally announced October 2018.
-
Spectral energy distribution and radio halo of NGC 253 at low radio frequencies
Authors:
A. D. Kapinska,
L. Staveley-Smith,
R. Crocker,
G. R. Meurer,
S. Bhandari,
N. Hurley-Walker,
A. R. Offringa,
D. J. Hanish,
N. Seymour,
R. D. Ekers,
M. E. Bell,
J. R. Callingham,
K. S. Dwarakanath,
B. -Q. For,
B. M. Gaensler,
P. J. Hancock,
L. Hindson,
M. Johnston-Hollitt,
E. Lenc,
B. McKinley,
J. Morgan,
P. Procopio,
R. B. Wayth,
C. Wu,
Q. Zheng
, et al. (45 additional authors not shown)
Abstract:
We present new radio continuum observations of NGC253 from the Murchison Widefield Array at frequencies between 76 and 227 MHz. We model the broadband radio spectral energy distribution for the total flux density of NGC253 between 76 MHz and 11 GHz. The spectrum is best described as a sum of central starburst and extended emission. The central component, corresponding to the inner 500pc of the sta…
▽ More
We present new radio continuum observations of NGC253 from the Murchison Widefield Array at frequencies between 76 and 227 MHz. We model the broadband radio spectral energy distribution for the total flux density of NGC253 between 76 MHz and 11 GHz. The spectrum is best described as a sum of central starburst and extended emission. The central component, corresponding to the inner 500pc of the starburst region of the galaxy, is best modelled as an internally free-free absorbed synchrotron plasma, with a turnover frequency around 230 MHz. The extended emission component of the NGC253 spectrum is best described as a synchrotron emission flattening at low radio frequencies. We find that 34% of the extended emission (outside the central starburst region) at 1 GHz becomes partially absorbed at low radio frequencies. Most of this flattening occurs in the western region of the SE halo, and may be indicative of synchrotron self-absorption of shock re-accelerated electrons or an intrinsic low-energy cut off of the electron distribution. Furthermore, we detect the large-scale synchrotron radio halo of NGC253 in our radio images. At 154 - 231 MHz the halo displays the well known X-shaped/horn-like structure, and extends out to ~8kpc in z-direction (from major axis).
△ Less
Submitted 19 February, 2017; v1 submitted 8 February, 2017;
originally announced February 2017.
-
Delay Spectrum with Phase-Tracking Arrays: Extracting the HI power spectrum from the Epoch of Reionization
Authors:
Sourabh Paul,
Shiv K. Sethi,
Miguel F. Morales,
K. S. Dwarkanath,
N. Udaya Shankar,
Ravi Subrahmanyan,
N. Barry,
A. P. Beardsley,
Judd D. Bowman,
F. Briggs,
P. Carroll,
A. de Oliveira-Costa,
Joshua S. Dillon,
A. Ewall-Wice,
L. Feng,
L. J. Greenhill,
B. M. Gaensler,
B. J. Hazelton,
J. N. Hewitt,
N. Hurley-Walker,
D. J. Jacobs,
Han-Seek Kim,
P. Kittiwisit,
E. Lenc,
J. Line
, et al. (29 additional authors not shown)
Abstract:
The Detection of redshifted 21 cm emission from the epoch of reionization (EoR) is a challenging task owing to strong foregrounds that dominate the signal. In this paper, we propose a general method, based on the delay spectrum approach, to extract HI power spectra that is applicable to tracking observations using an imaging radio interferometer (Delay Spectrum with Imaging Arrays (DSIA)). Our met…
▽ More
The Detection of redshifted 21 cm emission from the epoch of reionization (EoR) is a challenging task owing to strong foregrounds that dominate the signal. In this paper, we propose a general method, based on the delay spectrum approach, to extract HI power spectra that is applicable to tracking observations using an imaging radio interferometer (Delay Spectrum with Imaging Arrays (DSIA)). Our method is based on modelling the HI signal taking into account the impact of wide field effects such as the $w$-term which are then used as appropriate weights in cross-correlating the measured visibilities. Our method is applicable to any radio interferometer that tracks a phase center and could be utilized for arrays such as MWA, LOFAR, GMRT, PAPER and HERA. In the literature the delay spectrum approach has been implemented for near-redundant baselines using drift scan observations. In this paper we explore the scheme for non-redundant tracking arrays, and this is the first application of delay spectrum methodology to such data to extract the HI signal. We analyze 3 hours of MWA tracking data on the EoR1 field. We present both 2-dimensional ($k_\parallel,k_\perp$) and 1-dimensional (k) power spectra from the analysis. Our results are in agreement with the findings of other pipelines developed to analyse the MWA EoR data.
△ Less
Submitted 22 October, 2016;
originally announced October 2016.
-
Nonparametric Maximum Entropy Probability Density Estimation
Authors:
Jenny Farmer,
Donald J. Jacobs
Abstract:
Given a sample of independent and identically distributed random variables, a novel nonparametric maximum entropy method is presented to estimate the underlying continuous univariate probability density function (pdf). Estimates are found by maximizing a log-likelihood function based on single order statistics after transforming through a sequence of trial cumulative distribution functions that it…
▽ More
Given a sample of independent and identically distributed random variables, a novel nonparametric maximum entropy method is presented to estimate the underlying continuous univariate probability density function (pdf). Estimates are found by maximizing a log-likelihood function based on single order statistics after transforming through a sequence of trial cumulative distribution functions that iteratively improve using a Monte Carlo random search method. Improvement is quantified by assessing the random variables against the statistical properties of sampled uniform random data. Quality is determined using an empirically derived scoring function that is scaled to be sample size invariant. The scoring function identifies atypical fluctuations, for which threshold values are set to define objective criteria that prevent under-fitting as trial iterations continue to improve the model pdf, and, stop** the iteration cycle before over-fitting occurs. No prior knowledge about the data is required. An ensemble of pdf models is used to reflect uncertainties due to statistical fluctuations in random samples, and the quality of the estimates is visualized using scaled residual quantile plots that show deviations from size-invariant statistics. These considerations result in a tractable method that holistically employs key principles of random variables and their statistical properties combined with employing orthogonal basis functions and data-driven adaptive algorithms. Benchmark tests show that the pdf estimates readily converge to the true pdf as sample size increases. Robust results are demonstrated on several test probability densities that include cases with discontinuities, multi-resolution scales, heavy tails and singularities in the pdf, suggesting a generally applicable approach for statistical inference.
△ Less
Submitted 28 June, 2016;
originally announced June 2016.
-
Network rigidity at finite temperature: Relationships between thermodynamic stability, the non-additivity of entropy and cooperativity in molecular systems
Authors:
Donald J. Jacobs,
S. Dallakyan,
G. G. Wood,
A. Heckathorne
Abstract:
A statistical mechanical distance constraint model (DCM) is presented that explicitly accounts for network rigidity among constraints present within a system. Constraints are characterized by local microscopic free energy functions. Topological re-arrangements of thermally fluctuating constraints are permitted. The partition function is obtained by combining microscopic free energies of individu…
▽ More
A statistical mechanical distance constraint model (DCM) is presented that explicitly accounts for network rigidity among constraints present within a system. Constraints are characterized by local microscopic free energy functions. Topological re-arrangements of thermally fluctuating constraints are permitted. The partition function is obtained by combining microscopic free energies of individual constraints using network rigidity as an underlying long-range mechanical interaction -- giving a quantitative explanation for the non-additivity in component entropies exhibited in molecular systems. Two exactly solved 2-dimensional toy models representing flexible molecules that can undergo conformational change are presented to elucidate concepts, and to outline a DCM calculation scheme applicable to many types of physical systems. It is proposed that network rigidity plays a central role in balancing the energetic and entropic contributions to the free energy of bio-polymers, such as proteins. As a demonstration, the distance constraint model is solved exactly for the alpha-helix to coil transition in homogeneous peptides. Temperature and size independent model parameters are fitted to Monte Carlo simulation data, which includes peptides of length 10 for gas phase, and lengths 10, 15, 20 and 30 in water. The DCM is compared to the Lifson-Roig model. It is found that network rigidity provides a mechanism for cooperativity in molecular structures including their ability to spontaneously self-organize. In particular, the formation of a characteristic topological arrangement of constraints is associated with the most probable microstates changing under different thermodynamic conditions.
△ Less
Submitted 8 September, 2003;
originally announced September 2003.
-
Floppy modes and the free energy: Rigidity and connectivity percolation on Bethe Lattices
Authors:
P. M. Duxbury,
D. J. Jacobs,
M. F. Thorpe,
Cristian F. Moukarzel
Abstract:
We show that negative of the number of floppy modes behaves as a free energy for both connectivity and rigidity percolation, and we illustrate this result using Bethe lattices. The rigidity transition on Bethe lattices is found to be first order at a bond concentration close to that predicted by Maxwell constraint counting. We calculate the probability of a bond being on the infinite cluster and…
▽ More
We show that negative of the number of floppy modes behaves as a free energy for both connectivity and rigidity percolation, and we illustrate this result using Bethe lattices. The rigidity transition on Bethe lattices is found to be first order at a bond concentration close to that predicted by Maxwell constraint counting. We calculate the probability of a bond being on the infinite cluster and also on the overconstrained part of the infinite cluster, and show how a specific heat can be defined as the second derivative of the free energy. We demonstrate that the Bethe lattice solution is equivalent to that of the random bond model, where points are joined randomly (with equal probability at all length scales) to have a given coordination, and then subsequently bonds are randomly removed.
△ Less
Submitted 5 July, 1998;
originally announced July 1998.