-
Quantifying Distances Between Clusters with Elliptical or Non-Elliptical Shapes
Authors:
Meredith L. Wallace,
Lisa McTeague,
Jessica L. Graves,
Nicholas Kissel,
Cristina Tortora,
Bradley Wheeler,
Satish Iyengar
Abstract:
Finite mixture models that allow for a broad range of potentially non-elliptical cluster distributions is an emerging methodological field. Such methods allow for the shape of the clusters to match the natural heterogeneity of the data, rather than forcing a series of elliptical clusters. These methods are highly relevant for clustering continuous non-normal data - a common occurrence with objecti…
▽ More
Finite mixture models that allow for a broad range of potentially non-elliptical cluster distributions is an emerging methodological field. Such methods allow for the shape of the clusters to match the natural heterogeneity of the data, rather than forcing a series of elliptical clusters. These methods are highly relevant for clustering continuous non-normal data - a common occurrence with objective data that are now routinely captured in health research. However, interpreting and comparing such models - especially with regards to whether they produce meaningful clusters that are reasonably well separated - is non-trivial. We summarize several measures that can succinctly quantify the multivariate distance between two clusters, regardless of the cluster distribution, and suggest practical computational tools. Through a simulation study, we evaluate these measures across three scenarios that allow for clusters to differ in mean, scale, and rotation. We then demonstrate our approaches using physiological responses to emotional imagery captured as part of the Transdiagnostic Anxiety Study, a large-scale study of anxiety disorder spectrum patients and control participants. Finally, we synthesize findings to provide guidance on how to use distance measures in clustering applications.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
PHAT Stellar Cluster Survey. II. Andromeda Project Cluster Catalog
Authors:
L. Clifton Johnson,
Anil C. Seth,
Julianne J. Dalcanton,
Matthew L. Wallace,
Robert J. Simpson,
Chris J. Lintott,
Amit Kapadia,
Evan D. Skillman,
Nelson Caldwell,
Morgan Fouesneau,
Daniel R. Weisz,
Benjamin F. Williams,
Lori C. Beerman,
Dimitrios A. Gouliermis,
Ata Sarajedini
Abstract:
We construct a stellar cluster catalog for the Panchromatic Hubble Andromeda Treasury (PHAT) survey using image classifications collected from the Andromeda Project citizen science website. We identify 2,753 clusters and 2,270 background galaxies within ~0.5 deg$^2$ of PHAT imaging searched, or ~400 kpc$^2$ in deprojected area at the distance of the Andromeda galaxy (M31). These identifications re…
▽ More
We construct a stellar cluster catalog for the Panchromatic Hubble Andromeda Treasury (PHAT) survey using image classifications collected from the Andromeda Project citizen science website. We identify 2,753 clusters and 2,270 background galaxies within ~0.5 deg$^2$ of PHAT imaging searched, or ~400 kpc$^2$ in deprojected area at the distance of the Andromeda galaxy (M31). These identifications result from 1.82 million classifications of ~20,000 individual images (totaling ~7 gigapixels) by tens of thousands of volunteers. We show that our crowd-sourced approach, which collects >80 classifications per image, provides a robust, repeatable method of cluster identification. The high spatial resolution Hubble Space Telescope images resolve individual stars in each cluster and are instrumental in the factor of ~6 increase in the number of clusters known within the survey footprint. We measure integrated photometry in six filter passbands, ranging from the near-UV to the near-IR. PHAT clusters span a range of ~8 magnitudes in F475W (g-band) luminosity, equivalent to ~4 decades in cluster mass. We perform catalog completeness analysis using >3000 synthetic cluster simulations to determine robust detection limits and demonstrate that the catalog is 50% complete down to ~500 solar masses for ages <100 Myr. We include catalogs of clusters, background galaxies, remaining unselected candidates, and synthetic cluster simulations, making all information publicly available to the community. The catalog published here serves as the definitive base data product for PHAT cluster science, providing a census of star clusters in an L$^*$ spiral galaxy with unmatched sensitivity and quality.
△ Less
Submitted 20 January, 2015;
originally announced January 2015.
-
A small world of citations? The influence of collaboration networks on citation practices
Authors:
Matthew L. Wallace,
Vincent Larivière,
Yves Gingras
Abstract:
This paper examines the proximity of authors to those they cite using degrees of separation in a co-author network, essentially using collaboration networks to expand on the notion of self-citations. While the proportion of direct self-citations (including co-authors of both citing and cited papers) is relatively constant in time and across specialties in the natural sciences (10% of citations) an…
▽ More
This paper examines the proximity of authors to those they cite using degrees of separation in a co-author network, essentially using collaboration networks to expand on the notion of self-citations. While the proportion of direct self-citations (including co-authors of both citing and cited papers) is relatively constant in time and across specialties in the natural sciences (10% of citations) and the social sciences (20%), the same cannot be said for citations to authors who are members of the co-author network. Differences between fields and trends over time lie not only in the degree of co-authorship which defines the large-scale topology of the collaboration network, but also in the referencing practices within a given discipline, computed by defining a propensity to cite at a given distance within the collaboration network. Overall, there is little tendency to cite those nearby in the collaboration network, excluding direct self-citations. By analyzing these social references, we characterize the social capital of local collaboration networks in terms of the knowledge production within scientific fields. These results have implications for the long-standing debate over biases common to most types of citation analysis, and for understanding citation practices across scientific disciplines over the past 50 years. In addition, our findings have important practical implications for the availability of 'arm's length' expert reviewers of grant applications and manuscripts.
△ Less
Submitted 27 July, 2011;
originally announced July 2011.
-
Modeling a Century of Citation Distributions
Authors:
Matthew L. Wallace,
Vincent Larivière,
Yves Gingras
Abstract:
Changes in citation distributions over 100 years can reveal much about the evolution of the scientific communities or disciplines. The prevalence of uncited papers or of highly-cited papers, with respect to the bulk of publications, provides important clues as to the dynamics of scientific research. Using 25 million papers and 600 million references from the Web of Science over the 1900-2006 per…
▽ More
Changes in citation distributions over 100 years can reveal much about the evolution of the scientific communities or disciplines. The prevalence of uncited papers or of highly-cited papers, with respect to the bulk of publications, provides important clues as to the dynamics of scientific research. Using 25 million papers and 600 million references from the Web of Science over the 1900-2006 period, this paper proposes a simple model based on a random selection process to explain the "uncitedness" phenomenon and its decline in recent years. We show that the proportion of uncited papers is a function of 1) the number of articles published in a given year (the competing papers) and 2) the number of articles subsequently published (the citing papers) and the number of references they contain. Using uncitedness as a departure point, we demonstrate the utility of the stretched-exponential function and a form of the Tsallis function to fit complete citation distributions over the 20th century. As opposed to simple power-law fits, for instance, both these approaches are shown to be empirically well-grounded and robust enough to better understand citation dynamics at the aggregate level. Based on an expansion of these models, on our new understanding of uncitedness and on our large dataset, we are able provide clear quantitative evidence and provisional explanations for an important shift in citation practices around 1960, unmatched in the 20th century. We also propose a revision of the "citation classic" category as a set of articles which is clearly distinguishable from the rest of the field.
△ Less
Submitted 8 October, 2008;
originally announced October 2008.
-
Why it has become more difficult to predict Nobel Prize winners: a bibliometric analysis of Nominees and Winners of the Chemistry and Physics Prizes (1901-2007)
Authors:
Yves Gingras,
Matthew L. Wallace
Abstract:
We propose a comprehensive bibliometric study of the profile of Nobel prizewinners in chemistry and physics from 1901 to 2007, based on citation data available over the same period. The data allows us to observe the evolution of the profiles of winners in the years leading up to (and following) nominations and awarding of the Nobel Prize. The degree centrality and citation rankings in these fiel…
▽ More
We propose a comprehensive bibliometric study of the profile of Nobel prizewinners in chemistry and physics from 1901 to 2007, based on citation data available over the same period. The data allows us to observe the evolution of the profiles of winners in the years leading up to (and following) nominations and awarding of the Nobel Prize. The degree centrality and citation rankings in these fields confirm that the Prize is awarded at the peak of the winners' careers, despite brief a Halo Effect observable in the years following the attribution of the Prize. Changes in the size and organization of the two fields result in a rapid decline of predictive power of bibliometric data over the century. This can be explained not only by the growing size and fragmentation of the two disciplines, but also, at least in the case of physics, by an implicit hierarchy in the most legitimate topics within the discipline, as well as among the scientists selected for the Prize. Furthermore, the lack of readily-identifiable dominant contemporary physicists suggests that there are few new paradigm shifts within the field, as perceived by the scientific community as a whole.
△ Less
Submitted 18 August, 2008;
originally announced August 2008.
-
A new approach for detecting scientific specialties from raw cocitation networks
Authors:
Matthew L. Wallace,
Yves Gingras,
Russell Duhon
Abstract:
We use a technique recently developed by Blondel et al. (2008) in order to detect scientific specialties from author cocitation networks. This algorithm has distinct advantages over most of the previous methods used to obtain cocitation "clusters", since it avoids the use of similarity measures, relies entirely on the topology of the weighted network and can be applied to relatively large networ…
▽ More
We use a technique recently developed by Blondel et al. (2008) in order to detect scientific specialties from author cocitation networks. This algorithm has distinct advantages over most of the previous methods used to obtain cocitation "clusters", since it avoids the use of similarity measures, relies entirely on the topology of the weighted network and can be applied to relatively large networks. Most importantly, it requires no subjective interpretation of the cocitation data or of the communities found. Using two examples, we show that the resulting specialties are the smallest coherent "group" of researchers (within a hierarchy of cluster sizes) and can thus be identified unambiguously. Furthermore, we confirm that these communities are indeed representative of what we know about the structure of a given scientific discipline and that, as specialties, they can be accurately characterized by a few keywords (from the publication titles). We argue that this robust and efficient algorithm is particularly well-suited to cocitation networks, and that the results generated can be of great use to researchers studying various facets of the structure and evolution of science.
△ Less
Submitted 30 July, 2008;
originally announced July 2008.
-
Shear-induced overaging in a polymer glass
Authors:
Matthew L. Wallace,
Bela Joos
Abstract:
A phenomenon recently coined as ``overaging'' implies a slowdown in the collective (slow) relaxation modes of a glass when a transient shear strain is imposed. We are able to reproduce this behavior in simulations of a supercooled polymer melt by imposing instantaneous shear deformations. The increases in relaxation times $Δτ_{1/2}$ rise rapidly with deformation, becoming exponential in the plas…
▽ More
A phenomenon recently coined as ``overaging'' implies a slowdown in the collective (slow) relaxation modes of a glass when a transient shear strain is imposed. We are able to reproduce this behavior in simulations of a supercooled polymer melt by imposing instantaneous shear deformations. The increases in relaxation times $Δτ_{1/2}$ rise rapidly with deformation, becoming exponential in the plastic regime. This ``overaging'' is distinct from standard aging. We find increases in pressure, bond-orientational order and in the average energy of the inherent structures ($<e_{IS}>$) of the system, all dependent on the size of the deformation. The observed change in behavior from elastic to plastic deformation suggests a link to the physics of the ``jammed state''
△ Less
Submitted 28 June, 2005;
originally announced June 2005.
-
The rigidity transition in polymer melts with van der Waals interactions
Authors:
Matthew L. Wallace,
Bela Joos,
Michael Plischke
Abstract:
We study the onset of rigidity near the glass transition (GT) in a short-chain polymer melt modelled by a bead-spring model, where all beads interact with Lennard-Jones potentials. The properties of the system are examined above and below the GT. In order to minimize high cooling-rate effects and computational times, equilibrium configurations are reached via isothermal compression. We monitor q…
▽ More
We study the onset of rigidity near the glass transition (GT) in a short-chain polymer melt modelled by a bead-spring model, where all beads interact with Lennard-Jones potentials. The properties of the system are examined above and below the GT. In order to minimize high cooling-rate effects and computational times, equilibrium configurations are reached via isothermal compression. We monitor quantities such as the heat capacity C_P, the short-time diffusion constants D, the viscosity η, and the shear modulus; the time-dependent shear modulus G(t) is compared with the shear modulus μobtained from an externally applied instantaneous shear. We give a detailed analysis of the effects of such shearing on the system, both locally and globally. It is found that the polymeric glass only displays long-time rigid behavior below a temperature T_1, where T_1<T_G. Furthermore, the linear and non-linear relaxation regimes under applied shear are discussed.
△ Less
Submitted 21 July, 2004; v1 submitted 13 February, 2004;
originally announced February 2004.