-
Scientific productivity as a random walk
Authors:
Sam Zhang,
Nicholas LaBerge,
Samuel F. Way,
Daniel B. Larremore,
Aaron Clauset
Abstract:
The expectation that scientific productivity follows regular patterns over a career underpins many scholarly evaluations, including hiring, promotion and tenure, awards, and grant funding. However, recent studies of individual productivity patterns reveal a puzzle: on the one hand, the average number of papers published per year robustly follows the "canonical trajectory" of a rapid rise to an ear…
▽ More
The expectation that scientific productivity follows regular patterns over a career underpins many scholarly evaluations, including hiring, promotion and tenure, awards, and grant funding. However, recent studies of individual productivity patterns reveal a puzzle: on the one hand, the average number of papers published per year robustly follows the "canonical trajectory" of a rapid rise to an early peak followed by a graduate decline, but on the other hand, only about 20% of individual researchers' productivity follows this pattern. We resolve this puzzle by modeling scientific productivity as a parameterized random walk, showing that the canonical pattern can be explained as a decrease in the variance in changes to productivity in the early-to-mid career. By empirically characterizing the variable structure of 2,085 productivity trajectories of computer science faculty at 205 PhD-granting institutions, spanning 29,119 publications over 1980--2016, we (i) discover remarkably simple patterns in both early-career and year-to-year changes to productivity, and (ii) show that a random walk model of productivity both reproduces the canonical trajectory in the average productivity and captures much of the diversity of individual-level trajectories. These results highlight the fundamental role of a panoply of contingent factors in sha** individual scientific productivity, opening up new avenues for characterizing how systemic incentives and opportunities can be directed for aggregate effect.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Environmental Changes and the Dynamics of Musical Identity
Authors:
Samuel F. Way,
Santiago Gil,
Ian Anderson,
Aaron Clauset
Abstract:
Musical tastes reflect our unique values and experiences, our relationships with others, and the places where we live. But as each of these things changes, do our tastes also change to reflect the present, or remain fixed, reflecting our past? Here, we investigate how where a person lives shapes their musical preferences, using geographic relocation to construct quasi-natural experiments that meas…
▽ More
Musical tastes reflect our unique values and experiences, our relationships with others, and the places where we live. But as each of these things changes, do our tastes also change to reflect the present, or remain fixed, reflecting our past? Here, we investigate how where a person lives shapes their musical preferences, using geographic relocation to construct quasi-natural experiments that measure short- and long-term effects. Analyzing comprehensive data on over 16 million users on Spotify, we show that relocation within the United States has only a small impact on individuals' tastes, which remain more similar to those of their past environments. We then show that the age gap between a person and the music they consume indicates that adolescence, and likely their environment during these years, shapes their lifelong musical tastes. Our results demonstrate the robustness of individuals' musical identity, and shed new light on the development of preferences.
△ Less
Submitted 9 April, 2019;
originally announced April 2019.
-
Prestige drives epistemic inequality in the diffusion of scientific ideas
Authors:
Allison C. Morgan,
Dimitrios J. Economou,
Samuel F. Way,
Aaron Clauset
Abstract:
The spread of ideas in the scientific community is often viewed as a competition, in which good ideas spread further because of greater intrinsic fitness, and publication venue and citation counts correlate with importance and impact. However, relatively little is known about how structural factors influence the spread of ideas, and specifically how where an idea originates might influence how it…
▽ More
The spread of ideas in the scientific community is often viewed as a competition, in which good ideas spread further because of greater intrinsic fitness, and publication venue and citation counts correlate with importance and impact. However, relatively little is known about how structural factors influence the spread of ideas, and specifically how where an idea originates might influence how it spreads. Here, we investigate the role of faculty hiring networks, which embody the set of researcher transitions from doctoral to faculty institutions, in sha** the spread of ideas in computer science, and the importance of where in the network an idea originates. We consider comprehensive data on the hiring events of 5032 faculty at all 205 Ph.D.-granting departments of computer science in the U.S. and Canada, and on the timing and titles of 200,476 associated publications. Analyzing five popular research topics, we show empirically that faculty hiring can and does facilitate the spread of ideas in science. Having established such a mechanism, we then analyze its potential consequences using epidemic models to simulate the generic spread of research ideas and quantify the impact of where an idea originates on its longterm diffusion across the network. We find that research from prestigious institutions spreads more quickly and completely than work of similar quality originating from less prestigious institutions. Our analyses establish the theoretical trade-offs between university prestige and the quality of ideas necessary for efficient circulation. Our results establish faculty hiring as an underlying mechanism that drives the persistent epistemic advantage observed for elite institutions, and provide a theoretical lower bound for the impact of structural inequality in sha** the spread of ideas in science.
△ Less
Submitted 22 October, 2018; v1 submitted 24 May, 2018;
originally announced May 2018.
-
Automatically assembling a full census of an academic field
Authors:
Allison C. Morgan,
Samuel F. Way,
Aaron Clauset
Abstract:
The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain, complicating efforts to study workforce dynamics and the effects of policy. This is particularly tr…
▽ More
The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain, complicating efforts to study workforce dynamics and the effects of policy. This is particularly true in computer science, which lacks a single, all-encompassing directory or professional organization. A full census of computer science would serve many purposes, not the least of which is a better understanding of the trends and causes of unequal representation in computing. Previous academic census efforts have relied on narrow or biased samples, or on professional society membership rolls. A full census can be constructed directly from online departmental faculty directories, but doing so by hand is prohibitively expensive and time-consuming. Here, we introduce a topical web crawler for automating the collection of faculty information from web-based department rosters, and demonstrate the resulting system on the 205 PhD-granting computer science departments in the U.S. and Canada. This method constructs a complete census of the field within a few minutes, and achieves over 99% precision and recall. We conclude by comparing the resulting 2017 census to a hand-curated 2011 census to quantify turnover and retention in computer science, in general and for female faculty in particular, demonstrating the types of analysis made possible by automated census construction.
△ Less
Submitted 26 April, 2018; v1 submitted 8 April, 2018;
originally announced April 2018.
-
Network assembly of scientific communities of varying size and specificity
Authors:
Daniel T. Citron,
Samuel F. Way
Abstract:
How does the collaboration network of researchers coalesce around a scientific topic? What sort of social restructuring occurs as a new field develops? Previous empirical explorations of these questions have examined the evolution of co-authorship networks associated with several fields of science, each noting a characteristic shift in network structure as fields develop. Historically, however, su…
▽ More
How does the collaboration network of researchers coalesce around a scientific topic? What sort of social restructuring occurs as a new field develops? Previous empirical explorations of these questions have examined the evolution of co-authorship networks associated with several fields of science, each noting a characteristic shift in network structure as fields develop. Historically, however, such studies have tended to rely on manually annotated datasets and therefore only consider a handful of disciplines, calling into question the universality of the observed structural signature.To overcome this limitation and test the robustness of this phenomenon, we use a comprehensive dataset of over 189,000 scientific articles and develop a framework for partitioning articles and their authors into coherent, semantically-related groups representing scientific fields of varying size and specificity. We then use the resulting population of fields to study the structure of evolving co-authorship networks. Consistent with earlier findings, we observe a global topological transition as the co-authorship networks coalesce from a disjointed aggregate into a dense giant connected component that dominates the network. We validate these results using a separate, complimentary corpus of scientific articles, and, overall, we find that the previously reported characteristic structural evolution of a scientific field's associated co-authorship network is robust across a large number of scientific fields of varying size, scope, and specificity. Additionally, the framework developed in this study may be used in other scientometric contexts in order to extend studies to compare across a larger range of scientific disciplines.
△ Less
Submitted 15 January, 2018;
originally announced January 2018.
-
The misleading narrative of the canonical faculty productivity trajectory
Authors:
Samuel F. Way,
Allison C. Morgan,
Aaron Clauset,
Daniel B. Larremore
Abstract:
A scientist may publish tens or hundreds of papers over a career, but these contributions are not evenly spaced in time. Sixty years of studies on career productivity patterns in a variety of fields suggest an intuitive and universal pattern: productivity tends to rise rapidly to an early peak and then gradually declines. Here, we test the universality of this conventional narrative by analyzing t…
▽ More
A scientist may publish tens or hundreds of papers over a career, but these contributions are not evenly spaced in time. Sixty years of studies on career productivity patterns in a variety of fields suggest an intuitive and universal pattern: productivity tends to rise rapidly to an early peak and then gradually declines. Here, we test the universality of this conventional narrative by analyzing the structures of individual faculty productivity time series, constructed from over 200,000 publications and matched with hiring data for 2453 tenure-track faculty in all 205 Ph.D-granting computer science departments in the U.S. and Canada. Unlike prior studies, which considered only some faculty or some institutions, or lacked common career reference points, here we combine a large bibliographic dataset with comprehensive information on career transitions that covers an entire field of study. We show that the conventional narrative confidently describes only one fifth of faculty, regardless of department prestige or researcher gender, and the remaining four fifths of faculty exhibit a rich diversity of productivity patterns. To explain this diversity, we introduce a simple model of productivity trajectories, and explore correlations between its parameters and researcher covariates, showing that departmental prestige predicts overall individual productivity and the timing of the transition from first- to last-author publications. These results demonstrate the unpredictability of productivity over time, and open the door for new efforts to understand how environmental and individual factors shape scientific productivity.
△ Less
Submitted 17 October, 2017; v1 submitted 24 December, 2016;
originally announced December 2016.
-
Gender, Productivity, and Prestige in Computer Science Faculty Hiring Networks
Authors:
Samuel F. Way,
Daniel B. Larremore,
Aaron Clauset
Abstract:
Women are dramatically underrepresented in computer science at all levels in academia and account for just 15% of tenure-track faculty. Understanding the causes of this gender imbalance would inform both policies intended to rectify it and employment decisions by departments and individuals. Progress in this direction, however, is complicated by the complexity and decentralized nature of faculty h…
▽ More
Women are dramatically underrepresented in computer science at all levels in academia and account for just 15% of tenure-track faculty. Understanding the causes of this gender imbalance would inform both policies intended to rectify it and employment decisions by departments and individuals. Progress in this direction, however, is complicated by the complexity and decentralized nature of faculty hiring and the non-independence of hires. Using comprehensive data on both hiring outcomes and scholarly productivity for 2659 tenure-track faculty across 205 Ph.D.-granting departments in North America, we investigate the multi-dimensional nature of gender inequality in computer science faculty hiring through a network model of the hiring process. Overall, we find that hiring outcomes are most directly affected by (i) the relative prestige between hiring and placing institutions and (ii) the scholarly productivity of the candidates. After including these, and other features, the addition of gender did not significantly reduce modeling error. However, gender differences do exist, e.g., in scholarly productivity, postdoctoral training rates, and in career movements up the rankings of universities, suggesting that the effects of gender are indirectly incorporated into hiring decisions through gender's covariates. Furthermore, we find evidence that more highly ranked departments recruit female faculty at higher than expected rates, which appears to inhibit similar efforts by lower ranked departments. These findings illustrate the subtle nature of gender inequality in faculty hiring networks and provide new insights to the underrepresentation of women in computer science.
△ Less
Submitted 2 February, 2016;
originally announced February 2016.
-
Assembling thefacebook: Using heterogeneity to understand online social network assembly
Authors:
Abigail Z. Jacobs,
Samuel F. Way,
Johan Ugander,
Aaron Clauset
Abstract:
Online social networks represent a popular and diverse class of social media systems. Despite this variety, each of these systems undergoes a general process of online social network assembly, which represents the complicated and heterogeneous changes that transform newly born systems into mature platforms. However, little is known about this process. For example, how much of a network's assembly…
▽ More
Online social networks represent a popular and diverse class of social media systems. Despite this variety, each of these systems undergoes a general process of online social network assembly, which represents the complicated and heterogeneous changes that transform newly born systems into mature platforms. However, little is known about this process. For example, how much of a network's assembly is driven by simple growth? How does a network's structure change as it matures? How does network structure vary with adoption rates and user heterogeneity, and do these properties play different roles at different points in the assembly? We investigate these and other questions using a unique dataset of online connections among the roughly one million users at the first 100 colleges admitted to Facebook, captured just 20 months after its launch. We first show that different vintages and adoption rates across this population of networks reveal temporal dynamics of the assembly process, and that assembly is only loosely related to network growth. We then exploit natural experiments embedded in this dataset and complementary data obtained via Internet archaeology to show that different subnetworks matured at different rates toward similar end states. These results shed light on the processes and patterns of online social network assembly, and may facilitate more effective design for online social systems.
△ Less
Submitted 31 May, 2015; v1 submitted 23 March, 2015;
originally announced March 2015.