-
The Memory of Science: Inflation, Myopia, and the Knowledge Network
Authors:
Raj K. Pan,
Alexander M. Petersen,
Fabio Pammolli,
Santo Fortunato
Abstract:
Science is a growing system, exhibiting ~4% annual growth in publications and ~1.8% annual growth in the number of references per publication. Combined these trends correspond to a 12-year doubling period in the total supply of references, thereby challenging traditional methods of evaluating scientific production, from researchers to institutions. Against this background, we analyzed a citation n…
▽ More
Science is a growing system, exhibiting ~4% annual growth in publications and ~1.8% annual growth in the number of references per publication. Combined these trends correspond to a 12-year doubling period in the total supply of references, thereby challenging traditional methods of evaluating scientific production, from researchers to institutions. Against this background, we analyzed a citation network comprised of 837 million references produced by 32.6 million publications over the period 1965-2012, allowing for a temporal analysis of the `attention economy' in science. Unlike previous studies, we analyzed the entire probability distribution of reference ages - the time difference between a citing and cited paper - thereby capturing previously overlooked trends. Over this half-century period we observe a narrowing range of attention - both classic and recent literature are being cited increasingly less, pointing to the important role of socio-technical processes. To better understand the impact of exponential growth on the underlying knowledge network we develop a network-based model, featuring the redirection of scientific attention via publications' reference lists, and validate the model against several empirical benchmarks. We then use the model to test the causal impact of real paradigm shifts, thereby providing guidance for science policy analysis. In particular, we show how perturbations to the growth rate of scientific output affects the reference age distribution and the functionality of the vast science citation network as an aid for the search & retrieval of knowledge. In order to account for the inflation of science, our study points to the need for a systemic overhaul of the counting methods used to evaluate citation impact - especially in the case of evaluating science careers, which can span several decades and thus several doubling periods.
△ Less
Submitted 19 July, 2016;
originally announced July 2016.
-
Attention decay in science
Authors:
Pietro Della Briotta Parolo,
Raj Kumar Pan,
Rumi Ghosh,
Bernardo A. Huberman,
Kimmo Kaski,
Santo Fortunato
Abstract:
The exponential growth in the number of scientific papers makes it increasingly difficult for researchers to keep track of all the publications relevant to their work. Consequently, the attention that can be devoted to individual papers, measured by their citation counts, is bound to decay rapidly. In this work we make a thorough study of the life-cycle of papers in different disciplines. Typicall…
▽ More
The exponential growth in the number of scientific papers makes it increasingly difficult for researchers to keep track of all the publications relevant to their work. Consequently, the attention that can be devoted to individual papers, measured by their citation counts, is bound to decay rapidly. In this work we make a thorough study of the life-cycle of papers in different disciplines. Typically, the citation rate of a paper increases up to a few years after its publication, reaches a peak and then decreases rapidly. This decay can be described by an exponential or a power law behavior, as in ultradiffusive processes, with exponential fitting better than power law for the majority of cases. The decay is also becoming faster over the years, signaling that nowadays papers are forgotten more quickly. However, when time is counted in terms of the number of published papers, the rate of decay of citations is fairly independent of the period considered. This indicates that the attention of scholars depends on the number of published items, and not on real time.
△ Less
Submitted 23 November, 2015; v1 submitted 6 March, 2015;
originally announced March 2015.
-
The Nobel Prize delay
Authors:
Francesco Becattini,
Arnab Chatterjee,
Santo Fortunato,
Marija Mitrović,
Raj Kumar Pan,
Pietro Della Briotta Parolo
Abstract:
The time lag between the publication of a Nobel discovery and the conferment of the prize has been rapidly increasing for all disciplines, especially for Physics. Does this mean that fundamental science is running out of groundbreaking discoveries?
The time lag between the publication of a Nobel discovery and the conferment of the prize has been rapidly increasing for all disciplines, especially for Physics. Does this mean that fundamental science is running out of groundbreaking discoveries?
△ Less
Submitted 28 May, 2014;
originally announced May 2014.
-
Effects of temporal correlations on cascades: Threshold models on temporal networks
Authors:
Ville-Pekka Backlund,
Jari Saramäki,
Raj Kumar Pan
Abstract:
A person's decision to adopt an idea or product is often driven by the decisions of peers, mediated through a network of social ties. A common way of modeling adoption dynamics is to use threshold models, where a node may become an adopter given a high enough rate of contacts with adopted neighbors. We study the dynamics of threshold models that take both the network topology and the timings of co…
▽ More
A person's decision to adopt an idea or product is often driven by the decisions of peers, mediated through a network of social ties. A common way of modeling adoption dynamics is to use threshold models, where a node may become an adopter given a high enough rate of contacts with adopted neighbors. We study the dynamics of threshold models that take both the network topology and the timings of contacts into account, using empirical contact sequences as substrates. The models are designed such that adoption is driven by the number of contacts with different adopted neighbors within a chosen time. We find that while some networks support cascades leading to network-level adoption, some do not: the propagation of adoption depends on several factors from the frequency of contacts to burstiness and timing correlations of contact sequences. More specifically, burstiness is seen to suppress cascades sizes when compared to randomised contact timings, while timing correlations between contacts on adjacent links facilitate cascades.
△ Less
Submitted 27 June, 2014; v1 submitted 5 March, 2014;
originally announced March 2014.
-
Author Impact Factor: tracking the dynamics of individual scientific impact
Authors:
Raj Kumar Pan,
Santo Fortunato
Abstract:
The impact factor (IF) of scientific journals has acquired a major role in the evaluations of the output of scholars, departments and whole institutions. Typically papers appearing in journals with large values of the IF receive a high weight in such evaluations. However, at the end of the day one is interested in assessing the impact of individuals, rather than papers. Here we introduce Author Im…
▽ More
The impact factor (IF) of scientific journals has acquired a major role in the evaluations of the output of scholars, departments and whole institutions. Typically papers appearing in journals with large values of the IF receive a high weight in such evaluations. However, at the end of the day one is interested in assessing the impact of individuals, rather than papers. Here we introduce Author Impact Factor (AIF), which is the extension of the IF to authors. The AIF of an author A in year $t$ is the average number of citations given by papers published in year $t$ to papers published by A in a period of $Δt$ years before year $t$. Due to its intrinsic dynamic character, AIF is capable to capture trends and variations of the impact of the scientific output of scholars in time, unlike the $h$-index, which is a growing measure taking into account the whole career path.
△ Less
Submitted 12 May, 2014; v1 submitted 9 December, 2013;
originally announced December 2013.
-
On the Predictability of Future Impact in Science
Authors:
Orion Penner,
Raj Kumar Pan,
Alexander M. Petersen,
Kimmo Kaski,
Santo Fortunato
Abstract:
Correctly assessing a scientist's past research impact and potential for future impact is key in recruitment decisions and other evaluation processes. While a candidate's future impact is the main concern for these decisions, most measures only quantify the impact of previous work. Recently, it has been argued that linear regression models are capable of predicting a scientist's future impact. By…
▽ More
Correctly assessing a scientist's past research impact and potential for future impact is key in recruitment decisions and other evaluation processes. While a candidate's future impact is the main concern for these decisions, most measures only quantify the impact of previous work. Recently, it has been argued that linear regression models are capable of predicting a scientist's future impact. By applying that future impact model to 762 careers drawn from three disciplines: physics, biology, and mathematics, we identify a number of subtle, but critical, flaws in current models. Specifically, cumulative non-decreasing measures like the h-index contain intrinsic autocorrelation, resulting in significant overestimation of their "predictive power". Moreover, the predictive power of these models depend heavily upon scientists' career age, producing least accurate estimates for young researchers. Our results place in doubt the suitability of such models, and indicate further investigation is required before they can be used in recruiting decisions.
△ Less
Submitted 29 October, 2013; v1 submitted 1 June, 2013;
originally announced June 2013.
-
The case for caution in predicting scientists' future impact
Authors:
Orion Penner,
Raj K. Pan,
Alexander M. Petersen,
Santo Fortunato
Abstract:
We stress-test the career predictability model proposed by Acuna et al. [Nature 489, 201-202 2012] by applying their model to a longitudinal career data set of 100 Assistant professors in physics, two from each of the top 50 physics departments in the US. The Acuna model claims to predict h(t+Δt), a scientist's h-index Δt years into the future, using a linear combination of 5 cumulative career mea…
▽ More
We stress-test the career predictability model proposed by Acuna et al. [Nature 489, 201-202 2012] by applying their model to a longitudinal career data set of 100 Assistant professors in physics, two from each of the top 50 physics departments in the US. The Acuna model claims to predict h(t+Δt), a scientist's h-index Δt years into the future, using a linear combination of 5 cumulative career measures taken at career age t. Here we investigate how the "predictability" depends on the aggregation of career data across multiple age cohorts. We confirm that the Acuna model does a respectable job of predicting h(t+Δt) up to roughly 6 years into the future when aggregating all age cohorts together. However, when calculated using subsets of specific age cohorts (e.g. using data for only t=3), we find that the model's predictive power significantly decreases, especially when applied to early career years. For young careers, the model does a much worse job of predicting future impact, and hence, exposes a serious limitation. The limitation is particularly concerning as early career decisions make up a significant portion, if not the majority, of cases where quantitative approaches are likely to be applied.
△ Less
Submitted 2 April, 2013;
originally announced April 2013.
-
Reputation and Impact in Academic Careers
Authors:
Alexander M. Petersen,
Santo Fortunato,
Raj K. Pan,
Kimmo Kaski,
Orion Penner,
Armando Rungi,
Massimo Riccaboni,
H. Eugene Stanley,
Fabio Pammolli
Abstract:
Reputation is an important social construct in science, which enables informed quality assessments of both publications and careers of scientists in the absence of complete systemic information. However, the relation between reputation and career growth of an individual remains poorly understood, despite recent proliferation of quantitative research evaluation methods. Here we develop an original…
▽ More
Reputation is an important social construct in science, which enables informed quality assessments of both publications and careers of scientists in the absence of complete systemic information. However, the relation between reputation and career growth of an individual remains poorly understood, despite recent proliferation of quantitative research evaluation methods. Here we develop an original framework for measuring how a publication's citation rate $Δc$ depends on the reputation of its central author $i$, in addition to its net citation count $c$. To estimate the strength of the reputation effect, we perform a longitudinal analysis on the careers of 450 highly-cited scientists, using the total citations $C_{i}$ of each scientist as his/her reputation measure. We find a citation crossover $c_{\times}$ which distinguishes the strength of the reputation effect. For publications with $c < c_{\times}$, the author's reputation is found to dominate the annual citation rate. Hence, a new publication may gain a significant early advantage corresponding to roughly a 66% increase in the citation rate for each tenfold increase in $C_{i}$. However, the reputation effect becomes negligible for highly cited publications meaning that for $c\geq c_{\times}$ the citation rate measures scientific impact more transparently. In addition we have developed a stochastic reputation model, which is found to reproduce numerous statistical observations for real careers, thus providing insight into the microscopic mechanisms underlying cumulative advantage in science.
△ Less
Submitted 7 October, 2014; v1 submitted 28 March, 2013;
originally announced March 2013.
-
World citation and collaboration networks: uncovering the role of geography in science
Authors:
Raj Kumar Pan,
Kimmo Kaski,
Santo Fortunato
Abstract:
Modern information and communication technologies, especially the Internet, have diminished the role of spatial distances and territorial boundaries on the access and transmissibility of information. This has enabled scientists for closer collaboration and internationalization. Nevertheless, geography remains an important factor affecting the dynamics of science. Here we present a systematic analy…
▽ More
Modern information and communication technologies, especially the Internet, have diminished the role of spatial distances and territorial boundaries on the access and transmissibility of information. This has enabled scientists for closer collaboration and internationalization. Nevertheless, geography remains an important factor affecting the dynamics of science. Here we present a systematic analysis of citation and collaboration networks between cities and countries, by assigning papers to the geographic locations of their authors' affiliations. The citation flows as well as the collaboration strengths between cities decrease with the distance between them and follow gravity laws. In addition, the total research impact of a country grows linearly with the amount of national funding for research & development. However, the average impact reveals a peculiar threshold effect: the scientific output of a country may reach an impact larger than the world average only if the country invests more than about 100,000 USD per researcher annually.
△ Less
Submitted 17 December, 2012; v1 submitted 4 September, 2012;
originally announced September 2012.
-
The evolution of interdisciplinarity in physics research
Authors:
Raj Kumar Pan,
Sitabhra Sinha,
Kimmo Kaski,
Jari Saramäki
Abstract:
Science, being a social enterprise, is subject to fragmentation into groups that focus on specialized areas or topics. Often new advances occur through cross-fertilization of ideas between sub-fields that otherwise have little overlap as they study dissimilar phenomena using different techniques. Thus to explore the nature and dynamics of scientific progress one needs to consider the large-scale o…
▽ More
Science, being a social enterprise, is subject to fragmentation into groups that focus on specialized areas or topics. Often new advances occur through cross-fertilization of ideas between sub-fields that otherwise have little overlap as they study dissimilar phenomena using different techniques. Thus to explore the nature and dynamics of scientific progress one needs to consider the large-scale organization and interactions between different subject areas. Here, we study the relationships between the sub-fields of Physics using the Physics and Astronomy Classification Scheme (PACS) codes employed for self-categorization of articles published over the past 25 years (1985-2009). We observe a clear trend towards increasing interactions between the different sub-fields. The network of sub-fields also exhibits core-periphery organization, the nucleus being dominated by Condensed Matter and General Physics. However, over time Interdisciplinary Physics is steadily increasing its share in the network core, reflecting a shift in the overall trend of Physics research.
△ Less
Submitted 16 August, 2012; v1 submitted 1 June, 2012;
originally announced June 2012.
-
Multiscale Analysis of Spreading in a Large Communication Network
Authors:
Mikko Kivelä,
Raj Kumar Pan,
Kimmo Kaski,
János Kertész,
Jari Saramäki,
Márton Karsai
Abstract:
In temporal networks, both the topology of the underlying network and the timings of interaction events can be crucial in determining how some dynamic process mediated by the network unfolds. We have explored the limiting case of the speed of spreading in the SI model, set up such that an event between an infectious and susceptible individual always transmits the infection. The speed of this proce…
▽ More
In temporal networks, both the topology of the underlying network and the timings of interaction events can be crucial in determining how some dynamic process mediated by the network unfolds. We have explored the limiting case of the speed of spreading in the SI model, set up such that an event between an infectious and susceptible individual always transmits the infection. The speed of this process sets an upper bound for the speed of any dynamic process that is mediated through the interaction events of the network. With the help of temporal networks derived from large scale time-stamped data on mobile phone calls, we extend earlier results that point out the slowing-down effects of burstiness and temporal inhomogeneities. In such networks, links are not permanently active, but dynamic processes are mediated by recurrent events taking place on the links at specific points in time. We perform a multi-scale analysis and pinpoint the importance of the timings of event sequences on individual links, their correlations with neighboring sequences, and the temporal pathways taken by the network-scale spreading process. This is achieved by studying empirically and analytically different characteristic relay times of links, relevant to the respective scales, and a set of temporal reference models that allow for removing selected time-domain correlations one by one.
△ Less
Submitted 19 December, 2011;
originally announced December 2011.
-
The strength of strong ties in scientific collaboration networks
Authors:
Raj Kumar Pan,
Jari Saramäki
Abstract:
Network topology and its relationship to tie strengths may hinder or enhance the spreading of information in social networks. We study the correlations between tie strengths and topology in networks of scientific collaboration, and show that these are very different from ordinary social networks. For the latter, it has earlier been shown that strong ties are associated with dense network neighborh…
▽ More
Network topology and its relationship to tie strengths may hinder or enhance the spreading of information in social networks. We study the correlations between tie strengths and topology in networks of scientific collaboration, and show that these are very different from ordinary social networks. For the latter, it has earlier been shown that strong ties are associated with dense network neighborhoods, while weaker ties act as bridges between these. Because of this, weak links act as bottlenecks for the diffusion of information. We show that on the contrary, in co-authorship networks dense local neighborhoods mainly consist of weak links, whereas strong links are more important for overall connectivity. The important role of strong links is further highlighted in simulations of information spreading, where their topological position is seen to dramatically speed up spreading dynamics. Thus, in contrast to ordinary social networks, weight-topology correlations enhance the flow of information across scientific collaboration networks.
△ Less
Submitted 11 January, 2012; v1 submitted 26 June, 2011;
originally announced June 2011.
-
Emergence of Bursts and Communities in Evolving Weighted Networks
Authors:
Hang-Hyun Jo,
Raj Kumar Pan,
Kimmo Kaski
Abstract:
Understanding the patterns of human dynamics and social interaction, and the way they lead to the formation of an organized and functional society are important issues especially for techno-social development. Addressing these issues of social networks has recently become possible through large scale data analysis of e.g. mobile phone call records, which has revealed the existence of modular or co…
▽ More
Understanding the patterns of human dynamics and social interaction, and the way they lead to the formation of an organized and functional society are important issues especially for techno-social development. Addressing these issues of social networks has recently become possible through large scale data analysis of e.g. mobile phone call records, which has revealed the existence of modular or community structure with many links between nodes of the same community and relatively few links between nodes of different communities. The weights of links, e.g. the number of calls between two users, and the network topology are found correlated such that intra-community links are stronger compared to the weak inter-community links. This is known as Granovetter's "The strength of weak ties" hypothesis. In addition to this inhomogeneous community structure, the temporal patterns of human dynamics turn out to be inhomogeneous or bursty, characterized by the heavy tailed distribution of inter-event time between two consecutive events. In this paper, we study how the community structure and the bursty dynamics emerge together in an evolving weighted network model. The principal mechanisms behind these patterns are social interaction by cyclic closure, i.e. links to friends of friends and the focal closure, i.e. links to individuals sharing similar attributes or interests, and human dynamics by task handling process. These three mechanisms have been implemented as a network model with local attachment, global attachment, and priority-based queuing processes. By comprehensive numerical simulations we show that the interplay of these mechanisms leads to the emergence of heavy tailed inter-event time distribution and the evolution of Granovetter-type community structure. Moreover, the numerical results are found to be in qualitative agreement with empirical results from mobile phone call dataset.
△ Less
Submitted 1 June, 2011;
originally announced June 2011.
-
Path lengths, correlations, and centrality in temporal networks
Authors:
Raj Kumar Pan,
Jari Saramäki
Abstract:
In temporal networks, where nodes interact via sequences of temporary events, information or resources can only flow through paths that follow the time-ordering of events. Such temporal paths play a crucial role in dynamic processes. However, since networks have so far been usually considered static or quasi-static, the properties of temporal paths are not yet well understood. Building on a defini…
▽ More
In temporal networks, where nodes interact via sequences of temporary events, information or resources can only flow through paths that follow the time-ordering of events. Such temporal paths play a crucial role in dynamic processes. However, since networks have so far been usually considered static or quasi-static, the properties of temporal paths are not yet well understood. Building on a definition and algorithmic implementation of the average temporal distance between nodes, we study temporal paths in empirical networks of human communication and air transport. Although temporal distances correlate with static graph distances, there is a large spread, and nodes that appear close from the static network view may be connected via slow paths or not at all. Differences between static and temporal properties are further highlighted in studies of the temporal closeness centrality. In addition, correlations and heterogeneities in the underlying event sequences affect temporal path lengths, increasing temporal distances in communication networks and decreasing them in the air transport network.
△ Less
Submitted 19 July, 2011; v1 submitted 31 January, 2011;
originally announced January 2011.
-
Using explosive percolation in analysis of real-world networks
Authors:
Raj Kumar Pan,
Mikko Kivelä,
Jari Saramäki,
Kimmo Kaski,
János Kertész
Abstract:
We apply a variant of the explosive percolation procedure to large real-world networks, and show with finite-size scaling that the university class, ordinary or explosive, of the resulting percolation transition depends on the structural properties of the network as well as the number of unoccupied links considered for comparison in our procedure. We observe that in our social networks, the percol…
▽ More
We apply a variant of the explosive percolation procedure to large real-world networks, and show with finite-size scaling that the university class, ordinary or explosive, of the resulting percolation transition depends on the structural properties of the network as well as the number of unoccupied links considered for comparison in our procedure. We observe that in our social networks, the percolation clusters close to the critical point are related to the community structure. This relationship is further highlighted by applying the procedure to model networks with pre-defined communities.
△ Less
Submitted 18 April, 2011; v1 submitted 15 October, 2010;
originally announced October 2010.
-
Small But Slow World: How Network Topology and Burstiness Slow Down Spreading
Authors:
M. Karsai,
M. Kivelä,
R. K. Pan,
K. Kaski,
J. Kertész,
A. -L. Barabási,
J. Saramäki
Abstract:
Communication networks show the small-world property of short paths, but the spreading dynamics in them turns out slow. We follow the time evolution of information propagation through communication networks by using the SI model with empirical data on contact sequences. We introduce null models where the sequences are randomly shuffled in different ways, enabling us to distinguish between the cont…
▽ More
Communication networks show the small-world property of short paths, but the spreading dynamics in them turns out slow. We follow the time evolution of information propagation through communication networks by using the SI model with empirical data on contact sequences. We introduce null models where the sequences are randomly shuffled in different ways, enabling us to distinguish between the contributions of different impeding effects. The slowing down of spreading is found to be caused mostly by weight-topology correlations and the bursty activity patterns of individuals.
△ Less
Submitted 22 August, 2010; v1 submitted 10 June, 2010;
originally announced June 2010.
-
Network analysis of a corpus of undeciphered Indus civilization inscriptions indicates syntactic organization
Authors:
Sitabhra Sinha,
Md Izhar Ashraf,
Raj Kumar Pan,
Bryan Kenneth Wells
Abstract:
Archaeological excavations in the sites of the Indus Valley civilization (2500-1900 BCE) in Pakistan and northwestern India have unearthed a large number of artifacts with inscriptions made up of hundreds of distinct signs. To date there is no generally accepted decipherment of these sign sequences and there have been suggestions that the signs could be non-linguistic. Here we apply complex networ…
▽ More
Archaeological excavations in the sites of the Indus Valley civilization (2500-1900 BCE) in Pakistan and northwestern India have unearthed a large number of artifacts with inscriptions made up of hundreds of distinct signs. To date there is no generally accepted decipherment of these sign sequences and there have been suggestions that the signs could be non-linguistic. Here we apply complex network analysis techniques to a database of available Indus inscriptions, with the aim of detecting patterns indicative of syntactic organization. Our results show the presence of patterns, e.g., recursive structures in the segmentation trees of the sequences, that suggest the existence of a grammar underlying these inscriptions.
△ Less
Submitted 27 May, 2010;
originally announced May 2010.