-
The disruption index suffers from citation inflation and is confounded by shifts in scholarly citation practice
Authors:
Alexander M. Petersen,
Felber Arroyave,
Fabio Pammolli
Abstract:
Measuring the rate of innovation in academia and industry is fundamental to monitoring the efficiency and competitiveness of the knowledge economy. To this end, a disruption index (CD) was recently developed and applied to publication and patent citation networks (Wu et al., Nature 2019; Park et al., Nature 2023). Here we show that CD systematically decreases over time due to secular growth in res…
▽ More
Measuring the rate of innovation in academia and industry is fundamental to monitoring the efficiency and competitiveness of the knowledge economy. To this end, a disruption index (CD) was recently developed and applied to publication and patent citation networks (Wu et al., Nature 2019; Park et al., Nature 2023). Here we show that CD systematically decreases over time due to secular growth in research and patent production, following two distinct mechanisms unrelated to innovation -- one behavioral and the other structural. Whereas the behavioral explanation reflects shifts associated with techno-social factors (e.g. self-citation practices), the structural explanation follows from `citation inflation' (CI), an inextricable feature of real citation networks attributable to increasing reference list lengths, which causes CD to systematically decrease. We demonstrate this causal link by way of mathematical deduction, computational simulation, multi-variate regression, and quasi-experimental comparison of the disruptiveness of PNAS versus PNAS Plus articles, which differ only in their lengths. Accordingly, we analyze CD data available in the SciSciNet database and find that disruptiveness incrementally increased from 2005-2015, and that the negative relationship between disruption and team-size is remarkably small in overall magnitude effect size, and shifts from negative to positive for team size $\geq$ 8 coauthors.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
The disruption index is biased by citation inflation
Authors:
Alexander M. Petersen,
Felber Arroyave,
Fabio Pammolli
Abstract:
A recent analysis of scientific publication and patent citation networks by Park et al. (Nature, 2023) suggests that publications and patents are becoming less disruptive over time. Here we show that the reported decrease in disruptiveness is an artifact of systematic shifts in the structure of citation networks unrelated to innovation system capacity. Instead, the decline is attributable to 'cita…
▽ More
A recent analysis of scientific publication and patent citation networks by Park et al. (Nature, 2023) suggests that publications and patents are becoming less disruptive over time. Here we show that the reported decrease in disruptiveness is an artifact of systematic shifts in the structure of citation networks unrelated to innovation system capacity. Instead, the decline is attributable to 'citation inflation', an unavoidable characteristic of real citation networks that manifests as a systematic time-dependent bias and renders cross-temporal analysis challenging. One driver of citation inflation is the ever-increasing lengths of reference lists over time, which in turn increases the density of links in citation networks, and causes the disruption index to converge to 0. A second driver is attributable to shifts in the construction of reference lists, which is increasingly impacted by self-citations that increase in the rate of triadic closure in citation networks, and thus confounds efforts to measure disruption, which is itself a measure of triadic closure. Combined, these two systematic shifts render the disruption index temporally biased, and unsuitable for cross-temporal analysis. The impact of this systematic bias further stymies efforts to correlate disruption to other measures that are also time-dependent, such as team size and citation counts. In order to demonstrate this fundamental measurement problem, we present three complementary lines of critique (deductive, empirical and computational modeling), and also make available an ensemble of synthetic citation networks that can be used to test alternative citation-based indices for systematic bias.
△ Less
Submitted 2 June, 2023;
originally announced June 2023.
-
Human Mobility in Response to COVID-19 in France, Italy and UK
Authors:
Alessandro Galeazzi,
Matteo Cinelli,
Giovanni Bonaccorsi,
Francesco Pierri,
Ana Lucia Schmidt,
Antonio Scala,
Fabio Pammolli,
Walter Quattrociocchi
Abstract:
The policies implemented to hinder the COVID-19 outbreak represent one of the largest critical events in history. The understanding of this process is fundamental for crafting and tailoring post-disaster relief. In this work we perform a massive data analysis, through geolocalized data from 13M Facebook users, on how such a stress affected mobility patterns in France, Italy and UK. We find that th…
▽ More
The policies implemented to hinder the COVID-19 outbreak represent one of the largest critical events in history. The understanding of this process is fundamental for crafting and tailoring post-disaster relief. In this work we perform a massive data analysis, through geolocalized data from 13M Facebook users, on how such a stress affected mobility patterns in France, Italy and UK. We find that the general reduction of the overall efficiency in the network of movements is accompanied by geographical fragmentation with a massive reduction of long-range connections. The impact, however, differs among nations according to their initial mobility structure. Indeed, we find that the mobility network after the lockdown is more concentrated in the case of France and UK and more distributed in Italy. Such a process can be approximated through percolation to quantify the substantial impact of the lockdown.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Evidence of economic segregation from mobility lockdown during COVID-19 epidemic
Authors:
Giovanni Bonaccorsi,
Francesco Pierri,
Matteo Cinelli,
Francesco Porcelli,
Alessandro Galeazzi,
Andrea Flori,
Ana Lucia Schmidt,
Carlo Michele Valensise,
Antonio Scala,
Walter Quattrociocchi,
Fabio Pammolli
Abstract:
In response to the COVID-19 pandemic, National governments have applied lockdown restrictions to reduce the infection rate. We perform a massive analysis on near real-time Italian data provided by Facebook to investigate how lockdown strategies affect economic conditions of individuals and local governments. We model the change in mobility as an exogenous shock similar to a natural disaster. We id…
▽ More
In response to the COVID-19 pandemic, National governments have applied lockdown restrictions to reduce the infection rate. We perform a massive analysis on near real-time Italian data provided by Facebook to investigate how lockdown strategies affect economic conditions of individuals and local governments. We model the change in mobility as an exogenous shock similar to a natural disaster. We identify two ways through which mobility restrictions affect Italian citizens. First, we find that the impact of lockdown is stronger in municipalities with higher fiscal capacity. Second, we find a segregation effect, since mobility restrictions are stronger in municipalities for which inequality is higher and where individuals have lower income per capita.
△ Less
Submitted 11 April, 2020;
originally announced April 2020.
-
The Memory of Science: Inflation, Myopia, and the Knowledge Network
Authors:
Raj K. Pan,
Alexander M. Petersen,
Fabio Pammolli,
Santo Fortunato
Abstract:
Science is a growing system, exhibiting ~4% annual growth in publications and ~1.8% annual growth in the number of references per publication. Combined these trends correspond to a 12-year doubling period in the total supply of references, thereby challenging traditional methods of evaluating scientific production, from researchers to institutions. Against this background, we analyzed a citation n…
▽ More
Science is a growing system, exhibiting ~4% annual growth in publications and ~1.8% annual growth in the number of references per publication. Combined these trends correspond to a 12-year doubling period in the total supply of references, thereby challenging traditional methods of evaluating scientific production, from researchers to institutions. Against this background, we analyzed a citation network comprised of 837 million references produced by 32.6 million publications over the period 1965-2012, allowing for a temporal analysis of the `attention economy' in science. Unlike previous studies, we analyzed the entire probability distribution of reference ages - the time difference between a citing and cited paper - thereby capturing previously overlooked trends. Over this half-century period we observe a narrowing range of attention - both classic and recent literature are being cited increasingly less, pointing to the important role of socio-technical processes. To better understand the impact of exponential growth on the underlying knowledge network we develop a network-based model, featuring the redirection of scientific attention via publications' reference lists, and validate the model against several empirical benchmarks. We then use the model to test the causal impact of real paradigm shifts, thereby providing guidance for science policy analysis. In particular, we show how perturbations to the growth rate of scientific output affects the reference age distribution and the functionality of the vast science citation network as an aid for the search & retrieval of knowledge. In order to account for the inflation of science, our study points to the need for a systemic overhaul of the counting methods used to evaluate citation impact - especially in the case of evaluating science careers, which can span several decades and thus several doubling periods.
△ Less
Submitted 19 July, 2016;
originally announced July 2016.
-
Disambiguation of Patent Inventors and Assignees Using High-Resolution Geolocation Data
Authors:
Greg Morrison,
Massimo Riccaboni,
Fabio Pammolli
Abstract:
Patent data represent a significant source of information on innovation and the evolution of technology through networks of citations, co-invention and co-assignment of new patents. A major obstacle to extracting useful information from this data is the problem of name disambiguation: linking alternate spellings of individuals or institutions to a single identifier to uniquely determine the partie…
▽ More
Patent data represent a significant source of information on innovation and the evolution of technology through networks of citations, co-invention and co-assignment of new patents. A major obstacle to extracting useful information from this data is the problem of name disambiguation: linking alternate spellings of individuals or institutions to a single identifier to uniquely determine the parties involved in the creation of a technology. In this paper, we describe a new algorithm that uses high-resolution geolocation to disambiguate both inventor and assignees on more than 3.6 million patents found in the European Patent Office (EPO), under the Patent Cooperation treaty (PCT), and in the US Patent and Trademark Office (USPTO). We show that our algorithm has both high precision and recall in comparison to a manual disambiguation of EPO assignee names in Boston and Paris, and show it performs well for a benchmark of USPTO inventor names that can be linked to a high-resolution address (but poorly for inventors that never provided a high quality address). The most significant benefit of this work is the high quality assignee disambiguation with worldwide coverage coupled with an inventor disambiguation that is competitive with other state of the art approaches. To our knowledge this is the broadest and most accurate simultaneous disambiguation and cross-linking of the inventor and assignee names for a significant fraction of patents in these three major patent collections.
△ Less
Submitted 13 December, 2015;
originally announced January 2016.
-
Network communities within and across borders
Authors:
Federica Cerina,
Alessandro Chessa,
Fabio Pammolli,
Massimo Riccaboni
Abstract:
We investigate the impact of borders on the topology of spatially embedded networks. Indeed territorial subdivisions and geographical borders significantly hamper the geographical span of networks thus playing a key role in the formation of network communities. This is especially important in scientific and technological policy-making, highlighting the interplay between pressure for the internatio…
▽ More
We investigate the impact of borders on the topology of spatially embedded networks. Indeed territorial subdivisions and geographical borders significantly hamper the geographical span of networks thus playing a key role in the formation of network communities. This is especially important in scientific and technological policy-making, highlighting the interplay between pressure for the internationalization to lead towards a global innovation system and the administrative borders imposed by the national and regional institutions. In this study we introduce an outreach index to quantify the impact of borders on the community structure and apply it to the case of the European and US patent co-inventors networks. We find that (a) the US connectivity decays as a power of distance, whereas we observe a faster exponential decay for Europe; (b) European network communities essentially correspond to nations and contiguous regions while US communities span multiple states across the whole country without any characteristic geographic scale. We confirm our findings by means of a set of simulations aimed at exploring the relationship between different patterns of cross-border community structures and the outreach index.
△ Less
Submitted 3 April, 2014; v1 submitted 17 November, 2013;
originally announced November 2013.
-
Reputation and Impact in Academic Careers
Authors:
Alexander M. Petersen,
Santo Fortunato,
Raj K. Pan,
Kimmo Kaski,
Orion Penner,
Armando Rungi,
Massimo Riccaboni,
H. Eugene Stanley,
Fabio Pammolli
Abstract:
Reputation is an important social construct in science, which enables informed quality assessments of both publications and careers of scientists in the absence of complete systemic information. However, the relation between reputation and career growth of an individual remains poorly understood, despite recent proliferation of quantitative research evaluation methods. Here we develop an original…
▽ More
Reputation is an important social construct in science, which enables informed quality assessments of both publications and careers of scientists in the absence of complete systemic information. However, the relation between reputation and career growth of an individual remains poorly understood, despite recent proliferation of quantitative research evaluation methods. Here we develop an original framework for measuring how a publication's citation rate $Δc$ depends on the reputation of its central author $i$, in addition to its net citation count $c$. To estimate the strength of the reputation effect, we perform a longitudinal analysis on the careers of 450 highly-cited scientists, using the total citations $C_{i}$ of each scientist as his/her reputation measure. We find a citation crossover $c_{\times}$ which distinguishes the strength of the reputation effect. For publications with $c < c_{\times}$, the author's reputation is found to dominate the annual citation rate. Hence, a new publication may gain a significant early advantage corresponding to roughly a 66% increase in the citation rate for each tenfold increase in $C_{i}$. However, the reputation effect becomes negligible for highly cited publications meaning that for $c\geq c_{\times}$ the citation rate measures scientific impact more transparently. In addition we have developed a stochastic reputation model, which is found to reproduce numerous statistical observations for real careers, thus providing insight into the microscopic mechanisms underlying cumulative advantage in science.
△ Less
Submitted 7 October, 2014; v1 submitted 28 March, 2013;
originally announced March 2013.
-
Is Europe Evolving Toward an Integrated Research Area?
Authors:
Alessandro Chessa,
Andrea Morescalchi,
Fabio Pammolli,
Orion Penner,
Alexander M. Petersen,
Massimo Riccaboni
Abstract:
An integrated European Research Area (ERA) is a critical component for a more competitive and open European R&D system. However, the impact of EU-specific integration policies aimed at overcoming innovation barriers associated with national borders is not well understood. Here we analyze 2.4 x 10^6 patent applications filed with the European Patent Office (EPO) over the 25-year period 1986-2010 al…
▽ More
An integrated European Research Area (ERA) is a critical component for a more competitive and open European R&D system. However, the impact of EU-specific integration policies aimed at overcoming innovation barriers associated with national borders is not well understood. Here we analyze 2.4 x 10^6 patent applications filed with the European Patent Office (EPO) over the 25-year period 1986-2010 along with a sample of 2.6 x 10^5 records from the ISI Web of Science to quantitatively measure the role of borders in international R&D collaboration and mobility. From these data we construct five different networks for each year analyzed: (i) the patent co-inventor network, (ii) the publication co-author network, (iii) the co-applicant patent network, (iv) the patent citation network, and (v) the patent mobility network. We use methods from network science and econometrics to perform a comparative analysis across time and between EU and non-EU countries to determine the "treatment effect" resulting from EU integration policies. Using non-EU countries as a control set, we provide quantitative evidence that, despite decades of efforts to build a European Research Area, there has been little integration above global trends in patenting and publication. This analysis provides concrete evidence that Europe remains a collection of national innovation systems.
△ Less
Submitted 13 February, 2013;
originally announced February 2013.
-
Persistence and Uncertainty in the Academic Career
Authors:
Alexander M. Petersen,
Massimo Riccaboni,
H. Eugene Stanley,
Fabio Pammolli
Abstract:
Understanding how institutional changes within academia may affect the overall potential of science requires a better quantitative representation of how careers evolve over time. Since knowledge spillovers, cumulative advantage, competition, and collaboration are distinctive features of the academic profession, both the employment relationship and the procedures for assigning recognition and alloc…
▽ More
Understanding how institutional changes within academia may affect the overall potential of science requires a better quantitative representation of how careers evolve over time. Since knowledge spillovers, cumulative advantage, competition, and collaboration are distinctive features of the academic profession, both the employment relationship and the procedures for assigning recognition and allocating funding should be designed to account for these factors. We study the annual production n_{i}(t) of a given scientist i by analyzing longitudinal career data for 200 leading scientists and 100 assistant professors from the physics community. We compare our results with 21,156 sports careers. Our empirical analysis of individual productivity dynamics shows that (i) there are increasing returns for the top individuals within the competitive cohort, and that (ii) the distribution of production growth is a leptokurtic "tent-shaped" distribution that is remarkably symmetric. Our methodology is general, and we speculate that similar features appear in other disciplines where academic publication is essential and collaboration is a key feature. We introduce a model of proportional growth which reproduces these two observations, and additionally accounts for the significantly right-skewed distributions of career longevity and achievement in science. Using this theoretical model, we show that short-term contracts can amplify the effects of competition and uncertainty making careers more vulnerable to early termination, not necessarily due to lack of individual talent and persistence, but because of random negative production shocks. We show that fluctuations in scientific production are quantitatively related to a scientist's collaboration radius and team efficiency.
△ Less
Submitted 3 April, 2012;
originally announced April 2012.
-
The Evolution of Complex Networks: A New Framework
Authors:
Guido Caldarelli,
Alessandro Chessa,
Irene Crimaldi,
Fabio Pammolli
Abstract:
We introduce a new framework for the analysis of the dynamics of networks, based on randomly reinforced urn (RRU) processes, in which the weight of the edges is determined by a reinforcement mechanism. We rigorously explain the empirical evidence that in many real networks there is a subset of "dominant edges" that control a major share of the total weight of the network. Furthermore, we introduce…
▽ More
We introduce a new framework for the analysis of the dynamics of networks, based on randomly reinforced urn (RRU) processes, in which the weight of the edges is determined by a reinforcement mechanism. We rigorously explain the empirical evidence that in many real networks there is a subset of "dominant edges" that control a major share of the total weight of the network. Furthermore, we introduce a new statistical procedure to study the evolution of networks over time, assessing if a given instance of the nework is taken at its steady state or not. Our results are quite general, since they are not based on a particular probability distribution or functional form of the weights. We test our model in the context of the International Trade Network, showing the existence of a core of dominant links and determining its size.
△ Less
Submitted 6 March, 2012;
originally announced March 2012.