Search | arXiv e-print repository

arXiv:2406.06454 [pdf, other]

Which topics are best represented by science maps? An analysis of clustering effectiveness for citation and text similarity networks

Authors: Juan Pablo Bascur, Suzan Verberne, Nees Jan van Eck, Ludo Waltman

Abstract: A science map of topics is a visualization that shows topics identified algorithmically based on the bibliographic metadata of scientific publications. In practice not all topics are well represented in a science map. We analyzed how effectively different topics are represented in science maps created by clustering biomedical publications. To achieve this, we investigated which topic categories, o… ▽ More A science map of topics is a visualization that shows topics identified algorithmically based on the bibliographic metadata of scientific publications. In practice not all topics are well represented in a science map. We analyzed how effectively different topics are represented in science maps created by clustering biomedical publications. To achieve this, we investigated which topic categories, obtained from MeSH terms, are better represented in science maps based on citation or text similarity networks. To evaluate the clustering effectiveness of topics, we determined the extent to which documents belonging to the same topic are grouped together in the same cluster. We found that the best and worst represented topic categories are the same for citation and text similarity networks. The best represented topic categories are diseases, psychology, anatomy, organisms and the techniques and equipment used for diagnostics and therapy, while the worst represented topic categories are natural science fields, geographical entities, information sciences and health care and occupations. Furthermore, for the diseases and organisms topic categories and for science maps with smaller clusters, we found that topics tend to be better represented in citation similarity networks than in text similarity networks. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2207.03299 [pdf]

Academic information retrieval using citation clusters: In-depth evaluation based on systematic reviews

Authors: Juan Pablo Bascur, Suzan Verberne, Nees Jan van Eck, Ludo Waltman

Abstract: The field of scientometrics has shown the power of citation-based clusters for literature analysis, yet this technique has barely been used for information retrieval tasks. This work evaluates the performance of citation based-clusters for information retrieval tasks. We simulated a search process using these clusters with a tree hierarchy of clusters and a cluster selection algorithm. We evaluate… ▽ More The field of scientometrics has shown the power of citation-based clusters for literature analysis, yet this technique has barely been used for information retrieval tasks. This work evaluates the performance of citation based-clusters for information retrieval tasks. We simulated a search process using these clusters with a tree hierarchy of clusters and a cluster selection algorithm. We evaluated the task of finding the relevant documents for 25 systematic reviews. Our evaluation considered several trade-offs between recall and precision for the cluster selection, and we also replicated the Boolean queries self-reported by the systematic review to serve as a reference. We found that citation-based clusters search performance is highly variable and unpredictable, that it works best for users that prefer recall over precision at a ratio between 2 and 8, and that when used along with query-based search they complement each other, including finding new relevant documents. △ Less

Submitted 5 October, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

Comments: Final version

arXiv:2202.11639 [pdf]

Funding Covid-19 research: Insights from an exploratory analysis using open data infrastructures

Authors: Alexis-Michel Mugabushaka, Nees Jan van Eck, Ludo Waltman

Abstract: To analyse the outcomes of the funding they provide, it is essential for funding agencies to be able to trace the publications resulting from their funding. We study the open availability of funding data in Crossref, focusing on funding data for publications that report research related to Covid-19. We also present a comparison with the funding data available in two proprietary bibliometric databa… ▽ More To analyse the outcomes of the funding they provide, it is essential for funding agencies to be able to trace the publications resulting from their funding. We study the open availability of funding data in Crossref, focusing on funding data for publications that report research related to Covid-19. We also present a comparison with the funding data available in two proprietary bibliometric databases: Scopus and Web of Science. Our analysis reveals a limited coverage of funding data in Crossref. It also shows problems related to the quality of funding data, especially in Scopus. We offer recommendations for improving the open availability of funding data in Crossref. △ Less

Submitted 12 July, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

Comments: this updated version is based in a bigger sample

arXiv:2107.14641 [pdf]

Investigating Disagreement in the Scientific Literature

Authors: Wout S. Lamers, Kevin Boyack, Vincent Larivière, Cassidy R. Sugimoto, Nees Jan van Eck, Ludo Waltman, Dakota Murray

Abstract: Disagreement is essential to scientific progress. However, the extent of disagreement in science, its evolution over time, and the fields in which it happens, remains poorly understood. Leveraging a massive collection of English-language scientific texts, we develop a cue-phrase based approach to identify instances of disagreement citations across more than four million scientific articles. Using… ▽ More Disagreement is essential to scientific progress. However, the extent of disagreement in science, its evolution over time, and the fields in which it happens, remains poorly understood. Leveraging a massive collection of English-language scientific texts, we develop a cue-phrase based approach to identify instances of disagreement citations across more than four million scientific articles. Using this method, we construct an indicator of disagreement across scientific fields over the 2000-2015 period. In contrast with black-box text classification methods, our framework is transparent and easily interpretable. We reveal a disciplinary spectrum of disagreement, with higher disagreement in the social sciences and lower disagreement in physics and mathematics. However, detailed disciplinary analysis demonstrates heterogeneity across sub-fields, revealing the importance of local disciplinary cultures and epistemic characteristics of disagreement. Paper-level analysis reveals notable episodes of disagreement in science, and illustrates how methodological artifacts can confound analyses of scientific texts. These findings contribute to a broader understanding of disagreement and establish a foundation for future research to understanding key processes underlying scientific progress. △ Less

Submitted 27 October, 2021; v1 submitted 30 July, 2021; originally announced July 2021.

Comments: 49 pages, 10 figures

arXiv:2103.14558 [pdf]

doi 10.1007/s11192-020-03410-y

Collecting large-scale publication data at the level of individual researchers: A practical proposal for author name disambiguation

Authors: Ciriaco Andrea D'Angelo, Nees Jan van Eck

Abstract: The disambiguation of author names is an important and challenging task in bibliometrics. We propose an approach that relies on an external source of information for selecting and validating clusters of publications identified through an unsupervised author name disambiguation method. The application of the proposed approach to a random sample of Italian scholars shows encouraging results, with an… ▽ More The disambiguation of author names is an important and challenging task in bibliometrics. We propose an approach that relies on an external source of information for selecting and validating clusters of publications identified through an unsupervised author name disambiguation method. The application of the proposed approach to a random sample of Italian scholars shows encouraging results, with an overall precision, recall, and F-Measure of over 96%. The proposed approach can serve as a starting point for large-scale census of publication portfolios for bibliometric analyses at the level of individual researchers. △ Less

Submitted 26 March, 2021; originally announced March 2021.

Journal ref: Scientometrics, 123(2), 883-907 (2020)

arXiv:2005.10732 [pdf]

Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic

Authors: Martijn Visser, Nees Jan van Eck, Ludo Waltman

Abstract: We present a large-scale comparison of five multidisciplinary bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. The comparison considers scientific documents from the period 2008-2017 covered by these data sources. Scopus is compared in a pairwise manner with each of the other data sources. We first analyze differences between the data sources in the… ▽ More We present a large-scale comparison of five multidisciplinary bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. The comparison considers scientific documents from the period 2008-2017 covered by these data sources. Scopus is compared in a pairwise manner with each of the other data sources. We first analyze differences between the data sources in the coverage of documents, focusing for instance on differences over time, differences per document type, and differences per discipline. We then study differences in the completeness and accuracy of citation links. Based on our analysis, we discuss strengths and weaknesses of the different data sources. We emphasize the importance of combining a comprehensive coverage of the scientific literature with a flexible set of filters for making selections of the literature. △ Less

Submitted 17 January, 2021; v1 submitted 21 May, 2020; originally announced May 2020.

arXiv:1906.07011 [pdf]

Accuracy of citation data in Web of Science and Scopus

Authors: Nees Jan van Eck, Ludo Waltman

Abstract: We present a large-scale analysis of the accuracy of citation data in the Web of Science and Scopus databases. The analysis is based on citations given in publications in Elsevier journals. We reveal significant data quality problems for both databases. Missing and incorrect references are important problems in Web of Science. Duplicate publications are a serious problem in Scopus. We present a large-scale analysis of the accuracy of citation data in the Web of Science and Scopus databases. The analysis is based on citations given in publications in Elsevier journals. We reveal significant data quality problems for both databases. Missing and incorrect references are important problems in Web of Science. Duplicate publications are a serious problem in Scopus. △ Less

Submitted 17 June, 2019; originally announced June 2019.

Comments: Paper published in the Proceedings of the 16th International Conference of the International Society for Scientometrics and Informetrics (pp. 1087-1092)

arXiv:1901.06815 [pdf]

A principled methodology for comparing relatedness measures for clustering publications

Authors: Ludo Waltman, Kevin W. Boyack, Giovanni Colavizza, Nees Jan van Eck

Abstract: There are many different relatedness measures, based for instance on citation relations or textual similarity, that can be used to cluster scientific publications. We propose a principled methodology for evaluating the accuracy of clustering solutions obtained using these relatedness measures. We formally show that the proposed methodology has an important consistency property. The empirical analy… ▽ More There are many different relatedness measures, based for instance on citation relations or textual similarity, that can be used to cluster scientific publications. We propose a principled methodology for evaluating the accuracy of clustering solutions obtained using these relatedness measures. We formally show that the proposed methodology has an important consistency property. The empirical analyses that we present are based on publications in the fields of cell biology, condensed matter physics, and economics. Using the BM25 text-based relatedness measure as evaluation criterion, we find that bibliographic coupling relations yield more accurate clustering solutions than direct citation relations and co-citation relations. The so-called extended direct citation approach performs similarly to or slightly better than bibliographic coupling in terms of the accuracy of the resulting clustering solutions. The other way around, using a citation-based relatedness measure as evaluation criterion, BM25 turns out to yield more accurate clustering solutions than other text-based relatedness measures. △ Less

Submitted 14 August, 2019; v1 submitted 21 January, 2019; originally announced January 2019.

arXiv:1812.08259 [pdf, other]

doi 10.1098/rsos.190207

Intermediacy of publications

Authors: Lovro Šubelj, Ludo Waltman, Vincent Traag, Nees Jan van Eck

Abstract: Citation networks of scientific publications offer fundamental insights into the structure and development of scientific knowledge. We propose a new measure, called intermediacy, for tracing the historical development of scientific knowledge. Given two publications, an older and a more recent one, intermediacy identifies publications that seem to play a major role in the historical development fro… ▽ More Citation networks of scientific publications offer fundamental insights into the structure and development of scientific knowledge. We propose a new measure, called intermediacy, for tracing the historical development of scientific knowledge. Given two publications, an older and a more recent one, intermediacy identifies publications that seem to play a major role in the historical development from the older to the more recent publication. The identified publications are important in connecting the older and the more recent publication in the citation network. After providing a formal definition of intermediacy, we study its mathematical properties. We then present two empirical case studies, one tracing historical developments at the interface between the community detection literature and the scientometric literature and one examining the development of the literature on peer review. We show both conceptually and empirically how intermediacy differs from main path analysis, which is the most popular approach for tracing historical developments in citation networks. Main path analysis tends to favor longer paths over shorter ones, whereas intermediacy has the opposite tendency. Compared to main path analysis, we conclude that intermediacy offers a more principled approach for tracing the historical development of scientific knowledge. △ Less

Submitted 3 November, 2019; v1 submitted 19 December, 2018; originally announced December 2018.

Comments: 19 pages, 7 figures, 2 tables

Journal ref: R. Soc. Open Sci. 7(1), 190207 (2020)

arXiv:1810.08473 [pdf, other]

doi 10.1038/s41598-019-41695-z

From Louvain to Leiden: guaranteeing well-connected communities

Authors: Vincent Traag, Ludo Waltman, Nees Jan van Eck

Abstract: Community detection is often used to understand the structure of large and complex networks. One of the most popular algorithms for uncovering community structure is the so-called Louvain algorithm. We show that this algorithm has a major defect that largely went unnoticed until now: the Louvain algorithm may yield arbitrarily badly connected communities. In the worst case, communities may even be… ▽ More Community detection is often used to understand the structure of large and complex networks. One of the most popular algorithms for uncovering community structure is the so-called Louvain algorithm. We show that this algorithm has a major defect that largely went unnoticed until now: the Louvain algorithm may yield arbitrarily badly connected communities. In the worst case, communities may even be disconnected, especially when running the algorithm iteratively. In our experimental analysis, we observe that up to 25% of the communities are badly connected and up to 16% are disconnected. To address this problem, we introduce the Leiden algorithm. We prove that the Leiden algorithm yields communities that are guaranteed to be connected. In addition, we prove that, when the Leiden algorithm is applied iteratively, it converges to a partition in which all subsets of all communities are locally optimally assigned. Furthermore, by relying on a fast local move approach, the Leiden algorithm runs faster than the Louvain algorithm. We demonstrate the performance of the Leiden algorithm for several benchmark and real-world networks. We find that the Leiden algorithm is faster than the Louvain algorithm and uncovers better partitions, in addition to providing explicit guarantees. △ Less

Submitted 30 October, 2019; v1 submitted 19 October, 2018; originally announced October 2018.

Journal ref: Scientific Reports, vol. (9): 5233 (2019)

arXiv:1804.03869 [pdf]

Analyzing the activities of visitors of the Leiden Ranking website

Authors: Nees Jan van Eck, Ludo Waltman

Abstract: To provide a better understanding of the way in which university rankings are used, we present a detailed analysis of the activities of visitors of a university ranking website. We use the website of the CWTS Leiden Ranking for this purpose. We for instance study the countries from which visitors originate, the specific pages on the Leiden Ranking website that they visit, the countries or the univ… ▽ More To provide a better understanding of the way in which university rankings are used, we present a detailed analysis of the activities of visitors of a university ranking website. We use the website of the CWTS Leiden Ranking for this purpose. We for instance study the countries from which visitors originate, the specific pages on the Leiden Ranking website that they visit, the countries or the universities that they find of special interest, and the indicators that they focus on. In addition, we also discuss two experiments that were carried out on the Leiden Ranking website. Our analysis does not only provide new insights into the use of university rankings, but it also suggests possible ways in which these rankings can be improved. △ Less

Submitted 14 July, 2018; v1 submitted 11 April, 2018; originally announced April 2018.

arXiv:1801.09985 [pdf]

Field normalization of scientometric indicators

Authors: Ludo Waltman, Nees Jan van Eck

Abstract: When scientometric indicators are used to compare research units active in different scientific fields, there often is a need to make corrections for differences between fields, for instance differences in publication, collaboration, and citation practices. Field-normalized indicators aim to make such corrections. The design of these indicators is a significant challenge. We discuss the main issue… ▽ More When scientometric indicators are used to compare research units active in different scientific fields, there often is a need to make corrections for differences between fields, for instance differences in publication, collaboration, and citation practices. Field-normalized indicators aim to make such corrections. The design of these indicators is a significant challenge. We discuss the main issues in the design of field-normalized indicators, and we present an overview of different approaches that have been developed for dealing with the problem of field normalization. We also discuss how field-normalized indicators can be evaluated, and we consider the sensitivity of scientometric analyses to the choice of a field normalization approach. △ Less

Submitted 30 January, 2018; originally announced January 2018.

arXiv:1710.03094 [pdf]

Characterizing in-text citations in scientific articles: A large-scale analysis

Authors: Kevin W. Boyack, Nees Jan van Eck, Giovanni Colavizza, Ludo Waltman

Abstract: We report characteristics of in-text citations in over five million full text articles from two large databases - the PubMed Central Open Access subset and Elsevier journals - as functions of time, textual progression, and scientific field. The purpose of this study is to understand the characteristics of in-text citations in a detailed way prior to pursuing other studies focused on answering more… ▽ More We report characteristics of in-text citations in over five million full text articles from two large databases - the PubMed Central Open Access subset and Elsevier journals - as functions of time, textual progression, and scientific field. The purpose of this study is to understand the characteristics of in-text citations in a detailed way prior to pursuing other studies focused on answering more substantive research questions. As such, we have analyzed in-text citations in several ways and report many findings here. Perhaps most significantly, we find that there are large field-level differences that are reflected in position within the text, citation interval (or reference age), and citation counts of references. In general, the fields of Biomedical and Health Sciences, Life and Earth Sciences, and Physical Sciences and Engineering have similar reference distributions, although they vary in their specifics. The two remaining fields, Mathematics and Computer Science and Social Science and Humanities, have different reference distributions from the other three fields and between themselves. We also show that in all fields the numbers of sentences, references, and in-text mentions per article have increased over time, and that there are field-level and temporal differences in the numbers of in-text mentions per reference. A final finding is that references mentioned only once tend to be much more highly cited than those mentioned multiple times. △ Less

Submitted 9 October, 2017; originally announced October 2017.

Comments: 22 pages

arXiv:1707.03076 [pdf]

The Closer the Better: Similarity of Publication Pairs at Different Co-Citation Levels

Authors: Giovanni Colavizza, Kevin W. Boyack, Nees Jan van Eck, Ludo Waltman

Abstract: We investigate the similarities of pairs of articles which are co-cited at the different co-citation levels of the journal, article, section, paragraph, sentence and bracket. Our results indicate that textual similarity, intellectual overlap (shared references), author overlap (shared authors), proximity in publication time all rise monotonically as the co-citation level gets lower (from journal t… ▽ More We investigate the similarities of pairs of articles which are co-cited at the different co-citation levels of the journal, article, section, paragraph, sentence and bracket. Our results indicate that textual similarity, intellectual overlap (shared references), author overlap (shared authors), proximity in publication time all rise monotonically as the co-citation level gets lower (from journal to bracket). While the main gain in similarity happens when moving from journal to article co-citation, all level changes entail an increase in similarity, especially section to paragraph and paragraph to sentence/bracket levels. We compare results from four journals over the years 2010-2015: Cell, the European Journal of Operational Research, Physics Letters B and Research Policy, with consistent general outcomes and some interesting differences. Our findings motivate the use of granular co-citation information as defined by meaningful units of text, with implications for, among others, the elaboration of maps of science and the retrieval of scholarly literature. △ Less

Submitted 27 October, 2017; v1 submitted 10 July, 2017; originally announced July 2017.

arXiv:1702.03411 [pdf]

Citation-based clustering of publications using CitNetExplorer and VOSviewer

Authors: Nees Jan van Eck, Ludo Waltman

Abstract: Clustering scientific publications in an important problem in bibliometric research. We demonstrate how two software tools, CitNetExplorer and VOSviewer, can be used to cluster publications and to analyze the resulting clustering solutions. CitNetExplorer is used to cluster a large set of publications in the field of astronomy and astrophysics. The publications are clustered based on direct citati… ▽ More Clustering scientific publications in an important problem in bibliometric research. We demonstrate how two software tools, CitNetExplorer and VOSviewer, can be used to cluster publications and to analyze the resulting clustering solutions. CitNetExplorer is used to cluster a large set of publications in the field of astronomy and astrophysics. The publications are clustered based on direct citation relations. CitNetExplorer and VOSviewer are used together to analyze the resulting clustering solutions. Both tools use visualizations to support the analysis of the clustering solutions, with CitNetExplorer focusing on the analysis at the level of individual publications and VOSviewer focusing on the analysis at an aggregate level. The demonstration provided in this paper shows how a clustering of publications can be created and analyzed using freely available software tools. Using the approach presented in this paper, bibliometricians are able to carry out sophisticated cluster analyses without the need to have a deep knowledge of clustering techniques and without requiring advanced computer skills. △ Less

Submitted 11 February, 2017; originally announced February 2017.

Comments: 25 pages, 4 figures, 4 tables

arXiv:1607.02452 [pdf]

Constructing bibliometric networks: A comparison between full and fractional counting

Authors: Antonio Perianes-Rodriguez, Ludo Waltman, Nees Jan van Eck

Abstract: The analysis of bibliometric networks, such as co-authorship, bibliographic coupling, and co-citation networks, has received a considerable amount of attention. Much less attention has been paid to the construction of these networks. We point out that different approaches can be taken to construct a bibliometric network. Normally the full counting approach is used, but we propose an alternative fr… ▽ More The analysis of bibliometric networks, such as co-authorship, bibliographic coupling, and co-citation networks, has received a considerable amount of attention. Much less attention has been paid to the construction of these networks. We point out that different approaches can be taken to construct a bibliometric network. Normally the full counting approach is used, but we propose an alternative fractional counting approach. The basic idea of the fractional counting approach is that each action, such as co-authoring or citing a publication, should have equal weight, regardless of for instance the number of authors, citations, or references of a publication. We present two empirical analyses in which the full and fractional counting approaches yield very different results. These analyses deal with co-authorship networks of universities and bibliographic coupling networks of journals. Based on theoretical considerations and on the empirical analyses, we conclude that for many purposes the fractional counting approach is preferable over the full counting one. △ Less

Submitted 24 October, 2016; v1 submitted 8 July, 2016; originally announced July 2016.

arXiv:1605.02378 [pdf]

doi 10.1016/j.joi.2015.12.008

The elephant in the room: The problem of quantifying productivity in evaluative scientometrics

Authors: Ludo Waltman, Nees Jan van Eck, Martijn Visser, Paul Wouters

Abstract: In a critical and provocative paper, Abramo and D'Angelo claim that commonly used scientometric indicators such as the mean normalized citation score (MNCS) are completely inappropriate as indicators of scientific performance. Abramo and D'Angelo argue that scientific performance should be quantified using indicators that take into account the productivity of a research unit. We provide a response… ▽ More In a critical and provocative paper, Abramo and D'Angelo claim that commonly used scientometric indicators such as the mean normalized citation score (MNCS) are completely inappropriate as indicators of scientific performance. Abramo and D'Angelo argue that scientific performance should be quantified using indicators that take into account the productivity of a research unit. We provide a response to Abramo and D'Angelo, indicating where we believe they raise important issues, but also pointing out where we believe their claims to be too extreme. △ Less

Submitted 8 May, 2016; originally announced May 2016.

arXiv:1512.09023 [pdf, other]

doi 10.1371/journal.pone.0154404

Clustering scientific publications based on citation relations: A systematic comparison of different methods

Authors: Lovro Šubelj, Nees Jan van Eck, Ludo Waltman

Abstract: Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods, often referred to as graph partitioning or community detection techniques, have been developed. Foc… ▽ More Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods, often referred to as graph partitioning or community detection techniques, have been developed. Focusing on the problem of clustering the publications in a citation network, we present a systematic comparison of the performance of a large number of these clustering methods. Using a number of different citation networks, some of them relatively small and others very large, we extensively study the statistical properties of the results provided by different methods. In addition, we also carry out an expert-based assessment of the results produced by different methods. The expert-based assessment focuses on publications in the field of scientometrics. Our findings seem to indicate that there is a trade-off between different properties that may be considered desirable for a good clustering of publications. Overall, map equation methods appear to perform best in our analysis, suggesting that these methods deserve more attention from the bibliometric community. △ Less

Submitted 29 April, 2016; v1 submitted 30 December, 2015; originally announced December 2015.

Comments: 24 pages, 7 figures, 7 tables

Journal ref: PLoS ONE 11(4), e0154404 (2016)

arXiv:1507.03314 [pdf]

Evaluation of the citation matching algorithms of CWTS and iFQ in comparison to Web of Science

Authors: Marlies Olensky, Marion Schmidt, Nees Jan van Eck

Abstract: The results of bibliometric studies provided by bibliometric research groups, e.g. the Centre for Science and Technology Studies (CWTS) and the Institute for Research Information and Quality Assurance (iFQ), are often used in the process of research assessment. Their databases use Web of Science (WoS) citation data, which they match according to their own matching algorithms - in the case of CWTS… ▽ More The results of bibliometric studies provided by bibliometric research groups, e.g. the Centre for Science and Technology Studies (CWTS) and the Institute for Research Information and Quality Assurance (iFQ), are often used in the process of research assessment. Their databases use Web of Science (WoS) citation data, which they match according to their own matching algorithms - in the case of CWTS for standard usage in their studies and in the case of iFQ on an experimental basis. Since the problem of non-matched citations in WoS persists because of inaccuracies in the references or inaccuracies introduced in the data extraction process, it is important to ascertain how well these inaccuracies are rectified in these citation matching algorithms. This paper evaluates the algorithms of CWTS and iFQ in comparison to WoS in a quantitative and a qualitative analysis. The analysis builds upon the methodology and the manually verified corpus of a previous study. The algorithm of CWTS performs best, closely followed by that of iFQ. The WoS algorithm still performs quite well (F1 score: 96.41 percent), but shows deficits in matching references containing inaccuracies. An additional problem is posed by incorrectly provided cited reference information in source articles by WoS. △ Less

Submitted 12 July, 2015; originally announced July 2015.

Comments: 28 pages, 7 tables, 5 figures. The paper is accepted for publication in the Journal of the Association for Information Science and Technology (JASIST)

arXiv:1501.04431 [pdf]

Field-normalized citation impact indicators and the choice of an appropriate counting method

Authors: Ludo Waltman, Nees Jan van Eck

Abstract: Bibliometric studies often rely on field-normalized citation impact indicators in order to make comparisons between scientific fields. We discuss the connection between field normalization and the choice of a counting method for handling publications with multiple co-authors. Our focus is on the choice between full counting and fractional counting. Based on an extensive theoretical and empirical a… ▽ More Bibliometric studies often rely on field-normalized citation impact indicators in order to make comparisons between scientific fields. We discuss the connection between field normalization and the choice of a counting method for handling publications with multiple co-authors. Our focus is on the choice between full counting and fractional counting. Based on an extensive theoretical and empirical analysis, we argue that properly field-normalized results cannot be obtained when full counting is used. Fractional counting does provide results that are properly field normalized. We therefore recommend the use of fractional counting in bibliometric studies that require field normalization, especially in studies at the level of countries and research organizations. We also compare different variants of fractional counting. In general, it seems best to use either the author-level or the address-level variant of fractional counting. △ Less

Submitted 19 January, 2015; originally announced January 2015.

arXiv:1404.5322 [pdf]

CitNetExplorer: A new software tool for analyzing and visualizing citation networks

Authors: Nees Jan van Eck, Ludo Waltman

Abstract: We present CitNetExplorer, a new software tool for analyzing and visualizing citation networks of scientific publications. CitNetExplorer can for instance be used to study the development of a research field, to delineate the literature on a research topic, and to support literature reviewing. We first introduce the main concepts that need to be understood when working with CitNetExplorer. We then… ▽ More We present CitNetExplorer, a new software tool for analyzing and visualizing citation networks of scientific publications. CitNetExplorer can for instance be used to study the development of a research field, to delineate the literature on a research topic, and to support literature reviewing. We first introduce the main concepts that need to be understood when working with CitNetExplorer. We then demonstrate CitNetExplorer by using the tool to analyze the scientometric literature and the literature on community detection in networks. Finally, we discuss some technical details on the construction, visualization, and analysis of citation networks in CitNetExplorer. △ Less

Submitted 21 April, 2014; originally announced April 2014.

arXiv:1308.6604 [pdf]

doi 10.1140/epjb/e2013-40829-0

A smart local moving algorithm for large-scale modularity-based community detection

Authors: Ludo Waltman, Nees Jan van Eck

Abstract: We introduce a new algorithm for modularity-based community detection in large networks. The algorithm, which we refer to as a smart local moving algorithm, takes advantage of a well-known local moving heuristic that is also used by other algorithms. Compared with these other algorithms, our proposed algorithm uses the local moving heuristic in a more sophisticated way. Based on an analysis of a d… ▽ More We introduce a new algorithm for modularity-based community detection in large networks. The algorithm, which we refer to as a smart local moving algorithm, takes advantage of a well-known local moving heuristic that is also used by other algorithms. Compared with these other algorithms, our proposed algorithm uses the local moving heuristic in a more sophisticated way. Based on an analysis of a diverse set of networks, we show that our smart local moving algorithm identifies community structures with higher modularity values than other algorithms for large-scale modularity optimization, among which the popular 'Louvain algorithm' introduced by Blondel et al. (2008). The computational efficiency of our algorithm makes it possible to perform community detection in networks with tens of millions of nodes and hundreds of millions of edges. Our smart local moving algorithm also performs well in small and medium-sized networks. In short computing times, it identifies community structures with modularity values equally high as, or almost as high as, the highest values reported in the literature, and sometimes even higher than the highest values found in the literature. △ Less

Submitted 29 August, 2013; originally announced August 2013.

arXiv:1301.4941 [pdf]

A systematic empirical comparison of different approaches for normalizing citation impact indicators

Authors: Ludo Waltman, Nees Jan van Eck

Abstract: We address the question how citation-based bibliometric indicators can best be normalized to ensure fair comparisons between publications from different scientific fields and different years. In a systematic large-scale empirical analysis, we compare a traditional normalization approach based on a field classification system with three source normalization approaches. We pay special attention to t… ▽ More We address the question how citation-based bibliometric indicators can best be normalized to ensure fair comparisons between publications from different scientific fields and different years. In a systematic large-scale empirical analysis, we compare a traditional normalization approach based on a field classification system with three source normalization approaches. We pay special attention to the selection of the publications included in the analysis. Publications in national scientific journals, popular scientific magazines, and trade magazines are not included. Unlike earlier studies, we use algorithmically constructed classification systems to evaluate the different normalization approaches. Our analysis shows that a source normalization approach based on the recently introduced idea of fractional citation counting does not perform well. Two other source normalization approaches generally outperform the classification-system-based normalization approach that we study. Our analysis therefore offers considerable support for the use of source-normalized bibliometric indicators. △ Less

Submitted 2 April, 2013; v1 submitted 21 January, 2013; originally announced January 2013.

arXiv:1301.4597 [pdf]

Counting publications and citations: Is more always better?

Authors: Ludo Waltman, Nees Jan van Eck, Paul Wouters

Abstract: Is more always better? We address this question in the context of bibliometric indices that aim to assess the scientific impact of individual researchers by counting their number of highly cited publications. We propose a simple model in which the number of citations of a publication depends not only on the scientific impact of the publication but also on other 'random' factors. Our model indicate… ▽ More Is more always better? We address this question in the context of bibliometric indices that aim to assess the scientific impact of individual researchers by counting their number of highly cited publications. We propose a simple model in which the number of citations of a publication depends not only on the scientific impact of the publication but also on other 'random' factors. Our model indicates that more need not always be better. It turns out that the most influential researchers may have a systematically lower performance, in terms of highly cited publications, than some of their less influential colleagues. The model also suggests an improved way of counting highly cited publications. △ Less

Submitted 2 April, 2013; v1 submitted 19 January, 2013; originally announced January 2013.

arXiv:1210.0442 [pdf]

doi 10.1371/journal.pone.0062395

Citation analysis may severely underestimate the impact of clinical research as compared to basic research

Authors: Nees Jan van Eck, Ludo Waltman, Anthony F. J. van Raan, Robert J. M. Klautz, Wilco C. Peul

Abstract: Background: Citation analysis has become an important tool for research performance assessment in the medical sciences. However, different areas of medical research may have considerably different citation practices, even within the same medical field. Because of this, it is unclear to what extent citation-based bibliometric indicators allow for valid comparisons between research units active in d… ▽ More Background: Citation analysis has become an important tool for research performance assessment in the medical sciences. However, different areas of medical research may have considerably different citation practices, even within the same medical field. Because of this, it is unclear to what extent citation-based bibliometric indicators allow for valid comparisons between research units active in different areas of medical research. Methodology: A visualization methodology is introduced that reveals differences in citation practices between medical research areas. The methodology extracts terms from the titles and abstracts of a large collection of publications and uses these terms to visualize the structure of a medical field and to indicate how research areas within this field differ from each other in their average citation impact. Results: Visualizations are provided for 32 medical fields, defined based on journal subject categories in the Web of Science database. The analysis focuses on three fields. In each of these fields, there turn out to be large differences in citation practices between research areas. Low-impact research areas tend to focus on clinical intervention research, while high-impact research areas are often more oriented on basic and diagnostic research. Conclusions: Popular bibliometric indicators, such as the h-index and the impact factor, do not correct for differences in citation practices between medical fields. These indicators therefore cannot be used to make accurate between-field comparisons. More sophisticated bibliometric indicators do correct for field differences but still fail to take into account within-field heterogeneity in citation practices. As a consequence, the citation impact of clinical intervention research may be substantially underestimated in comparison with basic and diagnostic research. △ Less

Submitted 28 April, 2013; v1 submitted 1 October, 2012; originally announced October 2012.

arXiv:1209.0785 [pdf]

Some modifications to the SNIP journal impact indicator

Authors: Ludo Waltman, Nees Jan van Eck, Thed N. van Leeuwen, Martijn S. Visser

Abstract: The SNIP (source normalized impact per paper) indicator is an indicator of the citation impact of scientific journals. The indicator, introduced by Henk Moed in 2010, is included in Elsevier's Scopus database. The SNIP indicator uses a source normalized approach to correct for differences in citation practices between scientific fields. The strength of this approach is that it does not require a f… ▽ More The SNIP (source normalized impact per paper) indicator is an indicator of the citation impact of scientific journals. The indicator, introduced by Henk Moed in 2010, is included in Elsevier's Scopus database. The SNIP indicator uses a source normalized approach to correct for differences in citation practices between scientific fields. The strength of this approach is that it does not require a field classification system in which the boundaries of fields are explicitly defined. In this paper, a number of modifications that will be made to the SNIP indicator are explained, and the advantages of the resulting revised SNIP indicator are pointed out. It is argued that the original SNIP indicator has some counterintuitive properties, and it is shown mathematically that the revised SNIP indicator does not have these properties. Empirically, the differences between the original SNIP indicator and the revised one turn out to be relatively small, although some systematic differences can be observed. Relations with other source normalized indicators proposed in the literature are discussed as well. △ Less

Submitted 4 September, 2012; originally announced September 2012.

arXiv:1208.6122 [pdf]

Source normalized indicators of citation impact: An overview of different approaches and an empirical comparison

Authors: Ludo Waltman, Nees Jan van Eck

Abstract: Different scientific fields have different citation practices. Citation-based bibliometric indicators need to normalize for such differences between fields in order to allow for meaningful between-field comparisons of citation impact. Traditionally, normalization for field differences has usually been done based on a field classification system. In this approach, each publication belongs to one or… ▽ More Different scientific fields have different citation practices. Citation-based bibliometric indicators need to normalize for such differences between fields in order to allow for meaningful between-field comparisons of citation impact. Traditionally, normalization for field differences has usually been done based on a field classification system. In this approach, each publication belongs to one or more fields and the citation impact of a publication is calculated relative to the other publications in the same field. Recently, the idea of source normalization was introduced, which offers an alternative approach to normalize for field differences. In this approach, normalization is done by looking at the referencing behavior of citing publications or citing journals. In this paper, we provide an overview of a number of source normalization approaches and we empirically compare these approaches with a traditional normalization approach based on a field classification system. We also pay attention to the issue of the selection of the journals to be included in a normalization for field differences. Our analysis indicates a number of problems of the traditional classification-system-based normalization approach, suggesting that source normalization approaches may yield more accurate results. △ Less

Submitted 5 September, 2012; v1 submitted 30 August, 2012; originally announced August 2012.

arXiv:1203.4194 [pdf]

Research collaboration and the expanding science grid: Measuring globalization processes worldwide

Authors: Robert J. W. Tijssen, Ludo Waltman, Nees Jan van Eck

Abstract: This paper applies a new model and analytical tool to measure and study contemporary globalization processes in collaborative science - a world in which scientists, scholars, technicians and engineers interact within a 'grid' of interconnected research sites and collaboration networks. The building blocks of our metrics are the cities where scientific research is conducted, as mentioned in author… ▽ More This paper applies a new model and analytical tool to measure and study contemporary globalization processes in collaborative science - a world in which scientists, scholars, technicians and engineers interact within a 'grid' of interconnected research sites and collaboration networks. The building blocks of our metrics are the cities where scientific research is conducted, as mentioned in author addresses on research publications. The unit of analysis is the geographical distance between those cities. In our macro-level trend analysis, covering the years 2000-2010, we observe that research collaboration distances have been increasing, while the share of collaborative contacts with foreign cities has leveled off. Collaboration distances and growth rates differ significantly between countries and between fields of science. The application of a distance metrics to compare and track these processes opens avenues for further studies, both at the meso-level and at the micro-level, into how research collaboration patterns and trends are driving and sha** the connectivity fabric of world science. △ Less

Submitted 19 March, 2012; originally announced March 2012.

arXiv:1203.0532 [pdf]

A new methodology for constructing a publication-level classification system of science

Authors: Ludo Waltman, Nees Jan van Eck

Abstract: Classifying journals or publications into research areas is an essential element of many bibliometric analyses. Classification usually takes place at the level of journals, where the Web of Science subject categories are the most popular classification system. However, journal-level classification systems have two important limitations: They offer only a limited amount of detail, and they have dif… ▽ More Classifying journals or publications into research areas is an essential element of many bibliometric analyses. Classification usually takes place at the level of journals, where the Web of Science subject categories are the most popular classification system. However, journal-level classification systems have two important limitations: They offer only a limited amount of detail, and they have difficulties with multidisciplinary journals. To avoid these limitations, we introduce a new methodology for constructing classification systems at the level of individual publications. In the proposed methodology, publications are clustered into research areas based on citation relations. The methodology is able to deal with very large numbers of publications. We present an application in which a classification system is produced that includes almost ten million publications. Based on an extensive analysis of this classification system, we discuss the strengths and the limitations of the proposed methodology. Important strengths are the transparency and relative simplicity of the methodology and its fairly modest computing and memory requirements. The main limitation of the methodology is its exclusive reliance on direct citation relations between publications. The accuracy of the methodology can probably be increased by also taking into account other types of relations, for instance based on bibliographic coupling. △ Less

Submitted 2 March, 2012; originally announced March 2012.

arXiv:1202.3941 [pdf]

The Leiden Ranking 2011/2012: Data collection, indicators, and interpretation

Authors: Ludo Waltman, Clara Calero-Medina, Joost Kosten, Ed C. M. Noyons, Robert J. W. Tijssen, Nees Jan van Eck, Thed N. van Leeuwen, Anthony F. J. van Raan, Martijn S. Visser, Paul Wouters

Abstract: The Leiden Ranking 2011/2012 is a ranking of universities based on bibliometric indicators of publication output, citation impact, and scientific collaboration. The ranking includes 500 major universities from 41 different countries. This paper provides an extensive discussion of the Leiden Ranking 2011/2012. The ranking is compared with other global university rankings, in particular the Academic… ▽ More The Leiden Ranking 2011/2012 is a ranking of universities based on bibliometric indicators of publication output, citation impact, and scientific collaboration. The ranking includes 500 major universities from 41 different countries. This paper provides an extensive discussion of the Leiden Ranking 2011/2012. The ranking is compared with other global university rankings, in particular the Academic Ranking of World Universities (commonly known as the Shanghai Ranking) and the Times Higher Education World University Rankings. Also, a detailed description is offered of the data collection methodology of the Leiden Ranking 2011/2012 and of the indicators used in the ranking. Various innovations in the Leiden Ranking 2011/2012 are presented. These innovations include (1) an indicator based on counting a university's highly cited publications, (2) indicators based on fractional rather than full counting of collaborative publications, (3) the possibility of excluding non-English language publications, and (4) the use of stability intervals. Finally, some comments are made on the interpretation of the ranking, and a number of limitations of the ranking are pointed out. △ Less

Submitted 17 February, 2012; originally announced February 2012.

arXiv:1109.2058 [pdf]

Text mining and visualization using VOSviewer

Authors: Nees Jan van Eck, Ludo Waltman

Abstract: VOSviewer is a computer program for creating, visualizing, and exploring bibliometric maps of science. In this report, the new text mining functionality of VOSviewer is presented. A number of examples are given of applications in which VOSviewer is used for analyzing large amounts of text data. VOSviewer is a computer program for creating, visualizing, and exploring bibliometric maps of science. In this report, the new text mining functionality of VOSviewer is presented. A number of examples are given of applications in which VOSviewer is used for analyzing large amounts of text data. △ Less

Submitted 9 September, 2011; originally announced September 2011.

arXiv:1108.3901 [pdf]

The inconsistency of the h-index

Authors: Ludo Waltman, Nees Jan van Eck

Abstract: The h-index is a popular bibliometric indicator for assessing individual scientists. We criticize the h-index from a theoretical point of view. We argue that for the purpose of measuring the overall scientific impact of a scientist (or some other unit of analysis) the h-index behaves in a counterintuitive way. In certain cases, the mechanism used by the h-index to aggregate publication and citatio… ▽ More The h-index is a popular bibliometric indicator for assessing individual scientists. We criticize the h-index from a theoretical point of view. We argue that for the purpose of measuring the overall scientific impact of a scientist (or some other unit of analysis) the h-index behaves in a counterintuitive way. In certain cases, the mechanism used by the h-index to aggregate publication and citation statistics into a single number leads to inconsistencies in the way in which scientists are ranked. Our conclusion is that the h-index cannot be considered an appropriate indicator of a scientist's overall scientific impact. Based on recent theoretical insights, we discuss what kind of indicators can be used as an alternative to the h-index. We pay special attention to the highly cited publications indicator. This indicator has a lot in common with the h-index, but unlike the h-index it does not produce inconsistent rankings. △ Less

Submitted 19 August, 2011; originally announced August 2011.

arXiv:1105.5316 [pdf]

On the correlation between bibliometric indicators and peer review: Reply to Opthof and Leydesdorff

Authors: Ludo Waltman, Nees Jan van Eck, Thed N. van Leeuwen, Martijn S. Visser, Anthony F. J. van Raan

Abstract: Opthof and Leydesdorff [arXiv:1102.2569] reanalyze data reported by Van Raan [arXiv:physics/0511206] and conclude that there is no significant correlation between on the one hand average citation scores measured using the CPP/FCSm indicator and on the other hand the quality judgment of peers. We point out that Opthof and Leydesdorff draw their conclusions based on a very limited amount of data. We… ▽ More Opthof and Leydesdorff [arXiv:1102.2569] reanalyze data reported by Van Raan [arXiv:physics/0511206] and conclude that there is no significant correlation between on the one hand average citation scores measured using the CPP/FCSm indicator and on the other hand the quality judgment of peers. We point out that Opthof and Leydesdorff draw their conclusions based on a very limited amount of data. We also criticize the statistical methodology used by Opthof and Leydesdorff. Using a larger amount of data and a more appropriate statistical methodology, we do find a significant correlation between the CPP/FCSm indicator and peer judgment. △ Less

Submitted 26 May, 2011; originally announced May 2011.

arXiv:1105.3212 [pdf]

A recursive field-normalized bibliometric performance indicator: An application to the field of library and information science

Authors: Ludo Waltman, Erjia Yan, Nees Jan van Eck

Abstract: Two commonly used ideas in the development of citation-based research performance indicators are the idea of normalizing citation counts based on a field classification scheme and the idea of recursive citation weighing (like in PageRank-inspired indicators). We combine these two ideas in a single indicator, referred to as the recursive mean normalized citation score indicator, and we study the va… ▽ More Two commonly used ideas in the development of citation-based research performance indicators are the idea of normalizing citation counts based on a field classification scheme and the idea of recursive citation weighing (like in PageRank-inspired indicators). We combine these two ideas in a single indicator, referred to as the recursive mean normalized citation score indicator, and we study the validity of this indicator. Our empirical analysis shows that the proposed indicator is highly sensitive to the field classification scheme that is used. The indicator also has a strong tendency to reinforce biases caused by the classification scheme. Based on these observations, we advise against the use of indicators in which the idea of normalization based on a field classification scheme and the idea of recursive citation weighing are combined. △ Less

Submitted 16 May, 2011; originally announced May 2011.

arXiv:1105.2934 [pdf]

Universality of citation distributions revisited

Authors: Ludo Waltman, Nees Jan van Eck, Anthony F. J. van Raan

Abstract: Radicchi, Fortunato, and Castellano [arXiv:0806.0974, PNAS 105(45), 17268] claim that, apart from a scaling factor, all fields of science are characterized by the same citation distribution. We present a large-scale validation study of this universality-of-citation-distributions claim. Our analysis shows that claiming citation distributions to be universal for all fields of science is not warrante… ▽ More Radicchi, Fortunato, and Castellano [arXiv:0806.0974, PNAS 105(45), 17268] claim that, apart from a scaling factor, all fields of science are characterized by the same citation distribution. We present a large-scale validation study of this universality-of-citation-distributions claim. Our analysis shows that claiming citation distributions to be universal for all fields of science is not warranted. Although many fields indeed seem to have fairly similar citation distributions, there are quite some exceptions as well. We also briefly discuss the consequences of our findings for the measurement of scientific impact using citation-based bibliometric indicators. △ Less

Submitted 30 August, 2011; v1 submitted 15 May, 2011; originally announced May 2011.

arXiv:1103.3648 [pdf]

Globalisation of science in kilometres

Authors: Ludo Waltman, Robert J. W. Tijssen, Nees Jan van Eck

Abstract: The ongoing globalisation of science has undisputedly a major impact on how and where scientific research is being conducted nowadays. Yet, the big picture remains blurred. It is largely unknown where this process is heading, and at which rate. Which countries are leading or lagging? Many of its key features are difficult if not impossible to capture in measurements and comparative statistics. Our… ▽ More The ongoing globalisation of science has undisputedly a major impact on how and where scientific research is being conducted nowadays. Yet, the big picture remains blurred. It is largely unknown where this process is heading, and at which rate. Which countries are leading or lagging? Many of its key features are difficult if not impossible to capture in measurements and comparative statistics. Our empirical study measures the extent and growth of scientific globalisation in terms of physical distances between co-authoring researchers. Our analysis, drawing on 21 million research publications across all countries and fields of science, reveals that contemporary science has globalised at a fairly steady rate during recent decades. The average collaboration distance per publication has increased from 334 kilometres in 1980 to 1553 in 2009. Despite significant differences in globalisation rates across countries and fields of science, we observe a pervasive process in motion, moving towards a truly interconnected global science system. △ Less

Submitted 13 May, 2011; v1 submitted 18 March, 2011; originally announced March 2011.

arXiv:1006.1032 [pdf, other]

A unified approach to map** and clustering of bibliometric networks

Authors: Ludo Waltman, Nees Jan van Eck, Ed C. M. Noyons

Abstract: In the analysis of bibliometric networks, researchers often use map** and clustering techniques in a combined fashion. Typically, however, map** and clustering techniques that are used together rely on very different ideas and assumptions. We propose a unified approach to map** and clustering of bibliometric networks. We show that the VOS map** technique and a weighted and parameterized va… ▽ More In the analysis of bibliometric networks, researchers often use map** and clustering techniques in a combined fashion. Typically, however, map** and clustering techniques that are used together rely on very different ideas and assumptions. We propose a unified approach to map** and clustering of bibliometric networks. We show that the VOS map** technique and a weighted and parameterized variant of modularity-based clustering can both be derived from the same underlying principle. We illustrate our proposed approach by producing a combined map** and clustering of the most frequently cited publications that appeared in the field of information science in the period 1999-2008. △ Less

Submitted 5 June, 2010; originally announced June 2010.

arXiv:1004.1632 [pdf, other]

Towards a new crown indicator: An empirical analysis

Authors: Ludo Waltman, Nees Jan van Eck, Thed N. van Leeuwen, Martijn S. Visser, Anthony F. J. van Raan

Abstract: We present an empirical comparison between two normalization mechanisms for citation-based indicators of research performance. These mechanisms aim to normalize citation counts for the field and the year in which a publication was published. One mechanism is applied in the current so-called crown indicator of our institute. The other mechanism is applied in the new crown indicator that our institu… ▽ More We present an empirical comparison between two normalization mechanisms for citation-based indicators of research performance. These mechanisms aim to normalize citation counts for the field and the year in which a publication was published. One mechanism is applied in the current so-called crown indicator of our institute. The other mechanism is applied in the new crown indicator that our institute is planning to adopt. We find that at high aggregation levels, such as at the level of large research institutions or at the level of countries, the differences between the two mechanisms are very small. At lower aggregation levels, such as at the level of research groups or at the level of journals, the differences between the two mechanisms are somewhat larger. We pay special attention to the way in which recent publications are handled. These publications typically have very low citation counts and should therefore be handled with special care. △ Less

Submitted 7 September, 2010; v1 submitted 9 April, 2010; originally announced April 2010.

arXiv:1003.2551 [pdf, other]

A comparison of two techniques for bibliometric map**: Multidimensional scaling and VOS

Authors: Nees Jan van Eck, Ludo Waltman, Rommert Dekker, Jan van den Berg

Abstract: VOS is a new map** technique that can serve as an alternative to the well-known technique of multidimensional scaling. We present an extensive comparison between the use of multidimensional scaling and the use of VOS for constructing bibliometric maps. In our theoretical analysis, we show the mathematical relation between the two techniques. In our experimental analysis, we use the techniques fo… ▽ More VOS is a new map** technique that can serve as an alternative to the well-known technique of multidimensional scaling. We present an extensive comparison between the use of multidimensional scaling and the use of VOS for constructing bibliometric maps. In our theoretical analysis, we show the mathematical relation between the two techniques. In our experimental analysis, we use the techniques for constructing maps of authors, journals, and keywords. Two commonly used approaches to bibliometric map**, both based on multidimensional scaling, turn out to produce maps that suffer from artifacts. Maps constructed using VOS turn out not to have this problem. We conclude that in general maps constructed using VOS provide a more satisfactory representation of a data set than maps constructed using well-known multidimensional scaling approaches. △ Less

Submitted 12 March, 2010; originally announced March 2010.

arXiv:1003.2198 [pdf, other]

The relation between Eigenfactor, audience factor, and influence weight

Authors: Ludo Waltman, Nees Jan van Eck

Abstract: We present a theoretical and empirical analysis of a number of bibliometric indicators of journal performance. We focus on three indicators in particular, namely the Eigenfactor indicator, the audience factor, and the influence weight indicator. Our main finding is that the last two indicators can be regarded as a kind of special cases of the first indicator. We also find that the three indicators… ▽ More We present a theoretical and empirical analysis of a number of bibliometric indicators of journal performance. We focus on three indicators in particular, namely the Eigenfactor indicator, the audience factor, and the influence weight indicator. Our main finding is that the last two indicators can be regarded as a kind of special cases of the first indicator. We also find that the three indicators can be nicely characterized in terms of two properties. We refer to these properties as the property of insensitivity to field differences and the property of insensitivity to insignificant journals. The empirical results that we present illustrate our theoretical findings. We also show empirically that the differences between various indicators of journal performance are quite substantial. △ Less

Submitted 10 March, 2010; originally announced March 2010.

arXiv:1003.2167 [pdf, other]

Towards a new crown indicator: Some theoretical considerations

Authors: Ludo Waltman, Nees Jan van Eck, Thed N. van Leeuwen, Martijn S. Visser, Anthony F. J. van Raan

Abstract: The crown indicator is a well-known bibliometric indicator of research performance developed by our institute. The indicator aims to normalize citation counts for differences among fields. We critically examine the theoretical basis of the normalization mechanism applied in the crown indicator. We also make a comparison with an alternative normalization mechanism. The alternative mechanism turns o… ▽ More The crown indicator is a well-known bibliometric indicator of research performance developed by our institute. The indicator aims to normalize citation counts for differences among fields. We critically examine the theoretical basis of the normalization mechanism applied in the crown indicator. We also make a comparison with an alternative normalization mechanism. The alternative mechanism turns out to have more satisfactory properties than the mechanism applied in the crown indicator. In particular, the alternative mechanism has a so-called consistency property. The mechanism applied in the crown indicator lacks this important property. As a consequence of our findings, we are currently moving towards a new crown indicator, which relies on the alternative normalization mechanism. △ Less

Submitted 16 August, 2010; v1 submitted 10 March, 2010; originally announced March 2010.

arXiv:1003.2113 [pdf, other]

Rivals for the crown: Reply to Opthof and Leydesdorff

Authors: Anthony F. J. van Raan, Thed N. van Leeuwen, Martijn S. Visser, Nees Jan van Eck, Ludo Waltman

Abstract: We reply to the criticism of Opthof and Leydesdorff [arXiv:1002.2769] on the way in which our institute applies journal and field normalizations to citation counts. We point out why we believe most of the criticism is unjustified, but we also indicate where we think Opthof and Leydesdorff raise a valid point. We reply to the criticism of Opthof and Leydesdorff [arXiv:1002.2769] on the way in which our institute applies journal and field normalizations to citation counts. We point out why we believe most of the criticism is unjustified, but we also indicate where we think Opthof and Leydesdorff raise a valid point. △ Less

Submitted 10 March, 2010; originally announced March 2010.

Showing 1–42 of 42 results for author: van Eck, N J