Search | arXiv e-print repository

Network of scientific concepts: empirical analysis and modeling

Authors: Vasyl Palchykov, Mariana Krasnytska, Olesya Mryglod, Yurij Holovatch

Abstract: Concepts in a certain domain of science are linked via intrinsic connections reflecting the structure of knowledge. To get a qualitative insight and a quantitative description of this structure, we perform empirical analysis and modeling of the network of scientific concepts in the domain of physics. To this end we use a collection of manuscripts submitted to the e-print repository arXiv and the v… ▽ More Concepts in a certain domain of science are linked via intrinsic connections reflecting the structure of knowledge. To get a qualitative insight and a quantitative description of this structure, we perform empirical analysis and modeling of the network of scientific concepts in the domain of physics. To this end we use a collection of manuscripts submitted to the e-print repository arXiv and the vocabulary of scientific concepts collected via the ScienceWISE.info platform and construct a network of scientific concepts based on their co-occurrences in publications. The resulting complex network possesses a number of specific features (high node density, dissortativity, structural correlations, skewed node degree distribution) that can not be understood as a result of simple growth by several commonly used network models. We show that the model based on a simultaneous account of two factors, growth by blocks and preferential selection, gives an explanation of empirically observed properties of the concepts network. △ Less

Submitted 9 August, 2021; originally announced August 2021.

arXiv:2003.10289 [pdf, other]

Embedding technique and network analysis of scientific innovations emergence in an arXiv-based concept network

Authors: Serhii Brodiuk, Vasyl Palchykov, Yurij Holovatch

Abstract: Novelty is an inherent part of innovations and discoveries. Such processes may be considered as an appearance of new ideas or as an emergence of atypical connections between the existing ones. The importance of such connections hints for investigation of innovations through network or graph representation in the space of ideas. In such representation, a graph node corresponds to the relevant conce… ▽ More Novelty is an inherent part of innovations and discoveries. Such processes may be considered as an appearance of new ideas or as an emergence of atypical connections between the existing ones. The importance of such connections hints for investigation of innovations through network or graph representation in the space of ideas. In such representation, a graph node corresponds to the relevant concept (idea), whereas an edge between two nodes means that the corresponding concepts have been used in a common context. In this study we address the question about a possibility to identify the edges between existing concepts where the innovations may emerge. To this end, we use a well-documented scientific knowledge landscape of 1.2M arXiv.org manuscripts dated starting from April 2007 and until September 2019. We extract relevant concepts for them using the ScienceWISE.info platform. Combining approaches developed in complex networks science and graph embedding, we discuss the predictability of edges (links) on the scientific knowledge landscape where the innovations may appear. △ Less

Submitted 23 March, 2020; originally announced March 2020.

Comments: 6 pages, 1 figure, submitted to IEEE Third International Conference Data Stream Mining & Processing (Dsmp2020)

arXiv:1806.04406 [pdf, other]

Bipartite graph analysis as an alternative to reveal clusterization in complex systems

Authors: Vasyl Palchykov, Yurij Holovatch

Abstract: We demonstrate how analysis of co-clustering in bipartite networks may be used as a bridge to connect, compare and complement clustering results about community structure in two different spaces: single-mode bipartite network projections. As a case study we consider scientific knowledge, which is represented as a complex bipartite network of articles and related concepts. Connecting clusters of ar… ▽ More We demonstrate how analysis of co-clustering in bipartite networks may be used as a bridge to connect, compare and complement clustering results about community structure in two different spaces: single-mode bipartite network projections. As a case study we consider scientific knowledge, which is represented as a complex bipartite network of articles and related concepts. Connecting clusters of articles and clusters of concepts via article-to-concept bipartite co-clustering, we demonstrate how concept features (e.g. subject classes) may be inferred from the article ones. △ Less

Submitted 12 June, 2018; originally announced June 2018.

Comments: 5 pages, 2 figures, submitted to IEEE Second International Conference Data Stream Mining & Processing (Dsmp2018)

Journal ref: IEEE Second International Conference on Data Stream Mining & Processing (Dsmp2018) IEEE Catalog Number: CFP18J13-POD, ISBN: 978-1-5386-2875-1 (2018) pp. 84-87

arXiv:1612.07636 [pdf, other]

ScienceWISE: Topic Modeling over Scientific Literature Networks

Authors: Andrea Martini, Artem Lutov, Valerio Gemmetto, Andrii Magalich, Alessio Cardillo, Alex Constantin, Vasyl Palchykov, Mourad Khayati, Philippe Cudré-Mauroux, Alexey Boyarsky, Oleg Ruchayskiy, Diego Garlaschelli, Paolo De Los Rios, Karl Aberer

Abstract: We provide an up-to-date view on the knowledge management system ScienceWISE (SW) and address issues related to the automatic assignment of articles to research topics. So far, SW has been proven to be an effective platform for managing large volumes of technical articles by means of ontological concept-based browsing. However, as the publication of research articles accelerates, the expressivity… ▽ More We provide an up-to-date view on the knowledge management system ScienceWISE (SW) and address issues related to the automatic assignment of articles to research topics. So far, SW has been proven to be an effective platform for managing large volumes of technical articles by means of ontological concept-based browsing. However, as the publication of research articles accelerates, the expressivity and the richness of the SW ontology turns into a double-edged sword: a more fine-grained characterization of articles is possible, but at the cost of introducing more spurious relations among them. In this context, the challenge of continuously recommending relevant articles to users lies in tackling a network partitioning problem, where nodes represent articles and co-occurring concepts create edges between them. In this paper, we discuss the three research directions we have taken for solving this issue: i) the identification of generic concepts to reinforce inter-article similarities; ii) the adoption of a bipartite network representation to improve scalability; iii) the design of a clustering algorithm to identify concepts for cross-disciplinary articles and obtain fine-grained topics for all articles. △ Less

Submitted 22 December, 2016; originally announced December 2016.

Comments: 6 pages; 5 figures

arXiv:1602.08451 [pdf, other]

doi 10.1140/epjds/s13688-016-0090-4

Ground truth? Concept-based communities versus the external classification of physics manuscripts

Authors: Vasyl Palchykov, Valerio Gemmetto, Alexey Boyarsky, Diego Garlaschelli

Abstract: Community detection techniques are widely used to infer hidden structures within interconnected systems. Despite demonstrating high accuracy on benchmarks, they reproduce the external classification for many real-world systems with a significant level of discrepancy. A widely accepted reason behind such outcome is the unavoidable loss of non-topological information (such as node attributes) encoun… ▽ More Community detection techniques are widely used to infer hidden structures within interconnected systems. Despite demonstrating high accuracy on benchmarks, they reproduce the external classification for many real-world systems with a significant level of discrepancy. A widely accepted reason behind such outcome is the unavoidable loss of non-topological information (such as node attributes) encountered when the original complex system is represented as a network. In this article we emphasize that the observed discrepancies may also be caused by a different reason: the external classification itself. For this end we use scientific publication data which i) exhibit a well defined modular structure and ii) hold an expert-made classification of research articles. Having represented the articles and the extracted scientific concepts both as a bipartite network and as its unipartite projection, we applied modularity optimization to uncover the inner thematic structure. The resulting clusters are shown to partly reflect the author-made classification, although some significant discrepancies are observed. A detailed analysis of these discrepancies shows that they carry essential information about the system, mainly related to the use of similar techniques and methods across different (sub)disciplines, that is otherwise omitted when only the external classification is considered. △ Less

Submitted 6 February, 2016; originally announced February 2016.

Comments: 15 pages, 2 figures

Journal ref: EPJ Data Science 2016 5:28

arXiv:1602.04853 [pdf, other]

Complex Networks of Words in Fables

Authors: Yurij Holovatch, Vasyl Palchykov

Abstract: In this chapter we give an overview of the application of complex network theory to quantify some properties of language. Our study is based on two fables in Ukrainian, Mykyta the Fox and Abu-Kasym's slippers. It consists of two parts: the analysis of frequency-rank distributions of words and the application of complex-network theory. The first part shows that the text sizes are sufficiently large… ▽ More In this chapter we give an overview of the application of complex network theory to quantify some properties of language. Our study is based on two fables in Ukrainian, Mykyta the Fox and Abu-Kasym's slippers. It consists of two parts: the analysis of frequency-rank distributions of words and the application of complex-network theory. The first part shows that the text sizes are sufficiently large to observe statistical properties. This supports their selection for the analysis of typical properties of the language networks in the second part of the chapter. In describing language as a complex network, while words are usually associated with nodes, there is more variability in the choice of links and different representations result in different networks. Here, we examine a number of such representations of the language network and perform a comparative analysis of their characteristics. Our results suggest that, irrespective of link representation, the Ukrainian language network used in the selected fables is a strongly correlated, scale-free, small world. We discuss how such empirical approaches may help form a useful basis for a theoretical description of language evolution and how they may be used in analyses of other textual narratives. △ Less

Submitted 4 February, 2016; originally announced February 2016.

Comments: 16 pages, 4 figures and 2 tables. To appear in: "Maths Meets Myths: Complexity-science approaches to folktales, myths, sagas, and histories." Editors: R. Kenna, M. Mac Carron, P. Mac Carron. (Springer, 2016)

arXiv:1405.6009 [pdf, other]

doi 10.5488/CMP.17.33802

Transmission of cultural traits in layered ego-centric networks

Authors: Vasyl Palchykov, Kimmo Kaski, Janos Kertész

Abstract: Although a number of models have been developed to investigate the emergence of culture and evolutionary phases in social systems, one important aspect has not yet been sufficiently emphasized. This is the structure of the underlaying network of social relations serving as channels in transmitting cultural traits, which is expected to play a crucial role in the evolutionary processes in social sys… ▽ More Although a number of models have been developed to investigate the emergence of culture and evolutionary phases in social systems, one important aspect has not yet been sufficiently emphasized. This is the structure of the underlaying network of social relations serving as channels in transmitting cultural traits, which is expected to play a crucial role in the evolutionary processes in social systems. In this paper we contribute to the understanding of the role of the network structure by develo** a layered ego-centric network structure based model, inspired by the social brain hypothesis, to study transmission of cultural traits and their evolution in social network. For this model we first find analytical results in the spirit of mean-field approximation and then to validate the results we compare them with the results of extensive numerical simulations. △ Less

Submitted 19 November, 2014; v1 submitted 23 May, 2014; originally announced May 2014.

Comments: 10 pages, 2 figures

Journal ref: Condens. Matter Phys., 2014, Vol. 17, No. 3, 33802

arXiv:1403.3785 [pdf, other]

doi 10.1088/1367-2630/16/8/083038

Statistically validated mobile communication networks: Evolution of motifs in European and Chinese data

Authors: Ming-Xia Li, Vasyl Palchykov, Zhi-Qiang Jiang, Kimmo Kaski, Janos Kertész, Salvatore Miccichè, Michele Tumminello, Wei-Xing Zhou, Rosario N. Mantegna

Abstract: Big data open up unprecedented opportunities to investigate complex systems including the society. In particular, communication data serve as major sources for computational social sciences but they have to be cleaned and filtered as they may contain spurious information due to recording errors as well as interactions, like commercial and marketing activities, not directly related to the social ne… ▽ More Big data open up unprecedented opportunities to investigate complex systems including the society. In particular, communication data serve as major sources for computational social sciences but they have to be cleaned and filtered as they may contain spurious information due to recording errors as well as interactions, like commercial and marketing activities, not directly related to the social network. The network constructed from communication data can only be considered as a proxy for the network of social relationships. Here we apply a systematic method, based on multiple hypothesis testing, to statistically validate the links and then construct the corresponding Bonferroni network, generalized to the directed case. We study two large datasets of mobile phone records, one from Europe and the other from China. For both datasets we compare the raw data networks with the corresponding Bonferroni networks and point out significant differences in the structures and in the basic network measures. We show evidence that the Bonferroni network provides a better proxy for the network of social interactions than the original one. By using the filtered networks we investigated the statistics and temporal evolution of small directed 3-motifs and conclude that closed communication triads have a formation time-scale, which is quite fast and typically intraday. We also find that open communication triads preferentially evolve to other open triads with a higher fraction of reciprocated calls. These stylized facts were observed for both datasets. △ Less

Submitted 15 March, 2014; originally announced March 2014.

Comments: 19 pages, 8 figures, 5 tables

Journal ref: New J. Phys. 16 (2014) 083038

arXiv:1201.5722 [pdf]

doi 10.1038/srep00370

Sex differences in intimate relationships

Authors: Vasyl Palchykov, Kimmo Kaski, János Kertész, Albert-László Barabási, Robin I. M. Dunbar

Abstract: Social networks have turned out to be of fundamental importance both for our understanding human sociality and for the design of digital communication technology. However, social networks are themselves based on dyadic relationships and we have little understanding of the dynamics of close relationships and how these change over time. Evolutionary theory suggests that, even in monogamous mating sy… ▽ More Social networks have turned out to be of fundamental importance both for our understanding human sociality and for the design of digital communication technology. However, social networks are themselves based on dyadic relationships and we have little understanding of the dynamics of close relationships and how these change over time. Evolutionary theory suggests that, even in monogamous mating systems, the pattern of investment in close relationships should vary across the lifespan when post-weaning investment plays an important role in maximising fitness. Mobile phone data sets provide us with a unique window into the structure of relationships and the way these change across the lifespan. We here use data from a large national mobile phone dataset to demonstrate striking sex differences in the pattern in the gender-bias of preferred relationships that reflect the way the reproductive investment strategies of the two sexes change across the lifespan: these differences mainly reflect women's shifting patterns of investment in reproduction and parental care. These results suggest that human social strategies may have more complex dynamics than we have tended to assume and a life-history perspective may be crucial for understanding them. △ Less

Submitted 25 April, 2012; v1 submitted 27 January, 2012; originally announced January 2012.

Comments: 5 pages, 3 figures, contains electronic supplementary material

Journal ref: Sci. Rep. 2, 370 (2012)

Showing 1–9 of 9 results for author: Palchykov, V