Search | arXiv e-print repository

doi 10.1103/PhysRevE.108.044137

Two-step estimators of high dimensional correlation matrices

Authors: Andrés García-Medina, Salvatore Miccichè, Rosario N. Mantegna

Abstract: We investigate block diagonal and hierarchical nested stochastic multivariate Gaussian models by studying their sample cross-correlation matrix on high dimensions. By performing numerical simulations, we compare a filtered sample cross-correlation with the population cross-correlation matrices by using several rotationally invariant estimators (RIE) and hierarchical clustering estimators (HCE) und… ▽ More We investigate block diagonal and hierarchical nested stochastic multivariate Gaussian models by studying their sample cross-correlation matrix on high dimensions. By performing numerical simulations, we compare a filtered sample cross-correlation with the population cross-correlation matrices by using several rotationally invariant estimators (RIE) and hierarchical clustering estimators (HCE) under several loss functions. We show that at large but finite sample size, sample cross-correlation filtered by RIE estimators are often outperformed by HCE estimators for several of the loss functions. We also show that for block models and for hierarchically nested block models the best determination of the filtered sample cross-correlation is achieved by introducing two-step estimators combining state-of-the-art non-linear shrinkage models with hierarchical clustering estimators. △ Less

Submitted 10 October, 2023; v1 submitted 30 December, 2022; originally announced December 2022.

Comments: 14 pages, 6 figures, 6 tables

arXiv:2209.12712 [pdf, other]

Identifying maximal sets of significantly interacting nodes in higher-order networks

Authors: Federico Musciotto, Federico Battiston, Rosario N. Mantegna

Abstract: We introduce a method for the detection of Statistically Validated Simplices in higher-order networks. Statistically validated simplices represent the maximal sets of nodes of any size that consistently interact collectively and do not include co-interacting nodes that appears only occasionally. Using properly designed higher-order benchmarks, we show that our approach is highly effective in syste… ▽ More We introduce a method for the detection of Statistically Validated Simplices in higher-order networks. Statistically validated simplices represent the maximal sets of nodes of any size that consistently interact collectively and do not include co-interacting nodes that appears only occasionally. Using properly designed higher-order benchmarks, we show that our approach is highly effective in systems where the maximal sets are likely to be diluted into interactions of larger sizes that include occasional participants. By applying our method to two real world datasets, we also show how it allows to detect simplices whose nodes are characterized by significant levels of similarity, providing new insights on the generative processes of real world higher-order networks. △ Less

Submitted 26 September, 2022; originally announced September 2022.

Comments: presubmission version, 10 pages and 5 figures

arXiv:2111.07144 [pdf, other]

Quantifying the relationship between specialisation and reputation in an online platform

Authors: Giacomo Livan, Giuseppe Pappalardo, Rosario N. Mantegna

Abstract: Online platforms experience a tension between decentralisation and incentives to steer user behaviour, which are usually implemented through digital reputation systems. We provide a statistical characterisation of the user behaviour emerging from the interplay of such competing forces in Stack Overflow, a long-standing knowledge sharing platform. Over the 11 years covered by our analysis, we find… ▽ More Online platforms experience a tension between decentralisation and incentives to steer user behaviour, which are usually implemented through digital reputation systems. We provide a statistical characterisation of the user behaviour emerging from the interplay of such competing forces in Stack Overflow, a long-standing knowledge sharing platform. Over the 11 years covered by our analysis, we find that the platform's user base consistently self-organise into specialists and generalists, i.e., users who focus their activity on narrow and broad sets of topics, respectively. We relate the emergence of these behaviours to the platform's reputation system with a series of data-driven models, and find specialisation to be statistically associated with a higher ability to post the best answers to a question. Our findings are in stark contrast with observations made in top-down environments - such as firms and corporations - where generalist skills are consistently found to be more successful. △ Less

Submitted 13 November, 2021; originally announced November 2021.

arXiv:2103.16484 [pdf, other]

Detecting informative higher-order interactions in statistically validated hypergraphs

Authors: Federico Musciotto, Federico Battiston, Rosario N. Mantegna

Abstract: Recent empirical evidence has shown that in many real-world systems, successfully represented as networks, interactions are not limited to dyads, but often involve three or more agents at a time. These data are better described by hypergraphs, where hyperlinks encode higher-order interactions among a group of nodes. In spite of the large number of works on networks, highlighting informative hyperl… ▽ More Recent empirical evidence has shown that in many real-world systems, successfully represented as networks, interactions are not limited to dyads, but often involve three or more agents at a time. These data are better described by hypergraphs, where hyperlinks encode higher-order interactions among a group of nodes. In spite of the large number of works on networks, highlighting informative hyperlinks in hypergraphs obtained from real world data is still an open problem. Here we propose an analytic approach to filter hypergraphs by identifying those hyperlinks that are over-expressed with respect to a random null hypothesis, and represent the most relevant higher-order connections. We apply our method to a class of synthetic benchmarks and to several datasets. For all cases, the method highlights hyperlinks that are more informative than those extracted with pairwise approaches. Our method provides a first way to obtain statistically validated hypergraphs, separating informative connections from redundant and noisy ones. △ Less

Submitted 31 March, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

Comments: pre-submission version, 10 pages and 3 figures + SI

arXiv:2007.07166 [pdf, other]

doi 10.1063/5.0004487

Dynamics of fintech terms in news and blogs and specialization of companies of the fintech industry

Authors: Fabio Ciulla, Rosario N. Mantegna

Abstract: We perform a large scale analysis of a list of fintech terms in (i) news and blogs in English language and (ii) professional descriptions of companies operating in many countries. The occurrence and co-occurrence of fintech terms and locutions shows a progressive evolution of the list of fintech terms in a compact and coherent set of terms used worldwide to describe fintech business activities. By… ▽ More We perform a large scale analysis of a list of fintech terms in (i) news and blogs in English language and (ii) professional descriptions of companies operating in many countries. The occurrence and co-occurrence of fintech terms and locutions shows a progressive evolution of the list of fintech terms in a compact and coherent set of terms used worldwide to describe fintech business activities. By using methods of complex networks that are specifically designed to deal with heterogeneous systems, our analysis of a large set of professional descriptions of companies shows that companies having fintech terms in their description present over-expressions of specific attributes of country, municipality, and economic sector. By using the approach of statistically validated networks, we detect geographical and economic over-expressions of a set of companies related to the multi-industry, geographically and economically distributed fintech movement. △ Less

Submitted 14 July, 2020; originally announced July 2020.

Comments: 11 pages, 4 figures, 4 tables

arXiv:1906.06908 [pdf, other]

doi 10.1016/j.physa.2022.126933

Nested partitions from hierarchical clustering statistical validation

Authors: Christian Bongiorno, Salvatore Miccichè, Rosario N. Mantegna

Abstract: We develop a greedy algorithm that is fast and scalable in the detection of a nested partition extracted from a dendrogram obtained from hierarchical clustering of a multivariate series. Our algorithm provides a $p$-value for each clade observed in the hierarchical tree. The $p$-value is obtained by computing a number of bootstrap replicas of the dissimilarity matrix and by performing a statistica… ▽ More We develop a greedy algorithm that is fast and scalable in the detection of a nested partition extracted from a dendrogram obtained from hierarchical clustering of a multivariate series. Our algorithm provides a $p$-value for each clade observed in the hierarchical tree. The $p$-value is obtained by computing a number of bootstrap replicas of the dissimilarity matrix and by performing a statistical test on each difference between the dissimilarity associated with a given clade and the dissimilarity of the clade of its parent node. We prove the efficacy of our algorithm with a set of benchmarks generated by using a hierarchical factor model. We compare the results obtained by our algorithm with those of Pvclust. Pvclust is a widely used algorithm developed with a global approach originally motivated by phylogenetic studies. In our numerical experiments we focus on the role of multiple hypothesis test correction and on the robustness of the algorithms to inaccuracy and errors of datasets. We also apply our algorithm to a reference empirical dataset. We verify that our algorithm is much faster than Pvclust algorithm and has a better scalability both in the number of elements and in the number of records of the investigated multivariate set. Our algorithm provides a hierarchically nested partition in much shorter time than currently widely used algorithms allowing to perform a statistically validated cluster analysis detection in very large systems. △ Less

Submitted 17 June, 2019; originally announced June 2019.

MSC Class: 92Dxx ACM Class: I.5.3

arXiv:1902.07074 [pdf, other]

doi 10.3254/190007

A primer on statistically validated networks

Authors: Salvatore Miccichè, Rosario Nunzio Mantegna

Abstract: In this contribution we discuss some approaches of network analysis providing information about single links or single nodes with respect to a null hypothesis taking into account the heterogeneity of the system empirically observed. With this approach, a selection of nodes and links is feasible when the null hypothesis is statistically rejected. We focus our discussion on approaches using (i) the… ▽ More In this contribution we discuss some approaches of network analysis providing information about single links or single nodes with respect to a null hypothesis taking into account the heterogeneity of the system empirically observed. With this approach, a selection of nodes and links is feasible when the null hypothesis is statistically rejected. We focus our discussion on approaches using (i) the so-called disparity filter and (ii) statistically validated network in bipartite networks. For both methods we discuss the importance of using multiple hypothesis test correction. Specific applications of statistically validated networks are discussed. We also discuss how statistically validated networks can be used to (i) pre-process large sets of data and (ii) detect cores of communities that are forming the most close-knit and stable subsets of clusters of nodes present in a complex system. △ Less

Submitted 19 February, 2019; originally announced February 2019.

Comments: 13 pages, 2 figures. Lecture notes for the International School of Physics "Enrico Fermi" Computational Social Science and Complex Systems 16-21 July 2018

Journal ref: Proceedings of the International School of Physics "Enrico Fermi", Volume 203: Computational Social Science and Complex Systems, (2019)

arXiv:1802.03395 [pdf, other]

doi 10.1016/j.physa.2018.08.020

Bootstrap validation of links of a minimum spanning tree

Authors: Federico Musciotto, Luca Marotta, Salvatore Miccichè, Rosario N. Mantegna

Abstract: We describe two different bootstrap methods applied to the detection of a minimum spanning tree obtained from a set of multivariate variables. We show that two different bootstrap procedures provide partly distinct information that can be highly informative about the investigated complex system. Our case study, based on the investigation of daily returns of a portfolio of stocks traded in the US e… ▽ More We describe two different bootstrap methods applied to the detection of a minimum spanning tree obtained from a set of multivariate variables. We show that two different bootstrap procedures provide partly distinct information that can be highly informative about the investigated complex system. Our case study, based on the investigation of daily returns of a portfolio of stocks traded in the US equity markets, shows the degree of robustness and completeness of the information extracted with popular information filtering methods such as the minimum spanning tree and the planar maximally filtered graph. The first method performs a "row bootstrap" whereas the second method performs a "pair bootstrap". We show that the parallel use of the two methods is suggested especially for complex systems presenting both a nested hierarchical organization together with the presence of global feedback channels. △ Less

Submitted 9 February, 2018; originally announced February 2018.

Comments: 17 pages, 7 figures

Journal ref: Physica A, 512, 1032-1043, (2018)

arXiv:1802.01113 [pdf, other]

On the interplay between multiscaling and stocks dependence

Authors: R. J. Buonocore, G. Brandi, R. N. Mantegna, T. Di Matteo

Abstract: We find a nonlinear dependence between an indicator of the degree of multiscaling of log-price time series of a stock and the average correlation of the stock with respect to the other stocks traded in the same market. This result is a robust stylized fact holding for different financial markets. We investigate this result conditional on the stocks' capitalization and on the kurtosis of stocks' lo… ▽ More We find a nonlinear dependence between an indicator of the degree of multiscaling of log-price time series of a stock and the average correlation of the stock with respect to the other stocks traded in the same market. This result is a robust stylized fact holding for different financial markets. We investigate this result conditional on the stocks' capitalization and on the kurtosis of stocks' log-returns in order to search for possible confounding effects. We show that a linear dependence with the logarithm of the capitalization and the logarithm of kurtosis does not explain the observed stylized fact, which we interpret as being originated from a deeper relationship. △ Less

Submitted 31 March, 2019; v1 submitted 4 February, 2018; originally announced February 2018.

Comments: 19 pages, 8 figures, 9 tables

arXiv:1704.01524 [pdf, other]

doi 10.1103/PhysRevE.96.022321

Core of communities in bipartite networks

Authors: Christian Bongiorno, András London, Salvatore Miccichè, Rosario N. Mantegna

Abstract: We use the information present in a bipartite network to detect cores of communities of each set of the bipartite system. Cores of communities are found by investigating statistically validated projected networks obtained using information present in the bipartite network. Cores of communities are highly informative and robust with respect to the presence of errors or missing entries in the bipart… ▽ More We use the information present in a bipartite network to detect cores of communities of each set of the bipartite system. Cores of communities are found by investigating statistically validated projected networks obtained using information present in the bipartite network. Cores of communities are highly informative and robust with respect to the presence of errors or missing entries in the bipartite network. We assess the statistical robustness of cores by investigating an artificial benchmark network, the co-authorship network, and the actor-movie network. The accuracy and precision of the partition obtained with respect to the reference partition are measured in terms of the adjusted Rand index and of the adjusted Wallace index respectively. The detection of cores is highly precise although the accuracy of the methodology can be limited in some cases. △ Less

Submitted 6 March, 2017; originally announced April 2017.

Comments: 9 pages, 6 figures

Journal ref: Phys. Rev. E 96, 022321 (2017)

arXiv:1609.08030 [pdf, other]

doi 10.1371/journal.pone.0175036

An empirically grounded agent based model for modeling directs, conflict detection and resolution operations in Air Traffic Management

Authors: C. Bongiorno, S. Micciche', Rosario N. Mantegna

Abstract: We present an agent based model of the Air Traffic Management socio-technical complex system that aims at modeling the interactions between aircrafts and air traffic controllers at a tactical level. The core of the model is given by the conflict detection and resolution module and by the directs module. Directs are flight shortcuts that are given by air controllers to speed up the passage of an ai… ▽ More We present an agent based model of the Air Traffic Management socio-technical complex system that aims at modeling the interactions between aircrafts and air traffic controllers at a tactical level. The core of the model is given by the conflict detection and resolution module and by the directs module. Directs are flight shortcuts that are given by air controllers to speed up the passage of an aircraft within a certain airspace and therefore to facilitate airline operations. Conflicts resolution between flight trajectories can arise during the en-route phase of each flight due to both not detailed flight trajectory planning or unforeseen events that perturb the planned flight plan. Our model performs a local conflict detection and resolution procedure. Once a flight trajectory has been made conflict-free, the model searches for possible improvements of the system efficiency by issuing directs. We give an example of model calibration based on real data. We then provide an illustration of the capability of our model in generating scenario simulations able to give insights about the air traffic management system. We show that the calibrated model is able to reproduce the existence of a geographical localization of air traffic controllers' operations. Finally, we use the model to investigate the relationship between directs and conflict resolutions (i) in the presence of perfect forecast ability of controllers, and (ii) in the presence of some degree of uncertainty in flight trajectory forecast. △ Less

Submitted 26 September, 2016; originally announced September 2016.

Comments: 18 pages, 2 tables, 12 figures

Journal ref: PLOS ONE, 12 (4), e0175036, (2017)

arXiv:1603.02859 [pdf, other]

doi 10.1016/j.jairtraman.2016.10.009

Statistical characterization of deviations from planned flight trajectories in air traffic management

Authors: C. Bongiorno, G. Gurtner, F. Lillo, R. N. Mantegna, S. Miccichè

Abstract: Understanding the relation between planned and realized flight trajectories and the determinants of flight deviations is of great importance in air traffic management. In this paper we perform an in depth investigation of the statistical properties of planned and realized air traffic on the German airspace during a 28 day periods, corresponding to an AIRAC cycle. We find that realized trajectories… ▽ More Understanding the relation between planned and realized flight trajectories and the determinants of flight deviations is of great importance in air traffic management. In this paper we perform an in depth investigation of the statistical properties of planned and realized air traffic on the German airspace during a 28 day periods, corresponding to an AIRAC cycle. We find that realized trajectories are on average shorter than planned ones and this effect is stronger during night-time than daytime. Flights are more frequently deviated close to the departure airport and at a relatively large angle to destination. Moreover, the probability of a deviation is higher in low traffic phases. All these evidences indicate that deviations are mostly used by controllers to give directs to flights when traffic conditions allow it. Finally we introduce a new metric, termed difork, which is able to characterize navigation points according to the likelihood that a deviation occurs there. Difork allows to identify in a statistically rigorous way navigation point pairs where deviations are more (less) frequent than expected under a null hypothesis of randomness that takes into account the heterogeneity of the navigation points. Such pairs can therefore be seen as sources of flexibility (stability) of controllers traffic management while conjugating safety and efficiency. △ Less

Submitted 9 March, 2016; originally announced March 2016.

Comments: 16 pages; 11 figures; 3 tables

Journal ref: JATM, 58, 152-163, (2017)

arXiv:1511.06873 [pdf, other]

doi 10.1016/j.chaos.2016.02.027

Patterns of trading profiles at the Nordic Stock Exchange. A correlation-based approach

Authors: Federico Musciotto, Luca Marotta, Salvatore Miccichè, Jyrki Piilo, Rosario N. Mantegna

Abstract: We investigate the trading behavior of Finnish individual investors trading the stocks selected to compute the OMXH25 index in 2003 by tracking the individual daily investment decisions. We verify that the set of investors is a highly heterogeneous system under many aspects. We introduce a correlation based method that is able to detect a hierarchical structure of the trading profiles of heterogen… ▽ More We investigate the trading behavior of Finnish individual investors trading the stocks selected to compute the OMXH25 index in 2003 by tracking the individual daily investment decisions. We verify that the set of investors is a highly heterogeneous system under many aspects. We introduce a correlation based method that is able to detect a hierarchical structure of the trading profiles of heterogeneous individual investors. We verify that the detected hierarchical structure is highly overlap** with the cluster structure obtained with the approach of statistically validated networks when an appropriate threshold of the hierarchical trees is used. We also show that the combination of the correlation based method and of the statistically validated method provides a way to expand the information about the clusters of investors with similar trading profiles in a robust and reliable way. △ Less

Submitted 21 November, 2015; originally announced November 2015.

Comments: 25 pages, 8 figures

Journal ref: Chaos Solitons and Fractal, 88, 267-278, (2016)

arXiv:1511.06870 [pdf, other]

doi 10.1140/epjds/s13688-016-0071-7

Backbone of credit relationships in the Japanese credit market

Authors: Luca Marotta, Salvatore Miccichè, Yoshi Fujiwara, Hiroshi Iyetomi, Hideaki Aoyama, Mauro Gallegati, Rosario N. Mantegna

Abstract: We detect the backbone of the weighted bipartite network of the Japanese credit market relationships. The backbone is detected by adapting a general method used in the investigation of weighted networks. With this approach we detect a backbone that is statistically validated against a null hypothesis of uniform diversification of loans for banks and firms. Our investigation is done year by year an… ▽ More We detect the backbone of the weighted bipartite network of the Japanese credit market relationships. The backbone is detected by adapting a general method used in the investigation of weighted networks. With this approach we detect a backbone that is statistically validated against a null hypothesis of uniform diversification of loans for banks and firms. Our investigation is done year by year and it covers more than thirty years during the period from 1980 to 2011. We relate some of our findings with economic events that have characterized the Japanese credit market during the last years. The study of the time evolution of the backbone allows us to detect changes occurred in network size, fraction of credit explained, and attributes characterizing the banks and the firms present in the backbone. △ Less

Submitted 21 November, 2015; originally announced November 2015.

Comments: 14 pages, 8 figures

Journal ref: EPJ Data Science, 5 (10), 1-14, (2016)

arXiv:1412.3697 [pdf, ps, other]

Hybrid recommendation methods in complex networks

Authors: A. Fiasconaro, M. Tumminello, V. Nicosia, V. Latora, R. N. Mantegna

Abstract: We propose here two new recommendation methods, based on the appropriate normalization of already existing similarity measures, and on the convex combination of the recommendation scores derived from similarity between users and between objects. We validate the proposed measures on three relevant data sets, and we compare their performance with several recommendation systems recently proposed in t… ▽ More We propose here two new recommendation methods, based on the appropriate normalization of already existing similarity measures, and on the convex combination of the recommendation scores derived from similarity between users and between objects. We validate the proposed measures on three relevant data sets, and we compare their performance with several recommendation systems recently proposed in the literature. We show that the proposed similarity measures allow to attain an improvement of performances of up to 20\% with respect to existing non-parametric methods, and that the accuracy of a recommendation can vary widely from one specific bipartite network to another, which suggests that a careful choice of the most suitable method is highly relevant for an effective recommendation on a given system. Finally, we studied how an increasing presence of random links in the network affects the recommendation scores, and we found that one of the two recommendation algorithms introduced here can systematically outperform the others in noisy data sets. △ Less

Submitted 10 December, 2014; originally announced December 2014.

Comments: 9 pages, 6 figures, 2 tables

arXiv:1409.0789 [pdf, ps, other]

Sicily and the development of Econophysics: the pioneering work of Ettore Majorana and the Econophysics Workshop in Palermo

Authors: Rosario N. Mantegna

Abstract: Sicily has played an important role in the development of the new research area named "Econophysics". In fact some key ideas supporting this new hybrid discipline were originally formulated in a pioneering work of the Sicilian born physicist Ettore Majorana. The article he wrote was entitled "The value of statistical laws in physics and social sciences". I will discuss its origin and history that… ▽ More Sicily has played an important role in the development of the new research area named "Econophysics". In fact some key ideas supporting this new hybrid discipline were originally formulated in a pioneering work of the Sicilian born physicist Ettore Majorana. The article he wrote was entitled "The value of statistical laws in physics and social sciences". I will discuss its origin and history that has been recently discovered in the study of Stefano Roncoroni. This recent study documents the true reasons and motivations that triggered the pioneering work of Majorana. It also shows that the description of this work provided by Edoardo Amaldi was shallow and misleading. In the second part of the talk I will recollect the first years of development of econophysics and in particular the role of the "International Workshop on Econophysics and Statistical Finance" held in Palermo on 28-30 September 1998 and the setting in 1999 of the "Observatory of Complex Systems" the research group on Econophysics of Palermo University and Istituto Nazionale di Fisica della Materia. △ Less

Submitted 1 September, 2014; originally announced September 2014.

Comments: 4 pages, Proceedings of the XXXIII Congress of the Italian Society for the History of Physics and Astronomy (SISFA 2013 Acireale September 4th-7th 2013)

arXiv:1407.5429 [pdf, other]

doi 10.1371/journal.pone.0123079

Bank-firm credit network in Japan. An analysis of a bipartite network

Authors: Luca Marotta, Salvatore Miccichè, Yoshi Fujiwara, Hiroshi Iyetomi, Hideaki Aoyama, Mauro Gallegati, Rosario N. Mantegna

Abstract: We present an analysis of the credit market of Japan. The analysis is performed by investigating the bipartite network of banks and firms which is obtained by setting a link between a bank and a firm when a credit relationship is present in a given time window. In our investigation we focus on a community detection algorithm which is identifying communities composed by both banks and firms. We sho… ▽ More We present an analysis of the credit market of Japan. The analysis is performed by investigating the bipartite network of banks and firms which is obtained by setting a link between a bank and a firm when a credit relationship is present in a given time window. In our investigation we focus on a community detection algorithm which is identifying communities composed by both banks and firms. We show that the clusters obtained by directly working on the bipartite network carry information about the networked nature of the Japanese credit market. Our analysis is performed for each calendar year during the time period from 1980 to 2011. Specifically, we obtain communities of banks and networks for each of the 32 investigated years, and we introduce a method to track the time evolution of these communities on a statistical basis. We then characterize communities by detecting the simultaneous over-expression of attributes of firms and banks. Specifically, we consider as attributes the economic sector and the geographical location of firms and the type of banks. In our 32 year long analysis we detect a persistence of the over-expression of attributes of clusters of banks and firms together with a slow dynamics of changes from some specific attributes to new ones. Our empirical observations show that the credit market in Japan is a networked market where the type of banks, geographical location of firms and banks and economic sector of the firm play a role in sha** the credit relationships between banks and firms. △ Less

Submitted 21 July, 2014; originally announced July 2014.

Comments: 9 pages, 4 figures, 2 Tables

Journal ref: PLOS, 10 (5), e0123079, (2015)

arXiv:1403.3785 [pdf, other]

doi 10.1088/1367-2630/16/8/083038

Statistically validated mobile communication networks: Evolution of motifs in European and Chinese data

Authors: Ming-Xia Li, Vasyl Palchykov, Zhi-Qiang Jiang, Kimmo Kaski, Janos Kertész, Salvatore Miccichè, Michele Tumminello, Wei-Xing Zhou, Rosario N. Mantegna

Abstract: Big data open up unprecedented opportunities to investigate complex systems including the society. In particular, communication data serve as major sources for computational social sciences but they have to be cleaned and filtered as they may contain spurious information due to recording errors as well as interactions, like commercial and marketing activities, not directly related to the social ne… ▽ More Big data open up unprecedented opportunities to investigate complex systems including the society. In particular, communication data serve as major sources for computational social sciences but they have to be cleaned and filtered as they may contain spurious information due to recording errors as well as interactions, like commercial and marketing activities, not directly related to the social network. The network constructed from communication data can only be considered as a proxy for the network of social relationships. Here we apply a systematic method, based on multiple hypothesis testing, to statistically validate the links and then construct the corresponding Bonferroni network, generalized to the directed case. We study two large datasets of mobile phone records, one from Europe and the other from China. For both datasets we compare the raw data networks with the corresponding Bonferroni networks and point out significant differences in the structures and in the basic network measures. We show evidence that the Bonferroni network provides a better proxy for the network of social interactions than the original one. By using the filtered networks we investigated the statistics and temporal evolution of small directed 3-motifs and conclude that closed communication triads have a formation time-scale, which is quite fast and typically intraday. We also find that open communication triads preferentially evolve to other open triads with a higher fraction of reciprocated calls. These stylized facts were observed for both datasets. △ Less

Submitted 15 March, 2014; originally announced March 2014.

Comments: 19 pages, 8 figures, 5 tables

Journal ref: New J. Phys. 16 (2014) 083038

arXiv:1403.3638 [pdf, other]

doi 10.1016/j.jedc.2014.08.016

Networked relationships in the e-MID Interbank market: A trading model with memory

Authors: Giulia Iori, Rosario N. Mantegna, Luca Marotta, Salvatore Micciche', James Porter, Michele Tumminello

Abstract: Interbank markets are fundamental for bank liquidity management. In this paper, we introduce a model of interbank trading with memory. Our model reproduces features of preferential trading patterns in the e-MID market recently empirically observed through the method of statistically validated networks. The memory mechanism is used to introduce a proxy of trust in the model. The key idea is that a… ▽ More Interbank markets are fundamental for bank liquidity management. In this paper, we introduce a model of interbank trading with memory. Our model reproduces features of preferential trading patterns in the e-MID market recently empirically observed through the method of statistically validated networks. The memory mechanism is used to introduce a proxy of trust in the model. The key idea is that a lender, having lent many times to a borrower in the past, is more likely to lend to that borrower again in the future than to other borrowers, with which the lender has never (or has in- frequently) interacted. The core of the model depends on only one parameter representing the initial attractiveness of all the banks as borrowers. Model outcomes and real data are compared through a variety of measures that describe the structure and properties of trading networks, including number of statistically validated links, bidirectional links, and 3-motifs. Refinements of the pairing method are also proposed, in order to capture finite memory and reciprocity in the model. The model is implemented within the Mason framework in Java. △ Less

Submitted 14 March, 2014; originally announced March 2014.

Comments: 37 pages, 10 figures

Journal ref: JEDC, 50, 98-116, (2015)

arXiv:1402.6573 [pdf, ps, other]

doi 10.1038/srep05132

A comparative analysis of the statistical properties of large mobile phone calling networks

Authors: Ming-Xia Li, Zhi-Qiang Jiang, Wen-Jie Xie, Salvatore Miccichè, Michele Tumminello, Wei-Xing Zhou, Rosario N. Mantegna

Abstract: Mobile phone calling is one of the most widely used communication methods in modern society. The records of calls among mobile phone users provide us a valuable proxy for the understanding of human communication patterns embedded in social networks. Mobile phone users call each other forming a directed calling network. If only reciprocal calls are considered, we obtain an undirected mutual calling… ▽ More Mobile phone calling is one of the most widely used communication methods in modern society. The records of calls among mobile phone users provide us a valuable proxy for the understanding of human communication patterns embedded in social networks. Mobile phone users call each other forming a directed calling network. If only reciprocal calls are considered, we obtain an undirected mutual calling network. The preferential communication behavior between two connected users can be statistically tested and it results in two Bonferroni networks with statistically validated edges. We perform a comparative analysis of the statistical properties of these four networks, which are constructed from the calling records of more than nine million individuals in Shanghai over a period of 110 days. We find that these networks share many common structural properties and also exhibit idiosyncratic features when compared with previously studied large mobile calling networks. The empirical findings provide us an intriguing picture of a representative large social network that might shed new lights on the modelling of large social networks. △ Less

Submitted 6 May, 2014; v1 submitted 25 February, 2014; originally announced February 2014.

Journal ref: Scientific Reports 4, 5132 (2014)

arXiv:1401.0462 [pdf, other]

Emergence of statistically validated financial intraday lead-lag relationships

Authors: Chester Curme, Michele Tumminello, Rosario N. Mantegna, H. Eugene Stanley, Dror Y. Kenett

Abstract: According to the leading models in modern finance, the presence of intraday lead-lag relationships between financial assets is negligible in efficient markets. With the advance of technology, however, markets have become more sophisticated. To determine whether this has resulted in an improved market efficiency, we investigate whether statistically significant lagged correlation relationships exis… ▽ More According to the leading models in modern finance, the presence of intraday lead-lag relationships between financial assets is negligible in efficient markets. With the advance of technology, however, markets have become more sophisticated. To determine whether this has resulted in an improved market efficiency, we investigate whether statistically significant lagged correlation relationships exist in financial markets. We introduce a numerical method to statistically validate links in correlation-based networks, and employ our method to study lagged correlation networks of equity returns in financial markets. Crucially, our statistical validation of lead-lag relationships accounts for multiple hypothesis testing over all stock pairs. In an analysis of intraday transaction data from the periods 2002--2003 and 2011--2012, we find a striking growth in the networks as we increase the frequency with which we sample returns. We compute how the number of validated links and the magnitude of correlations change with increasing sampling frequency, and compare the results between the two data sets. Finally, we compare topological properties of the directed correlation-based networks from the two periods using the in-degree and out-degree distributions and an analysis of three-node motifs. Our analysis suggests a growth in both the efficiency and instability of financial markets over the past decade. △ Less

Submitted 2 January, 2014; originally announced January 2014.

arXiv:1306.4769 [pdf, other]

doi 10.1103/PhysRevE.88.012806

Evolution of correlation structure of industrial indices of US equity markets

Authors: Giuseppe Buccheri, Stefano Marmi, Rosario N. Mantegna

Abstract: We investigate the dynamics of correlations present between pairs of industry indices of US stocks traded in US markets by studying correlation based networks and spectral properties of the correlation matrix. The study is performed by using 49 industry index time series computed by K. French and E. Fama during the time period from July 1969 to December 2011 that is spanning more than 40 years. We… ▽ More We investigate the dynamics of correlations present between pairs of industry indices of US stocks traded in US markets by studying correlation based networks and spectral properties of the correlation matrix. The study is performed by using 49 industry index time series computed by K. French and E. Fama during the time period from July 1969 to December 2011 that is spanning more than 40 years. We show that the correlation between industry indices presents both a fast and a slow dynamics. The slow dynamics has a time scale longer than five years showing that a different degree of diversification of the investment is possible in different periods of time. On top to this slow dynamics, we also detect a fast dynamics associated with exogenous or endogenous events. The fast time scale we use is a monthly time scale and the evaluation time period is a 3 month time period. By investigating the correlation dynamics monthly, we are able to detect two examples of fast variations in the first and second eigenvalue of the correlation matrix. The first occurs during the dot-com bubble (from March 1999 to April 2001) and the second occurs during the period of highest impact of the subprime crisis (from August 2008 to August 2009). △ Less

Submitted 20 June, 2013; originally announced June 2013.

Comments: 8 pages, 10 figures

arXiv:1306.3769 [pdf, other]

doi 10.1371/journal.pone.0094414

Multi-scale analysis of the European airspace using network community detection

Authors: Gérald Gurtner, Stefania Vitali, Marco Cipolla, Fabrizio Lillo, Rosario Nunzio Mantegna, Salvatore Miccichè, Simone Pozzi

Abstract: We show that the European airspace can be represented as a multi-scale traffic network whose nodes are airports, sectors, or navigation points and links are defined and weighted according to the traffic of flights between the nodes. By using a unique database of the air traffic in the European airspace, we investigate the architecture of these networks with a special emphasis on their community st… ▽ More We show that the European airspace can be represented as a multi-scale traffic network whose nodes are airports, sectors, or navigation points and links are defined and weighted according to the traffic of flights between the nodes. By using a unique database of the air traffic in the European airspace, we investigate the architecture of these networks with a special emphasis on their community structure. We propose that unsupervised network community detection algorithms can be used to monitor the current use of the airspaces and improve it by guiding the design of new ones. Specifically, we compare the performance of three community detection algorithms, also by using a null model which takes into account the spatial distance between nodes, and we discuss their ability to find communities that could be used to define new control units of the airspace. △ Less

Submitted 17 June, 2013; originally announced June 2013.

Comments: 22 pages, 14 figures

Journal ref: PLoS ONE 9(5): e94414 (2014)

arXiv:1211.6356 [pdf, ps, other]

doi 10.1088/1367-2630/15/3/033033

Scale-free relaxation of a wave packet in a quantum well with power-law tails

Authors: Salvatore Miccichè, Andreas Buchleitner, Fabrizio Lillo, Rosario N. Mantegna, Tobias Paul, Sandro Wimberger

Abstract: We propose a setup for which a power-law decay is predicted to be observable for generic and realistic conditions. The system we study is very simple: A quantum wave packet initially prepared in a potential well with (i) tails asymptotically decaying like ~ x^{-2} and (ii) an eigenvalues spectrum that shows a continuous part attached to the ground or equilibrium state. We analytically derive the a… ▽ More We propose a setup for which a power-law decay is predicted to be observable for generic and realistic conditions. The system we study is very simple: A quantum wave packet initially prepared in a potential well with (i) tails asymptotically decaying like ~ x^{-2} and (ii) an eigenvalues spectrum that shows a continuous part attached to the ground or equilibrium state. We analytically derive the asymptotic decay law from the spectral properties for generic, confined initial states. Our findings are supported by realistic numerical simulations for state-of-the-art expansion experiments with cold atoms. △ Less

Submitted 10 February, 2013; v1 submitted 27 November, 2012; originally announced November 2012.

Comments: improved and extended version

Journal ref: New J. Phys. vol. 15, 033033 (2013)

arXiv:1207.3300 [pdf, other]

doi 10.1080/14697688.2014.931593

How news affect the trading behavior of different categories of investors in a financial market

Authors: Fabrizio Lillo, Salvatore Miccichè, Michele Tumminello, Jyrki Piilo, Rosario Nunzio Mantegna

Abstract: We investigate the trading behavior of a large set of single investors trading the highly liquid Nokia stock over the period 2003-2008 with the aim of determining the relative role of endogenous and exogenous factors that may affect their behavior. As endogenous factors we consider returns and volatility, whereas the exogenous factors we use are the total daily number of news and a semantic variab… ▽ More We investigate the trading behavior of a large set of single investors trading the highly liquid Nokia stock over the period 2003-2008 with the aim of determining the relative role of endogenous and exogenous factors that may affect their behavior. As endogenous factors we consider returns and volatility, whereas the exogenous factors we use are the total daily number of news and a semantic variable based on a sentiment analysis of news. Linear regression and partial correlation analysis of data show that different categories of investors are differently correlated to these factors. Governmental and non profit organizations are weakly sensitive to news and returns or volatility, and, typically, they are more correlated with the former than with the latter. Households and companies, on the contrary, are very sensitive to both endogenous and exogenous factors, and volatility and returns are, on average, much more relevant than the number of news and sentiment, respectively. Finally, financial institutions and foreign organizations are intermediate between these two cases, in terms of both the total explanatory power of these factors and their relative importance. △ Less

Submitted 13 July, 2012; originally announced July 2012.

Comments: 30 pages, 4 figures and 5 tables

Journal ref: Quantitative Finance, 15(2), 213-229, (2015)

arXiv:1107.3942 [pdf, other]

doi 10.1088/1367-2630/14/1/013041

Identification of clusters of investors from their real trading activity in a financial market

Authors: Michele Tumminello, Fabrizio Lillo, Jyrki Piilo, Rosario N. Mantegna

Abstract: We use statistically validated networks, a recently introduced method to validate links in a bipartite system, to identify clusters of investors trading in a financial market. Specifically, we investigate a special database allowing to track the trading activity of individual investors of the stock Nokia. We find that many statistically detected clusters of investors show a very high degree of syn… ▽ More We use statistically validated networks, a recently introduced method to validate links in a bipartite system, to identify clusters of investors trading in a financial market. Specifically, we investigate a special database allowing to track the trading activity of individual investors of the stock Nokia. We find that many statistically detected clusters of investors show a very high degree of synchronization in the time when they decide to trade and in the trading action taken. We investigate the composition of these clusters and we find that several of them show an over-expression of specific categories of investors. △ Less

Submitted 20 July, 2011; originally announced July 2011.

Comments: 25 pages, 5 figures

arXiv:1103.5555 [pdf, other]

doi 10.1103/PhysRevE.84.026108

Evolution of worldwide stock markets, correlation structure and correlation based graphs

Authors: Dong-Ming Song, Michele Tumminello, Wei-Xing Zhou, Rosario N. Mantegna

Abstract: We investigate the daily correlation present among market indices of stock exchanges located all over the world in the time period Jan 1996 - Jul 2009. We discover that the correlation among market indices presents both a fast and a slow dynamics. The slow dynamics reflects the development and consolidation of globalization. The fast dynamics is associated with critical events that originate in a… ▽ More We investigate the daily correlation present among market indices of stock exchanges located all over the world in the time period Jan 1996 - Jul 2009. We discover that the correlation among market indices presents both a fast and a slow dynamics. The slow dynamics reflects the development and consolidation of globalization. The fast dynamics is associated with critical events that originate in a specific country or region of the world and rapidly affect the global system. We provide evidence that the short term timescale of correlation among market indices is less than 3 trading months (about 60 trading days). The average values of the non diagonal elements of the correlation matrix, correlation based graphs and the spectral properties of the largest eigenvalues and eigenvectors of the correlation matrix are carrying information about the fast and slow dynamics of correlation of market indices. We introduce a measure of mutual information based on link co-occurrence in networks, in order to detect the fast dynamics of successive changes of correlation based graphs in a quantitative way. △ Less

Submitted 29 March, 2011; originally announced March 2011.

Comments: 8 pages, 11 figures

Journal ref: Physical Review E 84 (2), 026108 (2011)

arXiv:1103.2234 [pdf, other]

doi 10.1016/j.jedc.2013.11.010

Do firms share the same functional form of their growth rate distribution? A new statistical test

Authors: Josè T. Lunardi, Salvatore Miccichè, Fabrizio Lillo, Rosario N. Mantegna, Mauro Gallegati

Abstract: We introduce a new statistical test of the hypothesis that a balanced panel of firms have the same growth rate distribution or, more generally, that they share the same functional form of growth rate distribution. We applied the test to European Union and US publicly quoted manufacturing firms data, considering functional forms belonging to the Subbotin family of distributions. While our hypothese… ▽ More We introduce a new statistical test of the hypothesis that a balanced panel of firms have the same growth rate distribution or, more generally, that they share the same functional form of growth rate distribution. We applied the test to European Union and US publicly quoted manufacturing firms data, considering functional forms belonging to the Subbotin family of distributions. While our hypotheses are rejected for the vast majority of sets at the sector level, we cannot rejected them at the subsector level, indicating that homogenous panels of firms could be described by a common functional form of growth rate distribution. △ Less

Submitted 11 March, 2011; originally announced March 2011.

Comments: 17 pages, 3 figures, 2 tables

Journal ref: JEDC 39 140-164 (2014)

arXiv:1102.0687 [pdf, other]

Trading activity and price impact in parallel markets: SETS vs. off-book market at the London Stock Exchange

Authors: Angelo Carollo, Gabriella Vaglica, Fabrizio Lillo, Rosario N. Mantegna

Abstract: We empirically study the trading activity in the electronic on-book segment and in the dealership off-book segment of the London Stock Exchange, investigating separately the trading of active market members and of other market participants which are non-members. We find that (i) the volume distribution of off-book transactions has a significantly fatter tail than the one of on-book transactions, (… ▽ More We empirically study the trading activity in the electronic on-book segment and in the dealership off-book segment of the London Stock Exchange, investigating separately the trading of active market members and of other market participants which are non-members. We find that (i) the volume distribution of off-book transactions has a significantly fatter tail than the one of on-book transactions, (ii) groups of members and non-members can be classified in categories according to their trading profile (iii) there is a strong anticorrelation between the daily inventory variation of a market member due to the on-book market transactions and inventory variation due to the off-book market transactions with non-members, and (iv) the autocorrelation of the sign of the orders of non-members in the off-book market is slowly decaying. We also analyze the on-book price impact function over time, both for positive and negative lags, of the electronic trades and of the off-book trades. The unconditional impact curves are very different for the electronic trades and the off-book trades. Moreover there is a small dependence of impact on the volume for the on-book electronic trades, while the shape and magnitude of impact function of off-book transactions strongly depend on volume. △ Less

Submitted 3 February, 2011; originally announced February 2011.

Comments: 16 pages, 9 figures

arXiv:1011.4161 [pdf, other]

doi 10.1088/1742-5468/2011/01/P01019

Community characterization of heterogeneous complex systems

Authors: Michele Tumminello, Salvatore Miccichè, Fabrizio Lillo, Jan Varho, Jyrki Piilo, Rosario N. Mantegna

Abstract: We introduce an analytical statistical method to characterize the communities detected in heterogeneous complex systems. By posing a suitable null hypothesis, our method makes use of the hypergeometric distribution to assess the probability that a given property is over-expressed in the elements of a community with respect to all the elements of the investigated set. We apply our method to two spe… ▽ More We introduce an analytical statistical method to characterize the communities detected in heterogeneous complex systems. By posing a suitable null hypothesis, our method makes use of the hypergeometric distribution to assess the probability that a given property is over-expressed in the elements of a community with respect to all the elements of the investigated set. We apply our method to two specific complex networks, namely a network of world movies and a network of physics preprints. The characterization of the elements and of the communities is done in terms of languages and countries for the movie network and of journals and subject categories for papers. We find that our method is able to characterize clearly the identified communities. Moreover our method works well both for large and for small communities. △ Less

Submitted 18 November, 2010; originally announced November 2010.

Comments: 8 pages, 1 figure and 2 tables

Journal ref: J. Stat. Mech., P01019, (2011)

arXiv:1008.1414 [pdf, other]

doi 10.1371/journal.pone.0017994

Statistically validated networks in bipartite complex systems

Authors: Michele Tumminello, Salvatore Miccichè, Fabrizio Lillo, Jyrki Piilo, Rosario N. Mantegna

Abstract: Many complex systems present an intrinsic bipartite nature and are often described and modeled in terms of networks [1-5]. Examples include movies and actors [1, 2, 4], authors and scientific papers [6-9], email accounts and emails [10], plants and animals that pollinate them [11, 12]. Bipartite networks are often very heterogeneous in the number of relationships that the elements of one set estab… ▽ More Many complex systems present an intrinsic bipartite nature and are often described and modeled in terms of networks [1-5]. Examples include movies and actors [1, 2, 4], authors and scientific papers [6-9], email accounts and emails [10], plants and animals that pollinate them [11, 12]. Bipartite networks are often very heterogeneous in the number of relationships that the elements of one set establish with the elements of the other set. When one constructs a projected network with nodes from only one set, the system heterogeneity makes it very difficult to identify preferential links between the elements. Here we introduce an unsupervised method to statistically validate each link of the projected network against a null hypothesis taking into account the heterogeneity of the system. We apply our method to three different systems, namely the set of clusters of orthologous genes (COG) in completely sequenced genomes [13, 14], a set of daily returns of 500 US financial stocks, and the set of world movies of the IMDb database [15]. In all these systems, both different in size and level of heterogeneity, we find that our method is able to detect network structures which are informative about the system and are not simply expression of its heterogeneity. Specifically, our method (i) identifies the preferential relationships between the elements, (ii) naturally highlights the clustered structure of investigated systems, and (iii) allows to classify links according to the type of statistically validated relationships between the connected nodes. △ Less

Submitted 8 August, 2010; originally announced August 2010.

Comments: Main text: 13 pages, 3 figures, and 1 Table. Supplementary information: 15 pages, 3 figures, and 2 Tables

Journal ref: PLOS ONE 6 (3) e17994 (2011)

arXiv:1004.4272 [pdf, other]

When do improved covariance matrix estimators enhance portfolio optimization? An empirical comparative study of nine estimators

Authors: Ester Pantaleo, Michele Tumminello, Fabrizio Lillo, Rosario N. Mantegna

Abstract: The use of improved covariance matrix estimators as an alternative to the sample estimator is considered an important approach for enhancing portfolio optimization. Here we empirically compare the performance of 9 improved covariance estimation procedures by using daily returns of 90 highly capitalized US stocks for the period 1997-2007. We find that the usefulness of covariance matrix estimators… ▽ More The use of improved covariance matrix estimators as an alternative to the sample estimator is considered an important approach for enhancing portfolio optimization. Here we empirically compare the performance of 9 improved covariance estimation procedures by using daily returns of 90 highly capitalized US stocks for the period 1997-2007. We find that the usefulness of covariance matrix estimators strongly depends on the ratio between estimation period T and number of stocks N, on the presence or absence of short selling, and on the performance metric considered. When short selling is allowed, several estimation methods achieve a realized risk that is significantly smaller than the one obtained with the sample covariance method. This is particularly true when T/N is close to one. Moreover many estimators reduce the fraction of negative portfolio weights, while little improvement is achieved in the degree of diversification. On the contrary when short selling is not allowed and T>N, the considered methods are unable to outperform the sample covariance in terms of realized risk but can give much more diversified portfolios than the one obtained with the sample covariance. When T<N the use of the sample covariance matrix and of the pseudoinverse gives portfolios with very poor performance. △ Less

Submitted 24 April, 2010; originally announced April 2010.

Comments: 30 pages

arXiv:1003.2981 [pdf, other]

doi 10.1088/1367-2630/12/7/075031

Statistical identification with hidden Markov models of large order splitting strategies in an equity market

Authors: Gabriella Vaglica, Fabrizio Lillo, Rosario N. Mantegna

Abstract: Large trades in a financial market are usually split into smaller parts and traded incrementally over extended periods of time. We address these large trades as hidden orders. In order to identify and characterize hidden orders we fit hidden Markov models to the time series of the sign of the tick by tick inventory variation of market members of the Spanish Stock Exchange. Our methodology probabil… ▽ More Large trades in a financial market are usually split into smaller parts and traded incrementally over extended periods of time. We address these large trades as hidden orders. In order to identify and characterize hidden orders we fit hidden Markov models to the time series of the sign of the tick by tick inventory variation of market members of the Spanish Stock Exchange. Our methodology probabilistically detects trading sequences, which are characterized by a net majority of buy or sell transactions. We interpret these patches of sequential buying or selling transactions as proxies of the traded hidden orders. We find that the time, volume and number of transactions size distributions of these patches are fat tailed. Long patches are characterized by a high fraction of market orders and a low participation rate, while short patches have a large fraction of limit orders and a high participation rate. We observe the existence of a buy-sell asymmetry in the number, average length, average fraction of market orders and average participation rate of the detected patches. The detected asymmetry is clearly depending on the local market trend. We also compare the hidden Markov models patches with those obtained with the segmentation method used in Vaglica {\it et al.} (2008) and we conclude that the former ones can be interpreted as a partition of the latter ones. △ Less

Submitted 15 March, 2010; originally announced March 2010.

Comments: 26 pages, 12 figures

arXiv:0908.0202 [pdf, other]

doi 10.1103/PhysRevE.80.066102

Market impact and trading profile of large trading orders in stock markets

Authors: Esteban Moro, Javier Vicente, Luis G. Moyano, Austin Gerig, J. Doyne Farmer, Gabriella Vaglica, Fabrizio Lillo, Rosario N. Mantegna

Abstract: We empirically study the market impact of trading orders. We are specifically interested in large trading orders that are executed incrementally, which we call hidden orders. These are reconstructed based on information about market member codes using data from the Spanish Stock Market and the London Stock Exchange. We find that market impact is strongly concave, approximately increasing as the… ▽ More We empirically study the market impact of trading orders. We are specifically interested in large trading orders that are executed incrementally, which we call hidden orders. These are reconstructed based on information about market member codes using data from the Spanish Stock Market and the London Stock Exchange. We find that market impact is strongly concave, approximately increasing as the square root of order size. Furthermore, as a given order is executed, the impact grows in time according to a power-law; after the order is finished, it reverts to a level of about 0.5-0.7 of its value at its peak. We observe that hidden orders are executed at a rate that more or less matches trading in the overall market, except for small deviations at the beginning and end of the order. △ Less

Submitted 3 August, 2009; originally announced August 2009.

Comments: 9 pages, 7 figures

arXiv:0809.4615 [pdf, ps, other]

doi 10.1016/j.jebo.2010.01.004

Correlation, hierarchies, and networks in financial markets

Authors: M. Tumminello, F. Lillo, R. N. Mantegna

Abstract: We discuss some methods to quantitatively investigate the properties of correlation matrices. Correlation matrices play an important role in portfolio optimization and in several other quantitative descriptions of asset price dynamics in financial markets. Specifically, we discuss how to define and obtain hierarchical trees, correlation based trees and networks from a correlation matrix. The hie… ▽ More We discuss some methods to quantitatively investigate the properties of correlation matrices. Correlation matrices play an important role in portfolio optimization and in several other quantitative descriptions of asset price dynamics in financial markets. Specifically, we discuss how to define and obtain hierarchical trees, correlation based trees and networks from a correlation matrix. The hierarchical clustering and other procedures performed on the correlation matrix to detect statistically reliable aspects of the correlation matrix are seen as filtering procedures of the correlation matrix. We also discuss a method to associate a hierarchically nested factor model to a hierarchical tree obtained from a correlation matrix. The information retained in filtering procedures and its stability with respect to statistical fluctuations is quantified by using the Kullback-Leibler distance. △ Less

Submitted 26 September, 2008; originally announced September 2008.

Comments: 37 pages, 9 figures, 3 tables

Journal ref: J. Econ. Behav. Organ. 75, pp. 40-58 (2010)

arXiv:0803.2608 [pdf, other]

doi 10.1140/epjb/e2008-00276-8

Statistical properties of thermodynamically predicted RNA secondary structures in viral genomes

Authors: Marco Spanò, Fabrizio Lillo, Salvatore Miccichè, Rosario N. Mantegna

Abstract: By performing a comprehensive study on 1832 segments of 1212 complete genomes of viruses, we show that in viral genomes the hairpin structures of thermodynamically predicted RNA secondary structures are more abundant than expected under a simple random null hypothesis. The detected hairpin structures of RNA secondary structures are present both in coding and in noncoding regions for the four gro… ▽ More By performing a comprehensive study on 1832 segments of 1212 complete genomes of viruses, we show that in viral genomes the hairpin structures of thermodynamically predicted RNA secondary structures are more abundant than expected under a simple random null hypothesis. The detected hairpin structures of RNA secondary structures are present both in coding and in noncoding regions for the four groups of viruses categorized as dsDNA, dsRNA, ssDNA and ssRNA. For all groups hairpin structures of RNA secondary structures are detected more frequently than expected for a random null hypothesis in noncoding rather than in coding regions. However, potential RNA secondary structures are also present in coding regions of dsDNA group. In fact we detect evolutionary conserved RNA secondary structures in conserved coding and noncoding regions of a large set of complete genomes of dsDNA herpesviruses. △ Less

Submitted 18 March, 2008; originally announced March 2008.

Comments: 9 pages, 2 figures

Journal ref: Eur. Phys. J. B, 65, 323-331, (2008)

arXiv:0802.1600 [pdf, ps, other]

doi 10.1140/epjb/e2008-00225-7

Generation of hierarchically correlated multivariate symbolic sequences

Authors: Mi. Tumminello, F. Lillo, R. N. Mantegna

Abstract: We introduce an algorithm to generate multivariate series of symbols from a finite alphabet with a given hierarchical structure of similarities. The target hierarchical structure of similarities is arbitrary, for instance the one obtained by some hierarchical clustering procedure as applied to an empirical matrix of Hamming distances. The algorithm can be interpreted as the finite alphabet equiv… ▽ More We introduce an algorithm to generate multivariate series of symbols from a finite alphabet with a given hierarchical structure of similarities. The target hierarchical structure of similarities is arbitrary, for instance the one obtained by some hierarchical clustering procedure as applied to an empirical matrix of Hamming distances. The algorithm can be interpreted as the finite alphabet equivalent of the recently introduced hierarchically nested factor model (M. Tumminello et al. EPL 78 (3) 30006 (2007)). The algorithm is based on a generating mechanism that is different from the one used in the mutation rate approach. We apply the proposed methodology for investigating the relationship between the bootstrap value associated with a node of a phylogeny and the probability of finding that node in the true phylogeny. △ Less

Submitted 12 February, 2008; originally announced February 2008.

Comments: 7 pages, 6 figures, 1 table

Journal ref: Eur. Phys. J. B 65 (3): 333-340 (2008)

arXiv:0710.0576 [pdf, ps, other]

Shrinkage and spectral filtering of correlation matrices: a comparison via the Kullback-Leibler distance

Authors: M. Tumminello, F. Lillo, R. N. Mantegna

Abstract: The problem of filtering information from large correlation matrices is of great importance in many applications. We have recently proposed the use of the Kullback-Leibler distance to measure the performance of filtering algorithms in recovering the underlying correlation matrix when the variables are described by a multivariate Gaussian distribution. Here we use the Kullback-Leibler distance to… ▽ More The problem of filtering information from large correlation matrices is of great importance in many applications. We have recently proposed the use of the Kullback-Leibler distance to measure the performance of filtering algorithms in recovering the underlying correlation matrix when the variables are described by a multivariate Gaussian distribution. Here we use the Kullback-Leibler distance to investigate the performance of filtering methods based on Random Matrix Theory and on the shrinkage technique. We also present some results on the application of the Kullback-Leibler distance to multivariate data which are non Gaussian distributed. △ Less

Submitted 2 October, 2007; originally announced October 2007.

Comments: 11 pages, 4 figures, Presented at the Workshop "Random Matrix Theory: From Fundamental Physics To Application", Krakow, Poland, May 3-5, 2007

Journal ref: Acta Phys. Pol. B 38 (13), 4079-4088 (2007)

arXiv:0707.0385 [pdf, ps, other]

doi 10.1088/1367-2630/10/4/043019

Specialization of strategies and herding behavior of trading firms in a financial market

Authors: Fabrizio Lillo, Esteban Moro, Gabriella Vaglica, Rosario N. Mantegna

Abstract: The understanding of complex social or economic systems is an important scientific challenge. Here we present a comprehensive study of the Spanish Stock Exchange showing that most financial firms trading in that market are characterized by a resulting strategy and can be classified in groups of firms with different specialization. Few large firms overally act as trending firms whereas many heter… ▽ More The understanding of complex social or economic systems is an important scientific challenge. Here we present a comprehensive study of the Spanish Stock Exchange showing that most financial firms trading in that market are characterized by a resulting strategy and can be classified in groups of firms with different specialization. Few large firms overally act as trending firms whereas many heterogeneous firm act as reversing firms. The herding properties of these two groups are markedly different and consistently observed over a four-year period of trading. △ Less

Submitted 3 July, 2007; originally announced July 2007.

Comments: 8 pages, 5 figures

arXiv:0706.0168 [pdf, ps, other]

doi 10.1103/PhysRevE.76.031123

Kullback-Leibler distance as a measure of the information filtered from multivariate data

Authors: Michele Tumminello, Fabrizio Lillo, Rosario Nunzio Mantegna

Abstract: We show that the Kullback-Leibler distance is a good measure of the statistical uncertainty of correlation matrices estimated by using a finite set of data. For correlation matrices of multivariate Gaussian variables we analytically determine the expected values of the Kullback-Leibler distance of a sample correlation matrix from a reference model and we show that the expected values are known a… ▽ More We show that the Kullback-Leibler distance is a good measure of the statistical uncertainty of correlation matrices estimated by using a finite set of data. For correlation matrices of multivariate Gaussian variables we analytically determine the expected values of the Kullback-Leibler distance of a sample correlation matrix from a reference model and we show that the expected values are known also when the specific model is unknown. We propose to make use of the Kullback-Leibler distance to estimate the information extracted from a correlation matrix by correlation filtering procedures. We also show how to use this distance to measure the stability of filtering procedures with respect to statistical uncertainty. We explain the effectiveness of our method by comparing four filtering procedures, two of them being based on spectral analysis and the other two on hierarchical clustering. We compare these techniques as applied both to simulations of factor models and empirical data. We investigate the ability of these filtering procedures in recovering the correlation matrix of models from simulations. We discuss such an ability in terms of both the heterogeneity of model parameters and the length of data series. We also show that the two spectral techniques are typically more informative about the sample correlation matrix than techniques based on hierarchical clustering, whereas the latter are more stable with respect to statistical uncertainty. △ Less

Submitted 1 June, 2007; originally announced June 2007.

Comments: 13 pages, 6 figures

Journal ref: Phys. Rev. E 76, 031123 (2007)

arXiv:0704.2003 [pdf, ps, other]

doi 10.1103/PhysRevE.77.036110

Scaling laws of strategic behaviour and size heterogeneity in agent dynamics

Authors: Gabriella Vaglica, Fabrizio Lillo, Esteban Moro, Rosario N. Mantegna

Abstract: The dynamics of many socioeconomic systems is determined by the decision making process of agents. The decision process depends on agent's characteristics, such as preferences, risk aversion, behavioral biases, etc.. In addition, in some systems the size of agents can be highly heterogeneous leading to very different impacts of agents on the system dynamics. The large size of some agents poses c… ▽ More The dynamics of many socioeconomic systems is determined by the decision making process of agents. The decision process depends on agent's characteristics, such as preferences, risk aversion, behavioral biases, etc.. In addition, in some systems the size of agents can be highly heterogeneous leading to very different impacts of agents on the system dynamics. The large size of some agents poses challenging problems to agents who want to control their impact, either by forcing the system in a given direction or by hiding their intentionality. Here we consider the financial market as a model system, and we study empirically how agents strategically adjust the properties of large orders in order to meet their preference and minimize their impact. We quantify this strategic behavior by detecting scaling relations of allometric nature between the variables characterizing the trading activity of different institutions. We observe power law distributions in the investment time horizon, in the number of transactions needed to execute a large order and in the traded value exchanged by large institutions and we show that heterogeneity of agents is a key ingredient for the emergence of some aggregate properties characterizing this complex system. △ Less

Submitted 16 April, 2007; originally announced April 2007.

Comments: 6 pages, 3 figures

arXiv:physics/0701335 [pdf, ps, other]

Diffusive behavior and the modeling of characteristic times in limit order executions

Authors: Zoltan Eisler, Janos Kertesz, Fabrizio Lillo, Rosario N. Mantegna

Abstract: We present an empirical study of the first passage time (FPT) of order book prices needed to observe a prescribed price change Delta, the time to fill (TTF) for executed limit orders and the time to cancel (TTC) for canceled ones in a double auction market. We find that the distribution of all three quantities decays asymptotically as a power law, but that of FPT has significantly fatter tails t… ▽ More We present an empirical study of the first passage time (FPT) of order book prices needed to observe a prescribed price change Delta, the time to fill (TTF) for executed limit orders and the time to cancel (TTC) for canceled ones in a double auction market. We find that the distribution of all three quantities decays asymptotically as a power law, but that of FPT has significantly fatter tails than that of TTF. Thus a simple first passage time model cannot account for the observed TTF of limit orders. We propose that the origin of this difference is the presence of cancellations. We outline a simple model, which assumes that prices are characterized by the empirically observed distribution of the first passage time and orders are canceled randomly with lifetimes that are asymptotically power law distributed with an exponent lambda_LT. In spite of the simplifying assumptions of the model, the inclusion of cancellations is enough to account for the above observations and enables one to estimate characteristics of the cancellation strategies from empirical data. △ Less

Submitted 21 December, 2008; v1 submitted 30 January, 2007; originally announced January 2007.

Comments: 17 pages, 9 figures, 6 tables, to appear in Quantitative Finance

arXiv:physics/0609036 [pdf, ps, other]

doi 10.1117/12.729619

Economic sector identification in a set of stocks traded at the New York Stock Exchange: a comparative analysis

Authors: C. Coronnello, M. Tumminello, F. Lillo, S. Micciche`, R. N. Mantegna

Abstract: We review some methods recently used in the literature to detect the existence of a certain degree of common behavior of stock returns belonging to the same economic sector. Specifically, we discuss methods based on random matrix theory and hierarchical clustering techniques. We apply these methods to a set of stocks traded at the New York Stock Exchange. The investigated time series are recorde… ▽ More We review some methods recently used in the literature to detect the existence of a certain degree of common behavior of stock returns belonging to the same economic sector. Specifically, we discuss methods based on random matrix theory and hierarchical clustering techniques. We apply these methods to a set of stocks traded at the New York Stock Exchange. The investigated time series are recorded at a daily time horizon. All the considered methods are able to detect economic information and the presence of clusters characterized by the economic sector of stocks. However, different methodologies provide different information about the considered set. Our comparative analysis suggests that the application of just a single method could not be able to extract all the economic information present in the correlation coefficient matrix of a set of stocks. △ Less

Submitted 5 September, 2006; originally announced September 2006.

Comments: 13 pages, 8 figures, 2 Tables

Journal ref: Proceedings Volume 6601, Noise and Stochastics in Complex Systems and Finance; 66010T (2007)

arXiv:physics/0608279 [pdf, ps, other]

The Tenth Article of Ettore Majorana

Authors: Rosario Nunzio Mantegna

Abstract: This year is the centenary of the birth of Ettore Majorana, one of the major Italian physicists of all times. In this note we briefly sketch a few biographical details about Ettore Majorana and introduce and discuss the main points of Majorana's 10th article. In his article Majorana explicitly considers quantum mechanics as an irreducible statistical theory because the theory is not able to desc… ▽ More This year is the centenary of the birth of Ettore Majorana, one of the major Italian physicists of all times. In this note we briefly sketch a few biographical details about Ettore Majorana and introduce and discuss the main points of Majorana's 10th article. In his article Majorana explicitly considers quantum mechanics as an irreducible statistical theory because the theory is not able to describe the time evolution of a single particle or atom in a precise environment at a deterministic level. This lack of determinism at the level of an elementary physical system motivated him to suggest a formal analogy between statistical laws observed in physics and in the social sciences. We hope the occasion of the centenary of the birth of Ettore Majorana will be useful to remember and to reconsider not only his exceptional achievements in theoretical physics but also his fresh and original views on the role of statistical laws in physics and in other disciplines such as the social sciences. △ Less

Submitted 29 August, 2006; originally announced August 2006.

Comments: 3 pages, to appear in Europhysics News 37/4 July/August 2006

arXiv:physics/0608032 [pdf, ps, other]

Market reaction to temporary liquidity crises and the permanent market impact

Authors: Adam Ponzi, Fabrizio Lillo, Rosario N. Mantegna

Abstract: We study the relaxation dynamics of the bid-ask spread and of the midprice after a sudden, large variation of the spread, corresponding to a temporary crisis of liquidity in a double auction financial market. We find that the spread decays very slowly to its normal value as a consequence of the strategic limit order placement of liquidity providers. We consider several quantities, such as order… ▽ More We study the relaxation dynamics of the bid-ask spread and of the midprice after a sudden, large variation of the spread, corresponding to a temporary crisis of liquidity in a double auction financial market. We find that the spread decays very slowly to its normal value as a consequence of the strategic limit order placement of liquidity providers. We consider several quantities, such as order placement rates and distribution, that affect the decay of the spread. We measure the permanent impact both of a generic event altering the spread and of a single transaction and we find an approximately linear relation between immediate and permanent impact in both cases. △ Less

Submitted 3 August, 2006; originally announced August 2006.

Comments: 12 pages, 12 figures

arXiv:physics/0605251 [pdf, ps, other]

doi 10.1140/epjb/e2006-00414-4

Correlation based networks of equity returns sampled at different time horizons

Authors: M. Tumminello, T. Di Matteo, T. Aste, R. N. Mantegna

Abstract: We investigate the planar maximally filtered graphs of the portfolio of the 300 most capitalized stocks traded at the New York Stock Exchange during the time period 2001-2003. Topological properties such as the average length of shortest paths, the betweenness and the degree are computed on different planar maximally filtered graphs generated by sampling the returns at different time horizons ra… ▽ More We investigate the planar maximally filtered graphs of the portfolio of the 300 most capitalized stocks traded at the New York Stock Exchange during the time period 2001-2003. Topological properties such as the average length of shortest paths, the betweenness and the degree are computed on different planar maximally filtered graphs generated by sampling the returns at different time horizons ranging from 5 min up to one trading day. This analysis confirms that the selected stocks compose a hierarchical system progressively structuring as the sampling time horizon increases. Finally, a cluster formation, associated to economic sectors, is quantitatively investigated. △ Less

Submitted 3 April, 2007; v1 submitted 30 May, 2006; originally announced May 2006.

Comments: 9 pages, 8 figures

Journal ref: Eur. Phys. J. B 55 (2): 209-217 (2007)

arXiv:physics/0605116 [pdf, ps, other]

doi 10.1142/S0218127407018415

Spanning Trees and bootstrap reliability estimation in correlation based networks

Authors: M. Tumminello, C. Coronnello, F. Lillo, S. Micciche', R. N. Mantegna

Abstract: We introduce a new technique to associate a spanning tree to the average linkage cluster analysis. We term this tree as the Average Linkage Minimum Spanning Tree. We also introduce a technique to associate a value of reliability to links of correlation based graphs by using bootstrap replicas of data. Both techniques are applied to the portfolio of the 300 most capitalized stocks traded at New Y… ▽ More We introduce a new technique to associate a spanning tree to the average linkage cluster analysis. We term this tree as the Average Linkage Minimum Spanning Tree. We also introduce a technique to associate a value of reliability to links of correlation based graphs by using bootstrap replicas of data. Both techniques are applied to the portfolio of the 300 most capitalized stocks traded at New York Stock Exchange during the time period 2001-2003. We show that the Average Linkage Minimum Spanning Tree recognizes economic sectors and sub-sectors as communities in the network slightly better than the Minimum Spanning Tree does. We also show that the average reliability of links in the Minimum Spanning Tree is slightly greater than the average reliability of links in the Average Linkage Minimum Spanning Tree. △ Less

Submitted 15 May, 2006; originally announced May 2006.

Comments: 17 pages, 3 figures

Journal ref: Int. J. Bifurcation Chaos 17 (7), 2319-2329 (2007)

arXiv:cond-mat/0511726 [pdf, ps, other]

doi 10.1209/0295-5075/78/30006

Hierarchically nested factor model from multivariate data

Authors: M. Tumminello, F. Lillo, R. N. Mantegna

Abstract: We show how to achieve a statistical description of the hierarchical structure of a multivariate data set. Specifically we show that the similarity matrix resulting from a hierarchical clustering procedure is the correlation matrix of a factor model, the hierarchically nested factor model. In this model, factors are mutually independent and hierarchically organized. Finally, we use a bootstrap b… ▽ More We show how to achieve a statistical description of the hierarchical structure of a multivariate data set. Specifically we show that the similarity matrix resulting from a hierarchical clustering procedure is the correlation matrix of a factor model, the hierarchically nested factor model. In this model, factors are mutually independent and hierarchically organized. Finally, we use a bootstrap based procedure to reduce the number of factors in the model with the aim of retaining only those factors significantly robust with respect to the statistical uncertainty due to the finite length of data records. △ Less

Submitted 2 April, 2007; v1 submitted 30 November, 2005; originally announced November 2005.

Comments: 7 pages, 5 figures; accepted for publication in Europhys. Lett. ; the Appendix corresponds to the additional material of the accepted letter.

Journal ref: Europhys. Lett. 78, 30006 (2007)

arXiv:physics/0508118 [pdf, ps, other]

Correlation filtering in financial time series

Authors: T. Aste, T. Di Matteo, M. Tumminello, R. N. Mantegna

Abstract: We apply a method to filter relevant information from the correlation coefficient matrix by extracting a network of relevant interactions. This method succeeds to generate networks with the same hierarchical structure of the Minimum Spanning Tree but containing a larger amount of links resulting in a richer network topology allowing loops and cliques. In Tumminello et al. \cite{TumminielloPNAS05… ▽ More We apply a method to filter relevant information from the correlation coefficient matrix by extracting a network of relevant interactions. This method succeeds to generate networks with the same hierarchical structure of the Minimum Spanning Tree but containing a larger amount of links resulting in a richer network topology allowing loops and cliques. In Tumminello et al. \cite{TumminielloPNAS05}, we have shown that this method, applied to a financial portfolio of 100 stocks in the USA equity markets, is pretty efficient in filtering relevant information about the clustering of the system and its hierarchical structure both on the whole system and within each cluster. In particular, we have found that triangular loops and 4 element cliques have important and significant relations with the market structure and properties. Here we apply this filtering procedure to the analysis of correlation in two different kind of interest rate time series (16 Eurodollars and 34 US interest rates). △ Less

Submitted 17 August, 2005; originally announced August 2005.

Comments: 10 pages 7 figures

Journal ref: in {\it Noise and Fluctuations in Econophysics and Finance}, Edited by D. Abbott, J.-P. Bouchaud, X. Gabaix, J. L. McCauley, Proc. of SPIE, Vol. 5848 (SPIE, Bellingham, WA, 2005) 100-109. (Invited Paper)

arXiv:cond-mat/0508122 [pdf, ps, other]

Sector identification in a set of stock return time series traded at the London Stock Exchange

Authors: C. Coronnello, M. Tumminello, F. Lillo, S. Miccichè, R. N. Mantegna

Abstract: We compare some methods recently used in the literature to detect the existence of a certain degree of common behavior of stock returns belonging to the same economic sector. Specifically, we discuss methods based on random matrix theory and hierarchical clustering techniques. We apply these methods to a portfolio of stocks traded at the London Stock Exchange. The investigated time series are re… ▽ More We compare some methods recently used in the literature to detect the existence of a certain degree of common behavior of stock returns belonging to the same economic sector. Specifically, we discuss methods based on random matrix theory and hierarchical clustering techniques. We apply these methods to a portfolio of stocks traded at the London Stock Exchange. The investigated time series are recorded both at a daily time horizon and at a 5-minute time horizon. The correlation coefficient matrix is very different at different time horizons confirming that more structured correlation coefficient matrices are observed for long time horizons. All the considered methods are able to detect economic information and the presence of clusters characterized by the economic sector of stocks. However different methods present a different degree of sensitivity with respect to different sectors. Our comparative analysis suggests that the application of just a single method could not be able to extract all the economic information present in the correlation coefficient matrix of a stock portfolio. △ Less

Submitted 4 August, 2005; originally announced August 2005.

Comments: 28 pages, 13 figures, 3 Tables. Proceedings of the conference on "Applications of Random Matrices to Economy and other Complex Systems", Krakow (Poland), May 25-28 2005. Submitted for pubblication to Acta Phys. Pol

Journal ref: Acta Phys. Pol. B 36 (2005) 2653-2679

Showing 1–50 of 85 results for author: Mantegna, R N