-
Operator-Based Detecting, Learning, and Stabilizing Unstable Periodic Orbits of Chaotic Attractors
Authors:
Ali Tavasoli,
Heman Shakeri
Abstract:
This paper examines the use of operator-theoretic approaches to the analysis of chaotic systems through the lens of their unstable periodic orbits (UPOs). Our approach involves three data-driven steps for detecting, identifying, and stabilizing UPOs. We demonstrate the use of kernel integral operators within delay coordinates as an innovative method for UPO detection. For identifying the dynamic b…
▽ More
This paper examines the use of operator-theoretic approaches to the analysis of chaotic systems through the lens of their unstable periodic orbits (UPOs). Our approach involves three data-driven steps for detecting, identifying, and stabilizing UPOs. We demonstrate the use of kernel integral operators within delay coordinates as an innovative method for UPO detection. For identifying the dynamic behavior associated with each individual UPO, we utilize the Koopman operator to present the dynamics as linear equations in the space of Koopman eigenfunctions. This allows for characterizing the chaotic attractor by investigating its principal dynamical modes across varying UPOs. We extend this methodology into an interpretable machine learning framework aimed at stabilizing strange attractors on their UPOs. To illustrate the efficacy of our approach, we apply it to the Lorenz attractor as a case study.
△ Less
Submitted 7 September, 2023;
originally announced October 2023.
-
Characterizing the load profile in power grids by Koopman mode decomposition of interconnected dynamics
Authors:
Ali Tavasoli,
Behnaz Moradijamei,
Heman Shakeri
Abstract:
Electricity load forecasting is crucial for effectively managing and optimizing power grids. Over the past few decades, various statistical and deep learning approaches have been used to develop load forecasting models. This paper presents an interpretable machine learning approach that identifies load dynamics using data-driven methods within an operator-theoretic framework. We represent the load…
▽ More
Electricity load forecasting is crucial for effectively managing and optimizing power grids. Over the past few decades, various statistical and deep learning approaches have been used to develop load forecasting models. This paper presents an interpretable machine learning approach that identifies load dynamics using data-driven methods within an operator-theoretic framework. We represent the load data using the Koopman operator, which is inherent to the underlying dynamics. By computing the corresponding eigenfunctions, we decompose the load dynamics into coherent spatiotemporal patterns that are the most robust features of the dynamics. Each pattern evolves independently according to its single frequency, making its predictability based on linear dynamics. We emphasize that the load dynamics are constructed based on coherent spatiotemporal patterns that are intrinsic to the dynamics and are capable of encoding rich dynamical features at multiple time scales. These features are related to complex interactions over interconnected power grids and different exogenous effects. To implement the Koopman operator approach more efficiently, we cluster the load data using a modern kernel-based clustering approach and identify power stations with similar load patterns, particularly those with synchronized dynamics. We evaluate our approach using a large-scale dataset from a renewable electric power system within the continental European electricity system and show that the Koopman-based approach outperforms a deep learning (LSTM) architecture in terms of accuracy and computational efficiency. The code for this paper has been deposited in a GitHub repository, which can be accessed at the following address github.com/Shakeri-Lab/Power-Grids.
△ Less
Submitted 16 April, 2023;
originally announced April 2023.
-
MAD-FC: A Fold Change Visualization with Readability, Proportionality, and Symmetry
Authors:
Bruce A. Corliss,
Yaotian Wang,
Francis P. Driscoll,
Heman Shakeri,
Philip E. Bourne
Abstract:
We propose a fold change visualization that demonstrates a combination of properties from log and linear plots of fold change. A useful fold change visualization can exhibit: (1) readability, where fold change values are recoverable from datapoint position; (2) proportionality, where fold change values of the same direction are proportionally distant from the point of no change; (3) symmetry, wher…
▽ More
We propose a fold change visualization that demonstrates a combination of properties from log and linear plots of fold change. A useful fold change visualization can exhibit: (1) readability, where fold change values are recoverable from datapoint position; (2) proportionality, where fold change values of the same direction are proportionally distant from the point of no change; (3) symmetry, where positive and negative fold changes are equidistant to the point of no change; and (4) high dynamic range, where datapoint values are discernable across orders of magnitude. A linear visualization has readability and partial proportionality but lacks high dynamic range and symmetry (because negative direction fold changes are bound between [0, 1] while positive are between [1, $\infty$]). Log plots of fold change have partial readability, high dynamic range, and symmetry, but lack proportionality because of the log transform. We outline a new transform and visualization, named mirrored axis distortion of fold change (MAD-FC), that extends a linear visualization of fold change data to exhibit readability, proportionality, and symmetry (but still has the limited dynamic range of linear plots). We illustrate the use of MAD-FC with biomedical data using various fold change charts. We argue that MAD-FC plots may be a more useful visualization than log or linear plots for applications that require a limited dynamic range (approximately $\pm$2 orders of magnitude or $\pm$8 units in log2 space).
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
Contra-Analysis for Determining Negligible Effect Size in Scientific Research
Authors:
Bruce A. Corliss,
Yaotian Wang,
Heman Shakeri,
Philip E. Bourne
Abstract:
Scientific experiments study interventions that show evidence of an effect size that is meaningfully large, negligibly small, or inconclusively broad. Previously, we proposed contra-analysis as a decision-making process to help determine which interventions have a meaningfully large effect by using contra plots to compare effect size across broadly related experiments. Here, we extend the use of c…
▽ More
Scientific experiments study interventions that show evidence of an effect size that is meaningfully large, negligibly small, or inconclusively broad. Previously, we proposed contra-analysis as a decision-making process to help determine which interventions have a meaningfully large effect by using contra plots to compare effect size across broadly related experiments. Here, we extend the use of contra plots to determine which results have evidence of negligible (near-zero) effect size. Determining if an effect size is negligible is important for eliminating alternative scientific explanations and identifying approximate independence between an intervention and the variable measured. We illustrate that contra plots can score negligible effect size across studies, inform the selection of a threshold for negligible effect based on broadly related results, and determine which results have evidence of negligible effect with a hypothesis test. No other data visualization can carry out all three of these tasks for analyzing negligible effect size. We demonstrate this analysis technique on real data from biomedical research. This new application of contra plots can differentiate statistically insignificant results with high strength (narrow and near-zero interval estimate of effect size) from those with low strength (broad interval estimate of effect size). Such a designation could help resolve the File Drawer problem in science, where statistically insignificant results are underreported because their interpretation is ambiguous and nonstandard. With our proposed procedure, results designated with negligible effect will be considered strong and publishable evidence of near-zero effect size.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Leveraging Wastewater Monitoring for COVID-19 Forecasting in the US: a Deep Learning study
Authors:
Mehrdad Fazli,
Heman Shakeri
Abstract:
The outburst of COVID-19 in late 2019 was the start of a health crisis that shook the world and took millions of lives in the ensuing years. Many governments and health officials failed to arrest the rapid circulation of infection in their communities. The long incubation period and the large proportion of asymptomatic cases made COVID-19 particularly elusive to track. However, wastewater monitori…
▽ More
The outburst of COVID-19 in late 2019 was the start of a health crisis that shook the world and took millions of lives in the ensuing years. Many governments and health officials failed to arrest the rapid circulation of infection in their communities. The long incubation period and the large proportion of asymptomatic cases made COVID-19 particularly elusive to track. However, wastewater monitoring soon became a promising data source in addition to conventional indicators such as confirmed daily cases, hospitalizations, and deaths. Despite the consensus on the effectiveness of wastewater viral load data, there is a lack of methodological approaches that leverage viral load to improve COVID-19 forecasting. This paper proposes using deep learning to automatically discover the relationship between daily confirmed cases and viral load data. We trained one Deep Temporal Convolutional Networks (DeepTCN) and one Temporal Fusion Transformer (TFT) model to build a global forecasting model. We supplement the daily confirmed cases with viral loads and other socio-economic factors as covariates to the models. Our results suggest that TFT outperforms DeepTCN and learns a better association between viral load and daily cases. We demonstrated that equip** the models with the viral load improves their forecasting performance significantly. Moreover, viral load is shown to be the second most predictive input, following the containment and health index. Our results reveal the feasibility of training a location-agnostic deep-learning model to capture the dynamics of infection diffusion when wastewater viral load data is provided.
△ Less
Submitted 16 December, 2022;
originally announced December 2022.
-
Contra-Analysis: Prioritizing Meaningful Effect Size in Scientific Research
Authors:
Bruce A. Corliss,
Yaotian Wang,
Heman Shakeri,
Philip E. Bourne
Abstract:
At every phase of scientific research, scientists must decide how to allocate limited resources to pursue the research inquiries with the greatest potential. This prioritization dictates which controlled interventions are studied, awarded funding, published, reproduced with repeated experiments, investigated in related contexts, and translated for societal use. There are many factors that influenc…
▽ More
At every phase of scientific research, scientists must decide how to allocate limited resources to pursue the research inquiries with the greatest potential. This prioritization dictates which controlled interventions are studied, awarded funding, published, reproduced with repeated experiments, investigated in related contexts, and translated for societal use. There are many factors that influence this decision-making, but interventions with larger effect size are often favored because they exert the greatest influence on the system studied. To inform these decisions, scientists must compare effect size across studies with dissimilar experiment designs to identify the interventions with the largest effect. These studies are often only loosely related in nature, using experiments with a combination of different populations, conditions, timepoints, measurement techniques, and experiment models that measure the same phenomenon with a continuous variable. We name this assessment contra-analysis and propose to use credible intervals of the relative difference in means to compare effect size across studies in a meritocracy between competing interventions. We propose a data visualization, the contra plot, that allows scientists to score and rank effect size between studies that measure the same phenomenon, aid in determining an appropriate threshold for meaningful effect, and perform hypothesis tests to determine which interventions have meaningful effect size. We illustrate the use of contra plots with real biomedical research data. Contra-analysis promotes a practical interpretation of effect size and facilitates the prioritization of scientific research.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
The Least Difference in Means: A Statistic for Effect Size Strength and Practical Significance
Authors:
Bruce A. Corliss,
Yaotian Wang,
Heman Shakeri,
Philip E. Bourne
Abstract:
With limited resources, scientific inquiries must be prioritized for further study, funding, and translation based on their practical significance: whether the effect size is large enough to be meaningful in the real world. Doing so must evaluate a result's effect strength, defined as a conservative assessment of practical significance. We propose the least difference in means ($δ_L$) as a two-sam…
▽ More
With limited resources, scientific inquiries must be prioritized for further study, funding, and translation based on their practical significance: whether the effect size is large enough to be meaningful in the real world. Doing so must evaluate a result's effect strength, defined as a conservative assessment of practical significance. We propose the least difference in means ($δ_L$) as a two-sample statistic that can quantify effect strength and perform a hypothesis test to determine if a result has a meaningful effect size. To facilitate consensus, $δ_L$ allows scientists to compare effect strength between related results and choose different thresholds for hypothesis testing without recalculation. Both $δ_L$ and the relative $δ_L$ outperform other candidate statistics in identifying results with higher effect strength. We use real data to demonstrate how the relative $δ_L$ compares effect strength across broadly related experiments. The relative $δ_L$ can prioritize research based on the strength of their results.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
GeoTyper: Automated Pipeline from Raw scRNA-Seq Data to Cell Type Identification
Authors:
Cecily Wolfe,
Yayi Feng,
David Chen,
Edwin Purcell,
Anne Talkington,
Sepideh Dolatshahi,
Heman Shakeri
Abstract:
The cellular composition of the tumor microenvironment can directly impact cancer progression and the efficacy of therapeutics. Understanding immune cell activity, the body's natural defense mechanism, in the vicinity of cancerous cells is essential for develo** beneficial treatments. Single cell RNA sequencing (scRNA-seq) enables the examination of gene expression on an individual cell basis, p…
▽ More
The cellular composition of the tumor microenvironment can directly impact cancer progression and the efficacy of therapeutics. Understanding immune cell activity, the body's natural defense mechanism, in the vicinity of cancerous cells is essential for develo** beneficial treatments. Single cell RNA sequencing (scRNA-seq) enables the examination of gene expression on an individual cell basis, providing crucial information regarding both the disturbances in cell functioning caused by cancer and cell-cell communication in the tumor microenvironment. This novel technique generates large amounts of data, which require proper processing. Various tools exist to facilitate this processing but need to be organized to standardize the workflow from data wrangling to visualization, cell type identification, and analysis of changes in cellular activity, both from the standpoint of malignant cells and immune stromal cells that eliminate them. We aimed to develop a standardized pipeline (GeoTyper, https://github.com/celineyayifeng/GeoTyper) that integrates multiple scRNA-seq tools for processing raw sequence data extracted from NCBI GEO, visualization of results, statistical analysis, and cell type identification. This pipeline leverages existing tools, such as Cellranger from 10X Genomics, Alevin, and Seurat, to cluster cells and identify cell types based on gene expression profiles. We successfully tested and validated the pipeline on several publicly available scRNA-seq datasets, resulting in clusters corresponding to distinct cell types. By determining the cell types and their respective frequencies in the tumor microenvironment across multiple cancers, this workflow will help quantify changes in gene expression related to cell-cell communication and identify possible therapeutic targets.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
Using Machine Learning to Evaluate Real Estate Prices Using Location Big Data
Authors:
Walter Coleman,
Ben Johann,
Nicholas Pasternak,
Jaya Vellayan,
Natasha Foutz,
Heman Shakeri
Abstract:
With everyone trying to enter the real estate market nowadays, knowing the proper valuations for residential and commercial properties has become crucial. Past researchers have been known to utilize static real estate data (e.g. number of beds, baths, square footage) or even a combination of real estate and demographic information to predict property prices. In this investigation, we attempted to…
▽ More
With everyone trying to enter the real estate market nowadays, knowing the proper valuations for residential and commercial properties has become crucial. Past researchers have been known to utilize static real estate data (e.g. number of beds, baths, square footage) or even a combination of real estate and demographic information to predict property prices. In this investigation, we attempted to improve upon past research. So we decided to explore a unique approach: we wanted to determine if mobile location data could be used to improve the predictive power of popular regression and tree-based models. To prepare our data for our models, we processed the mobility data by attaching it to individual properties from the real estate data that aggregated users within 500 meters of the property for each day of the week. We removed people that lived within 500 meters of each property, so each property's aggregated mobility data only contained non-resident census features. On top of these dynamic census features, we also included static census features, including the number of people in the area, the average proportion of people commuting, and the number of residents in the area. Finally, we tested multiple models to predict real estate prices. Our proposed model is two stacked random forest modules combined using a ridge regression that uses the random forest outputs as predictors. The first random forest model used static features only and the second random forest model used dynamic features only. Comparing our models with and without the dynamic mobile location features concludes the model with dynamic mobile location features achieves 3/% percent lower mean squared error than the same model but without dynamic mobile location features.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
The Most Difference in Means: A Statistic for the Strength of Null and Near-Zero Results
Authors:
Bruce A. Corliss,
Taylor R. Brown,
Tingting Zhang,
Kevin A. Janes,
Heman Shakeri,
Philip E. Bourne
Abstract:
Statistical insignificance does not suggest the absence of effect, yet scientists must often use null results as evidence of negligible (near-zero) effect size to falsify scientific hypotheses. Doing so must assess a result's null strength, defined as the evidence for a negligible effect size. Such an assessment would differentiate strong null results that suggest a negligible effect size from wea…
▽ More
Statistical insignificance does not suggest the absence of effect, yet scientists must often use null results as evidence of negligible (near-zero) effect size to falsify scientific hypotheses. Doing so must assess a result's null strength, defined as the evidence for a negligible effect size. Such an assessment would differentiate strong null results that suggest a negligible effect size from weak null results that suggest a broad range of potential effect sizes. We propose the most difference in means ($δ_M$) as a two-sample statistic that can both quantify null strength and perform a hypothesis test for negligible effect size. To facilitate consensus when interpreting results, our statistic allows scientists to conclude that a result has negligible effect size using different thresholds with no recalculation required. To assist with selecting a threshold, $δ_M$ can also compare null strength between related results. Both $δ_M$ and the relative form of $δ_M$ outperform other candidate statistics in comparing null strength. We compile broadly related results and use the relative $δ_M$ to compare null strength across different treatments, measurement methods, and experiment models. Reporting the relative $δ_M$ may provide a technical solution to the file drawer problem by encouraging the publication of null and near-zero results.
△ Less
Submitted 24 May, 2022; v1 submitted 4 January, 2022;
originally announced January 2022.
-
A purely data-driven framework for prediction, optimization, and control of networked processes: application to networked SIS epidemic model
Authors:
Ali Tavasoli,
Teague Henry,
Heman Shakeri
Abstract:
Networks are landmarks of many complex phenomena where interweaving interactions between different agents transform simple local rule-sets into nonlinear emergent behaviors. While some recent studies unveil associations between the network structure and the underlying dynamical process, identifying stochastic nonlinear dynamical processes continues to be an outstanding problem. Here we develop a s…
▽ More
Networks are landmarks of many complex phenomena where interweaving interactions between different agents transform simple local rule-sets into nonlinear emergent behaviors. While some recent studies unveil associations between the network structure and the underlying dynamical process, identifying stochastic nonlinear dynamical processes continues to be an outstanding problem. Here we develop a simple data-driven framework based on operator-theoretic techniques to identify and control stochastic nonlinear dynamics taking place over large-scale networks. The proposed approach requires no prior knowledge of the network structure and identifies the underlying dynamics solely using a collection of two-step snapshots of the states. This data-driven system identification is achieved by using the Koopman operator to find a low dimensional representation of the dynamical patterns that evolve linearly. Further, we use the global linear Koopman model to solve critical control problems by applying to model predictive control (MPC)--typically, a challenging proposition when applied to large networks. We show that our proposed approach tackles this by converting the original nonlinear programming into a more tractable optimization problem that is both convex and with far fewer variables.
△ Less
Submitted 31 July, 2021;
originally announced August 2021.
-
Maximizing the algebraic connectivity in multilayer networks with arbitrary interconnections
Authors:
Ali Tavasoli,
Ehsan Ardjmand,
Heman Shakeri
Abstract:
The second smallest eigenvalue of the Laplacian matrix is determinative in characterizing many network properties and is known as algebraic connectivity. In this paper, we investigate the problem of maximizing algebraic connectivity in multilayer networks by allocating interlink weights subject to a budget while allowing arbitrary interconnections. For budgets below a threshold, we identify an upp…
▽ More
The second smallest eigenvalue of the Laplacian matrix is determinative in characterizing many network properties and is known as algebraic connectivity. In this paper, we investigate the problem of maximizing algebraic connectivity in multilayer networks by allocating interlink weights subject to a budget while allowing arbitrary interconnections. For budgets below a threshold, we identify an upper-bound for maximum algebraic connectivity which is independent of interconnections pattern and is reachable with satisfying a certain regularity condition. For efficient numerical approaches in regions of no analytical solution, we cast the problem into a convex framework that explores the problem from several perspectives and, particularly, transforms into a graph embedding problem that is easier to interpret and related to the optimum diffusion phase. Allowing arbitrary interconnections entails regions of multiple transitions, giving more diverse diffusion phases with respect to one-to-one interconnection case. When there is no limitation on the interconnections pattern, we derive several analytical results characterizing the optimal weights by individual Fiedler vectors. We use the ratio of algebraic connectivity and the layer sizes to explain the results. Finally, we study the placement of a limited number of interlinks by greedy heuristics, using the Fiedler vector components of each layer.
△ Less
Submitted 2 September, 2020; v1 submitted 29 August, 2020;
originally announced August 2020.
-
A new method for quantifying network cyclic structure to improve community detection
Authors:
Behnaz Moradi-Jamei,
Heman Shakeri,
Pietro Poggi-Corradini,
Michael J. Higgins
Abstract:
A distinguishing property of communities in networks is that cycles are more prevalent within communities than across communities. Thus, the detection of these communities may be aided through the incorporation of measures of the local "richness" of the cyclic structure. In this paper, we introduce renewal non-backtracking random walks (RNBRW) as a way of quantifying this structure. RNBRW gives a…
▽ More
A distinguishing property of communities in networks is that cycles are more prevalent within communities than across communities. Thus, the detection of these communities may be aided through the incorporation of measures of the local "richness" of the cyclic structure. In this paper, we introduce renewal non-backtracking random walks (RNBRW) as a way of quantifying this structure. RNBRW gives a weight to each edge equal to the probability that a non-backtracking random walk completes a cycle with that edge. Hence, edges with larger weights may be thought of as more important to the formation of cycles. Of note, since separate random walks can be performed in parallel, RNBRW weights can be estimated very quickly, even for large graphs. We give simulation results showing that pre-weighting edges through RNBRW may substantially improve the performance of common community detection algorithms. Our results suggest that RNBRW is especially efficient for the challenging case of detecting communities in sparse graphs.
△ Less
Submitted 11 October, 2019; v1 submitted 2 October, 2019;
originally announced October 2019.
-
Designing Optimal Multiplex Networks for Certain Laplacian Spectral Properties
Authors:
Heman Shakeri,
Ali Tavassoli,
Ehsan Ardjmand,
Pietro Poggi-Corradini
Abstract:
We discuss the design of interlayer edges in a multiplex network, under a limited budget, with the goal of improving its overall performance. We analyze the following three problems separately; first, we maximize the smallest nonzero eigenvalue, also known as the algebraic connectivity; secondly, we minimize the largest eigenvalue, also known as the spectral radius; and finally, we minimize the sp…
▽ More
We discuss the design of interlayer edges in a multiplex network, under a limited budget, with the goal of improving its overall performance. We analyze the following three problems separately; first, we maximize the smallest nonzero eigenvalue, also known as the algebraic connectivity; secondly, we minimize the largest eigenvalue, also known as the spectral radius; and finally, we minimize the spectral width. Maximizing the algebraic connectivity requires identical weights on the interlayer edges for budgets less than a threshold value. However, for larger budgets, the optimal weights are generally non-uniform. The dual formulation transforms the problem into a graph realization (embedding) problem that allows us to give a fuller picture. Namely, before the threshold budget, the optimal realization is one-dimensional with nodes in the same layer embedded to a single point; while, beyond the threshold, the optimal embeddings generally unfold into spaces with dimension bounded by the multiplicity of the algebraic connectivity. Finally, for extremely large budgets the embeddings revert again to lower dimensions. Minimizing the largest eigenvalue is driven by the spectral radius of the individual networks and its corresponding eigenvector. Before a threshold, the total budget is distributed among interlayer edges corresponding to the nodal lines of this eigenvector, and the optimal largest eigenvalue of the Laplacian remains constant. For larger budgets, the weight distribution tends to be almost uniform. In the dual picture, the optimal graph embedding is one-dimensional and non-homogeneous at first and beyond this threshold, the optimal embedding expands to be multi-dimensional, and for larger values of the budget, the two layers fill the embedding space. Finally, we show how these two problems are connected to minimizing the spectral width.
△ Less
Submitted 28 May, 2020; v1 submitted 4 March, 2019;
originally announced March 2019.
-
A new method for quantifying network cyclic structure to improve community detection
Authors:
Behnaz Moradi,
Heman Shakeri,
Pietro Poggi-Corradini,
Michael Higgins
Abstract:
A distinguishing property of communities in networks is that cycles are more prevalent within communities than across communities. Thus, the detection of these communities may be aided through the incorporation of measures of the local "richness" of the cyclic structure. In this paper, we introduce renewal non-backtracking random walks (RNBRW) as a way of quantifying this structure. RNBRW gives a…
▽ More
A distinguishing property of communities in networks is that cycles are more prevalent within communities than across communities. Thus, the detection of these communities may be aided through the incorporation of measures of the local "richness" of the cyclic structure. In this paper, we introduce renewal non-backtracking random walks (RNBRW) as a way of quantifying this structure. RNBRW gives a weight to each edge equal to the probability that a non-backtracking random walk completes a cycle with that edge. Hence, edges with larger weights may be thought of as more important to the formation of cycles. Of note, since separate random walks can be performed in parallel, RNBRW weights can be estimated very quickly, even for large graphs. We give simulation results showing that pre-weighting edges through RNBRW may substantially improve the performance of common community detection algorithms. Our results suggest that RNBRW is especially efficient for the challenging case of detecting communities in sparse graphs.
△ Less
Submitted 18 October, 2019; v1 submitted 18 May, 2018;
originally announced May 2018.
-
Generalization of Effective Conductance Centrality for Egonetworks
Authors:
Heman Shakeri,
Behnaz Moradi-Jamei,
Pietro Poggi-Corradini,
Nathan Albin,
Caterina Scoglio
Abstract:
We study the popular centrality measure known as effective conductance or in some circles as information centrality. This is an important notion of centrality for undirected networks, with many applications, e.g., for random walks, electrical resistor networks, epidemic spreading, etc. In this paper, we first reinterpret this measure in terms of modulus (energy) of families of walks on the network…
▽ More
We study the popular centrality measure known as effective conductance or in some circles as information centrality. This is an important notion of centrality for undirected networks, with many applications, e.g., for random walks, electrical resistor networks, epidemic spreading, etc. In this paper, we first reinterpret this measure in terms of modulus (energy) of families of walks on the network. This modulus centrality measure coincides with the effective conductance measure on simple undirected networks, and extends it to much more general situations, e.g., directed networks as well. Secondly, we study a variation of this modulus approach in the egocentric network paradigm. Egonetworks are networks formed around a focal node (ego) with a specific order of neighborhoods. We propose efficient analytical and approximate methods for computing these measures on both undirected and directed networks. Finally, we describe a simple method inspired by the modulus point-of-view, called shell degree, which proved to be a useful tool for network science.
△ Less
Submitted 26 July, 2018; v1 submitted 7 May, 2017;
originally announced May 2017.
-
Network clustering and community detection using modulus of families of loops
Authors:
Heman Shakeri,
Pietro Poggi-Corradini,
Nathan Albin,
Caterina Scoglio
Abstract:
We study the structure of loops in networks using the notion of modulus of loop families. We introduce a new measure of network clustering by quantifying the richness of families of (simple) loops. Modulus tries to minimize the expected overlap among loops by spreading the expected link-usage optimally. We propose weighting networks using these expected link-usages to improve classical community d…
▽ More
We study the structure of loops in networks using the notion of modulus of loop families. We introduce a new measure of network clustering by quantifying the richness of families of (simple) loops. Modulus tries to minimize the expected overlap among loops by spreading the expected link-usage optimally. We propose weighting networks using these expected link-usages to improve classical community detection algorithms. We show that the proposed method enhances the performance of certain algorithms, such as spectral partitioning and modularity maximization heuristics, on standard benchmarks.
△ Less
Submitted 26 December, 2016; v1 submitted 2 September, 2016;
originally announced September 2016.
-
GEMFsim: A Stochastic Simulator for the Generalized Epidemic Modeling Framework
Authors:
Faryad Darabi Sahneh,
Aram Vajdi,
Heman Shakeri,
Futing Fan,
Caterina Scoglio
Abstract:
The recently proposed generalized epidemic modeling framework (GEMF) \cite{sahneh2013generalized} lays the groundwork for systematically constructing a broad spectrum of stochastic spreading processes over complex networks. This article builds an algorithm for exact, continuous-time numerical simulation of GEMF-based processes. Moreover the implementation of this algorithm, GEMFsim, is available i…
▽ More
The recently proposed generalized epidemic modeling framework (GEMF) \cite{sahneh2013generalized} lays the groundwork for systematically constructing a broad spectrum of stochastic spreading processes over complex networks. This article builds an algorithm for exact, continuous-time numerical simulation of GEMF-based processes. Moreover the implementation of this algorithm, GEMFsim, is available in popular scientific programming platforms such as MATLAB, R, Python, and C; GEMFsim facilitates simulating stochastic spreading models that fit in GEMF framework. Using these simulations one can examine the accuracy of mean-field-type approximations that are commonly used for analytical study of spreading processes on complex networks.
△ Less
Submitted 7 April, 2016;
originally announced April 2016.
-
Maximizing Algebraic Connectivity in Interconnected Networks
Authors:
Heman Shakeri,
Nathan Albin,
Faryad Darabi Sahneh,
Pietro Poggi-Corradini,
Caterina Scoglio
Abstract:
Algebraic connectivity, the second eigenvalue of the Laplacian matrix, is a measure of node and link connectivity on networks. When studying interconnected networks it is useful to consider a multiplex model, where the component networks operate together with inter-layer links among them. In order to have a well-connected multilayer structure, it is necessary to optimally design these inter-layer…
▽ More
Algebraic connectivity, the second eigenvalue of the Laplacian matrix, is a measure of node and link connectivity on networks. When studying interconnected networks it is useful to consider a multiplex model, where the component networks operate together with inter-layer links among them. In order to have a well-connected multilayer structure, it is necessary to optimally design these inter-layer links considering realistic constraints. In this work, we solve the problem of finding an optimal weight distribution for one-to-one inter-layer links under budget constraint. We show that for the special multiplex configurations with identical layers, the uniform weight distribution is always optimal. On the other hand, when the two layers are arbitrary, increasing the budget reveals the existence of two different regimes. Up to a certain threshold budget, the second eigenvalue of the supra-Laplacian is simple, the optimal weight distribution is uniform, and the Fiedler vector is constant on each layer. Increasing the budget past the threshold, the optimal weight distribution can be non-uniform. The interesting consequence of this result is that there is no need to solve the optimization problem when the available budget is less than the threshold, which can be easily found analytically.
△ Less
Submitted 22 October, 2015;
originally announced October 2015.
-
Develo** of New Facets of Indirect Modeling in the Geosciences
Authors:
Hamed Owladeghaffari,
Hadi Shakeri,
Mostafa Sharifzadeh
Abstract:
In this paper, we describe some applications of Self Organizing feature map Neuro-Fuzzy Inference System (SONFIS) and Self Organizing feature map Rough Set (SORST) in analysis of permeability at a dam site and lost circulation in the drilling of three wells in Iran. Elicitation of the best rules on the information tables, exploration of the dominant structures on the behaviour of systems while t…
▽ More
In this paper, we describe some applications of Self Organizing feature map Neuro-Fuzzy Inference System (SONFIS) and Self Organizing feature map Rough Set (SORST) in analysis of permeability at a dam site and lost circulation in the drilling of three wells in Iran. Elicitation of the best rules on the information tables, exploration of the dominant structures on the behaviour of systems while they fall in to the balance of the second granulation level (rules) and highlighting of most effective attributes (parameters) on the selected systems, are some of the benefits of the proposed methods. In the other process, using complex networks (graphs) theory - as another method in not 1:1 modelling branch- mechanical behaviour of a rock joint has been investigated. Keywords: Information Granules; SONFIS; SORST; Complex Networks; Permeability; Lost Circulation; Mechanical Behavior of a Rock Joint
△ Less
Submitted 28 August, 2008;
originally announced August 2008.