-
Transform then Explore: a Simple and Effective Technique for Exploratory Combinatorial Optimization with Reinforcement Learning
Authors:
Tianle Pu,
Changjun Fan,
Mutian Shen,
Yizhou Lu,
Li Zeng,
Zohar Nussinov,
Chao Chen,
Zhong Liu
Abstract:
Many complex problems encountered in both production and daily life can be conceptualized as combinatorial optimization problems (COPs) over graphs. Recent years, reinforcement learning (RL) based models have emerged as a promising direction, which treat the COPs solving as a heuristic learning problem. However, current finite-horizon-MDP based RL models have inherent limitations. They are not all…
▽ More
Many complex problems encountered in both production and daily life can be conceptualized as combinatorial optimization problems (COPs) over graphs. Recent years, reinforcement learning (RL) based models have emerged as a promising direction, which treat the COPs solving as a heuristic learning problem. However, current finite-horizon-MDP based RL models have inherent limitations. They are not allowed to explore adquately for improving solutions at test time, which may be necessary given the complexity of NP-hard optimization tasks. Some recent attempts solve this issue by focusing on reward design and state feature engineering, which are tedious and ad-hoc. In this work, we instead propose a much simpler but more effective technique, named gauge transformation (GT). The technique is originated from physics, but is very effective in enabling RL agents to explore to continuously improve the solutions during test. Morever, GT is very simple, which can be implemented with less than 10 lines of Python codes, and can be applied to a vast majority of RL models. Experimentally, we show that traditional RL models with GT technique produce the state-of-the-art performances on the MaxCut problem. Furthermore, since GT is independent of any RL models, it can be seamlessly integrated into various RL frameworks, paving the way of these models for more effective explorations in the solving of general COPs.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
A new nature inspired modularity function adapted for unsupervised learning involving spatially embedded networks: A comparative analysis
Authors:
Raj Kishore,
Zohar Nussinov,
Kisor Kumar Sahu
Abstract:
Unsupervised machine learning methods can be of great help in many traditional engineering disciplines, where huge amount of labeled data is not readily available or is extremely difficult or costly to generate. Two specific examples include the structure of granular materials and atomic structure of metallic glasses. While the former is critically important for several hundreds of billion dollars…
▽ More
Unsupervised machine learning methods can be of great help in many traditional engineering disciplines, where huge amount of labeled data is not readily available or is extremely difficult or costly to generate. Two specific examples include the structure of granular materials and atomic structure of metallic glasses. While the former is critically important for several hundreds of billion dollars global industries, the latter is still a big puzzle in fundamental science. One thing is common in both the examples is that the particles are the elements of the ensembles that are embedded in Euclidean space and one can create a spatially embedded network to represent their key features. Some recent studies show that clustering, which generically refers to unsupervised learning, holds great promise in partitioning these networks. In many complex networks, the spatial information of nodes play very important role in determining the network properties. So understanding the structure of such networks is very crucial. We have compared the performance of our newly developed modularity function with some of the well-known modularity functions. We performed this comparison by finding the best partition in 2D and 3D granular assemblies. We show that for the class of networks considered in this article, our method produce much better results than the competing methods.
△ Less
Submitted 18 July, 2020;
originally announced July 2020.
-
Unsupervised Community Detection with a Potts Model Hamiltonian, an Efficient Algorithmic Solution, and Applications in Digital Pathology
Authors:
Brendon Lutnick,
Wen Dong,
Zohar Nussinov,
Pinaki Sarder
Abstract:
Unsupervised segmentation of large images using a Potts model Hamiltonian is unique in that segmentation is governed by a resolution parameter which scales the sensitivity to small clusters. Here, the input image is first modeled as a graph, which is then segmented by minimizing a Hamiltonian cost function defined on the graph and the respective segments. However, there exists no closed form solut…
▽ More
Unsupervised segmentation of large images using a Potts model Hamiltonian is unique in that segmentation is governed by a resolution parameter which scales the sensitivity to small clusters. Here, the input image is first modeled as a graph, which is then segmented by minimizing a Hamiltonian cost function defined on the graph and the respective segments. However, there exists no closed form solution of this optimization, and using previous iterative algorithmic solution techniques, the problem scales quadratically in the Input Length. Therefore, while Potts model segmentation gives accurate segmentation, it is grossly underutilized as an unsupervised learning technique. We propose a fast statistical down-sampling of input image pixels based on the respective color features, and a new iterative method to minimize the Potts model energy considering pixel to segment relationship. This method is generalizable and can be extended for image pixel texture features as well as spatial features. We demonstrate that this new method is highly efficient, and outperforms existing methods for Potts model based image segmentation. We demonstrate the application of our method in medical microscopy image segmentation; particularly, in segmenting renal glomerular micro-environment in renal pathology. Our method is not limited to image segmentation, and can be extended to any image/data segmentation/clustering task for arbitrary datasets with discrete features.
△ Less
Submitted 4 February, 2020;
originally announced February 2020.
-
Visual Machine Learning: Insight through Eigenvectors, Chladni patterns and community detection in 2D particulate structures
Authors:
Raj Kishore,
S. Swayamjyoti,
Shreeja Das,
Ajay K. Gogineni,
Zohar Nussinov,
D. Solenov,
Kisor K. Sahu
Abstract:
Machine learning (ML) is quickly emerging as a powerful tool with diverse applications across an extremely broad spectrum of disciplines and commercial endeavors. Typically, ML is used as a black box that provides little illuminating rationalization of its output. In the current work, we aim to better understand the generic intuition underlying unsupervised ML with a focus on physical systems. The…
▽ More
Machine learning (ML) is quickly emerging as a powerful tool with diverse applications across an extremely broad spectrum of disciplines and commercial endeavors. Typically, ML is used as a black box that provides little illuminating rationalization of its output. In the current work, we aim to better understand the generic intuition underlying unsupervised ML with a focus on physical systems. The systems that are studied here as test cases comprise of six different 2-dimensional (2-D) particulate systems of different complexities. It is noted that the findings of this study are generic to any unsupervised ML problem and are not restricted to materials systems alone. Three rudimentary unsupervised ML techniques are employed on the adjacency (connectivity) matrix of the six studied systems: (i) using principal eigenvalue and eigenvectors of the adjacency matrix, (ii) spectral decomposition, and (iii) a Potts model based community detection technique in which a modularity function is maximized. We demonstrate that, while solving a completely classical problem, ML technique produces features that are distinctly connected to quantum mechanical solutions. Dissecting these features help us to understand the deep connection between the classical non-linear world and the quantum mechanical linear world through the kaleidoscope of ML technique, which might have far reaching consequences both in the arena of physical sciences and ML.
△ Less
Submitted 2 January, 2020;
originally announced January 2020.
-
Inference of hidden structures in complex physical systems by multi-scale clustering
Authors:
Z. Nussinov,
P. Ronhovde,
Dandan Hu,
S. Chakrabarty,
M. Sahu,
Bo Sun,
N. A. Mauro,
K. K. Sahu
Abstract:
We survey the application of a relatively new branch of statistical physics--"community detection"-- to data mining. In particular, we focus on the diagnosis of materials and automated image segmentation. Community detection describes the quest of partitioning a complex system involving many elements into optimally decoupled subsets or communities of such elements. We review a multiresolution vari…
▽ More
We survey the application of a relatively new branch of statistical physics--"community detection"-- to data mining. In particular, we focus on the diagnosis of materials and automated image segmentation. Community detection describes the quest of partitioning a complex system involving many elements into optimally decoupled subsets or communities of such elements. We review a multiresolution variant which is used to ascertain structures at different spatial and temporal scales. Significant patterns are obtained by examining the correlations between different independent solvers. Similar to other combinatorial optimization problems in the NP complexity class, community detection exhibits several phases. Typically, illuminating orders are revealed by choosing parameters that lead to extremal information theory correlations.
△ Less
Submitted 14 January, 2016; v1 submitted 5 March, 2015;
originally announced March 2015.
-
An interacting replica approach applied to the traveling salesman problem
Authors:
Bo Sun,
Blake Leonard,
Peter Ronhovde,
Zohar Nussinov
Abstract:
We present a physics inspired heuristic method for solving combinatorial optimization problems. Our approach is specifically motivated by the desire to avoid trap** in metastable local minima- a common occurrence in hard problems with multiple extrema. Our method involves (i) coupling otherwise independent simulations of a system ("replicas") via geometrical distances as well as (ii) probabilist…
▽ More
We present a physics inspired heuristic method for solving combinatorial optimization problems. Our approach is specifically motivated by the desire to avoid trap** in metastable local minima- a common occurrence in hard problems with multiple extrema. Our method involves (i) coupling otherwise independent simulations of a system ("replicas") via geometrical distances as well as (ii) probabilistic inference applied to the solutions found by individual replicas. The {\it ensemble} of replicas evolves as to maximize the inter-replica correlation while simultaneously minimize the local intra-replica cost function (e.g., the total path length in the Traveling Salesman Problem within each replica). We demonstrate how our method improves the performance of rudimentary local optimization schemes long applied to the NP hard Traveling Salesman Problem. In particular, we apply our method to the well-known "$k$-opt" algorithm and examine two particular cases- $k=2$ and $k=3$. With the aid of geometrical coupling alone, we are able to determine for the optimum tour length on systems up to $280$ cities (an order of magnitude larger than the largest systems typically solved by the bare $k=3$ opt). The probabilistic replica-based inference approach improves $k-opt$ even further and determines the optimal solution of a problem with $318$ cities and find tours whose total length is close to that of the optimal solutions for other systems with a larger number of cities.
△ Less
Submitted 14 March, 2016; v1 submitted 27 June, 2014;
originally announced June 2014.
-
Improving the performance of algorithms to find communities in networks
Authors:
Richard K. Darst,
Zohar Nussinov,
Santo Fortunato
Abstract:
Many algorithms to detect communities in networks typically work without any information on the cluster structure to be found, as one has no a priori knowledge of it, in general. Not surprisingly, knowing some features of the unknown partition could help its identification, yielding an improvement of the performance of the method. Here we show that, if the number of clusters were known beforehand,…
▽ More
Many algorithms to detect communities in networks typically work without any information on the cluster structure to be found, as one has no a priori knowledge of it, in general. Not surprisingly, knowing some features of the unknown partition could help its identification, yielding an improvement of the performance of the method. Here we show that, if the number of clusters were known beforehand, standard methods, like modularity optimization, would considerably gain in accuracy, mitigating the severe resolution bias that undermines the reliability of the results of the original unconstrained version. The number of clusters can be inferred from the spectra of the recently introduced non-backtracking and flow matrices, even in benchmark graphs with realistic community structure. The limit of such two-step procedure is the overhead of the computation of the spectra.
△ Less
Submitted 1 December, 2014; v1 submitted 15 November, 2013;
originally announced November 2013.
-
Algorithm independent bounds on community detection problems and associated transitions in stochastic block model graphs
Authors:
Richard K. Darst,
David R. Reichman,
Peter Ronhovde,
Zohar Nussinov
Abstract:
We derive rigorous bounds for well-defined community structure in complex networks for a stochastic block model (SBM) benchmark. In particular, we analyze the effect of inter-community "noise" (inter-community edges) on any "community detection" algorithm's ability to correctly group nodes assigned to a planted partition, a problem which has been proven to be NP complete in a standard rendition. O…
▽ More
We derive rigorous bounds for well-defined community structure in complex networks for a stochastic block model (SBM) benchmark. In particular, we analyze the effect of inter-community "noise" (inter-community edges) on any "community detection" algorithm's ability to correctly group nodes assigned to a planted partition, a problem which has been proven to be NP complete in a standard rendition. Our result does not rely on the use of any one particular algorithm nor on the analysis of the limitations of inference. Rather, we turn the problem on its head and work backwards to examine when, in the first place, well defined structure may exist in SBMs.The method that we introduce here could potentially be applied to other computational problems. The objective of community detection algorithms is to partition a given network into optimally disjoint subgraphs (or communities). Similar to k-SAT and other combinatorial optimization problems, "community detection" exhibits different phases. Networks that lie in the "unsolvable phase" lack well-defined structure and thus have no partition that is meaningful. Solvable systems splinter into two disparate phases: those in the "hard" phase and those in the "easy" phase. As befits its name, within the easy phase, a partition is easy to achieve by known algorithms. When a network lies in the hard phase, it still has an underlying structure yet finding a meaningful partition which can be checked in polynomial time requires an exhaustive computational effort that rapidly increases with the size of the graph. When taken together, (i) the rigorous results that we report here on when graphs have an underlying structure and (ii) recent results concerning the limits of rather general algorithms, suggest bounds on the hard phase.
△ Less
Submitted 10 July, 2014; v1 submitted 24 June, 2013;
originally announced June 2013.
-
An edge density definition of overlap** and weighted graph communities
Authors:
Richard K. Darst David R. Reichman Peter Ronhovde,
Zohar Nussinov
Abstract:
Community detection in networks refers to the process of seeking strongly internally connected groups of nodes which are weakly externally connected. In this work, we introduce and study a community definition based on internal edge density. Beginning with the simple concept that edge density equals number of edges divided by maximal number of edges, we apply this definition to a variety of node a…
▽ More
Community detection in networks refers to the process of seeking strongly internally connected groups of nodes which are weakly externally connected. In this work, we introduce and study a community definition based on internal edge density. Beginning with the simple concept that edge density equals number of edges divided by maximal number of edges, we apply this definition to a variety of node and community arrangements to show that our definition yields sensible results. Our community definition is equivalent to that of the Absolute Potts Model community detection method (Phys. Rev. E 81, 046114 (2010)), and the performance of that method validates the usefulness of our definition across a wide variety of network types. We discuss how this definition can be extended to weighted, and multigraphs, and how the definition is capable of handling overlap** communities and local algorithms. We further validate our definition against the recently proposed Affiliation Graph Model (arXiv:1205.6228 [cs.SI]) and show that we can precisely solve these benchmarks. More than proposing an end-all community definition, we explain how studying the detailed properties of community definitions is important in order to validate that definitions do not have negative analytic properties. We urge that community definitions be separated from community detection algorithms and propose that community definitions be further evaluated by criteria such as these.
△ Less
Submitted 14 January, 2013;
originally announced January 2013.
-
Local multiresolution order in community detection
Authors:
Peter Ronhovde,
Zohar Nussinov
Abstract:
Community detection algorithms attempt to find the best clusters of nodes in an arbitrary complex network. Multi-scale ("multiresolution") community detection extends the problem to identify the best network scale(s) for these clusters. The latter task is generally accomplished by analyzing community stability simultaneously for all clusters in the network. In the current work, we extend this gene…
▽ More
Community detection algorithms attempt to find the best clusters of nodes in an arbitrary complex network. Multi-scale ("multiresolution") community detection extends the problem to identify the best network scale(s) for these clusters. The latter task is generally accomplished by analyzing community stability simultaneously for all clusters in the network. In the current work, we extend this general approach to define local multiresolution methods, which enable the extraction of well-defined local communities even if the global community structure is vaguely defined in an average sense. Toward this end, we propose measures analogous to variation of information and normalized mutual information that are used to quantitatively identify the best resolution(s) at the community level based on correlations between clusters in independently-solved systems. We demonstrate our method on two constructed networks as well as a real network and draw inferences about local community strength. Our approach is independent of the applied community detection algorithm save for the inherent requirement that the method be able to identify communities across different network scales, with appropriate changes to account for how different resolutions are evaluated or defined in a particular community detection method. It should, in principle, easily adapt to alternative community comparison measures.
△ Less
Submitted 18 November, 2014; v1 submitted 24 August, 2012;
originally announced August 2012.
-
Automatic Segmentation of Fluorescence Lifetime Microscopy Images of Cells Using Multi-Resolution Community Detection
Authors:
Dandan Hu,
Pinaki Sarder,
Peter Ronhovde,
Sandra Orthaus,
Samuel Achilefu,
Zohar Nussinov
Abstract:
We have developed an automatic method for segmenting fluorescence lifetime (FLT) imaging microscopy (FLIM) images of cells inspired by a multi-resolution community detection (MCD) based network segmentation method. The image processing problem is framed as identifying segments with respective average FLTs against a background in FLIM images. The proposed method segments a FLIM image for a given re…
▽ More
We have developed an automatic method for segmenting fluorescence lifetime (FLT) imaging microscopy (FLIM) images of cells inspired by a multi-resolution community detection (MCD) based network segmentation method. The image processing problem is framed as identifying segments with respective average FLTs against a background in FLIM images. The proposed method segments a FLIM image for a given resolution of the network composed using image pixels as the nodes and similarity between the pixels as the edges. In the resulting segmentation, low network resolution leads to larger segments and high network resolution leads to smaller segments. Further, the mean-square error (MSE) in estimating the FLT segments in a FLIM image using the proposed method was found to be consistently decreasing with increasing resolution of the corresponding network. The proposed MCD method outperformed a popular spectral clustering based method in performing FLIM image segmentation. The spectral segmentation method introduced noisy segments in its output at high resolution. It was unable to offer a consistent decrease in MSE with increasing resolution.
△ Less
Submitted 7 May, 2013; v1 submitted 22 August, 2012;
originally announced August 2012.
-
A Replica Inference Approach to Unsupervised Multi-Scale Image Segmentation
Authors:
Dandan Hu,
Peter Ronhovde,
Zohar Nussinov
Abstract:
We apply a replica inference based Potts model method to unsupervised image segmentation on multiple scales. This approach was inspired by the statistical mechanics problem of "community detection" and its phase diagram. Specifically, the problem is cast as identifying tightly bound clusters ("communities" or "solutes") against a background or "solvent". Within our multiresolution approach, we com…
▽ More
We apply a replica inference based Potts model method to unsupervised image segmentation on multiple scales. This approach was inspired by the statistical mechanics problem of "community detection" and its phase diagram. Specifically, the problem is cast as identifying tightly bound clusters ("communities" or "solutes") against a background or "solvent". Within our multiresolution approach, we compute information theory based correlations among multiple solutions ("replicas") of the same graph over a range of resolutions. Significant multiresolution structures are identified by replica correlations as manifest in information theory overlaps. With the aid of these correlations as well as thermodynamic measures, the phase diagram of the corresponding Potts model is analyzed both at zero and finite temperatures. Optimal parameters corresponding to a sensible unsupervised segmentation correspond to the "easy phase" of the Potts model. Our algorithm is fast and shown to be at least as accurate as the best algorithms to date and to be especially suited to the detection of camouflaged images.
△ Less
Submitted 28 June, 2011;
originally announced June 2011.
-
A Novel Approach Applied to the Largest Clique Problem
Authors:
Vladimir Gudkov,
Shmuel Nussinov,
Zohar Nussinov
Abstract:
A novel approach to complex problems has been previously applied to graph classification and the graph equivalence problem. Here we apply it to the NP complete problem of finding the largest perfect clique within a graph $G$.
A novel approach to complex problems has been previously applied to graph classification and the graph equivalence problem. Here we apply it to the NP complete problem of finding the largest perfect clique within a graph $G$.
△ Less
Submitted 17 September, 2002;
originally announced September 2002.