-
Hyperbolic Benchmarking Unveils Network Topology-Feature Relationship in GNN Performance
Authors:
Roya Aliakbarisani,
Robert Jankowski,
M. Ángeles Serrano,
Marián Boguñá
Abstract:
Graph Neural Networks (GNNs) have excelled in predicting graph properties in various applications ranging from identifying trends in social networks to drug discovery and malware detection. With the abundance of new architectures and increased complexity, GNNs are becoming highly specialized when tested on a few well-known datasets. However, how the performance of GNNs depends on the topological a…
▽ More
Graph Neural Networks (GNNs) have excelled in predicting graph properties in various applications ranging from identifying trends in social networks to drug discovery and malware detection. With the abundance of new architectures and increased complexity, GNNs are becoming highly specialized when tested on a few well-known datasets. However, how the performance of GNNs depends on the topological and features properties of graphs is still an open question. In this work, we introduce a comprehensive benchmarking framework for graph machine learning, focusing on the performance of GNNs across varied network structures. Utilizing the geometric soft configuration model in hyperbolic space, we generate synthetic networks with realistic topological properties and node feature vectors. This approach enables us to assess the impact of network properties, such as topology-feature correlation, degree distributions, local density of triangles (or clustering), and homophily, on the effectiveness of different GNN architectures. Our results highlight the dependency of model performance on the interplay between network structure and node features, providing insights for model selection in various scenarios. This study contributes to the field by offering a versatile tool for evaluating GNNs, thereby assisting in develo** and selecting suitable models based on specific data characteristics.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Feature-aware ultra-low dimensional reduction of real networks
Authors:
Robert Jankowski,
Pegah Hozhabrierdi,
Marián Boguñá,
M. Ángeles Serrano
Abstract:
In existing models and embedding methods of networked systems, node features describing their qualities are usually overlooked in favor of focusing solely on node connectivity. This study introduces $FiD$-Mercator, a model-based ultra-low dimensional reduction technique that integrates node features with network structure to create $D$-dimensional maps of complex networks in a hyperbolic space. Th…
▽ More
In existing models and embedding methods of networked systems, node features describing their qualities are usually overlooked in favor of focusing solely on node connectivity. This study introduces $FiD$-Mercator, a model-based ultra-low dimensional reduction technique that integrates node features with network structure to create $D$-dimensional maps of complex networks in a hyperbolic space. This embedding method efficiently uses features as an initial condition, guiding the search of nodes' coordinates towards an optimal solution. The research reveals that downstream task performance improves with the correlation between network connectivity and features, emphasizing the importance of such correlation for enhancing the description and predictability of real networks. Simultaneously, hyperbolic embedding's ability to reproduce local network properties remains unaffected by the inclusion of features. The findings highlight the necessity for develo** network embedding techniques capable of exploiting such correlations to optimize both network structure and feature association jointly in the future.
△ Less
Submitted 10 June, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Distilling Script Knowledge from Large Language Models for Constrained Language Planning
Authors:
Siyu Yuan,
Jiangjie Chen,
Ziquan Fu,
Xuyang Ge,
Soham Shah,
Charles Robert Jankowski,
Yanghua Xiao,
Deqing Yang
Abstract:
In everyday life, humans often plan their actions by following step-by-step instructions in the form of goal-oriented scripts. Previous work has exploited language models (LMs) to plan for abstract goals of stereotypical activities (e.g., "make a cake"), but leaves more specific goals with multi-facet constraints understudied (e.g., "make a cake for diabetics"). In this paper, we define the task o…
▽ More
In everyday life, humans often plan their actions by following step-by-step instructions in the form of goal-oriented scripts. Previous work has exploited language models (LMs) to plan for abstract goals of stereotypical activities (e.g., "make a cake"), but leaves more specific goals with multi-facet constraints understudied (e.g., "make a cake for diabetics"). In this paper, we define the task of constrained language planning for the first time. We propose an overgenerate-then-filter approach to improve large language models (LLMs) on this task, and use it to distill a novel constrained language planning dataset, CoScript, which consists of 55,000 scripts. Empirical results demonstrate that our method significantly improves the constrained language planning ability of LLMs, especially on constraint faithfulness. Furthermore, CoScript is demonstrated to be quite effective in endowing smaller LMs with constrained language planning ability.
△ Less
Submitted 26 May, 2023; v1 submitted 9 May, 2023;
originally announced May 2023.
-
The D-Mercator method for the multidimensional hyperbolic embedding of real networks
Authors:
Robert Jankowski,
Antoine Allard,
Marián Boguñá,
M. Ángeles Serrano
Abstract:
One of the pillars of the geometric approach to networks has been the development of model-based map** tools that embed real networks in its latent geometry. In particular, the tool Mercator embeds networks into the hyperbolic plane. However, some real networks are better described by the multidimensional formulation of the underlying geometric model. Here, we introduce $D$-Mercator, a model-bas…
▽ More
One of the pillars of the geometric approach to networks has been the development of model-based map** tools that embed real networks in its latent geometry. In particular, the tool Mercator embeds networks into the hyperbolic plane. However, some real networks are better described by the multidimensional formulation of the underlying geometric model. Here, we introduce $D$-Mercator, a model-based embedding method that produces multidimensional maps of real networks into the $(D+1)$-hyperbolic space, where the similarity subspace is represented as a $D$-sphere. We used $D$-Mercator to produce multidimensional hyperbolic maps of real networks and estimated their intrinsic dimensionality in terms of navigability and community structure. Multidimensional representations of real networks are instrumental in the identification of factors that determine connectivity and in elucidating fundamental issues that hinge on dimensionality, such as the presence of universality in critical behavior.
△ Less
Submitted 14 November, 2023; v1 submitted 13 April, 2023;
originally announced April 2023.
-
Determining crucial factors for the popularity of scientific articles
Authors:
Robert Jankowski,
Julian Sienkiewicz
Abstract:
Using a set of over 70.000 records from PLOS One journal consisting of 37 lexical, sentiment and bibliographic variables we perform analysis backed with machine learning methods to predict the class of popularity of scientific papers defined by the number of times they have been viewed. Our study shows correlations among the features and recovers a threshold for the number of views that results in…
▽ More
Using a set of over 70.000 records from PLOS One journal consisting of 37 lexical, sentiment and bibliographic variables we perform analysis backed with machine learning methods to predict the class of popularity of scientific papers defined by the number of times they have been viewed. Our study shows correlations among the features and recovers a threshold for the number of views that results in the best prediction results in terms of Matthew's correlation coefficient. Moreover, by creating a variable importance plot for random forest classifier, we are able to reduce the number of features while kee** similar predictability and determine crucial factors responsible for the popularity.
△ Less
Submitted 27 January, 2020;
originally announced January 2020.
-
Parameter estimation by fixed point of function of information processing intensity
Authors:
Rober Jankowski,
Marcin Makowski,
Edward W. Piotrowski
Abstract:
We present a new method of estimating the dispersion of a distribution which is based on the surprising property of a function that measures information processing intensity. It turns out that this function has a maximum at its fixed point. We use a fixed-point equation to estimate the parameter of the distribution that is of interest to us. We illustrate the estimation method by using the example…
▽ More
We present a new method of estimating the dispersion of a distribution which is based on the surprising property of a function that measures information processing intensity. It turns out that this function has a maximum at its fixed point. We use a fixed-point equation to estimate the parameter of the distribution that is of interest to us. We illustrate the estimation method by using the example of an exponential distribution. The codes of programs that calculate the experimental values of the information processing intensity are presented.
△ Less
Submitted 31 March, 2014;
originally announced April 2014.
-
Oscillatory Properties of Solutions of the Fourth Order Difference Equations with Quasidifferences
Authors:
Robert Jankowski,
Ewa Schmeidel,
Joanna Zonenberg
Abstract:
A class of fourth--order neutral type difference equations with quasidifferences and deviating arguments is considered. Our approach is based on studying the considered equation as a system of a four--dimensional difference system. The sufficient conditions under which the considered equation has no quickly oscillatory solutions are given.
A class of fourth--order neutral type difference equations with quasidifferences and deviating arguments is considered. Our approach is based on studying the considered equation as a system of a four--dimensional difference system. The sufficient conditions under which the considered equation has no quickly oscillatory solutions are given.
△ Less
Submitted 31 March, 2014; v1 submitted 12 January, 2014;
originally announced January 2014.
-
Almost Oscillation Criteria for Second Order Neutral Difference Equation with Quasidifferences
Authors:
Ewa Schmeidel Robert Jankowski
Abstract:
Using the Riccati transformation techniques, we will extend some almost oscillation criteria for the second-order nonlinear neutral difference equation with quasidifferences $$Δ\left(r_n\left(Δ\left(x_n+c x_{n-k}\right)\right)^γ\right)+q_nx_{n+1}^α=e_n.$$
Using the Riccati transformation techniques, we will extend some almost oscillation criteria for the second-order nonlinear neutral difference equation with quasidifferences $$Δ\left(r_n\left(Δ\left(x_n+c x_{n-k}\right)\right)^γ\right)+q_nx_{n+1}^α=e_n.$$
△ Less
Submitted 28 February, 2014; v1 submitted 25 September, 2013;
originally announced September 2013.
-
On the existence of bounded solutions for nonlinear second order neutral difference equations
Authors:
Marek Galewski,
Magdalena Nockowska Rosiak,
Robert Jankowski,
Ewa Schmeidel
Abstract:
\noindent Using the techniques connected with the measure of noncompactness we investigate the neutral difference equation of the following form \begin{equation*} Δ\left(r_{n}\left(Δ\left(x_{n}+p_{n}x_{n-k}\right) \right) ^γ\right) +q_{n}x_{n}^α+a_{n}f(x_{n})=0. \end{equation*}% where $x:{\mathbb{N}}_{0}\rightarrow {\mathbb{R}}$, $a,p,q:{\mathbb{N}}%_{0}\rightarrow {\mathbb{R}}$,…
▽ More
\noindent Using the techniques connected with the measure of noncompactness we investigate the neutral difference equation of the following form \begin{equation*} Δ\left(r_{n}\left(Δ\left(x_{n}+p_{n}x_{n-k}\right) \right) ^γ\right) +q_{n}x_{n}^α+a_{n}f(x_{n})=0. \end{equation*}% where $x:{\mathbb{N}}_{0}\rightarrow {\mathbb{R}}$, $a,p,q:{\mathbb{N}}%_{0}\rightarrow {\mathbb{R}}$, $r:{\mathbb{N}}_{0}\rightarrow {\mathbb{R}}% \setminus \{0\}$, $f\colon {\mathbb{R}}\rightarrow {\mathbb{R}}$ is a continuous function, and $k$ is a given positive integer, $γ\leq 1$ is ratio of odd positive integers, $α$ is a nonnegative constant. %$\sum a_{n}\left(t\right)$ converges uniformly on ${\mathbb{R}}$. %Here $\bN_0\colon =\left\{0,1,2, \dots \right\}$ and $\bN_k \colon = \left\{k, k+1, -k+2, \dots \right\}$ where $k$ is a given positive integer. Sufficient conditions for the existence of a bounded solution are obtained. Also a special type of stability and asymptotic stability are studied. Some earlier results are generalized. We note that the solution which we obtain does not directly correspond to a fixed point of a certain continuous operator since it is partially iterated. The method which we develop allows for considering through techniques connected with the measure of noncompactness also difference equations with memory. {\small \textbf{Keywords} Difference equation, measures of noncompactness, Darbo's fixed point theorem, boundedness, stability} {\small \textbf{AMS Subject classification} 39A10, 39A22, 39A30}
△ Less
Submitted 12 January, 2014; v1 submitted 9 April, 2013;
originally announced April 2013.