Search | arXiv e-print repository

Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference

Authors: Christopher Wolters, Xiaoxuan Yang, Ulf Schlichtmann, Toyotaro Suzumura

Abstract: Large language models (LLMs) have recently transformed natural language processing, enabling machines to generate human-like text and engage in meaningful conversations. This development necessitates speed, efficiency, and accessibility in LLM inference as the computational and memory requirements of these systems grow exponentially. Meanwhile, advancements in computing and memory capabilities are… ▽ More Large language models (LLMs) have recently transformed natural language processing, enabling machines to generate human-like text and engage in meaningful conversations. This development necessitates speed, efficiency, and accessibility in LLM inference as the computational and memory requirements of these systems grow exponentially. Meanwhile, advancements in computing and memory capabilities are lagging behind, exacerbated by the discontinuation of Moore's law. With LLMs exceeding the capacity of single GPUs, they require complex, expert-level configurations for parallel processing. Memory accesses become significantly more expensive than computation, posing a challenge for efficient scaling, known as the memory wall. Here, compute-in-memory (CIM) technologies offer a promising solution for accelerating AI inference by directly performing analog computations in memory, potentially reducing latency and power consumption. By closely integrating memory and compute elements, CIM eliminates the von Neumann bottleneck, reducing data movement and improving energy efficiency. This survey paper provides an overview and analysis of transformer-based models, reviewing various CIM architectures and exploring how they can address the imminent challenges of modern AI computing systems. We discuss transformer-related operators and their hardware acceleration schemes and highlight challenges, trends, and insights in corresponding CIM designs. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2402.16078 [pdf, other]

Beyond Spatio-Temporal Representations: Evolving Fourier Transform for Temporal Graphs

Authors: Anson Bastos, Kuldeep Singh, Abhishek Nadgeri, Manish Singh, Toyotaro Suzumura

Abstract: We present the Evolving Graph Fourier Transform (EFT), the first invertible spectral transform that captures evolving representations on temporal graphs. We motivate our work by the inadequacy of existing methods for capturing the evolving graph spectra, which are also computationally expensive due to the temporal aspect along with the graph vertex domain. We view the problem as an optimization ov… ▽ More We present the Evolving Graph Fourier Transform (EFT), the first invertible spectral transform that captures evolving representations on temporal graphs. We motivate our work by the inadequacy of existing methods for capturing the evolving graph spectra, which are also computationally expensive due to the temporal aspect along with the graph vertex domain. We view the problem as an optimization over the Laplacian of the continuous time dynamic graph. Additionally, we propose pseudo-spectrum relaxations that decompose the transformation process, making it highly computationally efficient. The EFT method adeptly captures the evolving graph's structural and positional properties, making it effective for downstream tasks on evolving graphs. Hence, as a reference implementation, we develop a simple neural model induced with EFT for capturing evolving graph spectra. We empirically validate our theoretical findings on a number of large-scale and standard temporal graph benchmarks and demonstrate that our model achieves state-of-the-art performance. △ Less

Submitted 18 April, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

Comments: Accepted as a full conference paper in the International Conference on Learning Representations 2024

arXiv:2310.06489 [pdf]

Deep Learning for Automatic Detection and Facial Recognition in Japanese Macaques: Illuminating Social Networks

Authors: Julien Paulet, Axel Molina, Benjamin Beltzung, Takafumi Suzumura, Shinya Yamamoto, Cédric Sueur

Abstract: Individual identification plays a pivotal role in ecology and ethology, notably as a tool for complex social structures understanding. However, traditional identification methods often involve invasive physical tags and can prove both disruptive for animals and time-intensive for researchers. In recent years, the integration of deep learning in research offered new methodological perspectives thro… ▽ More Individual identification plays a pivotal role in ecology and ethology, notably as a tool for complex social structures understanding. However, traditional identification methods often involve invasive physical tags and can prove both disruptive for animals and time-intensive for researchers. In recent years, the integration of deep learning in research offered new methodological perspectives through automatization of complex tasks. Harnessing object detection and recognition technologies is increasingly used by researchers to achieve identification on video footage. This study represents a preliminary exploration into the development of a non-invasive tool for face detection and individual identification of Japanese macaques (Macaca fuscata) through deep learning. The ultimate goal of this research is, using identifications done on the dataset, to automatically generate a social network representation of the studied population. The current main results are promising: (i) the creation of a Japanese macaques' face detector (Faster-RCNN model), reaching a 82.2% accuracy and (ii) the creation of an individual recognizer for K{ō}jima island macaques population (YOLOv8n model), reaching a 83% accuracy. We also created a K{ō}jima population social network by traditional methods, based on co-occurrences on videos. Thus, we provide a benchmark against which the automatically generated network will be assessed for reliability. These preliminary results are a testament to the potential of this innovative approach to provide the scientific community with a tool for tracking individuals and social network studies in Japanese macaques. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.01224 [pdf, other]

doi 10.1145/3589132.3625644

Revisiting Mobility Modeling with Graph: A Graph Transformer Model for Next Point-of-Interest Recommendation

Authors: Xiaohang Xu, Toyotaro Suzumura, Jiawei Yong, Masatoshi Hanai, Chuang Yang, Hiroki Kanezashi, Renhe Jiang, Shintaro Fukushima

Abstract: Next Point-of-Interest (POI) recommendation plays a crucial role in urban mobility applications. Recently, POI recommendation models based on Graph Neural Networks (GNN) have been extensively studied and achieved, however, the effective incorporation of both spatial and temporal information into such GNN-based models remains challenging. Extracting distinct fine-grained features unique to each pie… ▽ More Next Point-of-Interest (POI) recommendation plays a crucial role in urban mobility applications. Recently, POI recommendation models based on Graph Neural Networks (GNN) have been extensively studied and achieved, however, the effective incorporation of both spatial and temporal information into such GNN-based models remains challenging. Extracting distinct fine-grained features unique to each piece of information is difficult since temporal information often includes spatial information, as users tend to visit nearby POIs. To address the challenge, we propose \textbf{\underline{Mob}}ility \textbf{\underline{G}}raph \textbf{\underline{T}}ransformer (MobGT) that enables us to fully leverage graphs to capture both the spatial and temporal features in users' mobility patterns. MobGT combines individual spatial and temporal graph encoders to capture unique features and global user-location relations. Additionally, it incorporates a mobility encoder based on Graph Transformer to extract higher-order information between POIs. To address the long-tailed problem in spatial-temporal data, MobGT introduces a novel loss function, Tail Loss. Experimental results demonstrate that MobGT outperforms state-of-the-art models on various datasets and metrics, achieving 24\% improvement on average. Our codes are available at \url{https://github.com/Yukayo/MobGT}. △ Less

Submitted 2 October, 2023; originally announced October 2023.

Comments: Accepted as a full paper of SIGSPATIAL 2023

arXiv:2308.08934 [pdf, ps, other]

On Data Imbalance in Molecular Property Prediction with Pre-training

Authors: Limin Wang, Masatoshi Hanai, Toyotaro Suzumura, Shun Takashige, Kenjiro Taura

Abstract: Revealing and analyzing the various properties of materials is an essential and critical issue in the development of materials, including batteries, semiconductors, catalysts, and pharmaceuticals. Traditionally, these properties have been determined through theoretical calculations and simulations. However, it is not practical to perform such calculations on every single candidate material. Recent… ▽ More Revealing and analyzing the various properties of materials is an essential and critical issue in the development of materials, including batteries, semiconductors, catalysts, and pharmaceuticals. Traditionally, these properties have been determined through theoretical calculations and simulations. However, it is not practical to perform such calculations on every single candidate material. Recently, a combination method of the theoretical calculation and machine learning has emerged, that involves training machine learning models on a subset of theoretical calculation results to construct a surrogate model that can be applied to the remaining materials. On the other hand, a technique called pre-training is used to improve the accuracy of machine learning models. Pre-training involves training the model on pretext task, which is different from the target task, before training the model on the target task. This process aims to extract the input data features, stabilizing the learning process and improving its accuracy. However, in the case of molecular property prediction, there is a strong imbalance in the distribution of input data and features, which may lead to biased learning towards frequently occurring data during pre-training. In this study, we propose an effective pre-training method that addresses the imbalance in input data. We aim to improve the final accuracy by modifying the loss function of the existing representative pre-training method, node masking, to compensate the imbalance. We have investigated and assessed the impact of our proposed imbalance compensation on pre-training and the final prediction accuracy through experiments and evaluations using benchmark of molecular property prediction models. △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2308.08129 [pdf, other]

Is Self-Supervised Pretraining Good for Extrapolation in Molecular Property Prediction?

Authors: Shun Takashige, Masatoshi Hanai, Toyotaro Suzumura, Limin Wang, Kenjiro Taura

Abstract: The prediction of material properties plays a crucial role in the development and discovery of materials in diverse applications, such as batteries, semiconductors, catalysts, and pharmaceuticals. Recently, there has been a growing interest in employing data-driven approaches by using machine learning technologies, in combination with conventional theoretical calculations. In material science, the… ▽ More The prediction of material properties plays a crucial role in the development and discovery of materials in diverse applications, such as batteries, semiconductors, catalysts, and pharmaceuticals. Recently, there has been a growing interest in employing data-driven approaches by using machine learning technologies, in combination with conventional theoretical calculations. In material science, the prediction of unobserved values, commonly referred to as extrapolation, is particularly critical for property prediction as it enables researchers to gain insight into materials beyond the limits of available data. However, even with the recent advancements in powerful machine learning models, accurate extrapolation is still widely recognized as a significantly challenging problem. On the other hand, self-supervised pretraining is a machine learning technique where a model is first trained on unlabeled data using relatively simple pretext tasks before being trained on labeled data for target tasks. As self-supervised pretraining can effectively utilize material data without observed property values, it has the potential to improve the model's extrapolation ability. In this paper, we clarify how such self-supervised pretraining can enhance extrapolation performance.We propose an experimental framework for the demonstration and empirically reveal that while models were unable to accurately extrapolate absolute property values, self-supervised pretraining enables them to learn relative tendencies of unobserved property values and improve extrapolation performance. △ Less

Submitted 15 August, 2023; originally announced August 2023.

arXiv:2307.06576 [pdf, other]

doi 10.1145/3604915.3608801

Going Beyond Local: Global Graph-Enhanced Personalized News Recommendations

Authors: Boming Yang, Dairui Liu, Toyotaro Suzumura, Ruihai Dong, Irene Li

Abstract: Precisely recommending candidate news articles to users has always been a core challenge for personalized news recommendation systems. Most recent works primarily focus on using advanced natural language processing techniques to extract semantic information from rich textual data, employing content-based methods derived from local historical news. However, this approach lacks a global perspective,… ▽ More Precisely recommending candidate news articles to users has always been a core challenge for personalized news recommendation systems. Most recent works primarily focus on using advanced natural language processing techniques to extract semantic information from rich textual data, employing content-based methods derived from local historical news. However, this approach lacks a global perspective, failing to account for users' hidden motivations and behaviors beyond semantic information. To address this challenge, we propose a novel model called GLORY (Global-LOcal news Recommendation sYstem), which combines global representations learned from other users with local representations to enhance personalized recommendation systems. We accomplish this by constructing a Global-aware Historical News Encoder, which includes a global news graph and employs gated graph neural networks to enrich news representations, thereby fusing historical news representations by a historical news aggregator. Similarly, we extend this approach to a Global Candidate News Encoder, utilizing a global entity graph and a candidate news aggregator to enhance candidate news representation. Evaluation results on two public news datasets demonstrate that our method outperforms existing approaches. Furthermore, our model offers more diverse recommendations. △ Less

Submitted 26 September, 2023; v1 submitted 13 July, 2023; originally announced July 2023.

Comments: Recsys 2023, Best Student Paper

arXiv:2304.09105 [pdf, ps, other]

Exploring 360-Degree View of Customers for Lookalike Modeling

Authors: Md Mostafizur Rahman, Daisuke Kikuta, Satyen Abrol, Yu Hirate, Toyotaro Suzumura, Pablo Loyola, Takuma Ebisu, Manoj Kondapaka

Abstract: Lookalike models are based on the assumption that user similarity plays an important role towards product selling and enhancing the existing advertising campaigns from a very large user base. Challenges associated to these models reside on the heterogeneity of the user base and its sparsity. In this work, we propose a novel framework that unifies the customers different behaviors or features such… ▽ More Lookalike models are based on the assumption that user similarity plays an important role towards product selling and enhancing the existing advertising campaigns from a very large user base. Challenges associated to these models reside on the heterogeneity of the user base and its sparsity. In this work, we propose a novel framework that unifies the customers different behaviors or features such as demographics, buying behaviors on different platforms, customer loyalty behaviors and build a lookalike model to improve customer targeting for Rakuten Group, Inc. Extensive experiments on real e-commerce and travel datasets demonstrate the effectiveness of our proposed lookalike model for user targeting task. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Journal ref: SIGIR 2023

arXiv:2301.12929 [pdf, other]

Can Persistent Homology provide an efficient alternative for Evaluation of Knowledge Graph Completion Methods?

Authors: Anson Bastos, Kuldeep Singh, Abhishek Nadgeri, Johannes Hoffart, Toyotaro Suzumura, Manish Singh

Abstract: In this paper we present a novel method, $\textit{Knowledge Persistence}$ ($\mathcal{KP}$), for faster evaluation of Knowledge Graph (KG) completion approaches. Current ranking-based evaluation is quadratic in the size of the KG, leading to long evaluation times and consequently a high carbon footprint. $\mathcal{KP}$ addresses this by representing the topology of the KG completion methods through… ▽ More In this paper we present a novel method, $\textit{Knowledge Persistence}$ ($\mathcal{KP}$), for faster evaluation of Knowledge Graph (KG) completion approaches. Current ranking-based evaluation is quadratic in the size of the KG, leading to long evaluation times and consequently a high carbon footprint. $\mathcal{KP}$ addresses this by representing the topology of the KG completion methods through the lens of topological data analysis, concretely using persistent homology. The characteristics of persistent homology allow $\mathcal{KP}$ to evaluate the quality of the KG completion looking only at a fraction of the data. Experimental results on standard datasets show that the proposed metric is highly correlated with ranking metrics (Hits@N, MR, MRR). Performance evaluation shows that $\mathcal{KP}$ is computationally efficient: In some cases, the evaluation time (validation+test) of a KG completion method has been reduced from 18 hours (using Hits@10) to 27 seconds (using $\mathcal{KP}$), and on average (across methods & data) reduces the evaluation time (validation+test) by $\approx$ $\textbf{99.96}\%$. △ Less

Submitted 31 January, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: To appear in proceedings of The Web Conference 2023 (WWW'23)

arXiv:2212.05989

MegaCRN: Meta-Graph Convolutional Recurrent Network for Spatio-Temporal Modeling

Authors: Renhe Jiang, Zhaonan Wang, Jiawei Yong, Puneet Jeph, Quanjun Chen, Yasumasa Kobayashi, Xuan Song, Toyotaro Suzumura, Shintaro Fukushima

Abstract: Spatio-temporal modeling as a canonical task of multivariate time series forecasting has been a significant research topic in AI community. To address the underlying heterogeneity and non-stationarity implied in the graph streams, in this study, we propose Spatio-Temporal Meta-Graph Learning as a novel Graph Structure Learning mechanism on spatio-temporal data. Specifically, we implement this idea… ▽ More Spatio-temporal modeling as a canonical task of multivariate time series forecasting has been a significant research topic in AI community. To address the underlying heterogeneity and non-stationarity implied in the graph streams, in this study, we propose Spatio-Temporal Meta-Graph Learning as a novel Graph Structure Learning mechanism on spatio-temporal data. Specifically, we implement this idea into Meta-Graph Convolutional Recurrent Network (MegaCRN) by plugging the Meta-Graph Learner powered by a Meta-Node Bank into GCRN encoder-decoder. We conduct a comprehensive evaluation on two benchmark datasets (METR-LA and PEMS-BAY) and a large-scale spatio-temporal dataset that contains a variaty of non-stationary phenomena. Our model outperformed the state-of-the-arts to a large degree on all three datasets (over 27% MAE and 34% RMSE). Besides, through a series of qualitative evaluations, we demonstrate that our model can explicitly disentangle locations and time slots with different patterns and be robustly adaptive to different anomalous situations. Codes and datasets are available at https://github.com/deepkashiwa20/MegaCRN. △ Less

Submitted 19 April, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

Comments: Rejected by AIJ. We withdraw for now and shall further work on the manuscript

arXiv:2211.14701 [pdf, other]

Spatio-Temporal Meta-Graph Learning for Traffic Forecasting

Authors: Renhe Jiang, Zhaonan Wang, Jiawei Yong, Puneet Jeph, Quanjun Chen, Yasumasa Kobayashi, Xuan Song, Shintaro Fukushima, Toyotaro Suzumura

Abstract: Traffic forecasting as a canonical task of multivariate time series forecasting has been a significant research topic in AI community. To address the spatio-temporal heterogeneity and non-stationarity implied in the traffic stream, in this study, we propose Spatio-Temporal Meta-Graph Learning as a novel Graph Structure Learning mechanism on spatio-temporal data. Specifically, we implement this ide… ▽ More Traffic forecasting as a canonical task of multivariate time series forecasting has been a significant research topic in AI community. To address the spatio-temporal heterogeneity and non-stationarity implied in the traffic stream, in this study, we propose Spatio-Temporal Meta-Graph Learning as a novel Graph Structure Learning mechanism on spatio-temporal data. Specifically, we implement this idea into Meta-Graph Convolutional Recurrent Network (MegaCRN) by plugging the Meta-Graph Learner powered by a Meta-Node Bank into GCRN encoder-decoder. We conduct a comprehensive evaluation on two benchmark datasets (i.e., METR-LA and PEMS-BAY) and a new large-scale traffic speed dataset called EXPY-TKY that covers 1843 expressway road links in Tokyo. Our model outperformed the state-of-the-arts on all three datasets. Besides, through a series of qualitative evaluations, we demonstrate that our model can explicitly disentangle the road links and time slots with different patterns and be robustly adaptive to any anomalous traffic situations. Codes and datasets are available at https://github.com/deepkashiwa20/MegaCRN. △ Less

Submitted 6 March, 2023; v1 submitted 26 November, 2022; originally announced November 2022.

Comments: Accepted by AAAI 2023

arXiv:2211.11979 [pdf, other]

Learnable Spectral Wavelets on Dynamic Graphs to Capture Global Interactions

Authors: Anson Bastos, Abhishek Nadgeri, Kuldeep Singh, Toyotaro Suzumura, Manish Singh

Abstract: Learning on evolving(dynamic) graphs has caught the attention of researchers as static methods exhibit limited performance in this setting. The existing methods for dynamic graphs learn spatial features by local neighborhood aggregation, which essentially only captures the low pass signals and local interactions. In this work, we go beyond current approaches to incorporate global features for effe… ▽ More Learning on evolving(dynamic) graphs has caught the attention of researchers as static methods exhibit limited performance in this setting. The existing methods for dynamic graphs learn spatial features by local neighborhood aggregation, which essentially only captures the low pass signals and local interactions. In this work, we go beyond current approaches to incorporate global features for effectively learning representations of a dynamically evolving graph. We propose to do so by capturing the spectrum of the dynamic graph. Since static methods to learn the graph spectrum would not consider the history of the evolution of the spectrum as the graph evolves with time, we propose a novel approach to learn the graph wavelets to capture this evolving spectra. Further, we propose a framework that integrates the dynamically captured spectra in the form of these learnable wavelets into spatial features for incorporating local and global interactions. Experiments on eight standard datasets show that our method significantly outperforms related methods on various tasks for dynamic graphs. △ Less

Submitted 21 November, 2022; originally announced November 2022.

Comments: Accepted for publication in AAAI 2023

arXiv:2206.14024 [pdf, other]

doi 10.1063/5.0129791

BOTAN: BOnd TArgeting Network for prediction of slow glassy dynamics by machine learning relative motion

Authors: Hayato Shiba, Masatoshi Hanai, Toyotaro Suzumura, Takashi Shimokawabe

Abstract: Recent developments in machine learning have enabled accurate predictions of the dynamics of slow structural relaxation in glass-forming systems. However, existing machine-learning models for these tasks are mostly designed such that they learn a single dynamic quantity and relate it to the structural features of glassy liquids. In this study, we propose a graph neural network model, ``BOnd TArget… ▽ More Recent developments in machine learning have enabled accurate predictions of the dynamics of slow structural relaxation in glass-forming systems. However, existing machine-learning models for these tasks are mostly designed such that they learn a single dynamic quantity and relate it to the structural features of glassy liquids. In this study, we propose a graph neural network model, ``BOnd TArgeting Network (BOTAN)'', that learns relative motion between neighboring pairs of particles, in addition to the self-motion of particles. By relating the structural features to these two different dynamical variables, the model autonomously acquires the ability to discern how different dynamical processes, strain fluctuations and particle rearrangements, affect the self-motion of particles undergoing slow relaxation, and thus can predict with high precision how slow structural relaxation develops in space and time. △ Less

Submitted 2 February, 2023; v1 submitted 28 June, 2022; originally announced June 2022.

Comments: 11 pages, 10 figures, a typo fixed in the title; accepted in J. Chem. Phys

Journal ref: J. Chem. Phys. 158, 084503 (2023)

arXiv:2205.12102 [pdf, other]

KQGC: Knowledge Graph Embedding with Smoothing Effects of Graph Convolutions for Recommendation

Authors: Daisuke Kikuta, Toyotaro Suzumura, Md Mostafizur Rahman, Yu Hirate, Satyen Abrol, Manoj Kondapaka, Takuma Ebisu, Pablo Loyola

Abstract: Leveraging graphs on recommender systems has gained popularity with the development of graph representation learning (GRL). In particular, knowledge graph embedding (KGE) and graph neural networks (GNNs) are representative GRL approaches, which have achieved the state-of-the-art performance on several recommendation tasks. Furthermore, combination of KGE and GNNs (KG-GNNs) has been explored and fo… ▽ More Leveraging graphs on recommender systems has gained popularity with the development of graph representation learning (GRL). In particular, knowledge graph embedding (KGE) and graph neural networks (GNNs) are representative GRL approaches, which have achieved the state-of-the-art performance on several recommendation tasks. Furthermore, combination of KGE and GNNs (KG-GNNs) has been explored and found effective in many academic literatures. One of the main characteristics of GNNs is their ability to retain structural properties among neighbors in the resulting dense representation, which is usually coined as smoothing. The smoothing is specially desired in the presence of homophilic graphs, such as the ones we find on recommender systems. In this paper, we propose a new model for recommender systems named Knowledge Query-based Graph Convolution (KQGC). In contrast to exisiting KG-GNNs, KQGC focuses on the smoothing, and leverages a simple linear graph convolution for smoothing KGE. A pre-trained KGE is fed into KQGC, and it is smoothed by aggregating neighbor knowledge queries, which allow entity-embeddings to be aligned on appropriate vector points for smoothing KGE effectively. We apply the proposed KQGC to a recommendation task that aims prospective users for specific products. Extensive experiments on a real E-commerce dataset demonstrate the effectiveness of KQGC. △ Less

Submitted 23 May, 2022; originally announced May 2022.

Comments: 9pages, 6 figures

arXiv:2203.14188 [pdf, ps, other]

mdx: A Cloud Platform for Supporting Data Science and Cross-Disciplinary Research Collaborations

Authors: Toyotaro Suzumura, Akiyoshi Sugiki, Hiroyuki Takizawa, Akira Imakura, Hiroshi Nakamura, Kenjiro Taura, Tomohiro Kudoh, Toshihiro Hanawa, Yuji Sekiya, Hiroki Kobayashi, Shin Matsushima, Yohei Kuga, Ryo Nakamura, Renhe Jiang, Junya Kawase, Masatoshi Hanai, Hiroshi Miyazaki, Tsutomu Ishizaki, Daisuke Shimotoku, Daisuke Miyamoto, Kento Aida, Atsuko Takefusa, Takashi Kurimoto, Koji Sasayama, Naoya Kitagawa , et al. (8 additional authors not shown)

Abstract: The growing amount of data and advances in data science have created a need for a new kind of cloud platform that provides users with flexibility, strong security, and the ability to couple with supercomputers and edge devices through high-performance networks. We have built such a nation-wide cloud platform, called "mdx" to meet this need. The mdx platform's virtualization service, jointly operat… ▽ More The growing amount of data and advances in data science have created a need for a new kind of cloud platform that provides users with flexibility, strong security, and the ability to couple with supercomputers and edge devices through high-performance networks. We have built such a nation-wide cloud platform, called "mdx" to meet this need. The mdx platform's virtualization service, jointly operated by 9 national universities and 2 national research institutes in Japan, launched in 2021, and more features are in development. Currently mdx is used by researchers in a wide variety of domains, including materials informatics, geo-spatial information science, life science, astronomical science, economics, social science, and computer science. This paper provides an the overview of the mdx platform, details the motivation for its development, reports its current status, and outlines its future plans. △ Less

Submitted 26 March, 2022; originally announced March 2022.

arXiv:2203.12363 [pdf, other]

Ethereum Fraud Detection with Heterogeneous Graph Neural Networks

Authors: Hiroki Kanezashi, Toyotaro Suzumura, Xin Liu, Takahiro Hirofuchi

Abstract: While transactions with cryptocurrencies such as Ethereum are becoming more prevalent, fraud and other criminal transactions are not uncommon. Graph analysis algorithms and machine learning techniques detect suspicious transactions that lead to phishing in large transaction networks. Many graph neural network (GNN) models have been proposed to apply deep learning techniques to graph structures. Al… ▽ More While transactions with cryptocurrencies such as Ethereum are becoming more prevalent, fraud and other criminal transactions are not uncommon. Graph analysis algorithms and machine learning techniques detect suspicious transactions that lead to phishing in large transaction networks. Many graph neural network (GNN) models have been proposed to apply deep learning techniques to graph structures. Although there is research on phishing detection using GNN models in the Ethereum transaction network, models that address the scale of the number of vertices and edges and the imbalance of labels have not yet been studied. In this paper, we compared the model performance of GNN models on the actual Ethereum transaction network dataset and phishing reported label data to exhaustively compare and verify which GNN models and hyperparameters produce the best accuracy. Specifically, we evaluated the model performance of representative homogeneous GNN models which consider single-type nodes and edges and heterogeneous GNN models which support different types of nodes and edges. We showed that heterogeneous models had better model performance than homogeneous models. In particular, the RGCN model achieved the best performance in the overall metrics. △ Less

Submitted 4 July, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

Comments: 8 pages, 5 figures, Accepted to KDD'22 Workshop on Mining and Learning with Graphs

arXiv:2201.09332 [pdf, other]

How Expressive are Transformers in Spectral Domain for Graphs?

Authors: Anson Bastos, Abhishek Nadgeri, Kuldeep Singh, Hiroki Kanezashi, Toyotaro Suzumura, Isaiah Onando Mulang'

Abstract: The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides ex… ▽ More The recent works proposing transformer-based models for graphs have proven the inadequacy of Vanilla Transformer for graph representation learning. To understand this inadequacy, there is a need to investigate if spectral analysis of the transformer will reveal insights into its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and establish the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis and prove that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum (i.e., actual frequency components of the graphs) analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN-based models with low-pass characteristics (e.g., GAT). △ Less

Submitted 15 July, 2022; v1 submitted 23 January, 2022; originally announced January 2022.

Comments: Accepted in Transactions on Machine Learning Research

arXiv:2109.07893 [pdf, other]

doi 10.1145/3458817.3480858

Efficient Scaling of Dynamic Graph Neural Networks

Authors: Venkatesan T. Chakaravarthy, Shivmaran S. Pandian, Saurabh Raje, Yogish Sabharwal, Toyotaro Suzumura, Shashanka Ubaru

Abstract: We present distributed algorithms for training dynamic Graph Neural Networks (GNN) on large scale graphs spanning multi-node, multi-GPU systems. To the best of our knowledge, this is the first scaling study on dynamic GNN. We devise mechanisms for reducing the GPU memory usage and identify two execution time bottlenecks: CPU-GPU data transfer; and communication volume. Exploiting properties of dyn… ▽ More We present distributed algorithms for training dynamic Graph Neural Networks (GNN) on large scale graphs spanning multi-node, multi-GPU systems. To the best of our knowledge, this is the first scaling study on dynamic GNN. We devise mechanisms for reducing the GPU memory usage and identify two execution time bottlenecks: CPU-GPU data transfer; and communication volume. Exploiting properties of dynamic graphs, we design a graph difference-based strategy to significantly reduce the transfer time. We develop a simple, but effective data distribution technique under which the communication volume remains fixed and linear in the input size, for any number of GPUs. Our experiments using billion-size graphs on a system of 128 GPUs shows that: (i) the distribution scheme achieves up to 30x speedup on 128 GPUs; (ii) the graph-difference technique reduces the transfer time by a factor of up to 4.1x and the overall execution time by up to 40% △ Less

Submitted 16 September, 2021; originally announced September 2021.

Comments: Conference version to appear in the proceedings of SC'21

ACM Class: I.2.6; C.2.4

arXiv:2105.10094 [pdf, ps, other]

Finding All Bounded-Length Simple Cycles in a Directed Graph

Authors: Anshul Gupta, Toyotaro Suzumura

Abstract: A new efficient algorithm is presented for finding all simple cycles that satisfy a length constraint in a directed graph. When the number of vertices is non-trivial, most cycle-finding problems are of practical interest for sparse graphs only. We show that for a class of sparse graphs in which the vertex degrees are almost uniform, our algorithm can find all cycles of length less than or equal to… ▽ More A new efficient algorithm is presented for finding all simple cycles that satisfy a length constraint in a directed graph. When the number of vertices is non-trivial, most cycle-finding problems are of practical interest for sparse graphs only. We show that for a class of sparse graphs in which the vertex degrees are almost uniform, our algorithm can find all cycles of length less than or equal to $k$ in $O((c+n)(k-1)d^k)$ steps, where $n$ is the number of vertices, $c$ is the total number of cycles discovered, $d$ is the average degree of the graph's vertices, and $k > 1$. While our analysis for the running time addresses only a class of sparse graphs, we provide empirical and experimental evidence of the efficiency of the algorithm for general sparse graphs. This algorithm is a significant improvement over the only other deterministic algorithm for this problem known to us; it also lends itself to massive parallelism. Experimental results of a serial implementation on some large real-world graphs are presented. △ Less

Submitted 26 May, 2021; v1 submitted 20 May, 2021; originally announced May 2021.

ACM Class: G.2.2; I.1.2

arXiv:2103.14620 [pdf, other]

LiGCN: Label-interpretable Graph Convolutional Networks for Multi-label Text Classification

Authors: Irene Li, Aosong Feng, Hao Wu, Tianxiao Li, Toyotaro Suzumura, Ruihai Dong

Abstract: Multi-label text classification (MLTC) is an attractive and challenging task in natural language processing (NLP). Compared with single-label text classification, MLTC has a wider range of applications in practice. In this paper, we propose a label-interpretable graph convolutional network model to solve the MLTC problem by modeling tokens and labels as nodes in a heterogeneous graph. In this way,… ▽ More Multi-label text classification (MLTC) is an attractive and challenging task in natural language processing (NLP). Compared with single-label text classification, MLTC has a wider range of applications in practice. In this paper, we propose a label-interpretable graph convolutional network model to solve the MLTC problem by modeling tokens and labels as nodes in a heterogeneous graph. In this way, we are able to take into account multiple relationships including token-level relationships. Besides, the model allows better interpretability for predicted labels as the token-label edges are exposed. We evaluate our method on four real-world datasets and it achieves competitive scores against selected baseline methods. Specifically, this model achieves a gain of 0.14 on the F1 score in the small label set MLTC, and 0.07 in the large label set scenario. △ Less

Submitted 22 May, 2022; v1 submitted 26 March, 2021; originally announced March 2021.

Comments: 8 tables, 3 figures

Journal ref: DLG4NLP Workshop, NAACL 2022

arXiv:2101.07026 [pdf, ps, other]

Time-Efficient and High-Quality Graph Partitioning for Graph Dynamic Scaling

Authors: Masatoshi Hanai, Nikos Tziritas, Toyotaro Suzumura, Wentong Cai, Georgios Theodoropoulos

Abstract: The dynamic scaling of distributed computations plays an important role in the utilization of elastic computational resources, such as the cloud. It enables the provisioning and de-provisioning of resources to match dynamic resource availability and demands. In the case of distributed graph processing, changing the number of the graph partitions while maintaining high partitioning quality imposes… ▽ More The dynamic scaling of distributed computations plays an important role in the utilization of elastic computational resources, such as the cloud. It enables the provisioning and de-provisioning of resources to match dynamic resource availability and demands. In the case of distributed graph processing, changing the number of the graph partitions while maintaining high partitioning quality imposes serious computational overheads as typically a time-consuming graph partitioning algorithm needs to execute each time repartitioning is required. In this paper, we propose a dynamic scaling method that can efficiently change the number of graph partitions while kee** its quality high. Our idea is based on two techniques: preprocessing and very fast edge partitioning, called graph edge ordering and chunk-based edge partitioning, respectively. The former converts the graph data into an ordered edge list in such a way that edges with high locality are closer to each other. The latter immediately divides the ordered edge list into an arbitrary number of high-quality partitions. The evaluation with the real-world billion-scale graphs demonstrates that our proposed approach significantly reduces the repartitioning time, while the partitioning quality it achieves is on par with that of the best existing static method. △ Less

Submitted 18 January, 2021; originally announced January 2021.

Comments: 21 pages, 15 figures. Under review

arXiv:2007.07095 [pdf, other]

Private Sources of Mobility Data Under COVID-19

Authors: Raquel Pérez Arnal, David Conesa, Sergio Alvarez-Napagao, Toyotaro Suzumura, Martí Català, Enric Alvarez, Dario Garcia-Gasulla

Abstract: The COVID-19 pandemic is changing the world in unprecedented and unpredictable ways. Human mobility is at the epicenter of that change, as the greatest facilitator for the spread of the virus. To study the change in mobility, to evaluate the efficiency of mobility restriction policies, and to facilitate a better response to possible future crisis, we need to properly understand all mobility data s… ▽ More The COVID-19 pandemic is changing the world in unprecedented and unpredictable ways. Human mobility is at the epicenter of that change, as the greatest facilitator for the spread of the virus. To study the change in mobility, to evaluate the efficiency of mobility restriction policies, and to facilitate a better response to possible future crisis, we need to properly understand all mobility data sources at our disposal. Our work is dedicated to the study of private mobility sources, gathered and released by large technological companies. This data is of special interest because, unlike most public sources, it is focused on people, not transportation means. i.e., its unit of measurement is the closest thing to a person in a western society: a phone. Furthermore, the sample of society they cover is large and representative. On the other hand, this sort of data is not directly accessible for anonymity reasons. Thus, properly interpreting its patterns demands caution. Aware of that, we set forth to explore the behavior and inter-relations of private sources of mobility data in the context of Spain. This country represents a good experimental setting because of its large and fast pandemic peak, and for its implementation of a sustained, generalized lockdown. We find private mobility sources to be both correlated and complementary. Using them, we evaluate the efficiency of implemented policies, and provide a insights into what new normal means in Spain. △ Less

Submitted 14 July, 2020; originally announced July 2020.

Comments: 14 pages, 8 figures, 1 table

arXiv:2006.05573 [pdf, other]

Global Data Science Project for COVID-19

Authors: Toyotaro Suzumura, Dario Garcia-Gasulla, Sergio Alvarez Napagao, Irene Li, Hiroshi Maruyama, Hiroki Kanezashi, Raquel P'erez-Arnal, Kunihiko Miyoshi, Euma Ishii, Keita Suzuki, Sayaka Shiba, Mariko Kurokawa, Yuta Kanzawa, Naomi Nakagawa, Masatoshi Hanai, Yixin Li, Tianxiao Li

Abstract: This paper aims at providing the summary of the Global Data Science Project (GDSC) for COVID-19. as on May 31 2020. COVID-19 has largely impacted on our societies through both direct and indirect effects transmitted by the policy measures to counter the spread of viruses. We quantitatively analysed the multifaceted impacts of the COVID-19 pandemic on our societies including people's mobility, heal… ▽ More This paper aims at providing the summary of the Global Data Science Project (GDSC) for COVID-19. as on May 31 2020. COVID-19 has largely impacted on our societies through both direct and indirect effects transmitted by the policy measures to counter the spread of viruses. We quantitatively analysed the multifaceted impacts of the COVID-19 pandemic on our societies including people's mobility, health, and social behaviour changes. People's mobility has changed significantly due to the implementation of travel restriction and quarantine measurements. Indeed, the physical distance has widened at international (cross-border), national and regional level. At international level, due to the travel restrictions, the number of international flights has plunged overall at around 88 percent during March. In particular, the number of flights connecting Europe dropped drastically in mid of March after the United States announced travel restrictions to Europe and the EU and participating countries agreed to close borders, at 84 percent decline compared to March 10th. Similarly, we examined the impacts of quarantine measures in the major city: Tokyo (Japan), New York City (the United States), and Barcelona (Spain). Within all three cities, we found the significant decline in traffic volume. We also identified the increased concern for mental health through the analysis of posts on social networking services such as Twitter and Instagram. Notably, in the beginning of April 2020, the number of post with #depression on Instagram doubled, which might reflect the rise in mental health awareness among Instagram users. Besides, we identified the changes in a wide range of people's social behaviors, as well as economic impacts through the analysis of Instagram data and primary survey data. △ Less

Submitted 3 August, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

Comments: 42 pages, 49 figures

arXiv:2006.02950 [pdf, other]

The Impact of COVID-19 on Flight Networks

Authors: Toyotaro Suzumura, Hiroki Kanezashi, Mishal Dholakia, Euma Ishii, Sergio Alvarez Napagao, Raquel Pérez-Arnal, Dario Garcia-Gasulla, Toshiaki Murofushi

Abstract: As COVID-19 transmissions spread worldwide, governments have announced and enforced travel restrictions to prevent further infections. Such restrictions have a direct effect on the volume of international flights among these countries, resulting in extensive social and economic costs. To better understand the situation in a quantitative manner, we used the Opensky network data to clarify flight pa… ▽ More As COVID-19 transmissions spread worldwide, governments have announced and enforced travel restrictions to prevent further infections. Such restrictions have a direct effect on the volume of international flights among these countries, resulting in extensive social and economic costs. To better understand the situation in a quantitative manner, we used the Opensky network data to clarify flight patterns and flight densities around the world and observe relationships between flight numbers with new infections, and with the economy (unemployment rate) in Barcelona. We found that the number of daily flights gradually decreased and suddenly dropped 64% during the second half of March in 2020 after the US and Europe enacted travel restrictions. We also observed a 51% decrease in the global flight network density decreased during this period. Regarding new COVID-19 cases, the world had an unexpected surge regardless of travel restrictions. Finally, the layoffs for temporary workers in the tourism and airplane business increased by 4.3 fold in the weeks following Spain's decision to close its borders. △ Less

Submitted 14 February, 2021; v1 submitted 4 June, 2020; originally announced June 2020.

Comments: 12 pages, 42 figures. Toyotaro Suzumura and Hiroki Kanezashi contributed equally to this work

arXiv:2005.12873 [pdf, other]

Benchmarking Graph Data Management and Processing Systems: A Survey

Authors: Miyuru Dayarathna, Toyotaro Suzumura

Abstract: The development of scalable, representative, and widely adopted benchmarks for graph data systems have been a question for which answers has been sought for decades. We conduct an in-depth study of the existing literature on benchmarks for graph data management and processing, covering 20 different benchmarks developed during the last 15 years. We categorize the benchmarks into three areas focusin… ▽ More The development of scalable, representative, and widely adopted benchmarks for graph data systems have been a question for which answers has been sought for decades. We conduct an in-depth study of the existing literature on benchmarks for graph data management and processing, covering 20 different benchmarks developed during the last 15 years. We categorize the benchmarks into three areas focusing on benchmarks for graph processing systems, graph database benchmarks, and bigdata benchmarks with graph processing workloads. This systematic approach allows us to identify multiple issues existing in this area, including i) few benchmarks exist which can produce high workload scenarios, ii) no significant work done on benchmarking graph stream processing as well as graph based machine learning, iii) benchmarks tend to use conventional metrics despite new meaningful metrics have been around for years, iv) increasing number of big data benchmarks appear with graph processing workloads. Following these observations, we conclude the survey by describing key challenges for future research on graph data systems benchmarking. △ Less

Submitted 22 September, 2021; v1 submitted 26 May, 2020; originally announced May 2020.

Comments: 26 pages, 5 figures

ACM Class: A.1; E.1; H.2

arXiv:2004.10899 [pdf, other]

What are We Depressed about When We Talk about COVID19: Mental Health Analysis on Tweets Using Natural Language Processing

Authors: Irene Li, Yixin Li, Tianxiao Li, Sergio Alvarez-Napagao, Dario Garcia-Gasulla, Toyotaro Suzumura

Abstract: The outbreak of coronavirus disease 2019 (COVID-19) recently has affected human life to a great extent. Besides direct physical and economic threats, the pandemic also indirectly impact people's mental health conditions, which can be overwhelming but difficult to measure. The problem may come from various reasons such as unemployment status, stay-at-home policy, fear for the virus, and so forth. I… ▽ More The outbreak of coronavirus disease 2019 (COVID-19) recently has affected human life to a great extent. Besides direct physical and economic threats, the pandemic also indirectly impact people's mental health conditions, which can be overwhelming but difficult to measure. The problem may come from various reasons such as unemployment status, stay-at-home policy, fear for the virus, and so forth. In this work, we focus on applying natural language processing (NLP) techniques to analyze tweets in terms of mental health. We trained deep models that classify each tweet into the following emotions: anger, anticipation, disgust, fear, joy, sadness, surprise and trust. We build the EmoCT (Emotion-Covid19-Tweet) dataset for the training purpose by manually labeling 1,000 English tweets. Furthermore, we propose and compare two methods to find out the reasons that are causing sadness and fear. △ Less

Submitted 8 June, 2020; v1 submitted 22 April, 2020; originally announced April 2020.

Comments: 7 pages, 7 figures

arXiv:1912.07701 [pdf, ps, other]

Exploring Multi-Banking Customer-to-Customer Relations in AML Context with Poincaré Embeddings

Authors: Lucia Larise Stavarache, Donatas Narbutis, Toyotaro Suzumura, Ray Harishankar, Augustas Žaltauskas

Abstract: In the recent years money laundering schemes have grown in complexity and speed of realization, affecting financial institutions and millions of customers globally. Strengthened privacy policies, along with in-country regulations, make it hard for banks to inner- and cross-share, and report suspicious activities for the AML (Anti-Money Laundering) measures. Existing topologies and models for AML a… ▽ More In the recent years money laundering schemes have grown in complexity and speed of realization, affecting financial institutions and millions of customers globally. Strengthened privacy policies, along with in-country regulations, make it hard for banks to inner- and cross-share, and report suspicious activities for the AML (Anti-Money Laundering) measures. Existing topologies and models for AML analysis and information sharing are subject to major limitations, such as compliance with regulatory constraints, extended infrastructure to run high-computation algorithms, data quality and span, proving cumbersome and costly to execute, federate, and interpret. This paper proposes a new topology for exploring multi-banking customer social relations in AML context -- customer-to-customer, customer-to-transaction, and transaction-to-transaction -- using a 3D modeling topological algebra formulated through Poincaré embeddings. △ Less

Submitted 22 June, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

Comments: NeurIPS 2019 Workshop on Robust AI in Financial Services (https://sites.google.com/view/robust-ai-in-fs-2019)

arXiv:1909.12946 [pdf, other]

Towards Federated Graph Learning for Collaborative Financial Crimes Detection

Authors: Toyotaro Suzumura, Yi Zhou, Natahalie Baracaldo, Guangnan Ye, Keith Houck, Ryo Kawahara, Ali Anwar, Lucia Larise Stavarache, Yuji Watanabe, Pablo Loyola, Daniel Klyashtorny, Heiko Ludwig, Kumar Bhaskaran

Abstract: Financial crime is a large and growing problem, in some way touching almost every financial institution. Financial institutions are the front line in the war against financial crime and accordingly, must devote substantial human and technology resources to this effort. Current processes to detect financial misconduct have limitations in their ability to effectively differentiate between malicious… ▽ More Financial crime is a large and growing problem, in some way touching almost every financial institution. Financial institutions are the front line in the war against financial crime and accordingly, must devote substantial human and technology resources to this effort. Current processes to detect financial misconduct have limitations in their ability to effectively differentiate between malicious behavior and ordinary financial activity. These limitations tend to result in gross over-reporting of suspicious activity that necessitate time-intensive and costly manual review. Advances in technology used in this domain, including machine learning based approaches, can improve upon the effectiveness of financial institutions' existing processes, however, a key challenge that most financial institutions continue to face is that they address financial crimes in isolation without any insight from other firms. Where financial institutions address financial crimes through the lens of their own firm, perpetrators may devise sophisticated strategies that may span across institutions and geographies. Financial institutions continue to work relentlessly to advance their capabilities, forming partnerships across institutions to share insights, patterns and capabilities. These public-private partnerships are subject to stringent regulatory and data privacy requirements, thereby making it difficult to rely on traditional technology solutions. In this paper, we propose a methodology to share key information across institutions by using a federated graph learning platform that enables us to build more accurate machine learning models by leveraging federated learning and also graph learning approaches. We demonstrated that our federated model outperforms local model by 20% with the UK FCA TechSprint data set. This new platform opens up a door to efficiently detecting global money laundering activity. △ Less

Submitted 2 October, 2019; v1 submitted 19 September, 2019; originally announced September 2019.

arXiv:1909.10660 [pdf, other]

Exploring Graph Neural Networks for Stock Market Predictions with Rolling Window Analysis

Authors: Daiki Matsunaga, Toyotaro Suzumura, Toshihiro Takahashi

Abstract: Recently, there has been a surge of interest in the use of machine learning to help aid in the accurate predictions of financial markets. Despite the exciting advances in this cross-section of finance and AI, many of the current approaches are limited to using technical analysis to capture historical trends of each stock price and thus limited to certain experimental setups to obtain good predicti… ▽ More Recently, there has been a surge of interest in the use of machine learning to help aid in the accurate predictions of financial markets. Despite the exciting advances in this cross-section of finance and AI, many of the current approaches are limited to using technical analysis to capture historical trends of each stock price and thus limited to certain experimental setups to obtain good prediction results. On the other hand, professional investors additionally use their rich knowledge of inter-market and inter-company relations to map the connectivity of companies and events, and use this map to make better market predictions. For instance, they would predict the movement of a certain company's stock price based not only on its former stock price trends but also on the performance of its suppliers or customers, the overall industry, macroeconomic factors and trade policies. This paper investigates the effectiveness of work at the intersection of market predictions and graph neural networks, which hold the potential to mimic the ways in which investors make decisions by incorporating company knowledge graphs directly into the predictive model. The main goal of this work is to test the validity of this approach across different markets and longer time horizons for backtesting using rolling window analysis. In this work, we concentrate on the prediction of individual stock prices in the Japanese Nikkei 225 market over a period of roughly 20 years. For the knowledge graph, we use the Nikkei Value Search data, which is a rich dataset showing mainly supplier relations among Japanese and foreign companies. Our preliminary results show a 29.5% increase and a 2.2-fold increase in the return ratio and Sharpe ratio, respectively, when compared to the market benchmark, as well as a 6.32% increase and 1.3-fold increase, respectively, compared to the baseline LSTM model. △ Less

Submitted 27 November, 2019; v1 submitted 23 September, 2019; originally announced September 2019.

Comments: NeurIPS 2019 Workshop on Robust AI in Financial Services: Data, Fairness, Explainability, Trustworthiness, and Privacy (Robust AI in FS), Vancouver, Canada

arXiv:1908.05855 [pdf, ps, other]

Distributed Edge Partitioning for Trillion-edge Graphs

Authors: Masatoshi Hanai, Toyotaro Suzumura, Wen Jun Tan, Elvis Liu, Georgios Theodoropoulos, Wentong Cai

Abstract: We propose Distributed Neighbor Expansion (Distributed NE), a parallel and distributed graph partitioning method that can scale to trillion-edge graphs while providing high partitioning quality. Distributed NE is based on a new heuristic, called parallel expansion, where each partition is constructed in parallel by greedily expanding its edge set from a single vertex in such a way that the increas… ▽ More We propose Distributed Neighbor Expansion (Distributed NE), a parallel and distributed graph partitioning method that can scale to trillion-edge graphs while providing high partitioning quality. Distributed NE is based on a new heuristic, called parallel expansion, where each partition is constructed in parallel by greedily expanding its edge set from a single vertex in such a way that the increase of the vertex cuts becomes local minimal. We theoretically prove that the proposed method has the upper bound in the partitioning quality. The empirical evaluation with various graphs shows that the proposed method produces higher-quality partitions than the state-of-the-art distributed graph partitioning algorithms. The performance evaluation shows that the space efficiency of the proposed method is an order-of-magnitude better than the existing algorithms, kee** its time efficiency comparable. As a result, Distributed NE can partition a trillion-edge graph using only 256 machines within 70 minutes. △ Less

Submitted 21 September, 2019; v1 submitted 16 August, 2019; originally announced August 2019.

Comments: VLDB 2020, Code in http://www.masahanai.jp/DistributedNE/

arXiv:1902.10191 [pdf, other]

EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs

Authors: Aldo Pareja, Giacomo Domeniconi, Jie Chen, Tengfei Ma, Toyotaro Suzumura, Hiroki Kanezashi, Tim Kaler, Tao B. Schardl, Charles E. Leiserson

Abstract: Graph representation learning resurges as a trending research subject owing to the widespread use of deep learning for Euclidean data, which inspire various creative designs of neural networks in the non-Euclidean domain, particularly graphs. With the success of these graph neural networks (GNN) in the static setting, we approach further practical scenarios where the graph dynamically evolves. Exi… ▽ More Graph representation learning resurges as a trending research subject owing to the widespread use of deep learning for Euclidean data, which inspire various creative designs of neural networks in the non-Euclidean domain, particularly graphs. With the success of these graph neural networks (GNN) in the static setting, we approach further practical scenarios where the graph dynamically evolves. Existing approaches typically resort to node embeddings and use a recurrent neural network (RNN, broadly speaking) to regulate the embeddings and learn the temporal dynamics. These methods require the knowledge of a node in the full time span (including both training and testing) and are less applicable to the frequent change of the node set. In some extreme scenarios, the node sets at different time steps may completely differ. To resolve this challenge, we propose EvolveGCN, which adapts the graph convolutional network (GCN) model along the temporal dimension without resorting to node embeddings. The proposed approach captures the dynamism of the graph sequence through using an RNN to evolve the GCN parameters. Two architectures are considered for the parameter evolution. We evaluate the proposed approach on tasks including link prediction, edge classification, and node classification. The experimental results indicate a generally higher performance of EvolveGCN compared with related approaches. The code is available at \url{https://github.com/IBM/EvolveGCN}. △ Less

Submitted 18 November, 2019; v1 submitted 26 February, 2019; originally announced February 2019.

Comments: AAAI 2020. The code is available at https://github.com/IBM/EvolveGCN

arXiv:1812.10321 [pdf, other]

doi 10.1109/HiPC.2018.00019

Adaptive Pattern Matching with Reinforcement Learning for Dynamic Graphs

Authors: Hiroki Kanezashi, Toyotaro Suzumura, Dario Garcia-Gasulla, Min-hwan Oh, Satoshi Matsuoka

Abstract: Graph pattern matching algorithms to handle million-scale dynamic graphs are widely used in many applications such as social network analytics and suspicious transaction detections from financial networks. On the other hand, the computation complexity of many graph pattern matching algorithms is expensive, and it is not affordable to extract patterns from million-scale graphs. Moreover, most real-… ▽ More Graph pattern matching algorithms to handle million-scale dynamic graphs are widely used in many applications such as social network analytics and suspicious transaction detections from financial networks. On the other hand, the computation complexity of many graph pattern matching algorithms is expensive, and it is not affordable to extract patterns from million-scale graphs. Moreover, most real-world networks are time-evolving, updating their structures continuously, which makes it harder to update and output newly matched patterns in real time. Many incremental graph pattern matching algorithms which reduce the number of updates have been proposed to handle such dynamic graphs. However, it is still challenging to recompute vertices in the incremental graph pattern matching algorithms in a single process, and that prevents the real-time analysis. We propose an incremental graph pattern matching algorithm to deal with time-evolving graph data and also propose an adaptive optimization system based on reinforcement learning to recompute vertices in the incremental process more efficiently. Then we discuss the qualitative efficiency of our system with several types of data graphs and pattern graphs. We evaluate the performance using million-scale attributed and time-evolving social graphs. Our incremental algorithm is up to 10.1 times faster than an existing graph pattern matching and 1.95 times faster with the adaptive systems in a computation node than naive incremental processing. △ Less

Submitted 21 December, 2018; originally announced December 2018.

Comments: 10 pages and 11 figures

arXiv:1812.00076 [pdf, ps, other]

Scalable Graph Learning for Anti-Money Laundering: A First Look

Authors: Mark Weber, Jie Chen, Toyotaro Suzumura, Aldo Pareja, Tengfei Ma, Hiroki Kanezashi, Tim Kaler, Charles E. Leiserson, Tao B. Schardl

Abstract: Organized crime inflicts human suffering on a genocidal scale: the Mexican drug cartels have murdered 150,000 people since 2006, upwards of 700,000 people per year are "exported" in a human trafficking industry enslaving an estimated 40 million people. These nefarious industries rely on sophisticated money laundering schemes to operate. Despite tremendous resources dedicated to anti-money launderi… ▽ More Organized crime inflicts human suffering on a genocidal scale: the Mexican drug cartels have murdered 150,000 people since 2006, upwards of 700,000 people per year are "exported" in a human trafficking industry enslaving an estimated 40 million people. These nefarious industries rely on sophisticated money laundering schemes to operate. Despite tremendous resources dedicated to anti-money laundering (AML) only a tiny fraction of illicit activity is prevented. The research community can help. In this brief paper, we map the structural and behavioral dynamics driving the technical challenge. We review AML methods, current and emergent. We provide a first look at scalable graph convolutional neural networks for forensic analysis of financial data, which is massive, dense, and dynamic. We report preliminary experimental results using a large synthetic graph (1M nodes, 9M edges) generated by a data simulator we created called AMLSim. We consider opportunities for high performance efficiency, in terms of computation and memory, and we share results from a simple graph compression experiment. Our results support our working hypothesis that graph deep learning for AML bears great promise in the fight against criminal financial activity. △ Less

Submitted 30 November, 2018; originally announced December 2018.

Comments: NeurIPS 2018 Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy, Montreal, Canada

arXiv:1808.06251 [pdf, other]

doi 10.1109/BigData.2016.7840991

An incremental local-first community detection method for dynamic graphs

Authors: Hiroki Kanezashi, Toyotaro Suzumura

Abstract: Community detections for large-scale real world networks have been more popular in social analytics. In particular, dynamically growing network analyses become important to find long-term trends and detect anomalies. In order to analyze such networks, we need to obtain many snapshots and apply same analytic methods to them. However, it is inefficient to extract communities from these whole newly g… ▽ More Community detections for large-scale real world networks have been more popular in social analytics. In particular, dynamically growing network analyses become important to find long-term trends and detect anomalies. In order to analyze such networks, we need to obtain many snapshots and apply same analytic methods to them. However, it is inefficient to extract communities from these whole newly generated networks with little differences every time, and then it is impossible to follow the network growths in the real time. We proposed an incremental community detection algorithm for high-volume graph streams. It is based on the top of a well-known batch-oriented algorithm named DEMON[1]. We also evaluated performance and precisions of our proposed incremental algorithm with real-world big networks with up to 410,236 vertices and 2,439,437 edges and computed in less than one second to detect communities in an incremental fashion - which achieves up to 107 times faster than the original algorithm without sacrificing accuracies. △ Less

Submitted 19 August, 2018; originally announced August 2018.

Comments: 8 pages, 7 figures and 3 pseudo codes, 2016 IEEE International Conference on Big Data (Big Data)

arXiv:1804.07152 [pdf, other]

Scalable attribute-aware network embedding with locality

Authors: Weiyi Liu, Zhining Liu, Toyotaro Suzumura, Guangmin Hu

Abstract: Adding attributes for nodes to network embedding helps to improve the ability of the learned joint representation to depict features from topology and attributes simultaneously. Recent research on the joint embedding has exhibited a promising performance on a variety of tasks by jointly embedding the two spaces. However, due to the indispensable requirement of globality based information, present… ▽ More Adding attributes for nodes to network embedding helps to improve the ability of the learned joint representation to depict features from topology and attributes simultaneously. Recent research on the joint embedding has exhibited a promising performance on a variety of tasks by jointly embedding the two spaces. However, due to the indispensable requirement of globality based information, present approaches contain a flaw of in-scalability. Here we propose \emph{SANE}, a scalable attribute-aware network embedding algorithm with locality, to learn the joint representation from topology and attributes. By enforcing the alignment of a local linear relationship between each node and its K-nearest neighbors in topology and attribute space, the joint embedding representations are more informative comparing with a single representation from topology or attributes alone. And we argue that the locality in \emph{SANE} is the key to learning the joint representation at scale. By using several real-world networks from diverse domains, We demonstrate the efficacy of \emph{SANE} in performance and scalability aspect. Overall, for performance on label classification, SANE successfully reaches up to the highest F1-score on most datasets, and even closer to the baseline method that needs label information as extra inputs, compared with other state-of-the-art joint representation algorithms. What's more, \emph{SANE} has an up to 71.4\% performance gain compared with the single topology-based algorithm. For scalability, we have demonstrated the linearly time complexity of \emph{SANE}. In addition, we intuitively observe that when the network size scales to 100,000 nodes, the "learning joint embedding" step of \emph{SANE} only takes $\approx10$ seconds. △ Less

Submitted 29 April, 2018; v1 submitted 17 April, 2018; originally announced April 2018.

arXiv:1802.03057 [pdf, ps, other]

System G Distributed Graph Database

Authors: Gabriel Tanase, Toyotaro Suzumura, **ho Lee, Chun-Fu Chen, Jason Crawford, Hiroki Kanezashi, Song Zhang, Warut D. Vijitbenjaronk

Abstract: Motivated by the need to extract knowledge and value from interconnected data, graph analytics on big data is a very active area of research in both industry and academia. To support graph analytics efficiently a large number of in memory graph libraries, graph processing systems and graph databases have emerged. Projects in each of these categories focus on particular aspects such as static versu… ▽ More Motivated by the need to extract knowledge and value from interconnected data, graph analytics on big data is a very active area of research in both industry and academia. To support graph analytics efficiently a large number of in memory graph libraries, graph processing systems and graph databases have emerged. Projects in each of these categories focus on particular aspects such as static versus dynamic graphs, off line versus on line processing, small versus large graphs, etc. While there has been much advance in graph processing in the past decades, there is still a need for a fast graph processing, using a cluster of machines with distributed storage. In this paper, we discuss a novel distributed graph database called System G designed for efficient graph data storage and processing on modern computing architectures. In particular we describe a single node graph database and a runtime and communication layer that allows us to compose a distributed graph database from multiple single node instances. From various industry requirements, we find that fast insertions and large volume concurrent queries are critical parts of the graph databases and we optimize our database for such features. We experimentally show the efficiency of System G for storing data and processing graph queries on state-of-the-art platforms. △ Less

Submitted 8 February, 2018; originally announced February 2018.

arXiv:1709.03551 [pdf, other]

Principled Multilayer Network Embedding

Authors: Weiyi Liu, Pin-Yu Chen, Sailung Yeung, Toyotaro Suzumura, Lingli Chen

Abstract: Multilayer network analysis has become a vital tool for understanding different relationships and their interactions in a complex system, where each layer in a multilayer network depicts the topological structure of a group of nodes corresponding to a particular relationship. The interactions among different layers imply how the interplay of different relations on the topology of each layer. For a… ▽ More Multilayer network analysis has become a vital tool for understanding different relationships and their interactions in a complex system, where each layer in a multilayer network depicts the topological structure of a group of nodes corresponding to a particular relationship. The interactions among different layers imply how the interplay of different relations on the topology of each layer. For a single-layer network, network embedding methods have been proposed to project the nodes in a network into a continuous vector space with a relatively small number of dimensions, where the space embeds the social representations among nodes. These algorithms have been proved to have a better performance on a variety of regular graph analysis tasks, such as link prediction, or multi-label classification. In this paper, by extending a standard graph mining into multilayer network, we have proposed three methods ("network aggregation," "results aggregation" and "layer co-analysis") to project a multilayer network into a continuous vector space. From the evaluation, we have proved that comparing with regular link prediction methods, "layer co-analysis" achieved the best performance on most of the datasets, while "network aggregation" and "results aggregation" also have better performance than regular link prediction methods. △ Less

Submitted 14 September, 2017; v1 submitted 11 September, 2017; originally announced September 2017.

arXiv:1709.03545 [pdf, other]

Learning Graph Topological Features via GAN

Authors: Weiyi Liu, Hal Cooper, Min Hwan Oh, Sailung Yeung, Pin-Yu Chen, Toyotaro Suzumura, Lingli Chen

Abstract: Inspired by the generation power of generative adversarial networks (GANs) in image domains, we introduce a novel hierarchical architecture for learning characteristic topological features from a single arbitrary input graph via GANs. The hierarchical architecture consisting of multiple GANs preserves both local and global topological features and automatically partitions the input graph into repr… ▽ More Inspired by the generation power of generative adversarial networks (GANs) in image domains, we introduce a novel hierarchical architecture for learning characteristic topological features from a single arbitrary input graph via GANs. The hierarchical architecture consisting of multiple GANs preserves both local and global topological features and automatically partitions the input graph into representative stages for feature learning. The stages facilitate reconstruction and can be used as indicators of the importance of the associated topological structures. Experiments show that our method produces subgraphs retaining a wide range of topological features, even in early reconstruction stages (unlike a single GAN, which cannot easily identify such features, let alone reconstruct the original graph). This paper is firstline research on combining the use of GANs and graph topological analysis. △ Less

Submitted 8 October, 2019; v1 submitted 11 September, 2017; originally announced September 2017.

arXiv:1707.09872 [pdf, other]

Full-Network Embedding in a Multimodal Embedding Pipeline

Authors: Armand Vilalta, Dario Garcia-Gasulla, Ferran Parés, Eduard Ayguadé, Jesus Labarta, Ulises Cortés, Toyotaro Suzumura

Abstract: The current state-of-the-art for image annotation and image retrieval tasks is obtained through deep neural networks, which combine an image representation and a text representation into a shared embedding space. In this paper we evaluate the impact of using the Full-Network embedding in this setting, replacing the original image representation in a competitive multimodal embedding generation sche… ▽ More The current state-of-the-art for image annotation and image retrieval tasks is obtained through deep neural networks, which combine an image representation and a text representation into a shared embedding space. In this paper we evaluate the impact of using the Full-Network embedding in this setting, replacing the original image representation in a competitive multimodal embedding generation scheme. Unlike the one-layer image embeddings typically used by most approaches, the Full-Network embedding provides a multi-scale representation of images, which results in richer characterizations. To measure the influence of the Full-Network embedding, we evaluate its performance on three different datasets, and compare the results with the original multimodal embedding generation scheme when using a one-layer image embedding, and with the rest of the state-of-the-art. Results for image annotation and image retrieval tasks indicate that the Full-Network embedding is consistently superior to the one-layer embedding. These results motivate the integration of the Full-Network embedding on any multimodal embedding generation scheme, something feasible thanks to the flexibility of the approach. △ Less

Submitted 9 August, 2017; v1 submitted 24 July, 2017; originally announced July 2017.

Comments: In 2nd Workshop on Semantic Deep Learning (SemDeep-2) at the 12th International Conference on Computational Semantics (IWCS) 2017

arXiv:1707.07465 [pdf, other]

Building Graph Representations of Deep Vector Embeddings

Authors: Dario Garcia-Gasulla, Armand Vilalta, Ferran Parés, Jonatan Moreno, Eduard Ayguadé, Jesus Labarta, Ulises Cortés, Toyotaro Suzumura

Abstract: Patterns stored within pre-trained deep neural networks compose large and powerful descriptive languages that can be used for many different purposes. Typically, deep network representations are implemented within vector embedding spaces, which enables the use of traditional machine learning algorithms on top of them. In this short paper we propose the construction of a graph embedding space inste… ▽ More Patterns stored within pre-trained deep neural networks compose large and powerful descriptive languages that can be used for many different purposes. Typically, deep network representations are implemented within vector embedding spaces, which enables the use of traditional machine learning algorithms on top of them. In this short paper we propose the construction of a graph embedding space instead, introducing a methodology to transform the knowledge coded within a deep convolutional network into a topological space (i.e. a network). We outline how such graph can hold data instances, data features, relations between instances and features, and relations among features. Finally, we introduce some preliminary experiments to illustrate how the resultant graph embedding space can be exploited through graph analytics algorithms. △ Less

Submitted 9 August, 2017; v1 submitted 24 July, 2017; originally announced July 2017.

Comments: Accepted at the 2nd Workshop on Semantic Deep Learning (SemDeep-2)

arXiv:1707.06197 [pdf, other]

Can GAN Learn Topological Features of a Graph?

Authors: Weiyi Liu, Pin-Yu Chen, Hal Cooper, Min Hwan Oh, Sailung Yeung, Toyotaro Suzumura

Abstract: This paper is first-line research expanding GANs into graph topology analysis. By leveraging the hierarchical connectivity structure of a graph, we have demonstrated that generative adversarial networks (GANs) can successfully capture topological features of any arbitrary graph, and rank edge sets by different stages according to their contribution to topology reconstruction. Moreover, in addition… ▽ More This paper is first-line research expanding GANs into graph topology analysis. By leveraging the hierarchical connectivity structure of a graph, we have demonstrated that generative adversarial networks (GANs) can successfully capture topological features of any arbitrary graph, and rank edge sets by different stages according to their contribution to topology reconstruction. Moreover, in addition to acting as an indicator of graph reconstruction, we find that these stages can also preserve important topological features in a graph. △ Less

Submitted 19 July, 2017; originally announced July 2017.

arXiv:1705.07706 [pdf, other]

An Out-of-the-box Full-network Embedding for Convolutional Neural Networks

Authors: Dario Garcia-Gasulla, Armand Vilalta, Ferran Parés, Jonatan Moreno, Eduard Ayguadé, Jesus Labarta, Ulises Cortés, Toyotaro Suzumura

Abstract: Transfer learning for feature extraction can be used to exploit deep representations in contexts where there is very few training data, where there are limited computational resources, or when tuning the hyper-parameters needed for training is not an option. While previous contributions to feature extraction propose embeddings based on a single layer of the network, in this paper we propose a full… ▽ More Transfer learning for feature extraction can be used to exploit deep representations in contexts where there is very few training data, where there are limited computational resources, or when tuning the hyper-parameters needed for training is not an option. While previous contributions to feature extraction propose embeddings based on a single layer of the network, in this paper we propose a full-network embedding which successfully integrates convolutional and fully connected features, coming from all layers of a deep convolutional neural network. To do so, the embedding normalizes features in the context of the problem, and discretizes their values to reduce noise and regularize the embedding space. Significantly, this also reduces the computational cost of processing the resultant representations. The proposed method is shown to outperform single layer embeddings on several image classification tasks, while also being more robust to the choice of the pre-trained model used for obtaining the initial features. The performance gap in classification accuracy between thoroughly tuned solutions and the full-network embedding is also reduced, which makes of the proposed approach a competitive solution for a large set of applications. △ Less

Submitted 22 May, 2017; originally announced May 2017.

arXiv:1704.06841 [pdf]

Medical Text Classification using Convolutional Neural Networks

Authors: Mark Hughes, Irene Li, Spyros Kotoulas, Toyotaro Suzumura

Abstract: We present an approach to automatically classify clinical text at a sentence level. We are using deep convolutional neural networks to represent complex features. We train the network on a dataset providing a broad categorization of health information. Through a detailed evaluation, we demonstrate that our method outperforms several approaches widely used in natural language processing tasks by ab… ▽ More We present an approach to automatically classify clinical text at a sentence level. We are using deep convolutional neural networks to represent complex features. We train the network on a dataset providing a broad categorization of health information. Through a detailed evaluation, we demonstrate that our method outperforms several approaches widely used in natural language processing tasks by about 15%. △ Less

Submitted 22 April, 2017; originally announced April 2017.

arXiv:1703.09307 [pdf, other]

Fluid Communities: A Competitive, Scalable and Diverse Community Detection Algorithm

Authors: Ferran Parés, Dario Garcia-Gasulla, Armand Vilalta, Jonatan Moreno, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, Toyotaro Suzumura

Abstract: We introduce a community detection algorithm (Fluid Communities) based on the idea of fluids interacting in an environment, expanding and contracting as a result of that interaction. Fluid Communities is based on the propagation methodology, which represents the state-of-the-art in terms of computational cost and scalability. While being highly efficient, Fluid Communities is able to find communit… ▽ More We introduce a community detection algorithm (Fluid Communities) based on the idea of fluids interacting in an environment, expanding and contracting as a result of that interaction. Fluid Communities is based on the propagation methodology, which represents the state-of-the-art in terms of computational cost and scalability. While being highly efficient, Fluid Communities is able to find communities in synthetic graphs with an accuracy close to the current best alternatives. Additionally, Fluid Communities is the first propagation-based algorithm capable of identifying a variable number of communities in network. To illustrate the relevance of the algorithm, we evaluate the diversity of the communities found by Fluid Communities, and find them to be significantly different from the ones found by alternative methods. △ Less

Submitted 9 October, 2017; v1 submitted 27 March, 2017; originally announced March 2017.

Comments: Accepted at the 6th International Conference on Complex Networks and Their Applications

arXiv:1703.01127 [pdf, other]

On the Behavior of Convolutional Nets for Feature Extraction

Authors: Dario Garcia-Gasulla, Ferran Parés, Armand Vilalta, Jonatan Moreno, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, Toyotaro Suzumura

Abstract: Deep neural networks are representation learning techniques. During training, a deep net is capable of generating a descriptive language of unprecedented size and detail in machine learning. Extracting the descriptive language coded within a trained CNN model (in the case of image data), and reusing it for other purposes is a field of interest, as it provides access to the visual descriptors previ… ▽ More Deep neural networks are representation learning techniques. During training, a deep net is capable of generating a descriptive language of unprecedented size and detail in machine learning. Extracting the descriptive language coded within a trained CNN model (in the case of image data), and reusing it for other purposes is a field of interest, as it provides access to the visual descriptors previously learnt by the CNN after processing millions of images, without requiring an expensive training phase. Contributions to this field (commonly known as feature representation transfer or transfer learning) have been purely empirical so far, extracting all CNN features from a single layer close to the output and testing their performance by feeding them to a classifier. This approach has provided consistent results, although its relevance is limited to classification tasks. In a completely different approach, in this paper we statistically measure the discriminative power of every single feature found within a deep CNN, when used for characterizing every class of 11 datasets. We seek to provide new insights into the behavior of CNN features, particularly the ones from convolutional layers, as this can be relevant for their application to knowledge representation and reasoning. Our results confirm that low and middle level features may behave differently to high level features, but only under certain conditions. We find that all CNN features can be used for knowledge representation purposes both by their presence or by their absence, doubling the information a single CNN feature may provide. We also study how much noise these features may include, and propose a thresholding approach to discard most of it. All these insights have a direct application to the generation of CNN embedding spaces. △ Less

Submitted 29 January, 2018; v1 submitted 3 March, 2017; originally announced March 2017.

Comments: Published in the Journal of Artificial Intelligence Research (JAIR), Special Track on Deep Learning, Knowledge Representation, and Reasoning

arXiv:1611.09084 [pdf, other]

Hierarchical Hyperlink Prediction for the WWW

Authors: Dario Garcia-Gasulla, Eduard Ayguadé, Jesús Labarta, Ulises Cortés, Toyotaro Suzumura

Abstract: The hyperlink prediction task, that of proposing new links between webpages, can be used to improve search engines, expand the visibility of web pages, and increase the connectivity and navigability of the web. Hyperlink prediction is typically performed on webgraphs composed by thousands or millions of vertices, where on average each webpage contains less than fifty links. Algorithms processing g… ▽ More The hyperlink prediction task, that of proposing new links between webpages, can be used to improve search engines, expand the visibility of web pages, and increase the connectivity and navigability of the web. Hyperlink prediction is typically performed on webgraphs composed by thousands or millions of vertices, where on average each webpage contains less than fifty links. Algorithms processing graphs so large and sparse require to be both scalable and precise, a challenging combination. Similarity-based algorithms are among the most scalable solutions within the link prediction field, due to their parallel nature and computational simplicity. These algorithms independently explore the nearby topological features of every missing link from the graph in order to determine its likelihood. Unfortunately, the precision of similarity-based algorithms is limited, which has prevented their broad application so far. In this work we explore the performance of similarity-based algorithms for the particular problem of hyperlink prediction on large webgraphs, and propose a novel method which assumes the existence of hierarchical properties. We evaluate this new approach on several webgraphs and compare its performance with that of the current best similarity-based algorithms. Its remarkable performance leads us to argue on the applicability of the proposal, identifying several use cases of hyperlink prediction. We also describes the approach we took for the computation of large-scale graphs from the perspective of high-performance computing, providing details on the implementation and parallelization of code. △ Less

Submitted 28 November, 2016; originally announced November 2016.

Comments: Submitted to Transactions on Internet Technology journal

arXiv:1507.08818 [pdf, other]

A Visual Embedding for the Unsupervised Extraction of Abstract Semantics

Authors: D. Garcia-Gasulla, J. Béjar, U. Cortés, E. Ayguadé, J. Labarta, T. Suzumura, R. Chen

Abstract: Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art… ▽ More Vector-space word representations obtained from neural network models have been shown to enable semantic operations based on vector arithmetic. In this paper, we explore the existence of similar information on vector representations of images. For that purpose we define a methodology to obtain large, sparse vector representations of image classes, and generate vectors through the state-of-the-art deep learning architecture GoogLeNet for 20K images obtained from ImageNet. We first evaluate the resultant vector-space semantics through its correlation with WordNet distances, and find vector distances to be strongly correlated with linguistic semantics. We then explore the location of images within the vector space, finding elements close in WordNet to be clustered together, regardless of significant visual variances (e.g. 118 dog types). More surprisingly, we find that the space unsupervisedly separates complex classes without prior knowledge (e.g. living things). Afterwards, we consider vector arithmetics. Although we are unable to obtain meaningful results on this regard, we discuss the various problem we encountered, and how we consider to solve them. Finally, we discuss the impact of our research for cognitive systems, focusing on the role of the architecture being used. △ Less

Submitted 16 December, 2016; v1 submitted 31 July, 2015; originally announced July 2015.

Comments: 14 pages, 5 figures, accepted at Cognitive Systems Research

arXiv:1505.04542 [pdf, other]

doi 10.1145/2771774.2771776

Scalable Parallel Numerical Constraint Solver Using Global Load Balancing

Authors: Daisuke Ishii, Kazuki Yoshizoe, Toyotaro Suzumura

Abstract: We present a scalable parallel solver for numerical constraint satisfaction problems (NCSPs). Our parallelization scheme consists of homogeneous worker solvers, each of which runs on an available core and communicates with others via the global load balancing (GLB) method. The parallel solver is implemented with X10 that provides an implementation of GLB as a library. In experiments, several NCSPs… ▽ More We present a scalable parallel solver for numerical constraint satisfaction problems (NCSPs). Our parallelization scheme consists of homogeneous worker solvers, each of which runs on an available core and communicates with others via the global load balancing (GLB) method. The parallel solver is implemented with X10 that provides an implementation of GLB as a library. In experiments, several NCSPs from the literature were solved and attained up to 516-fold speedup using 600 cores of the TSUBAME2.5 supercomputer. △ Less

Submitted 18 May, 2015; originally announced May 2015.

Comments: To be presented at X10'15 Workshop

ACM Class: D.1.3

arXiv:1411.1507 [pdf, ps, other]

doi 10.1007/978-3-319-10428-7_30

Scalable Parallel Numerical CSP Solver

Authors: Daisuke Ishii, Kazuki Yoshizoe, Toyotaro Suzumura

Abstract: We present a parallel solver for numerical constraint satisfaction problems (NCSPs) that can scale on a number of cores. Our proposed method runs worker solvers on the available cores and simultaneously the workers cooperate for the search space distribution and balancing. In the experiments, we attained up to 119-fold speedup using 256 cores of a parallel computer. We present a parallel solver for numerical constraint satisfaction problems (NCSPs) that can scale on a number of cores. Our proposed method runs worker solvers on the available cores and simultaneously the workers cooperate for the search space distribution and balancing. In the experiments, we attained up to 119-fold speedup using 256 cores of a parallel computer. △ Less

Submitted 6 November, 2014; originally announced November 2014.

Comments: The final publication is available at Springer

Showing 1–49 of 49 results for author: Suzumura, T