-
A Method for Network Intrusion Detection Using Flow Sequence and BERT Framework
Authors:
Loc Gia Nguyen,
Kohei Watabe
Abstract:
A Network Intrusion Detection System (NIDS) is a tool that identifies potential threats to a network. Recently, different flow-based NIDS designs utilizing Machine Learning (ML) algorithms have been proposed as solutions to detect intrusions efficiently. However, conventional ML-based classifiers have not seen widespread adoption in the real world due to their poor domain adaptation capability. In…
▽ More
A Network Intrusion Detection System (NIDS) is a tool that identifies potential threats to a network. Recently, different flow-based NIDS designs utilizing Machine Learning (ML) algorithms have been proposed as solutions to detect intrusions efficiently. However, conventional ML-based classifiers have not seen widespread adoption in the real world due to their poor domain adaptation capability. In this research, our goal is to explore the possibility of using sequences of flows to improve the domain adaptation capability of network intrusion detection systems. Our proposal employs natural language processing techniques and Bidirectional Encoder Representations from Transformers framework, which is an effective technique for modeling data with respect to its context. Early empirical results show that our approach has improved domain adaptation capability compared to previous approaches. The proposed approach provides a new research method for building a robust intrusion detection system.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
An Accurate Graph Generative Model with Tunable Features
Authors:
Takahiro Yokoyama,
Yoshiki Sato,
Sho Tsugawa,
Kohei Watabe
Abstract:
A graph is a very common and powerful data structure used for modeling communication and social networks. Models that generate graphs with arbitrary features are important basic technologies in repeated simulations of networks and prediction of topology changes. Although existing generative models for graphs are useful for providing graphs similar to real-world graphs, graph generation models with…
▽ More
A graph is a very common and powerful data structure used for modeling communication and social networks. Models that generate graphs with arbitrary features are important basic technologies in repeated simulations of networks and prediction of topology changes. Although existing generative models for graphs are useful for providing graphs similar to real-world graphs, graph generation models with tunable features have been less explored in the field. Previously, we have proposed GraphTune, a generative model for graphs that continuously tune specific graph features of generated graphs while maintaining most of the features of a given graph dataset. However, the tuning accuracy of graph features in GraphTune has not been sufficient for practical applications. In this paper, we propose a method to improve the accuracy of GraphTune by adding a new mechanism to feed back errors of graph features of generated graphs and by training them alternately and independently. Experiments on a real-world graph dataset showed that the features in the generated graphs are accurately tuned compared with conventional models.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
Flow-based Network Intrusion Detection Based on BERT Masked Language Model
Authors:
Loc Gia Nguyen,
Kohei Watabe
Abstract:
A Network Intrusion Detection System (NIDS) is an important tool that identifies potential threats to a network. Recently, different flow-based NIDS designs utilizing Machine Learning (ML) algorithms have been proposed as potential solutions to detect intrusions efficiently. However, conventional ML-based classifiers have not seen widespread adoption in the real-world due to their poor domain adap…
▽ More
A Network Intrusion Detection System (NIDS) is an important tool that identifies potential threats to a network. Recently, different flow-based NIDS designs utilizing Machine Learning (ML) algorithms have been proposed as potential solutions to detect intrusions efficiently. However, conventional ML-based classifiers have not seen widespread adoption in the real-world due to their poor domain adaptation capability. In this research, our goal is to explore the possibility of improve the domain adaptation capability of NIDS. Our proposal employs Natural Language Processing (NLP) techniques and Bidirectional Encoder Representations from Transformers (BERT) framework. The proposed method achieved positive results when tested on data from different domains.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
Identifying Influential Brokers on Social Media from Social Network Structure
Authors:
Sho Tsugawa,
Kohei Watabe
Abstract:
Identifying influencers in a given social network has become an important research problem for various applications, including accelerating the spread of information in viral marketing and preventing the spread of fake news and rumors. The literature contains a rich body of studies on identifying influential source spreaders who can spread their own messages to many other nodes. In contrast, the i…
▽ More
Identifying influencers in a given social network has become an important research problem for various applications, including accelerating the spread of information in viral marketing and preventing the spread of fake news and rumors. The literature contains a rich body of studies on identifying influential source spreaders who can spread their own messages to many other nodes. In contrast, the identification of influential brokers who can spread other nodes' messages to many nodes has not been fully explored. Theoretical and empirical studies suggest that involvement of both influential source spreaders and brokers is a key to facilitating large-scale information diffusion cascades. Therefore, this paper explores ways to identify influential brokers from a given social network. By using three social media datasets, we investigate the characteristics of influential brokers by comparing them with influential source spreaders and central nodes obtained from centrality measures. Our results show that (i) most of the influential source spreaders are not influential brokers (and vice versa) and (ii) the overlap between central nodes and influential brokers is small (less than 15%) in Twitter datasets. We also tackle the problem of identifying influential brokers from centrality measures and node embeddings, and we examine the effectiveness of social network features in the broker identification task. Our results show that (iii) although a single centrality measure cannot characterize influential brokers well, prediction models using node embedding features achieve F$_1$ scores of 0.35--0.68, suggesting the effectiveness of social network features for identifying influential brokers.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
GraphTune: A Learning-based Graph Generative Model with Tunable Structural Features
Authors:
Kohei Watabe,
Shohei Nakazawa,
Yoshiki Sato,
Sho Tsugawa,
Kenji Nakagawa
Abstract:
Generative models for graphs have been actively studied for decades, and they have a wide range of applications. Recently, learning-based graph generation that reproduces real-world graphs has been attracting the attention of many researchers. Although several generative models that utilize modern machine learning technologies have been proposed, conditional generation of general graphs has been l…
▽ More
Generative models for graphs have been actively studied for decades, and they have a wide range of applications. Recently, learning-based graph generation that reproduces real-world graphs has been attracting the attention of many researchers. Although several generative models that utilize modern machine learning technologies have been proposed, conditional generation of general graphs has been less explored in the field. In this paper, we propose a generative model that allows us to tune the value of a global-level structural feature as a condition. Our model, called GraphTune, makes it possible to tune the value of any structural feature of generated graphs using Long Short Term Memory (LSTM) and a Conditional Variational AutoEncoder (CVAE). We performed comparative evaluations of GraphTune and conventional models on a real graph dataset. The evaluations show that GraphTune makes it possible to more clearly tune the value of a global-level structural feature better than conventional models.
△ Less
Submitted 5 April, 2023; v1 submitted 27 January, 2022;
originally announced January 2022.
-
A transformer-based deep learning approach for classifying brain metastases into primary organ sites using clinical whole brain MRI
Authors:
Qing Lyu,
Sanjeev V. Namjoshi,
Emory McTyre,
Umit Topaloglu,
Richard Barcus,
Michael D. Chan,
Christina K. Cramer,
Waldemar Debinski,
Metin N. Gurcan,
Glenn J. Lesser,
Hui-Kuan Lin,
Reginald F. Munden,
Boris C. Pasche,
Kiran Kumar Solingapuram Sai,
Roy E. Strowd,
Stephen B. Tatter,
Kounosuke Watabe,
Wei Zhang,
Ge Wang,
Christopher T. Whitlow
Abstract:
Treatment decisions for brain metastatic disease rely on knowledge of the primary organ site, and currently made with biopsy and histology. Here we develop a novel deep learning approach for accurate non-invasive digital histology with whole-brain MRI data. Our IRB-approved single-site retrospective study was comprised of patients (n=1,399) referred for MRI treatment-planning and gamma knife radio…
▽ More
Treatment decisions for brain metastatic disease rely on knowledge of the primary organ site, and currently made with biopsy and histology. Here we develop a novel deep learning approach for accurate non-invasive digital histology with whole-brain MRI data. Our IRB-approved single-site retrospective study was comprised of patients (n=1,399) referred for MRI treatment-planning and gamma knife radiosurgery over 21 years. Contrast-enhanced T1-weighted and T2-weighted Fluid-Attenuated Inversion Recovery brain MRI exams (n=1,582) were preprocessed and input to the proposed deep learning workflow for tumor segmentation, modality transfer, and primary site classification into one of five classes. Ten-fold cross-validation generated overall AUC of 0.878 (95%CI:0.873,0.883), lung class AUC of 0.889 (95%CI:0.883,0.895), breast class AUC of 0.873 (95%CI:0.860,0.886), melanoma class AUC of 0.852 (95%CI:0.842,0.862), renal class AUC of 0.830 (95%CI:0.809,0.851), and other class AUC of 0.822 (95%CI:0.805,0.839). These data establish that whole-brain imaging features are discriminative to allow accurate diagnosis of the primary organ site of malignancy. Our end-to-end deep radiomic approach has great potential for classifying metastatic tumor types from whole-brain MRI images. Further refinement may offer an invaluable clinical tool to expedite primary cancer site identification for precision treatment and improved outcomes.
△ Less
Submitted 20 April, 2022; v1 submitted 7 October, 2021;
originally announced October 2021.
-
A Tunable Model for Graph Generation Using LSTM and Conditional VAE
Authors:
Shohei Nakazawa,
Yoshiki Sato,
Kenji Nakagawa,
Sho Tsugawa,
Kohei Watabe
Abstract:
With the development of graph applications, generative models for graphs have been more crucial. Classically, stochastic models that generate graphs with a pre-defined probability of edges and nodes have been studied. Recently, some models that reproduce the structural features of graphs by learning from actual graph data using machine learning have been studied. However, in these conventional stu…
▽ More
With the development of graph applications, generative models for graphs have been more crucial. Classically, stochastic models that generate graphs with a pre-defined probability of edges and nodes have been studied. Recently, some models that reproduce the structural features of graphs by learning from actual graph data using machine learning have been studied. However, in these conventional studies based on machine learning, structural features of graphs can be learned from data, but it is not possible to tune features and generate graphs with specific features. In this paper, we propose a generative model that can tune specific features, while learning structural features of a graph from data. With a dataset of graphs with various features generated by a stochastic model, we confirm that our model can generate a graph with specific features.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
Analysis of the Convergence Speed of the Arimoto-Blahut Algorithm by the Second Order Recurrence Formula
Authors:
Kenji Nakagawa,
Yoshinori Takei,
Shin-ichiro Hara,
Kohei Watabe
Abstract:
In this paper, we investigate the convergence speed of the Arimoto-Blahut algorithm. For many channel matrices the convergence is exponential, but for some channel matrices it is slower than exponential. By analyzing the Taylor expansion of the defining function of the Arimoto-Blahut algorithm, we will make the conditions clear for the exponential or slower convergence. The analysis of the slow co…
▽ More
In this paper, we investigate the convergence speed of the Arimoto-Blahut algorithm. For many channel matrices the convergence is exponential, but for some channel matrices it is slower than exponential. By analyzing the Taylor expansion of the defining function of the Arimoto-Blahut algorithm, we will make the conditions clear for the exponential or slower convergence. The analysis of the slow convergence is new in this paper. Based on the analysis, we will compare the convergence speed of the Arimoto-Blahut algorithm numerically with the values obtained in our theorems for several channel matrices. The purpose of this paper is a complete understanding of the convergence speed of the Arimoto-Blahut algorithm.
△ Less
Submitted 17 September, 2020;
originally announced September 2020.
-
Analysis for the Slow Convergence in Arimoto Algorithm
Authors:
Kenji Nakagawa,
Yoshinori Takei,
Kohei Watabe
Abstract:
In this paper, we investigate the convergence speed of the Arimoto algorithm. By analyzing the Taylor expansion of the defining function of the Arimoto algorithm, we will clarify the conditions for the exponential or $1/N$ order convergence and calculate the convergence speed. We show that the convergence speed of the $1/N$ order is evaluated by the derivatives of the Kullback-Leibler divergence w…
▽ More
In this paper, we investigate the convergence speed of the Arimoto algorithm. By analyzing the Taylor expansion of the defining function of the Arimoto algorithm, we will clarify the conditions for the exponential or $1/N$ order convergence and calculate the convergence speed. We show that the convergence speed of the $1/N$ order is evaluated by the derivatives of the Kullback-Leibler divergence with respect to the input probabilities. The analysis for the convergence of the $1/N$ order is new in this paper. Based on the analysis, we will compare the convergence speed of the Arimoto algorithm with the theoretical values obtained in our theorems for several channel matrices.
△ Less
Submitted 3 September, 2018;
originally announced September 2018.
-
On the Search Algorithm for the Output Distribution that Achieves the Channel Capacity
Authors:
Kenji Nakagawa,
Kohei Watabe,
Takuto Sabu
Abstract:
We consider a search algorithm for the output distribution that achieves the channel capacity of a discrete memoryless channel. We will propose an algorithm by iterated projections of an output distribution onto affine subspaces in the set of output distributions. The problem of channel capacity has a similar geometric structure as that of smallest enclosing circle for a finite number of points in…
▽ More
We consider a search algorithm for the output distribution that achieves the channel capacity of a discrete memoryless channel. We will propose an algorithm by iterated projections of an output distribution onto affine subspaces in the set of output distributions. The problem of channel capacity has a similar geometric structure as that of smallest enclosing circle for a finite number of points in the Euclidean space. The metric in the Euclidean space is the Euclidean distance and the metric in the space of output distributions is the Kullback-Leibler divergence. We consider these two problems based on Amari's $α$-geometry. Then, we first consider the smallest enclosing circle in the Euclidean space and develop an algorithm to find the center of the smallest enclosing circle. Based on the investigation, we will apply the obtained algorithm to the problem of channel capacity.
△ Less
Submitted 8 January, 2016; v1 submitted 6 January, 2016;
originally announced January 2016.
-
Measuring the frequency of a Sr optical lattice clock using a 120-km coherent optical transfer
Authors:
F. -L. Hong,
M. Musha,
M. Takamoto,
H. Inaba,
S. Yanagimachi,
A. Takamizawa,
K. Watabe,
T. Ikegami,
M. Imae,
Y. Fujii,
M. Amemiya,
K. Nakagawa,
K. Ueda,
H. Katori
Abstract:
We demonstrate a precision frequency measurement using a phase-stabilized 120-km optical fiber link over a physical distance of 50 km. The transition frequency of the 87Sr optical lattice clock at the University of Tokyo is measured to be 429228004229874.1(2.4) Hz referenced to international atomic time (TAI). The measured frequency agrees with results obtained in Boulder and Paris at a 6*10^-16…
▽ More
We demonstrate a precision frequency measurement using a phase-stabilized 120-km optical fiber link over a physical distance of 50 km. The transition frequency of the 87Sr optical lattice clock at the University of Tokyo is measured to be 429228004229874.1(2.4) Hz referenced to international atomic time (TAI). The measured frequency agrees with results obtained in Boulder and Paris at a 6*10^-16 fractional level, which matches the current best evaluations of Cs primary frequency standards. The results demonstrate the excellent functions of the intercity optical fibre link, and the great potential of optical lattice clocks for use in the redefinition of the second.
△ Less
Submitted 12 November, 2008;
originally announced November 2008.
-
Limit distributions of two-dimensional quantum walks
Authors:
Kyohei Watabe,
Naoki Kobayashi,
Makoto Katori,
Norio Konno
Abstract:
One-parameter family of discrete-time quantum-walk models on the square lattice, which includes the Grover-walk model as a special case, is analytically studied. Convergence in the long-time limit $t \to \infty$ of all joint moments of two components of walker's pseudovelocity, $X_t/t$ and $Y_t/t$, is proved and the probability density of limit distribution is derived. Dependence of the two-dime…
▽ More
One-parameter family of discrete-time quantum-walk models on the square lattice, which includes the Grover-walk model as a special case, is analytically studied. Convergence in the long-time limit $t \to \infty$ of all joint moments of two components of walker's pseudovelocity, $X_t/t$ and $Y_t/t$, is proved and the probability density of limit distribution is derived. Dependence of the two-dimensional limit density function on the parameter of quantum coin and initial four-component qudit of quantum walker is determined. Symmetry of limit distribution on a plane and localization around the origin are completely controlled. Comparison with numerical results of direct computer-simulations is also shown.
△ Less
Submitted 19 June, 2008; v1 submitted 19 February, 2008;
originally announced February 2008.