-
CFTM: Continuous time fractional topic model
Authors:
Kei Nakagawa,
Kohei Hayashi,
Yugo Fujimoto
Abstract:
In this paper, we propose the Continuous Time Fractional Topic Model (cFTM), a new method for dynamic topic modeling. This approach incorporates fractional Brownian motion~(fBm) to effectively identify positive or negative correlations in topic and word distribution over time, revealing long-term dependency or roughness. Our theoretical analysis shows that the cFTM can capture these long-term depe…
▽ More
In this paper, we propose the Continuous Time Fractional Topic Model (cFTM), a new method for dynamic topic modeling. This approach incorporates fractional Brownian motion~(fBm) to effectively identify positive or negative correlations in topic and word distribution over time, revealing long-term dependency or roughness. Our theoretical analysis shows that the cFTM can capture these long-term dependency or roughness in both topic and word distributions, mirroring the main characteristics of fBm. Moreover, we prove that the parameter estimation process for the cFTM is on par with that of LDA, traditional topic models. To demonstrate the cFTM's property, we conduct empirical study using economic news articles. The results from these tests support the model's ability to identify and track long-term dependency or roughness in topics over time.
△ Less
Submitted 6 February, 2024; v1 submitted 29 January, 2024;
originally announced February 2024.
-
Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning
Authors:
Takumi Yoshida,
Hiroyuki Hanada,
Kazuya Nakagawa,
Kouichi Taji,
Koji Tsuda,
Ichiro Takeuchi
Abstract:
Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction model by considering substructures, such as subsets, subgraphs, and subsequences (referred to as patterns), present in the structured data as features of the model.…
▽ More
Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction model by considering substructures, such as subsets, subgraphs, and subsequences (referred to as patterns), present in the structured data as features of the model. The primary challenge in predictive pattern mining lies in the exponential growth of the number of patterns with the complexity of the structured data. In this study, we propose the Safe Pattern Pruning (SPP) method to address the explosion of pattern numbers in predictive pattern mining. We also discuss how it can be effectively employed throughout the entire model building process in practical data analysis. To demonstrate the effectiveness of the proposed method, we conduct numerical experiments on regression and classification problems involving sets, graphs, and sequences.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
What Makes An Apology More Effective? Exploring Anthropomorphism, Individual Differences, And Emotion In Human-Automation Trust Repair
Authors:
Peggy Pei-Ying Lu,
Makoto Konishi,
Shin Sano,
Sho Hiruta,
Francis Ken Nakagawa
Abstract:
Recent advances in technology have allowed an automation system to recognize its errors and repair trust more actively than ever. While previous research has called for further studies of different human factors and design features, their effect on human-automation trust repair scenarios remains unknown, especially concerning emotions. This paper seeks to fill such gaps by investigating the impact…
▽ More
Recent advances in technology have allowed an automation system to recognize its errors and repair trust more actively than ever. While previous research has called for further studies of different human factors and design features, their effect on human-automation trust repair scenarios remains unknown, especially concerning emotions. This paper seeks to fill such gaps by investigating the impact of anthropomorphism, users' individual differences, and emotional responses on human-automation trust repair. Our experiment manipulated various types of trust violations and apology messages with different emotionally expressive anthropomorphic cues. While no significant effect from the different apology representations was found, our participants displayed polarizing attitudes toward the anthropomorphic cues. We also found that (1). some personality traits, such as openness and conscientiousness, negatively correlate with the effectiveness of the apology messages, and (2). a person's emotional response toward a trust violation positively correlates with the effectiveness of the apology messages.
△ Less
Submitted 1 December, 2022; v1 submitted 18 November, 2022;
originally announced November 2022.
-
Uncertainty Aware Trader-Company Method: Interpretable Stock Price Prediction Capturing Uncertainty
Authors:
Yugo Fujimoto,
Kei Nakagawa,
Kentaro Imajo,
Kentaro Minami
Abstract:
Machine learning is an increasingly popular tool with some success in predicting stock prices. One promising method is the Trader-Company~(TC) method, which takes into account the dynamism of the stock market and has both high predictive power and interpretability. Machine learning-based stock prediction methods including the TC method have been concentrating on point prediction. However, point pr…
▽ More
Machine learning is an increasingly popular tool with some success in predicting stock prices. One promising method is the Trader-Company~(TC) method, which takes into account the dynamism of the stock market and has both high predictive power and interpretability. Machine learning-based stock prediction methods including the TC method have been concentrating on point prediction. However, point prediction in the absence of uncertainty estimates lacks credibility quantification and raises concerns about safety. The challenge in this paper is to make an investment strategy that combines high predictive power and the ability to quantify uncertainty. We propose a novel approach called Uncertainty Aware Trader-Company Method~(UTC) method. The core idea of this approach is to combine the strengths of both frameworks by merging the TC method with the probabilistic modeling, which provides probabilistic predictions and uncertainty estimations. We expect this to retain the predictive power and interpretability of the TC method while capturing the uncertainty. We theoretically prove that the proposed method estimates the posterior variance and does not introduce additional biases from the original TC method. We conduct a comprehensive evaluation of our approach based on the synthetic and real market datasets. We confirm with synthetic data that the UTC method can detect situations where the uncertainty increases and the prediction is difficult. We also confirmed that the UTC method can detect abrupt changes in data generating distributions. We demonstrate with real market data that the UTC method can achieve higher returns and lower risks than baselines.
△ Less
Submitted 2 November, 2022; v1 submitted 30 October, 2022;
originally announced October 2022.
-
On a Proof of the Convergence Speed of a Second-order Recurrence Formula in the Arimoto-Blahut Algorithm
Authors:
Kenji Nakagawa,
Yoshinori Takei,
Shin-ichiro Hara
Abstract:
In [8] (Nakagawa, et.al., IEEE Trans. IT, 2021), we investigated the convergence speed of the Arimoto-Blahut algorithm. In [8], the convergence of the order $O(1/N)$ was analyzed by focusing on the second-order nonlinear recurrence formula consisting of the first- and second-order terms of the Taylor expansion of the defining function of the Arimoto-Blahut algorithm. However, in [8], an infinite n…
▽ More
In [8] (Nakagawa, et.al., IEEE Trans. IT, 2021), we investigated the convergence speed of the Arimoto-Blahut algorithm. In [8], the convergence of the order $O(1/N)$ was analyzed by focusing on the second-order nonlinear recurrence formula consisting of the first- and second-order terms of the Taylor expansion of the defining function of the Arimoto-Blahut algorithm. However, in [8], an infinite number of inequalities were assumed as a "conjecture," and proofs were given based on the conjecture. In this paper, we report a proof of the convergence of the order $O(1/N)$ for a class of channel matrices without assuming the conjecture. The correctness of the proof will be confirmed by several numerical examples.
△ Less
Submitted 11 September, 2022;
originally announced September 2022.
-
GraphTune: A Learning-based Graph Generative Model with Tunable Structural Features
Authors:
Kohei Watabe,
Shohei Nakazawa,
Yoshiki Sato,
Sho Tsugawa,
Kenji Nakagawa
Abstract:
Generative models for graphs have been actively studied for decades, and they have a wide range of applications. Recently, learning-based graph generation that reproduces real-world graphs has been attracting the attention of many researchers. Although several generative models that utilize modern machine learning technologies have been proposed, conditional generation of general graphs has been l…
▽ More
Generative models for graphs have been actively studied for decades, and they have a wide range of applications. Recently, learning-based graph generation that reproduces real-world graphs has been attracting the attention of many researchers. Although several generative models that utilize modern machine learning technologies have been proposed, conditional generation of general graphs has been less explored in the field. In this paper, we propose a generative model that allows us to tune the value of a global-level structural feature as a condition. Our model, called GraphTune, makes it possible to tune the value of any structural feature of generated graphs using Long Short Term Memory (LSTM) and a Conditional Variational AutoEncoder (CVAE). We performed comparative evaluations of GraphTune and conventional models on a real graph dataset. The evaluations show that GraphTune makes it possible to more clearly tune the value of a global-level structural feature better than conventional models.
△ Less
Submitted 5 April, 2023; v1 submitted 27 January, 2022;
originally announced January 2022.
-
Fractional SDE-Net: Generation of Time Series Data with Long-term Memory
Authors:
Kohei Hayashi,
Kei Nakagawa
Abstract:
In this paper, we focus on the generation of time-series data using neural networks. It is often the case that input time-series data have only one realized (and usually irregularly sampled) path, which makes it difficult to extract time-series characteristics, and its noise structure is more complicated than i.i.d. type. Time series data, especially from hydrology, telecommunications, economics,…
▽ More
In this paper, we focus on the generation of time-series data using neural networks. It is often the case that input time-series data have only one realized (and usually irregularly sampled) path, which makes it difficult to extract time-series characteristics, and its noise structure is more complicated than i.i.d. type. Time series data, especially from hydrology, telecommunications, economics, and finance, exhibit long-term memory also called long-range dependency (LRD). The main purpose of this paper is to artificially generate time series with the help of neural networks, making the LRD of paths into account. We propose fSDE-Net: neural fractional Stochastic Differential Equation Network. It generalizes the neural stochastic differential equation model by using fractional Brownian motion with a Hurst index larger than half, which exhibits the LRD property. We derive the solver of fSDE-Net and theoretically analyze the existence and uniqueness of the solution to fSDE-Net. Our experiments with artificial and real time-series data demonstrate that the fSDE-Net model can replicate distributional properties well.
△ Less
Submitted 23 August, 2022; v1 submitted 16 January, 2022;
originally announced January 2022.
-
Improving Nonparametric Classification via Local Radial Regression with an Application to Stock Prediction
Authors:
Ruixing Cao,
Akifumi Okuno,
Kei Nakagawa,
Hidetoshi Shimodaira
Abstract:
For supervised classification problems, this paper considers estimating the query's label probability through local regression using observed covariates. Well-known nonparametric kernel smoother and $k$-nearest neighbor ($k$-NN) estimator, which take label average over a ball around the query, are consistent but asymptotically biased particularly for a large radius of the ball. To eradicate such b…
▽ More
For supervised classification problems, this paper considers estimating the query's label probability through local regression using observed covariates. Well-known nonparametric kernel smoother and $k$-nearest neighbor ($k$-NN) estimator, which take label average over a ball around the query, are consistent but asymptotically biased particularly for a large radius of the ball. To eradicate such bias, local polynomial regression (LPoR) and multiscale $k$-NN (MS-$k$-NN) learn the bias term by local regression around the query and extrapolate it to the query itself. However, their theoretical optimality has been shown for the limit of the infinite number of training samples. For correcting the asymptotic bias with fewer observations, this paper proposes a \emph{local radial regression (LRR)} and its logistic regression variant called \emph{local radial logistic regression~(LRLR)}, by combining the advantages of LPoR and MS-$k$-NN. The idea is quite simple: we fit the local regression to observed labels by taking only the radial distance as the explanatory variable and then extrapolate the estimated label probability to zero distance. The usefulness of the proposed method is shown theoretically and experimentally. We prove the convergence rate of the $L^2$ risk for LRR with reference to MS-$k$-NN, and our numerical experiments, including real-world datasets of daily stock indices, demonstrate that LRLR outperforms LPoR and MS-$k$-NN.
△ Less
Submitted 21 July, 2022; v1 submitted 27 December, 2021;
originally announced December 2021.
-
WebRTC-based measurement tool for peer-to-peer applications and preliminary findings with real users
Authors:
Kosuke Nakagawa,
Manabu Tsukada,
Keiichi Shima,
Hiroshi Esaki
Abstract:
Direct peer-to-peer (P2P) communication is often used to minimize the end-to-end latency for real-time applications that require accurate synchronization, such as remote musical ensembles. However, there are few studies on the performance of P2P communication between home network environments, thus hindering the deployment of services that require synchronization. In this study, we developed a P2P…
▽ More
Direct peer-to-peer (P2P) communication is often used to minimize the end-to-end latency for real-time applications that require accurate synchronization, such as remote musical ensembles. However, there are few studies on the performance of P2P communication between home network environments, thus hindering the deployment of services that require synchronization. In this study, we developed a P2P performance measurement tool using the Web Real-Time Communication (WebRTC) statistics application programming interface. Using this tool, we can easily measure P2P performance between home network environments on a web browser without downloading client applications. We also verified the reliability of round-trip time (RTT) measurements using WebRTC and confirmed that our system could provide the necessary measurement accuracy for RTT and jitter measurements for real-time applications. In addition, we measured the performance of a full mesh topology connection with 10 users in an actual environment in Japan. Consequently, we found that only 66% of the peer connections had a latency of 30 ms or less, which is the minimum requirement for high synchronization applications, such as musical ensembles.
△ Less
Submitted 3 December, 2021;
originally announced December 2021.
-
Training of deep cross-modality conversion models with a small dataset, and their application in megavoltage CT to kilovoltage CT conversion
Authors:
Sho Ozaki,
Shizuo Kaji,
Kanabu Nawa,
Toshikazu Imae,
Atsushi Aoki,
Takahiro Nakamoto,
Takeshi Ohta,
Yuki Nozawa,
Hideomi Yamashita,
Akihiro Haga,
Keiichi Nakagawa
Abstract:
In recent years, deep-learning-based image processing has emerged as a valuable tool for medical imaging owing to its high performance. However, the quality of deep-learning-based methods heavily relies on the amount of training data; the high cost of acquiring a large dataset is a limitation to their utilization in medical fields. Herein, based on deep learning, we developed a computed tomography…
▽ More
In recent years, deep-learning-based image processing has emerged as a valuable tool for medical imaging owing to its high performance. However, the quality of deep-learning-based methods heavily relies on the amount of training data; the high cost of acquiring a large dataset is a limitation to their utilization in medical fields. Herein, based on deep learning, we developed a computed tomography (CT) modality conversion method requiring only a few unsupervised images. The proposed method is based on CycleGAN with several extensions tailored for CT images, which aims at preserving the structure in the processed images and reducing the amount of training data. This method was applied to realize the conversion of megavoltage computed tomography (MVCT) to kilovoltage computed tomography (kVCT) images. Training was conducted using several datasets acquired from patients with head and neck cancer. The size of the datasets ranged from 16 slices (two patients) to 2745 slices (137 patients) for MVCT and 2824 slices (98 patients) for kVCT. The required size of the training data was found to be as small as a few hundred slices. By statistical and visual evaluations, the quality improvement and structure preservation of the MVCT images converted by the proposed model were investigated. As a clinical benefit, it was observed by medical doctors that the converted images enhanced the precision of contouring. We developed an MVCT to kVCT conversion model based on deep learning, which can be trained using only a few hundred unpaired images. The stability of the model against changes in data size was demonstrated. This study promotes the reliable use of deep learning in clinical medicine by partially answering commonly asked questions, such as "Is our data sufficient?" and "How much data should we acquire?"
△ Less
Submitted 5 April, 2022; v1 submitted 12 July, 2021;
originally announced July 2021.
-
A Tunable Model for Graph Generation Using LSTM and Conditional VAE
Authors:
Shohei Nakazawa,
Yoshiki Sato,
Kenji Nakagawa,
Sho Tsugawa,
Kohei Watabe
Abstract:
With the development of graph applications, generative models for graphs have been more crucial. Classically, stochastic models that generate graphs with a pre-defined probability of edges and nodes have been studied. Recently, some models that reproduce the structural features of graphs by learning from actual graph data using machine learning have been studied. However, in these conventional stu…
▽ More
With the development of graph applications, generative models for graphs have been more crucial. Classically, stochastic models that generate graphs with a pre-defined probability of edges and nodes have been studied. Recently, some models that reproduce the structural features of graphs by learning from actual graph data using machine learning have been studied. However, in these conventional studies based on machine learning, structural features of graphs can be learned from data, but it is not possible to tune features and generate graphs with specific features. In this paper, we propose a generative model that can tune specific features, while learning structural features of a graph from data. With a dataset of graphs with various features generated by a stochastic model, we confirm that our model can generate a graph with specific features.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
Controlling False Discovery Rates under Cross-Sectional Correlations
Authors:
Junpei Komiyama,
Masaya Abe,
Kei Nakagawa,
Kenichiro McAlinn
Abstract:
We consider controlling the false discovery rate for testing many time series with an unknown cross-sectional correlation structure. Given a large number of hypotheses, false and missing discoveries can plague an analysis. While many procedures have been proposed to control false discovery, most of them either assume independent hypotheses or lack statistical power. A problem of particular interes…
▽ More
We consider controlling the false discovery rate for testing many time series with an unknown cross-sectional correlation structure. Given a large number of hypotheses, false and missing discoveries can plague an analysis. While many procedures have been proposed to control false discovery, most of them either assume independent hypotheses or lack statistical power. A problem of particular interest is in financial asset pricing, where the goal is to determine which ``factors" lead to excess returns out of a large number of potential factors. Our contribution is two-fold. First, we show the consistency of Fama and French's prominent method under multiple testing. Second, we propose a novel method for false discovery control using double bootstrap**. We achieve superior statistical power to existing methods and prove that the false discovery rate is controlled. Simulations and a real data application illustrate the efficacy of our method over existing methods.
△ Less
Submitted 9 June, 2021; v1 submitted 15 February, 2021;
originally announced February 2021.
-
Mean-Variance Efficient Reinforcement Learning by Expected Quadratic Utility Maximization
Authors:
Masahiro Kato,
Kei Nakagawa,
Kenshi Abe,
Tetsuro Morimura
Abstract:
Risk management is critical in decision making, and mean-variance (MV) trade-off is one of the most common criteria. However, in reinforcement learning (RL) for sequential decision making under uncertainty, most of the existing methods for MV control suffer from computational difficulties caused by the double sampling problem. In this paper, in contrast to strict MV control, we consider learning M…
▽ More
Risk management is critical in decision making, and mean-variance (MV) trade-off is one of the most common criteria. However, in reinforcement learning (RL) for sequential decision making under uncertainty, most of the existing methods for MV control suffer from computational difficulties caused by the double sampling problem. In this paper, in contrast to strict MV control, we consider learning MV efficient policies that achieve Pareto efficiency regarding MV trade-off. To achieve this purpose, we train an agent to maximize the expected quadratic utility function, a common objective of risk management in finance and economics. We call our approach direct expected quadratic utility maximization (EQUM). The EQUM does not suffer from the double sampling issue because it does not include gradient estimation of variance. We confirm that the maximizer of the objective in the EQUM directly corresponds to an MV efficient policy under a certain condition. We conduct experiments with benchmark settings to demonstrate the effectiveness of the EQUM.
△ Less
Submitted 5 September, 2021; v1 submitted 3 October, 2020;
originally announced October 2020.
-
Analysis of the Convergence Speed of the Arimoto-Blahut Algorithm by the Second Order Recurrence Formula
Authors:
Kenji Nakagawa,
Yoshinori Takei,
Shin-ichiro Hara,
Kohei Watabe
Abstract:
In this paper, we investigate the convergence speed of the Arimoto-Blahut algorithm. For many channel matrices the convergence is exponential, but for some channel matrices it is slower than exponential. By analyzing the Taylor expansion of the defining function of the Arimoto-Blahut algorithm, we will make the conditions clear for the exponential or slower convergence. The analysis of the slow co…
▽ More
In this paper, we investigate the convergence speed of the Arimoto-Blahut algorithm. For many channel matrices the convergence is exponential, but for some channel matrices it is slower than exponential. By analyzing the Taylor expansion of the defining function of the Arimoto-Blahut algorithm, we will make the conditions clear for the exponential or slower convergence. The analysis of the slow convergence is new in this paper. Based on the analysis, we will compare the convergence speed of the Arimoto-Blahut algorithm numerically with the values obtained in our theorems for several channel matrices. The purpose of this paper is a complete understanding of the convergence speed of the Arimoto-Blahut algorithm.
△ Less
Submitted 17 September, 2020;
originally announced September 2020.
-
TPLVM: Portfolio Construction by Student's $t$-process Latent Variable Model
Authors:
Yusuke Uchiyama,
Kei Nakagawa
Abstract:
Optimal asset allocation is a key topic in modern finance theory. To realize the optimal asset allocation on investor's risk aversion, various portfolio construction methods have been proposed. Recently, the applications of machine learning are rapidly growing in the area of finance. In this article, we propose the Student's $t$-process latent variable model (TPLVM) to describe non-Gaussian fluctu…
▽ More
Optimal asset allocation is a key topic in modern finance theory. To realize the optimal asset allocation on investor's risk aversion, various portfolio construction methods have been proposed. Recently, the applications of machine learning are rapidly growing in the area of finance. In this article, we propose the Student's $t$-process latent variable model (TPLVM) to describe non-Gaussian fluctuations of financial timeseries by lower dimensional latent variables. Subsequently, we apply the TPLVM to minimum-variance portfolio as an alternative of existing nonlinear factor models. To test the performance of the proposed portfolio, we construct minimum-variance portfolios of global stock market indices based on the TPLVM or Gaussian process latent variable model. By comparing these portfolios, we confirm the proposed portfolio outperforms that of the existing Gaussian process latent variable model.
△ Less
Submitted 28 January, 2020;
originally announced February 2020.
-
A Robust Transferable Deep Learning Framework for Cross-sectional Investment Strategy
Authors:
Kei Nakagawa,
Masaya Abe,
Junpei Komiyama
Abstract:
Stock return predictability is an important research theme as it reflects our economic and social organization, and significant efforts are made to explain the dynamism therein. Statistics of strong explanative power, called "factor" have been proposed to summarize the essence of predictive stock returns. Although machine learning methods are increasingly popular in stock return prediction, an inf…
▽ More
Stock return predictability is an important research theme as it reflects our economic and social organization, and significant efforts are made to explain the dynamism therein. Statistics of strong explanative power, called "factor" have been proposed to summarize the essence of predictive stock returns. Although machine learning methods are increasingly popular in stock return prediction, an inference of the stock returns is highly elusive, and still most investors, if partly, rely on their intuition to build a better decision making. The challenge here is to make an investment strategy that is consistent over a reasonably long period, with the minimum human decision on the entire process. To this end, we propose a new stock return prediction framework that we call Ranked Information Coefficient Neural Network (RIC-NN). RIC-NN is a deep learning approach and includes the following three novel ideas: (1) nonlinear multi-factor approach, (2) stop** criteria with ranked information coefficient (rank IC), and (3) deep transfer learning among multiple regions. Experimental comparison with the stocks in the Morgan Stanley Capital International (MSCI) indices shows that RIC-NN outperforms not only off-the-shelf machine learning methods but also the average return of major equity investment funds in the last fourteen years.
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
Use of Ghost Cytometry to Differentiate Cells with Similar Gross Morphologic Characteristics
Authors:
Hiroaki Adachi,
Yoko Kawamura,
Keiji Nakagawa,
Ryoichi Horisaki,
Issei Sato,
Satoko Yamaguchi,
Katsuhito Fujiu,
Kayo Waki,
Hiroyuki Noji,
Sadao Ota
Abstract:
Imaging flow cytometry shows significant potential for increasing our understanding of heterogeneous and complex life systems and is useful for biomedical applications. Ghost cytometry is a recently proposed approach for directly analyzing compressively measured signals, thereby relieving the computational bottleneck observed in high-throughput cytometry based on morphological information. While t…
▽ More
Imaging flow cytometry shows significant potential for increasing our understanding of heterogeneous and complex life systems and is useful for biomedical applications. Ghost cytometry is a recently proposed approach for directly analyzing compressively measured signals, thereby relieving the computational bottleneck observed in high-throughput cytometry based on morphological information. While this image-free approach could distinguish different cell types using the same fluorescence staining method, further strict controls are sometimes required to clearly demonstrate that the classification is based on detailed morphologic analysis. In this study, we show that ghost cytometry can be used to classify cell populations of the same type but with different fluorescence distributions in space, supporting the strength of our image-free approach for morphologic cell analysis.
△ Less
Submitted 22 March, 2019;
originally announced March 2019.
-
Deep Recurrent Factor Model: Interpretable Non-Linear and Time-Varying Multi-Factor Model
Authors:
Kei Nakagawa,
Tomoki Ito,
Masaya Abe,
Kiyoshi Izumi
Abstract:
A linear multi-factor model is one of the most important tools in equity portfolio management. The linear multi-factor models are widely used because they can be easily interpreted. However, financial markets are not linear and their accuracy is limited. Recently, deep learning methods were proposed to predict stock return in terms of the multi-factor model. Although these methods perform quite we…
▽ More
A linear multi-factor model is one of the most important tools in equity portfolio management. The linear multi-factor models are widely used because they can be easily interpreted. However, financial markets are not linear and their accuracy is limited. Recently, deep learning methods were proposed to predict stock return in terms of the multi-factor model. Although these methods perform quite well, they have significant disadvantages such as a lack of transparency and limitations in the interpretability of the prediction. It is thus difficult for institutional investors to use black-box-type machine learning techniques in actual investment practice because they should show accountability to their customers. Consequently, the solution we propose is based on LSTM with LRP. Specifically, we extend the linear multi-factor model to be non-linear and time-varying with LSTM. Then, we approximate and linearize the learned LSTM models by LRP. We call this LSTM+LRP model a deep recurrent factor model. Finally, we perform an empirical analysis of the Japanese stock market and show that our recurrent model has better predictive capability than the traditional linear model and fully-connected deep learning methods.
△ Less
Submitted 20 January, 2019;
originally announced January 2019.
-
Visual enhancement of Cone-beam CT by use of CycleGAN
Authors:
S. Kida,
S. Kaji,
K. Nawa,
T. Imae,
T. Nakamoto,
S. Ozaki,
T. Ohta,
Y. Nozawa,
K. Nakagawa
Abstract:
Cone-beam computed tomography (CBCT) offers advantages over conventional fan-beam CT in that it requires a shorter time and less exposure to obtain images. CBCT has found a wide variety of applications in patient positioning for image-guided radiation therapy, extracting radiomic information for designing patient-specific treatment, and computing fractional dose distributions for adaptive radiatio…
▽ More
Cone-beam computed tomography (CBCT) offers advantages over conventional fan-beam CT in that it requires a shorter time and less exposure to obtain images. CBCT has found a wide variety of applications in patient positioning for image-guided radiation therapy, extracting radiomic information for designing patient-specific treatment, and computing fractional dose distributions for adaptive radiation therapy. However, CBCT images suffer from low soft-tissue contrast, noise, and artifacts compared to conventional fan-beam CT images. Therefore, it is essential to improve the image quality of CBCT. In this paper, we propose a synthetic approach to translate CBCT images with deep neural networks. Our method requires only unpaired and unaligned CBCT images and planning fan-beam CT (PlanCT) images for training. Once trained, 3D reconstructed CBCT images can be directly translated to high-quality PlanCT-like images. We demonstrate the effectiveness of our method with images obtained from 24 prostate patients, and we provide a statistical and visual comparison. The image quality of the translated images shows substantial improvement in voxel values, spatial uniformity, and artifact suppression compared to those of the original CBCT. The anatomical structures of the original CBCT images were also well preserved in the translated images. Our method enables more accurate adaptive radiation therapy, and opens up new applications for CBCT that hinge on high-quality images.
△ Less
Submitted 25 November, 2019; v1 submitted 17 January, 2019;
originally announced January 2019.
-
Analysis for the Slow Convergence in Arimoto Algorithm
Authors:
Kenji Nakagawa,
Yoshinori Takei,
Kohei Watabe
Abstract:
In this paper, we investigate the convergence speed of the Arimoto algorithm. By analyzing the Taylor expansion of the defining function of the Arimoto algorithm, we will clarify the conditions for the exponential or $1/N$ order convergence and calculate the convergence speed. We show that the convergence speed of the $1/N$ order is evaluated by the derivatives of the Kullback-Leibler divergence w…
▽ More
In this paper, we investigate the convergence speed of the Arimoto algorithm. By analyzing the Taylor expansion of the defining function of the Arimoto algorithm, we will clarify the conditions for the exponential or $1/N$ order convergence and calculate the convergence speed. We show that the convergence speed of the $1/N$ order is evaluated by the derivatives of the Kullback-Leibler divergence with respect to the input probabilities. The analysis for the convergence of the $1/N$ order is new in this paper. Based on the analysis, we will compare the convergence speed of the Arimoto algorithm with the theoretical values obtained in our theorems for several channel matrices.
△ Less
Submitted 3 September, 2018;
originally announced September 2018.
-
SIMD Vectorization for the Lennard-Jones Potential with AVX2 and AVX-512 instructions
Authors:
Hiroshi Watanabe,
Koh M. Nakagawa
Abstract:
This work describes the SIMD vectorization of the force calculation of the Lennard-Jones potential with Intel AVX2 and AVX-512 instruction sets. Since the force-calculation kernel of the molecular dynamics method involves indirect access to memory, the data layout is one of the most important factors in vectorization. We find that the Array of Structures (AoS) with padding exhibits better performa…
▽ More
This work describes the SIMD vectorization of the force calculation of the Lennard-Jones potential with Intel AVX2 and AVX-512 instruction sets. Since the force-calculation kernel of the molecular dynamics method involves indirect access to memory, the data layout is one of the most important factors in vectorization. We find that the Array of Structures (AoS) with padding exhibits better performance than Structure of Arrays (SoA) with appropriate vectorization and optimizations. In particular, AoS with 512-bit width exhibits the best performance among the architectures. While the difference in performance between AoS and SoA is significant for the vectorization with AVX2, that with AVX-512 is minor. The effect of other optimization techniques, such as software pipelining together with vectorization, is also discussed. We present results for benchmarks on three CPU architectures: Intel Haswell (HSW), Knights Landing (KNL), and Skylake (SKL). The performance gains by vectorization are about 42\% on HSW compared with the code optimized without vectorization. On KNL, the hand-vectorized codes exhibit 34\% better performance than the codes vectorized automatically by the Intel compiler. On SKL, the code vectorized with AVX2 exhibits slightly better performance than that with vectorized AVX-512.
△ Less
Submitted 22 October, 2018; v1 submitted 13 June, 2018;
originally announced June 2018.
-
$\tilde{O}(n^{1/3})$-Space Algorithm for the Grid Graph Reachability Problem
Authors:
Ryo Ashida,
Kotaro Nakagawa
Abstract:
The directed graph reachability problem takes as input an $n$-vertex directed graph $G=(V,E)$, and two distinguished vertices $s$ and $t$. The problem is to determine whether there exists a path from $s$ to $t$ in $G$. This is a canonical complete problem for class NL. Asano et al. proposed an $\tilde{O}(\sqrt{n})$ space and polynomial time algorithm for the directed grid and planar graph reachabi…
▽ More
The directed graph reachability problem takes as input an $n$-vertex directed graph $G=(V,E)$, and two distinguished vertices $s$ and $t$. The problem is to determine whether there exists a path from $s$ to $t$ in $G$. This is a canonical complete problem for class NL. Asano et al. proposed an $\tilde{O}(\sqrt{n})$ space and polynomial time algorithm for the directed grid and planar graph reachability problem. The main result of this paper is to show that the directed graph reachability problem restricted to grid graphs can be solved in polynomial time using only $\tilde{O}(n^{1/3})$ space.
△ Less
Submitted 20 September, 2019; v1 submitted 19 March, 2018;
originally announced March 2018.
-
Horizontal Product Differentiation in Varian's Model of Sales
Authors:
Kuninori Nakagawa
Abstract:
We consider the explicit introduction of firms' choice of location to Varian's model of sales for a two-stage spatial competition model based on a standard Hotelling's linear city model. This model is the formalization of Varian's model of sales in the context of Hotelling's spatial competition. We obtain three main results. First, we show that there exists a subgame perfect equilibrium in which e…
▽ More
We consider the explicit introduction of firms' choice of location to Varian's model of sales for a two-stage spatial competition model based on a standard Hotelling's linear city model. This model is the formalization of Varian's model of sales in the context of Hotelling's spatial competition. We obtain three main results. First, we show that there exists a subgame perfect equilibrium in which each firm chooses a symmetric mixed strategy equilibrium profile. This equilibrium includes symmetric location pairs and asymmetric location pairs. Second, the equilibrium behaviors in our model are randomized at both location and price stages. Third, we show that expected profits in a subgame perfect equilibrium are equal to the maximum monopoly profit from an uninformed market. Thus, even when product differentiation is explicitly introduced into a Varian-type model, Varian's implication can be retained; the opportunity for profit in an informed market is lost with competition.
△ Less
Submitted 7 May, 2017;
originally announced May 2017.
-
On the Search Algorithm for the Output Distribution that Achieves the Channel Capacity
Authors:
Kenji Nakagawa,
Kohei Watabe,
Takuto Sabu
Abstract:
We consider a search algorithm for the output distribution that achieves the channel capacity of a discrete memoryless channel. We will propose an algorithm by iterated projections of an output distribution onto affine subspaces in the set of output distributions. The problem of channel capacity has a similar geometric structure as that of smallest enclosing circle for a finite number of points in…
▽ More
We consider a search algorithm for the output distribution that achieves the channel capacity of a discrete memoryless channel. We will propose an algorithm by iterated projections of an output distribution onto affine subspaces in the set of output distributions. The problem of channel capacity has a similar geometric structure as that of smallest enclosing circle for a finite number of points in the Euclidean space. The metric in the Euclidean space is the Euclidean distance and the metric in the space of output distributions is the Kullback-Leibler divergence. We consider these two problems based on Amari's $α$-geometry. Then, we first consider the smallest enclosing circle in the Euclidean space and develop an algorithm to find the center of the smallest enclosing circle. Based on the investigation, we will apply the obtained algorithm to the problem of channel capacity.
△ Less
Submitted 8 January, 2016; v1 submitted 6 January, 2016;
originally announced January 2016.