-
Text-Based Correlation Matrix in Multi-Asset Allocation
Authors:
Yasuhiro Nakayama,
Tomochika Sawaki,
Issei Furuya,
Shunsuke Tamura
Abstract:
The purpose of this study is to estimate the correlation structure between multiple assets using financial text analysis. In recent years, as the background of elevating inflation in the global economy and monetary policy tightening by central banks, the correlation structure between assets, especially interest rate sensitivity and inflation sensitivity, has changed dramatically, increasing the im…
▽ More
The purpose of this study is to estimate the correlation structure between multiple assets using financial text analysis. In recent years, as the background of elevating inflation in the global economy and monetary policy tightening by central banks, the correlation structure between assets, especially interest rate sensitivity and inflation sensitivity, has changed dramatically, increasing the impact on the performance of investors' portfolios. Therefore, the importance of estimating a robust correlation structure in portfolio management has increased. On the other hand, the correlation coefficient using only the historical price data observed in the financial market is accompanied by a certain degree of time lag, and also has the aspect that prediction errors can occur due to the nonstationarity of financial time series data, and that the interpretability from the viewpoint of fundamentals is a little poor when a phase change occurs. In this study, we performed natural language processing on news text and central bank text to verify the prediction accuracy of future correlation coefficient changes. As a result, it was suggested that this method is useful in comparison with the prediction from ordinary time series data.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Practical Repetition-Aware Grammar Compression
Authors:
Isamu Furuya
Abstract:
The goal of grammar compression is to construct a small sized context free grammar which uniquely generates the input text data. Among grammar compression methods, RePair is known for its good practical compression performance. MR-RePair was recently proposed as an improvement to RePair for constructing small-sized context free grammar for repetitive text data. However, a compact encoding scheme h…
▽ More
The goal of grammar compression is to construct a small sized context free grammar which uniquely generates the input text data. Among grammar compression methods, RePair is known for its good practical compression performance. MR-RePair was recently proposed as an improvement to RePair for constructing small-sized context free grammar for repetitive text data. However, a compact encoding scheme has not been discussed for MR-RePair. We propose a practical encoding method for MR-RePair and show its effectiveness through comparative experiments. Moreover, we extend MR-RePair to run-length context free grammar and design a novel variant for it called RL-MR-RePair. We experimentally demonstrate that a compression scheme consisting of RL-MR-RePair and the proposed encoding method show good performance on real repetitive datasets.
△ Less
Submitted 29 October, 2019;
originally announced October 2019.
-
Re-Pair In Small Space
Authors:
Dominik Köppl,
Tomohiro I,
Isamu Furuya,
Yoshimasa Takabatake,
Kensuke Sakai,
Keisuke Goto
Abstract:
Re-Pair is a grammar compression scheme with favorably good compression rates. The computation of Re-Pair comes with the cost of maintaining large frequency tables, which makes it hard to compute Re-Pair on large scale data sets. As a solution for this problem we present, given a text of length $n$ whose characters are drawn from an integer alphabet, an…
▽ More
Re-Pair is a grammar compression scheme with favorably good compression rates. The computation of Re-Pair comes with the cost of maintaining large frequency tables, which makes it hard to compute Re-Pair on large scale data sets. As a solution for this problem we present, given a text of length $n$ whose characters are drawn from an integer alphabet, an $O(n^2) \cap O(n^2 \lg \log_τn \lg \lg \lg n / \log_τn)$ time algorithm computing Re-Pair in $n \lg \max(n,τ)$ bits of space including the text space, where $τ$ is the number of terminals and non-terminals. The algorithm works in the restore model, supporting the recovery of the original input in the time for the Re-Pair computation with $O(\lg n)$ additional bits of working space. We give variants of our solution working in parallel or in the external memory model.
△ Less
Submitted 16 November, 2019; v1 submitted 13 August, 2019;
originally announced August 2019.
-
MR-RePair: Grammar Compression based on Maximal Repeats
Authors:
Isamu Furuya,
Takuya Takagi,
Yuto Nakashima,
Shunsuke Inenaga,
Hideo Bannai,
Takuya Kida
Abstract:
We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the corresponding most frequent maximal repeats. Then, we design a novel variant of RePair, called MR-RePair, which substitutes the most frequent maximal repeats at…
▽ More
We analyze the grammar generation algorithm of the RePair compression algorithm and show the relation between a grammar generated by RePair and maximal repeats. We reveal that RePair replaces step by step the most frequent pairs within the corresponding most frequent maximal repeats. Then, we design a novel variant of RePair, called MR-RePair, which substitutes the most frequent maximal repeats at once instead of substituting the most frequent pairs consecutively. We implemented MR-RePair and compared the size of the grammar generated by MR-RePair to that by RePair on several text corpus. Our experiments show that MR-RePair generates more compact grammars than RePair does, especially for highly repetitive texts.
△ Less
Submitted 18 February, 2019; v1 submitted 12 November, 2018;
originally announced November 2018.
-
Compaction of Church Numerals for Higher-Order Compression
Authors:
Isamu Furuya,
Takuya Kida
Abstract:
In this study, we address the problem of compacting Church numerals. Church numerals appear as a representation of the repetitive part of data in higher-order compression. We propose a novel decomposition scheme for a natural number using tetration, which leads to a compact representation of $λ$-terms equivalent to the original Church numerals. For natural number $n$, we prove that the size of the…
▽ More
In this study, we address the problem of compacting Church numerals. Church numerals appear as a representation of the repetitive part of data in higher-order compression. We propose a novel decomposition scheme for a natural number using tetration, which leads to a compact representation of $λ$-terms equivalent to the original Church numerals. For natural number $n$, we prove that the size of the $λ$-term obtained by the proposed method is $O(({\rm slog}_{2}n)^{\log n/ \log \log n})$. Moreover, we quantitatively confirmed experimentally that the proposed method outperforms a binary expression of Church numerals when $n$ is less than approximately 10000.
△ Less
Submitted 10 November, 2017; v1 submitted 30 June, 2017;
originally announced June 2017.