-
Lightweight Zero-shot Text-to-Speech with Mixture of Adapters
Authors:
Kenichi Fujita,
Takanori Ashihara,
Marc Delcroix,
Yusuke Ijima
Abstract:
The advancements in zero-shot text-to-speech (TTS) methods, based on large-scale models, have demonstrated high fidelity in reproducing speaker characteristics. However, these models are too large for practical daily use. We propose a lightweight zero-shot TTS method using a mixture of adapters (MoA). Our proposed method incorporates MoA modules into the decoder and the variance adapter of a non-a…
▽ More
The advancements in zero-shot text-to-speech (TTS) methods, based on large-scale models, have demonstrated high fidelity in reproducing speaker characteristics. However, these models are too large for practical daily use. We propose a lightweight zero-shot TTS method using a mixture of adapters (MoA). Our proposed method incorporates MoA modules into the decoder and the variance adapter of a non-autoregressive TTS model. These modules enhance the ability to adapt a wide variety of speakers in a zero-shot manner by selecting appropriate adapters associated with speaker characteristics on the basis of speaker embeddings. Our method achieves high-quality speech synthesis with minimal additional parameters. Through objective and subjective evaluations, we confirmed that our method achieves better performance than the baseline with less than 40\% of parameters at 1.9 times faster inference speed. Audio samples are available on our demo page (https://ntt-hilab-gensp.github.io/is2024lightweightTTS/).
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Development of an Estimation Method for the Seismic Motion Reproducibility of a Three-dimensional Ground Structure Model by combining Surface-observed Seismic Motion and Three-dimensional Seismic Motion Analysis
Authors:
Tsuyoshi Ichimura,
Kohei Fujita,
Ryota Kusakabe,
Hiroyuki Fujiwara,
Muneo Hori,
Maddegedara Lalith
Abstract:
The ground structure can substantially influence seismic ground motion underscoring the need to develop a ground structure model with sufficient reliability in terms of ground motion estimation for earthquake damage mitigation. While many methods for generating ground structure models have been proposed and used in practice, there remains room for enhancing their reliability. In this study, amid m…
▽ More
The ground structure can substantially influence seismic ground motion underscoring the need to develop a ground structure model with sufficient reliability in terms of ground motion estimation for earthquake damage mitigation. While many methods for generating ground structure models have been proposed and used in practice, there remains room for enhancing their reliability. In this study, amid many candidate 3D ground structure models generated from geotechnical engineering knowledge, we propose a method for selecting a credible 3D ground structure model capable of reproducing observed earthquake ground motion, utilizing seismic ground motion data solely observed at the ground surface and employing 3D seismic ground motion analysis. Through a numerical experiment, we illustrate the efficacy of this approach. By conducting $10^2$-$10^3$ cases of fast 3D seismic wave propagation analyses using graphic processing units (GPUs), we demonstrate that a credible 3D ground structure model is selected according to the quantity of seismic motion information. We show the effectiveness of the proposed method by showing that the accuracy of seismic motions using ground structure models that were selected from the pool of candidate models is higher than that using ground structure models that were not selected from the pool of candidate models.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Low-ordered Orthogonal Voxel Finite Element with INT8 Tensor Cores for GPU-based Explicit Elastic Wave Propagation Analysis
Authors:
Tsuyoshi Ichimura,
Kohei Fujita,
Muneo Hori,
Maddegedara Lalith
Abstract:
Faster explicit elastic wavefield simulations are required for large and complex three-dimensional media using a structured finite element method. Such wavefield simulations are suitable for GPUs, which have exhibited improved computational performance in recent years, and the use of GPUs is expected to speed up such simulations. However, available computational performance on GPUs is typically no…
▽ More
Faster explicit elastic wavefield simulations are required for large and complex three-dimensional media using a structured finite element method. Such wavefield simulations are suitable for GPUs, which have exhibited improved computational performance in recent years, and the use of GPUs is expected to speed up such simulations. However, available computational performance on GPUs is typically not fully exploited, and the conventional method involves some numerical dispersion. Thus, in this paper, we propose an explicit structured-mesh wavefield simulation method that uses INT8 Tensor Cores and reduces numerical dispersion to speed up computation on GPUs. The proposed method was implemented for GPUs, and its performance was evaluated in a simulation experiment of a real-world problem. The results demonstrate that the proposed method is 17.0 times faster than the conventional method.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis
Authors:
Kenichi Fujita,
Atsushi Ando,
Yusuke Ijima
Abstract:
This paper proposes a speech rhythm-based method for speaker embeddings to model phoneme duration using a few utterances by the target speaker. Speech rhythm is one of the essential factors among speaker characteristics, along with acoustic features such as F0, for reproducing individual utterances in speech synthesis. A novel feature of the proposed method is the rhythm-based embeddings extracted…
▽ More
This paper proposes a speech rhythm-based method for speaker embeddings to model phoneme duration using a few utterances by the target speaker. Speech rhythm is one of the essential factors among speaker characteristics, along with acoustic features such as F0, for reproducing individual utterances in speech synthesis. A novel feature of the proposed method is the rhythm-based embeddings extracted from phonemes and their durations, which are known to be related to speaking rhythm. They are extracted with a speaker identification model similar to the conventional spectral feature-based one. We conducted three experiments, speaker embeddings generation, speech synthesis with generated embeddings, and embedding space analysis, to evaluate the performance. The proposed method demonstrated a moderate speaker identification performance (15.2% EER), even with only phonemes and their duration information. The objective and subjective evaluation results demonstrated that the proposed method can synthesize speech with speech rhythm closer to the target speaker than the conventional method. We also visualized the embeddings to evaluate the relationship between the distance of the embeddings and the perceptual similarity. The visualization of the embedding space and the relation analysis between the closeness indicated that the distribution of embeddings reflects the subjective and objective similarity.
△ Less
Submitted 10 February, 2024;
originally announced February 2024.
-
Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters
Authors:
Kenichi Fujita,
Hiroshi Sato,
Takanori Ashihara,
Hiroki Kanagawa,
Marc Delcroix,
Takafumi Moriya,
Yusuke Ijima
Abstract:
The zero-shot text-to-speech (TTS) method, based on speaker embeddings extracted from reference speech using self-supervised learning (SSL) speech representations, can reproduce speaker characteristics very accurately. However, this approach suffers from degradation in speech synthesis quality when the reference speech contains noise. In this paper, we propose a noise-robust zero-shot TTS method.…
▽ More
The zero-shot text-to-speech (TTS) method, based on speaker embeddings extracted from reference speech using self-supervised learning (SSL) speech representations, can reproduce speaker characteristics very accurately. However, this approach suffers from degradation in speech synthesis quality when the reference speech contains noise. In this paper, we propose a noise-robust zero-shot TTS method. We incorporated adapters into the SSL model, which we fine-tuned with the TTS model using noisy reference speech. In addition, to further improve performance, we adopted a speech enhancement (SE) front-end. With these improvements, our proposed SSL-based zero-shot TTS achieved high-quality speech synthesis with noisy reference speech. Through the objective and subjective evaluations, we confirmed that the proposed method is highly robust to noise in reference speech, and effectively works in combination with SE.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Characteristics of networks generated by kernel growing neural gas
Authors:
Kazuhisa Fujita
Abstract:
This research aims to develop kernel GNG, a kernelized version of the growing neural gas (GNG) algorithm, and to investigate the features of the networks generated by the kernel GNG. The GNG is an unsupervised artificial neural network that can transform a dataset into an undirected graph, thereby extracting the features of the dataset as a graph. The GNG is widely used in vector quantization, clu…
▽ More
This research aims to develop kernel GNG, a kernelized version of the growing neural gas (GNG) algorithm, and to investigate the features of the networks generated by the kernel GNG. The GNG is an unsupervised artificial neural network that can transform a dataset into an undirected graph, thereby extracting the features of the dataset as a graph. The GNG is widely used in vector quantization, clustering, and 3D graphics. Kernel methods are often used to map a dataset to feature space, with support vector machines being the most prominent application. This paper introduces the kernel GNG approach and explores the characteristics of the networks generated by kernel GNG. Five kernels, including Gaussian, Laplacian, Cauchy, inverse multiquadric, and log kernels, are used in this study. The results of this study show that the average degree and the average clustering coefficient decrease as the kernel parameter increases for Gaussian, Laplacian, Cauchy, and IMQ kernels. If we avoid more edges and a higher clustering coefficient (or more triangles), the kernel GNG with a larger value of the parameter will be more appropriate.
△ Less
Submitted 25 August, 2023; v1 submitted 16 August, 2023;
originally announced August 2023.
-
An efficient and straightforward online quantization method for a data stream through remove-birth updating
Authors:
Kazuhisa Fujita
Abstract:
The growth of network-connected devices has led to an exponential increase in data generation, creating significant challenges for efficient data analysis. This data is generated continuously, creating a dynamic flow known as a data stream. The characteristics of a data stream may change dynamically, and this change is known as concept drift. Consequently, a method for handling data streams must e…
▽ More
The growth of network-connected devices has led to an exponential increase in data generation, creating significant challenges for efficient data analysis. This data is generated continuously, creating a dynamic flow known as a data stream. The characteristics of a data stream may change dynamically, and this change is known as concept drift. Consequently, a method for handling data streams must efficiently reduce their volume while dynamically adapting to these changing characteristics. This paper proposes a simple online vector quantization method for concept drift. The proposed method identifies and replaces units with low win probability through remove-birth updating, thus achieving a rapid adaptation to concept drift. Furthermore, the results of this study show that the proposed method can generate minimal dead units even in the presence of concept drift. This study also suggests that some metrics calculated from the proposed method will be helpful for drift detection.
△ Less
Submitted 25 December, 2023; v1 submitted 21 June, 2023;
originally announced June 2023.
-
Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model
Authors:
Kenichi Fujita,
Takanori Ashihara,
Hiroki Kanagawa,
Takafumi Moriya,
Yusuke Ijima
Abstract:
This paper proposes a zero-shot text-to-speech (TTS) conditioned by a self-supervised speech-representation model acquired through self-supervised learning (SSL). Conventional methods with embedding vectors from x-vector or global style tokens still have a gap in reproducing the speaker characteristics of unseen speakers. A novel point of the proposed method is the direct use of the SSL model to o…
▽ More
This paper proposes a zero-shot text-to-speech (TTS) conditioned by a self-supervised speech-representation model acquired through self-supervised learning (SSL). Conventional methods with embedding vectors from x-vector or global style tokens still have a gap in reproducing the speaker characteristics of unseen speakers. A novel point of the proposed method is the direct use of the SSL model to obtain embedding vectors from speech representations trained with a large amount of data. We also introduce the separate conditioning of acoustic features and a phoneme duration predictor to obtain the disentangled embeddings between rhythm-based speaker characteristics and acoustic-feature-based ones. The disentangled embeddings will enable us to achieve better reproduction performance for unseen speakers and rhythm transfer conditioned by different speeches. Objective and subjective evaluations showed that the proposed method can synthesize speech with improved similarity and achieve speech-rhythm transfer.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
Gaussian Process Classification Bandits
Authors:
Tatsuya Hayashi,
Naoki Ito,
Koji Tabata,
Atsuyoshi Nakamura,
Katsumasa Fujita,
Yoshinori Harada,
Tamiki Komatsuzaki
Abstract:
Classification bandits are multi-armed bandit problems whose task is to classify a given set of arms into either positive or negative class depending on whether the rate of the arms with the expected reward of at least h is not less than w for given thresholds h and w. We study a special classification bandit problem in which arms correspond to points x in d-dimensional real space with expected re…
▽ More
Classification bandits are multi-armed bandit problems whose task is to classify a given set of arms into either positive or negative class depending on whether the rate of the arms with the expected reward of at least h is not less than w for given thresholds h and w. We study a special classification bandit problem in which arms correspond to points x in d-dimensional real space with expected rewards f(x) which are generated according to a Gaussian process prior. We develop a framework algorithm for the problem using various arm selection policies and propose policies called FCB and FTSV. We show a smaller sample complexity upper bound for FCB than that for the existing algorithm of the level set estimation, in which whether f(x) is at least h or not must be decided for every arm's x. Arm selection policies depending on an estimated rate of arms with rewards of at least h are also proposed and shown to improve empirical sample complexity. According to our experimental results, the rate-estimation versions of FCB and FTSV, together with that of the popular active learning policy that selects the point with the maximum variance, outperform other policies for synthetic functions, and the version of FTSV is also the best performer for our real-world dataset.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Physics-informed neural network method for modelling beam-wall interactions
Authors:
Kazuhiro Fujita
Abstract:
A mesh-free approach for modelling beam-wall interactions in particle accelerators is proposed. The key idea of our method is to use a deep neural network as a surrogate for the solution to a set of partial differential equations involving the particle beam, and the surface impedance concept. The proposed approach is applied to the coupling impedance of an accelerator vacuum chamber with thin cond…
▽ More
A mesh-free approach for modelling beam-wall interactions in particle accelerators is proposed. The key idea of our method is to use a deep neural network as a surrogate for the solution to a set of partial differential equations involving the particle beam, and the surface impedance concept. The proposed approach is applied to the coupling impedance of an accelerator vacuum chamber with thin conductive coating, and also verified in comparison with the existing analytical formula.
△ Less
Submitted 4 January, 2022; v1 submitted 21 December, 2021;
originally announced December 2021.
-
AlphaDDA: Strategies for Adjusting the Playing Strength of a Fully Trained AlphaZero System to a Suitable Human Training Partner
Authors:
Kazuhisa Fujita
Abstract:
Artificial intelligence (AI) has achieved superhuman performance in board games such as Go, chess, and Othello (Reversi). In other words, the AI system surpasses the level of a strong human expert player in such games. In this context, it is difficult for a human player to enjoy playing the games with the AI. To keep human players entertained and immersed in a game, the AI is required to dynamical…
▽ More
Artificial intelligence (AI) has achieved superhuman performance in board games such as Go, chess, and Othello (Reversi). In other words, the AI system surpasses the level of a strong human expert player in such games. In this context, it is difficult for a human player to enjoy playing the games with the AI. To keep human players entertained and immersed in a game, the AI is required to dynamically balance its skill with that of the human player. To address this issue, we propose AlphaDDA, an AlphaZero-based AI with dynamic difficulty adjustment (DDA). AlphaDDA consists of a deep neural network (DNN) and a Monte Carlo tree search, as in AlphaZero. AlphaDDA learns and plays a game the same way as AlphaZero, but can change its skills. AlphaDDA estimates the value of the game state from only the board state using the DNN. AlphaDDA changes a parameter dominantly controlling its skills according to the estimated value. Consequently, AlphaDDA adjusts its skills according to a game state. AlphaDDA can adjust its skill using only the state of a game without any prior knowledge regarding an opponent. In this study, AlphaDDA plays Connect4, Othello, and 6x6 Othello with other AI agents. Other AI agents are AlphaZero, Monte Carlo tree search, the minimax algorithm, and a random player. This study shows that AlphaDDA can balance its skill with that of the other AI agents, except for a random player. The DDA ability of AlphaDDA is based on an accurate estimation of the value from the state of a game. We believe that the AlphaDDA approach for DDA can be used for any game AI system if the DNN can accurately estimate the value of the game state and we know a parameter controlling the skills of the AI system.
△ Less
Submitted 20 September, 2022; v1 submitted 11 November, 2021;
originally announced November 2021.
-
Estimation of the number of clusters on d-dimensional sphere
Authors:
Kazuhisa Fujita
Abstract:
Spherical data is distributed on the sphere. The data appears in various fields such as meteorology, biology, and natural language processing. However, a method for analysis of spherical data does not develop enough yet. One of the important issues is an estimation of the number of clusters in spherical data. To address the issue, I propose a new method called the Spherical X-means (SX-means) that…
▽ More
Spherical data is distributed on the sphere. The data appears in various fields such as meteorology, biology, and natural language processing. However, a method for analysis of spherical data does not develop enough yet. One of the important issues is an estimation of the number of clusters in spherical data. To address the issue, I propose a new method called the Spherical X-means (SX-means) that can estimate the number of clusters on d-dimensional sphere. The SX-means is the model-based method assuming that the data is generated from a mixture of von Mises-Fisher distributions. The present paper explains the proposed method and shows its performance of estimation of the number of clusters.
△ Less
Submitted 13 May, 2021; v1 submitted 15 November, 2020;
originally announced November 2020.
-
Approximate spectral clustering using both reference vectors and topology of the network generated by growing neural gas
Authors:
Kazuhisa Fujita
Abstract:
Spectral clustering (SC) is one of the most popular clustering methods and often outperforms traditional clustering methods. SC uses the eigenvectors of a Laplacian matrix calculated from a similarity matrix of a dataset. SC has serious drawbacks: the significant increases in the time complexity derived from the computation of eigenvectors and the memory space complexity to store the similarity ma…
▽ More
Spectral clustering (SC) is one of the most popular clustering methods and often outperforms traditional clustering methods. SC uses the eigenvectors of a Laplacian matrix calculated from a similarity matrix of a dataset. SC has serious drawbacks: the significant increases in the time complexity derived from the computation of eigenvectors and the memory space complexity to store the similarity matrix. To address the issues, I develop a new approximate spectral clustering using the network generated by growing neural gas (GNG), called ASC with GNG in this study. ASC with GNG uses not only reference vectors for vector quantization but also the topology of the network for extraction of the topological relationship between data points in a dataset. ASC with GNG calculates the similarity matrix from both the reference vectors and the topology of the network generated by GNG. Using the network generated from a dataset by GNG, ASC with GNG achieves to reduce the computational and space complexities and improve clustering quality. In this study, I demonstrate that ASC with GNG effectively reduces the computational time. Moreover, this study shows that ASC with GNG provides equal to or better clustering performance than SC.
△ Less
Submitted 12 August, 2021; v1 submitted 15 September, 2020;
originally announced September 2020.
-
Topology design of two-fluid heat exchange
Authors:
Hiroki Kobayashi,
Kentaro Yaji,
Shintaro Yamasaki,
Kikuo Fujita
Abstract:
Heat exchangers are devices that typically transfer heat between two fluids. The performance of a heat exchanger such as heat transfer rate and pressure loss strongly depends on the flow regime in the heat transfer system. In this paper, we present a density-based topology optimization method for a two-fluid heat exchange system, which achieves a maximum heat transfer rate under fixed pressure los…
▽ More
Heat exchangers are devices that typically transfer heat between two fluids. The performance of a heat exchanger such as heat transfer rate and pressure loss strongly depends on the flow regime in the heat transfer system. In this paper, we present a density-based topology optimization method for a two-fluid heat exchange system, which achieves a maximum heat transfer rate under fixed pressure loss. We propose a representation model accounting for three states, i.e., two fluids and a solid wall between the two fluids, by using a single design variable field. The key aspect of the proposed model is that mixing of the two fluids can be essentially prevented without any penalty scheme. This is because the solid constantly exists between the two fluids due to the use of the single design variable field. We demonstrate the effectiveness of the proposed approach through three-dimensional numerical examples in which an optimized design is compared with a simple reference design, and the effects of design conditions (i.e., Reynolds number, Prandtl number, design domain size, and flow arrangements) are investigated.
△ Less
Submitted 5 May, 2020;
originally announced May 2020.
-
A Feedback Shift Correction in Predicting Conversion Rates under Delayed Feedback
Authors:
Shota Yasui,
Gota Morishita,
Komei Fujita,
Masashi Shibata
Abstract:
In display advertising, predicting the conversion rate, that is, the probability that a user takes a predefined action on an advertiser's website, such as purchasing goods is fundamental in estimating the value of displaying the advertisement. However, there is a relatively long time delay between a click and its resultant conversion. Because of the delayed feedback, some positive instances at the…
▽ More
In display advertising, predicting the conversion rate, that is, the probability that a user takes a predefined action on an advertiser's website, such as purchasing goods is fundamental in estimating the value of displaying the advertisement. However, there is a relatively long time delay between a click and its resultant conversion. Because of the delayed feedback, some positive instances at the training period are labeled as negative because some conversions have not yet occurred when training data are gathered. As a result, the conditional label distributions differ between the training data and the production environment. This situation is referred to as a feedback shift. We address this problem by using an importance weight approach typically used for covariate shift correction. We prove its consistency for the feedback shift. Results in both offline and online experiments show that our proposed method outperforms the existing method.
△ Less
Submitted 5 February, 2020;
originally announced February 2020.
-
Deep learning generates custom-made logistic regression models for explaining how breast cancer subtypes are classified
Authors:
Takuma Shibahara,
Chisa Wada,
Yasuho Yamashita,
Kazuhiro Fujita,
Masamichi Sato,
Junichi Kuwata,
Atsushi Okamoto,
Yoshimasa Ono
Abstract:
Differentiating the intrinsic subtypes of breast cancer is crucial for deciding the best treatment strategy. Deep learning can predict the subtypes from genetic information more accurately than conventional statistical methods, but to date, deep learning has not been directly utilized to examine which genes are associated with which subtypes. To clarify the mechanisms embedded in the intrinsic sub…
▽ More
Differentiating the intrinsic subtypes of breast cancer is crucial for deciding the best treatment strategy. Deep learning can predict the subtypes from genetic information more accurately than conventional statistical methods, but to date, deep learning has not been directly utilized to examine which genes are associated with which subtypes. To clarify the mechanisms embedded in the intrinsic subtypes, we developed an explainable deep learning model called a point-wise linear (PWL) model that generates a custom-made logistic regression for each patient. Logistic regression, which is familiar to both physicians and medical informatics researchers, allows us to analyze the importance of the feature variables, and the PWL model harnesses these practical abilities of logistic regression. In this study, we show that analyzing breast cancer subtypes is clinically beneficial for patients and one of the best ways to validate the capability of the PWL model. First, we trained the PWL model with RNA-seq data to predict PAM50 intrinsic subtypes and applied it to the 41/50 genes of PAM50 through the subtype prediction task. Second, we developed a deep enrichment analysis method to reveal the relationships between the PAM50 subtypes and the copy numbers of breast cancer. Our findings showed that the PWL model utilized genes relevant to the cell cycle-related pathways. These preliminary successes in breast cancer subtype analysis demonstrate the potential of our analysis strategy to clarify the mechanisms underlying breast cancer and improve overall clinical outcomes.
△ Less
Submitted 18 July, 2022; v1 submitted 20 January, 2020;
originally announced January 2020.
-
Fatigue-Aware Ad Creative Selection
Authors:
Daisuke Moriwaki,
Komei Fujita,
Shota Yasui,
Takahiro Hoshino
Abstract:
In online display advertising, selecting the most effective ad creative (ad image) for each impression is a crucial task for DSPs (Demand-Side Platforms) to fulfill their goals (click-through rate, number of conversions, revenue, and brand improvement). As widely recognized in the marketing literature, the effect of ad creative changes with the number of repetitive ad exposures. In this study, we…
▽ More
In online display advertising, selecting the most effective ad creative (ad image) for each impression is a crucial task for DSPs (Demand-Side Platforms) to fulfill their goals (click-through rate, number of conversions, revenue, and brand improvement). As widely recognized in the marketing literature, the effect of ad creative changes with the number of repetitive ad exposures. In this study, we propose an efficient and easy-to-implement ad creative selection algorithm that explicitly considers user's psychological status when selecting ad creatives. The proposed system was deployed in a real-world production environment and tested against the baseline algorithms. The results show superiority of the proposed algorithm.
△ Less
Submitted 14 January, 2020; v1 submitted 20 August, 2019;
originally announced August 2019.
-
What you get is not always what you see: pitfalls in solar array assessment using overhead imagery
Authors:
Wei Hu,
Kyle Bradbury,
Jordan M. Malof,
Boning Li,
Bohao Huang,
Artem Streltsov,
K. Sydny Fujita,
Ben Hoen
Abstract:
Effective integration planning for small, distributed solar photovoltaic (PV) arrays into electric power grids requires access to high quality data: the location and power capacity of individual solar PV arrays. Unfortunately, national databases of small-scale solar PV do not exist; those that do are limited in their spatial resolution, typically aggregated up to state or national levels. While se…
▽ More
Effective integration planning for small, distributed solar photovoltaic (PV) arrays into electric power grids requires access to high quality data: the location and power capacity of individual solar PV arrays. Unfortunately, national databases of small-scale solar PV do not exist; those that do are limited in their spatial resolution, typically aggregated up to state or national levels. While several promising approaches for solar PV detection have been published, strategies for evaluating the performance of these models are often highly heterogeneous from study to study. The resulting comparison of these methods for practical applications for energy assessments becomes challenging and may imply that the reported performance evaluations are overly optimistic. The heterogeneity comes in many forms, each of which we explore in this work: the level of spatial aggregation, the validation of ground truth, inconsistencies in the training and validation datasets, and the degree of diversity of the locations and sensors from which the training and validation data originate. For each, we discuss emerging practices from the literature to address them or suggest directions of future research. As part of our investigation, we evaluate solar PV identification performance in two large regions. Our findings suggest that traditional performance evaluation of the automated identification of solar PV from satellite imagery may be optimistic due to common limitations in the validation process. The takeaways from this work are intended to inform and catalyze the large-scale practical application of automated solar PV assessment techniques by energy researchers and professionals.
△ Less
Submitted 25 July, 2022; v1 submitted 28 February, 2019;
originally announced February 2019.
-
End-to-End Argument Mining for Discussion Threads Based on Parallel Constrained Pointer Architecture
Authors:
Gaku Morio,
Katsuhide Fujita
Abstract:
Argument Mining (AM) is a relatively recent discipline, which concentrates on extracting claims or premises from discourses, and inferring their structures. However, many existing works do not consider micro-level AM studies on discussion threads sufficiently. In this paper, we tackle AM for discussion threads. Our main contributions are follows: (1) A novel combination scheme focusing on micro-le…
▽ More
Argument Mining (AM) is a relatively recent discipline, which concentrates on extracting claims or premises from discourses, and inferring their structures. However, many existing works do not consider micro-level AM studies on discussion threads sufficiently. In this paper, we tackle AM for discussion threads. Our main contributions are follows: (1) A novel combination scheme focusing on micro-level inner- and inter- post schemes for a discussion thread. (2) Annotation of large-scale civic discussion threads with the scheme. (3) Parallel constrained pointer architecture (PCPA), a novel end-to-end technique to discriminate sentence types, inner-post relations, and inter-post interactions simultaneously. The experimental results demonstrate that our proposed model shows better accuracy in terms of relations extraction, in comparison to existing state-of-the-art models.
△ Less
Submitted 3 September, 2018;
originally announced September 2018.
-
Identifying exogenous and endogenous activity in social media
Authors:
Kazuki Fujita,
Alexey Medvedev,
Shinsuke Koyama,
Renaud Lambiotte,
Shigeru Shinomoto
Abstract:
The occurrence of new events in a system is typically driven by external causes and by previous events taking place inside the system. This is a general statement, applying to a range of situations including, more recently, to the activity of users in Online social networks (OSNs). Here we develop a method for extracting from a series of posting times the relative contributions of exogenous, e.g.…
▽ More
The occurrence of new events in a system is typically driven by external causes and by previous events taking place inside the system. This is a general statement, applying to a range of situations including, more recently, to the activity of users in Online social networks (OSNs). Here we develop a method for extracting from a series of posting times the relative contributions of exogenous, e.g. news media, and endogenous, e.g. information cascade. The method is based on the fitting of a generalized linear model (GLM) equipped with a self-excitation mechanism. We test the method with synthetic data generated by a nonlinear Hawkes process, and apply it to a real time series of tweets with a given hashtag. In the empirical dataset, the estimated contributions of exogenous and endogenous volumes are close to the amounts of original tweets and retweets respectively. We conclude by discussing the possible applications of the method, for instance in online marketing.
△ Less
Submitted 2 August, 2018;
originally announced August 2018.
-
Implicit Low-Order Unstructured Finite-Element Multiple Simulation Enhanced by Dense Computation using OpenACC
Authors:
Takuma Yamaguchi,
Kohei Fujita,
Tsuyoshi Ichimura,
Muneo Hori,
Maddegedara Lalith,
Kengo Nakajima
Abstract:
In this paper, we develop a low-order three-dimensional finite-element solver for fast multiple-case crust deformation analysis on GPU-based systems. Based on a high-performance solver designed for massively parallel CPU based systems, we modify the algorithm to reduce random data access, and then insert OpenACC directives. The developed solver on ten Reedbush-H nodes (20 P100 GPUs) attained speed…
▽ More
In this paper, we develop a low-order three-dimensional finite-element solver for fast multiple-case crust deformation analysis on GPU-based systems. Based on a high-performance solver designed for massively parallel CPU based systems, we modify the algorithm to reduce random data access, and then insert OpenACC directives. The developed solver on ten Reedbush-H nodes (20 P100 GPUs) attained speedup of 14.2 times from 20 K computer nodes, which is high considering the peak memory bandwidth ratio of 11.4 between the two systems. On the newest Volta generation V100 GPUs, the solver attained a further 2.45 times speedup from P100 GPUs. As a demonstrative example, we computed 368 cases of crustal deformation analyses of northeast Japan with 400 million degrees of freedom. The total procedure of algorithm modification and porting implementation took only two weeks; we can see that high performance improvement was achieved with low development cost. With the developed solver, we can expect improvement in reliability of crust-deformation analyses by many-case analyses on a wide range of GPU-based systems.
△ Less
Submitted 24 October, 2017;
originally announced October 2017.
-
On Upper Bounds on the Church-Rosser Theorem
Authors:
Ken-etsu Fujita
Abstract:
The Church-Rosser theorem in the type-free lambda-calculus is well investigated both for beta-equality and beta-reduction. We provide a new proof of the theorem for beta-equality with no use of parallel reductions, but simply with Takahashi's translation (Gross-Knuth strategy). Based on this, upper bounds for reduction sequences on the theorem are obtained as the fourth level of the Grzegorc…
▽ More
The Church-Rosser theorem in the type-free lambda-calculus is well investigated both for beta-equality and beta-reduction. We provide a new proof of the theorem for beta-equality with no use of parallel reductions, but simply with Takahashi's translation (Gross-Knuth strategy). Based on this, upper bounds for reduction sequences on the theorem are obtained as the fourth level of the Grzegorczyk hierarchy.
△ Less
Submitted 3 January, 2017;
originally announced January 2017.
-
Extract an essential skeleton of a character as a graph from a character image
Authors:
Kazuhisa Fujita
Abstract:
This paper aims to make a graph representing an essential skeleton of a character from an image that includes a machine printed or a handwritten character using growing neural gas (GNG) method and relative network graph (RNG) algorithm. The visual system in our brain can recognize printed characters and handwritten characters easily, robustly, and precisely. How does our brain robustly recognize c…
▽ More
This paper aims to make a graph representing an essential skeleton of a character from an image that includes a machine printed or a handwritten character using growing neural gas (GNG) method and relative network graph (RNG) algorithm. The visual system in our brain can recognize printed characters and handwritten characters easily, robustly, and precisely. How does our brain robustly recognize characters? The visual processing in our brain uses the essential features of an object, such as crosses and corners. These features will be helpful for character recognition by a computer. However, extraction of the features is difficult. If the skeleton of a character is represented as a graph, we can more easily extract the features. To extract the skeleton of a character as a graph from an image, this paper proposes the new approach using GNG and RNG algorithm. I achieved to extract skeleton graphs from images including distorted, noisy, and handwritten characters.
△ Less
Submitted 31 January, 2022; v1 submitted 13 June, 2015;
originally announced June 2015.