-
Lyricist-Singer Entropy Affects Lyric-Lyricist Classification Performance
Authors:
Mitsuki Morita,
Masato Kikuchi,
Tadachika Ozono
Abstract:
Although lyrics represent an essential component of music, few music information processing studies have been conducted on the characteristics of lyricists. Because these characteristics may be valuable for musical applications, such as recommendations, they warrant further study. We considered a potential method that extracts features representing the characteristics of lyricists from lyrics. Bec…
▽ More
Although lyrics represent an essential component of music, few music information processing studies have been conducted on the characteristics of lyricists. Because these characteristics may be valuable for musical applications, such as recommendations, they warrant further study. We considered a potential method that extracts features representing the characteristics of lyricists from lyrics. Because these features must be identified prior to extraction, we focused on lyricists with easily identifiable features. We believe that it is desirable for singers to perform unique songs that share certain characteristics specific to the singer. Accordingly, we hypothesized that lyricists account for the unique characteristics of the singers they write lyrics for. In other words, lyric-lyricist classification performance or the ease of capturing the features of a lyricist from the lyrics may depend on the variety of singers. In this study, we observed a relationship between lyricist-singer entropy or the variety of singers associated with a single lyricist and lyric-lyricist classification performance. As an example, the lyricist-singer entropy is minimal when the lyricist writes lyrics for only one singer. In our experiments, we grouped lyricists among five groups in terms of lyricist-singer entropy and assessed the lyric-lyricist classification performance within each group. Consequently, the best F1 score was obtained for the group with the lowest lyricist-singer entropy. Our results suggest that further analyses of the features contributing to lyric-lyricist classification performance on the lowest lyricist-singer entropy group may improve the feature extraction task for lyricists.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Conservative Likelihood Ratio Estimator for Infrequent Data Slightly above a Frequency Threshold
Authors:
Masato Kikuchi,
Yuhi Kusakabe,
Tadachika Ozono
Abstract:
A naive likelihood ratio (LR) estimation using the observed frequencies of events can overestimate LRs for infrequent data. One approach to avoid this problem is to use a frequency threshold and set the estimates to zero for frequencies below the threshold. This approach eliminates the computation of some estimates, thereby making practical tasks using LRs more efficient. However, it still overest…
▽ More
A naive likelihood ratio (LR) estimation using the observed frequencies of events can overestimate LRs for infrequent data. One approach to avoid this problem is to use a frequency threshold and set the estimates to zero for frequencies below the threshold. This approach eliminates the computation of some estimates, thereby making practical tasks using LRs more efficient. However, it still overestimates LRs for low frequencies near the threshold. This study proposes a conservative estimator for low frequencies, slightly above the threshold. Our experiment used LRs to predict the occurrence contexts of named entities from a corpus. The experimental results demonstrate that our estimator improves the prediction accuracy while maintaining efficiency in the context prediction task.
△ Less
Submitted 28 October, 2022;
originally announced November 2022.
-
Improving Multi-class Classifier Using Likelihood Ratio Estimation with Regularization
Authors:
Masato Kikuchi,
Tadachika Ozono
Abstract:
The universal-set naive Bayes classifier (UNB)~\cite{Komiya:13}, defined using likelihood ratios (LRs), was proposed to address imbalanced classification problems. However, the LR estimator used in the UNB overestimates LRs for low-frequency data, degrading the classification performance. Our previous study~\cite{Kikuchi:19} proposed an effective LR estimator even for low-frequency data. This esti…
▽ More
The universal-set naive Bayes classifier (UNB)~\cite{Komiya:13}, defined using likelihood ratios (LRs), was proposed to address imbalanced classification problems. However, the LR estimator used in the UNB overestimates LRs for low-frequency data, degrading the classification performance. Our previous study~\cite{Kikuchi:19} proposed an effective LR estimator even for low-frequency data. This estimator uses regularization to suppress the overestimation, but we did not consider imbalanced data. In this paper, we integrated the estimator with the UNB. Our experiments with imbalanced data showed that our proposed classifier effectively adjusts the classification scores according to the class balance using regularization parameters and improves the classification performance.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Develo** a Component Comment Extractor from Product Reviews on E-Commerce Sites
Authors:
Shogo Anda,
Masato Kikuchi,
Tadachika Ozono
Abstract:
Consumers often read product reviews to inform their buying decision, as some consumers want to know a specific component of a product. However, because typical sentences on product reviews contain various details, users must identify sentences about components they want to know amongst the many reviews. Therefore, we aimed to develop a system that identifies and collects component and aspect info…
▽ More
Consumers often read product reviews to inform their buying decision, as some consumers want to know a specific component of a product. However, because typical sentences on product reviews contain various details, users must identify sentences about components they want to know amongst the many reviews. Therefore, we aimed to develop a system that identifies and collects component and aspect information of products in sentences. Our BERT-based classifiers assign labels referring to components and aspects to sentences in reviews and extract sentences with comments on specific components and aspects. We determined proper labels based for the words identified through pattern matching from product reviews to create the training data. Because we could not use the words as labels, we carefully created labels covering the meanings of the words. However, the training data was imbalanced on component and aspect pairs. We introduced a data augmentation method using WordNet to reduce the bias. Our evaluation demonstrates that the system can determine labels for road bikes using pattern matching, covering more than 88\% of the indicators of components and aspects on e-commerce sites. Moreover, our data augmentation method can improve the-F1-measure on insufficient data from 0.66 to 0.76.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
Online Classroom Evaluation System Based on Multi-Reaction Estimation
Authors:
Yanyi Peng,
Masato Kikuchi,
Tadachika Ozono
Abstract:
Compared with traditional face-to-face teaching, online learning is more convenient. However, during online classes, it is more difficult for teachers to observe all student reactions at the same time. Our system is designed to help teachers to adjust the speed of their lessons by detecting student reactions. In this study, we estimate student head pose, hand poses, and expressions through the cam…
▽ More
Compared with traditional face-to-face teaching, online learning is more convenient. However, during online classes, it is more difficult for teachers to observe all student reactions at the same time. Our system is designed to help teachers to adjust the speed of their lessons by detecting student reactions. In this study, we estimate student head pose, hand poses, and expressions through the camera, all these poses will be used as criteria for judging the student participation. After estimating, we proposed a method to evaluate classroom participation based on student head pose, hand poses, and facial expression recognition. The estimated result divides the class quality into positive, neutral, and negative, then, under the help of the system, teachers can rearrange the content of the class.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Matching Social Issues to Technologies for Civic Tech by Association Rule Mining using Weighted Casual Confidence
Authors:
Masato Kikuchi,
Shun Shiramatsu,
Ryota Kozakai,
Tadachika Ozono
Abstract:
More than 80 civic tech communities in Japan are develo** information technology (IT) systems to solve their regional issues. Collaboration among such communities across different regions assists in solving their problems because some groups have limited IT knowledge and experience for this purpose. Our objective is to realize a civic tech matchmaking system to assist such communities in finding…
▽ More
More than 80 civic tech communities in Japan are develo** information technology (IT) systems to solve their regional issues. Collaboration among such communities across different regions assists in solving their problems because some groups have limited IT knowledge and experience for this purpose. Our objective is to realize a civic tech matchmaking system to assist such communities in finding better partners with IT experience in their issues. In this study, as the first step toward collaboration, we acquire relevant social issues and information technologies by association rule mining. To meet our challenge, we supply a questionnaire to members of civic tech communities and obtain answers on their faced issues and their available technologies. Subsequently, we match the relevant issues and technologies from the answers. However, most of the issues and technologies in this questionnaire data are infrequent, and there is a significant bias in their occurrence. Here, it is difficult to extract truly relevant issues--technologies combinations with existing interestingness measures. Therefore, we introduce a new measure called weighted casual confidence, and show that our measure is effective for mining relevant issues--technologies pairs.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Product Information Browsing Support System Using Analytic Hierarchy Process
Authors:
Weijian Li,
Masato Kikuchi,
Tadachika Ozono
Abstract:
Large-scale e-commerce sites can collect and analyze a large number of user preferences and behaviors, and thus can recommend highly trusted products to users. However, it is very difficult for individuals or non-corporate groups to obtain large-scale user data. Therefore, we consider whether knowledge of the decision-making domain can be used to obtain user preferences and combine it with content…
▽ More
Large-scale e-commerce sites can collect and analyze a large number of user preferences and behaviors, and thus can recommend highly trusted products to users. However, it is very difficult for individuals or non-corporate groups to obtain large-scale user data. Therefore, we consider whether knowledge of the decision-making domain can be used to obtain user preferences and combine it with content-based filtering to design an information retrieval system. This study describes the process of building a product information browsing support system with high satisfaction based on product similarity and multiple other perspectives about products on the Internet. We present the architecture of the proposed system and explain the working principle of its constituent modules. Finally, we demonstrate the effectiveness of the proposed system through an evaluation experiment and a questionnaire.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Feature Selective Likelihood Ratio Estimator for Low- and Zero-frequency N-grams
Authors:
Masato Kikuchi,
Mitsuo Yoshida,
Kyoji Umemura,
Tadachika Ozono
Abstract:
In natural language processing (NLP), the likelihood ratios (LRs) of N-grams are often estimated from the frequency information. However, a corpus contains only a fraction of the possible N-grams, and most of them occur infrequently. Hence, we desire an LR estimator for low- and zero-frequency N-grams. One way to achieve this is to decompose the N-grams into discrete values, such as letters and wo…
▽ More
In natural language processing (NLP), the likelihood ratios (LRs) of N-grams are often estimated from the frequency information. However, a corpus contains only a fraction of the possible N-grams, and most of them occur infrequently. Hence, we desire an LR estimator for low- and zero-frequency N-grams. One way to achieve this is to decompose the N-grams into discrete values, such as letters and words, and take the product of the LRs for the values. However, because this method deals with a large number of discrete values, the running time and memory usage for estimation are problematic. Moreover, use of unnecessary discrete values causes deterioration of the estimation accuracy. Therefore, this paper proposes combining the aforementioned method with the feature selection method used in document classification, and shows that our estimator provides effective and efficient estimation results for low- and zero-frequency N-grams.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
Develo** a Lecture Video Recording System Using Augmented Reality
Authors:
Yuma Ito,
Masato Kikuchi,
Tadachika Ozono,
Toramatsu Shintani
Abstract:
Assistive technology is a prerequisite for making a high-quality lecture video. It is therefore imperative to edit the lecture video after recording. In this study, we aim to reduce the cumbersome task of lecture video editing by develo** a system that enables the addition of visual effects in the video while recording. In particular, we use augmented reality (AR) technology to digitize and disp…
▽ More
Assistive technology is a prerequisite for making a high-quality lecture video. It is therefore imperative to edit the lecture video after recording. In this study, we aim to reduce the cumbersome task of lecture video editing by develo** a system that enables the addition of visual effects in the video while recording. In particular, we use augmented reality (AR) technology to digitize and display in real-time lecture materials, assistant agents, and other recording contents used by the lecturer. Our system realizes such a mechanism as a lecture recording environment. In addition, our system based on AR technology can support the work of the lecturer, which is difficult to do by oneself while conducting the lecture, using the information of the lecturer's position and the progress of the lecture. We evaluated the system functionality and performance, and verified the system's correct behavior. If the burden of making lecture videos can be reduced, the lecturer will be able to devote more time to improving the quality of lecture contents, which is expected to contribute to the improvement of lectures.
△ Less
Submitted 11 October, 2021;
originally announced October 2021.
-
Development of an Extractive Title Generation System Using Titles of Papers of Top Conferences for Intermediate English Students
Authors:
Kento Kaku,
Masato Kikuchi,
Tadachika Ozono,
Toramatsu Shintani
Abstract:
The formulation of good academic paper titles in English is challenging for intermediate English authors (particularly students). This is because such authors are not aware of the type of titles that are generally in use. We aim to realize a support system for formulating more effective English titles for intermediate English and beginner authors. This study develops an extractive title generation…
▽ More
The formulation of good academic paper titles in English is challenging for intermediate English authors (particularly students). This is because such authors are not aware of the type of titles that are generally in use. We aim to realize a support system for formulating more effective English titles for intermediate English and beginner authors. This study develops an extractive title generation system that formulates titles from keywords extracted from an abstract. Moreover, we realize a title evaluation model that can evaluate the appropriateness of paper titles. We train the model with titles of top-conference papers by using BERT. This paper describes the training data, implementation, and experimental results. The results show that our evaluation model can identify top-conference titles more effectively than intermediate English and beginner students.
△ Less
Submitted 8 October, 2021;
originally announced October 2021.