Search | arXiv e-print repository

A Single Linear Layer Yields Task-Adapted Low-Rank Matrices

Authors: Hwichan Kim, Shota Sasaki, Sho Hoshino, Ukyo Honda

Abstract: Low-Rank Adaptation (LoRA) is a widely used Parameter-Efficient Fine-Tuning (PEFT) method that updates an initial weight matrix $W_0$ with a delta matrix $ΔW$ consisted by two low-rank matrices $A$ and $B$. A previous study suggested that there is correlation between $W_0$ and $ΔW$. In this study, we aim to delve deeper into relationships between $W_0$ and low-rank matrices $A$ and $B$ to further… ▽ More Low-Rank Adaptation (LoRA) is a widely used Parameter-Efficient Fine-Tuning (PEFT) method that updates an initial weight matrix $W_0$ with a delta matrix $ΔW$ consisted by two low-rank matrices $A$ and $B$. A previous study suggested that there is correlation between $W_0$ and $ΔW$. In this study, we aim to delve deeper into relationships between $W_0$ and low-rank matrices $A$ and $B$ to further comprehend the behavior of LoRA. In particular, we analyze a conversion matrix that transform $W_0$ into low-rank matrices, which encapsulates information about the relationships. Our analysis reveals that the conversion matrices are similar across each layer. Inspired by these findings, we hypothesize that a single linear layer, which takes each layer's $W_0$ as input, can yield task-adapted low-rank matrices. To confirm this hypothesis, we devise a method named Conditionally Parameterized LoRA (CondLoRA) that updates initial weight matrices with low-rank matrices derived from a single linear layer. Our empirical results show that CondLoRA maintains a performance on par with LoRA, despite the fact that the trainable parameters of CondLoRA are fewer than those of LoRA. Therefore, we conclude that "a single linear layer yields task-adapted low-rank matrices." △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: Accepted at LREC-COLING 2024

arXiv:2403.05257 [pdf, other]

Cross-lingual Transfer or Machine Translation? On Data Augmentation for Monolingual Semantic Textual Similarity

Authors: Sho Hoshino, Akihiko Kato, Soichiro Murakami, Peinan Zhang

Abstract: Learning better sentence embeddings leads to improved performance for natural language understanding tasks including semantic textual similarity (STS) and natural language inference (NLI). As prior studies leverage large-scale labeled NLI datasets for fine-tuning masked language models to yield sentence embeddings, task performance for languages other than English is often left behind. In this stu… ▽ More Learning better sentence embeddings leads to improved performance for natural language understanding tasks including semantic textual similarity (STS) and natural language inference (NLI). As prior studies leverage large-scale labeled NLI datasets for fine-tuning masked language models to yield sentence embeddings, task performance for languages other than English is often left behind. In this study, we directly compared two data augmentation techniques as potential solutions for monolingual STS: (a) cross-lingual transfer that exploits English resources alone as training data to yield non-English sentence embeddings as zero-shot inference, and (b) machine translation that coverts English data into pseudo non-English training data in advance. In our experiments on monolingual STS in Japanese and Korean, we find that the two data techniques yield performance on par. Rather, we find a superiority of the Wikipedia domain over the NLI domain for these languages, in contrast to prior studies that focused on NLI as training data. Combining our findings, we demonstrate that the cross-lingual transfer of Wikipedia data exhibits improved performance, and that native Wikipedia data can further improve performance for monolingual STS. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: LREC-COLING 2024

arXiv:2306.12719 [pdf, other]

Natural Language Generation for Advertising: A Survey

Authors: Soichiro Murakami, Sho Hoshino, Peinan Zhang

Abstract: Natural language generation methods have emerged as effective tools to help advertisers increase the number of online advertisements they produce. This survey entails a review of the research trends on this topic over the past decade, from template-based to extractive and abstractive approaches using neural networks. Additionally, key challenges and directions revealed through the survey, includin… ▽ More Natural language generation methods have emerged as effective tools to help advertisers increase the number of online advertisements they produce. This survey entails a review of the research trends on this topic over the past decade, from template-based to extractive and abstractive approaches using neural networks. Additionally, key challenges and directions revealed through the survey, including metric optimization, faithfulness, diversity, multimodality, and the development of benchmark datasets, are discussed. △ Less

Submitted 22 June, 2023; originally announced June 2023.

arXiv:2204.11445 [pdf, other]

Aspect-based Analysis of Advertising Appeals for Search Engine Advertising

Authors: Soichiro Murakami, Peinan Zhang, Sho Hoshino, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura

Abstract: Writing an ad text that attracts people and persuades them to click or act is essential for the success of search engine advertising. Therefore, ad creators must consider various aspects of advertising appeals (A$^3$) such as the price, product features, and quality. However, products and services exhibit unique effective A$^3$ for different industries. In this work, we focus on exploring the effe… ▽ More Writing an ad text that attracts people and persuades them to click or act is essential for the success of search engine advertising. Therefore, ad creators must consider various aspects of advertising appeals (A$^3$) such as the price, product features, and quality. However, products and services exhibit unique effective A$^3$ for different industries. In this work, we focus on exploring the effective A$^3$ for different industries with the aim of assisting the ad creation process. To this end, we created a dataset of advertising appeals and used an existing model that detects various aspects for ad texts. Our experiments demonstrated that different industries have their own effective A$^3$ and that the identification of the A$^3$ contributes to the estimation of advertising performance. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: Accepted by NAACL-HLT2022 Industry track

arXiv:2203.05153 [pdf, other]

Determining Existence of Logical Obstructions to the Distributed Task Solvability

Authors: Sou Hoshino

Abstract: To study the distributed task solvability, Goubault, Ledent, and Rajsbaum devised a model of dynamic epistemic logic that is equivalent to the topological model for distributed computing. In the logical model, the unsolvability of a particular distributed task can be proven by finding a formula, called logical obstruction. This logical method is very appealing because the concrete formulas that pr… ▽ More To study the distributed task solvability, Goubault, Ledent, and Rajsbaum devised a model of dynamic epistemic logic that is equivalent to the topological model for distributed computing. In the logical model, the unsolvability of a particular distributed task can be proven by finding a formula, called logical obstruction. This logical method is very appealing because the concrete formulas that prevent to solve task would have implications of intuitive factors for the unsolvability. However, it has not been well studied when a logical obstruction exists and how to systematically construct a concrete logical obstruction formula, if any. In addition, it is proved that there are some tasks that are solvable but do not admit logical obstructions. In this paper, we propose a method to prove the non-existence of logical obstructions to the solvability of distributed tasks, based on the technique of simulation. Moreover, we give a method to determine whether a logical obstruction exists or not for a finite protocol and a finite task, and if it exists, construct a concrete obstruction. Using this method, we demonstrate that the language of the standard epistemic logic, without distributed knowledge, does not admit logical obstruction to the solvability of $k$-set agreement tasks. We also show that there is no logical obstruction for multi-round immediate snapshot even in the language of epistemic logic with distributed knowledge. In addition, for the know-all model, we provide a concrete obstruction formula that shows the unsolvability of the $k$-set agreement task. △ Less

Submitted 9 March, 2022; originally announced March 2022.

Comments: 22 pages, 5 figures

Showing 1–5 of 5 results for author: Hoshino, S