Skip to main content

Showing 1–50 of 56 results for author: Kaneko, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03129  [pdf, other

    cs.CL

    Social Bias Evaluation for Large Language Models Requires Prompt Variations

    Authors: Rem Hida, Masahiro Kaneko, Naoaki Okazaki

    Abstract: Warning: This paper contains examples of stereotypes and biases. Large Language Models (LLMs) exhibit considerable social biases, and various studies have tried to evaluate and mitigate these biases accurately. Previous studies use downstream tasks as prompts to examine the degree of social biases for evaluation and mitigation. While LLMs' output highly depends on prompts, previous studies evaluat… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2404.11262  [pdf, other

    cs.CL

    Sampling-based Pseudo-Likelihood for Membership Inference Attacks

    Authors: Masahiro Kaneko, Youmi Ma, Yuki Wata, Naoaki Okazaki

    Abstract: Large Language Models (LLMs) are trained on large-scale web data, which makes it difficult to grasp the contribution of each text. This poses the risk of leaking inappropriate data such as benchmarks, personal information, and copyrighted texts in the training data. Membership Inference Attacks (MIA), which determine whether a given text is included in the model's training data, have been attracti… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  3. arXiv:2403.16139  [pdf, other

    cs.CL

    A Little Leak Will Sink a Great Ship: Survey of Transparency for Large Language Models from Start to Finish

    Authors: Masahiro Kaneko, Timothy Baldwin

    Abstract: Large Language Models (LLMs) are trained on massive web-crawled corpora. This poses risks of leakage, including personal information, copyrighted texts, and benchmark datasets. Such leakage leads to undermining human trust in AI due to potential unauthorized generation of content or overestimation of performance. We establish the following three criteria concerning the leakage issues: (1) leakage… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  4. arXiv:2403.14159  [pdf, other

    cs.RO math.OC

    Robust Locomotion via Zero-order Stochastic Nonlinear Model Predictive Control with Guard Saltation Matrix

    Authors: Sotaro Katayama, Noriaki Takasugi, Mitsuhisa Kaneko, Norio Nagatsuka, and Masaya Kinoshita

    Abstract: This paper presents a stochastic/robust nonlinear model predictive control (NMPC) to enhance the robustness of legged locomotion against contact uncertainties. We integrate the contact uncertainties into the covariance propagation of stochastic/robust NMPC framework by leveraging the guard saltation matrix and an extended Kalman filter-like covariance update. We achieve fast stochastic/robust NMPC… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 8 pages, 8 figures

  5. arXiv:2402.15987  [pdf, other

    cs.CL cs.AI

    Likelihood-based Mitigation of Evaluation Bias in Large Language Models

    Authors: Masanari Ohi, Masahiro Kaneko, Ryuto Koike, Mengsay Loem, Naoaki Okazaki

    Abstract: Large Language Models (LLMs) are widely used to evaluate natural language generation tasks as automated metrics. However, the likelihood, a measure of LLM's plausibility for a sentence, can vary due to superficial differences in sentences, such as word order and sentence structure. It is therefore possible that there might be a likelihood bias if LLMs are used for evaluation: they might overrate s… ▽ More

    Submitted 1 March, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

    Comments: 4 main pages

  6. arXiv:2402.14258  [pdf, other

    cs.CL

    Eagle: Ethical Dataset Given from Real Interactions

    Authors: Masahiro Kaneko, Danushka Bollegala, Timothy Baldwin

    Abstract: Recent studies have demonstrated that large language models (LLMs) have ethical-related problems such as social biases, lack of moral reasoning, and generation of offensive content. The existing evaluation metrics and methods to address these ethical challenges use datasets intentionally created by instructing humans to create instances including ethical problems. Therefore, the data does not refl… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  7. arXiv:2401.15585  [pdf, other

    cs.CL

    Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting

    Authors: Masahiro Kaneko, Danushka Bollegala, Naoaki Okazaki, Timothy Baldwin

    Abstract: There exist both scalable tasks, like reading comprehension and fact-checking, where model performance improves with model size, and unscalable tasks, like arithmetic reasoning and symbolic reasoning, where model performance does not necessarily improve with model size. Large language models (LLMs) equipped with Chain-of-Thought (CoT) prompting are able to make accurate incremental predictions eve… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  8. arXiv:2401.08511  [pdf, other

    cs.CL

    The Gaps between Pre-train and Downstream Settings in Bias Evaluation and Debiasing

    Authors: Masahiro Kaneko, Danushka Bollegala, Timothy Baldwin

    Abstract: The output tendencies of Pre-trained Language Models (PLM) vary markedly before and after Fine-Tuning (FT) due to the updates to the model parameters. These divergences in output tendencies result in a gap in the social biases of PLMs. For example, there exits a low correlation between intrinsic bias scores of a PLM and its extrinsic bias scores under FT-based debiasing methods. Additionally, appl… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  9. arXiv:2312.08668  [pdf, other

    cs.RO

    Versatile Telescopic-Wheeled-Legged Locomotion of Tachyon 3 via Full-Centroidal Nonlinear Model Predictive Control

    Authors: Sotaro Katayama, Noriaki Takasugi, Mitsuhisa Kaneko, Masaya Kinoshita

    Abstract: This paper presents a nonlinear model predictive control (NMPC) toward versatile motion generation for the telescopic-wheeled-legged robot Tachyon 3, the unique hardware structure of which poses challenges in control and motion planning. We apply the full-centroidal NMPC formulation with dedicated constraints that can capture the accurate kinematics and dynamics of Tachyon 3. We have developed a c… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: 8 pages, 9 figures

  10. arXiv:2311.08369  [pdf, other

    cs.CL

    How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection

    Authors: Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki

    Abstract: To combat the misuse of Large Language Models (LLMs), many recent studies have presented LLM-generated-text detectors with promising performance. When users instruct LLMs to generate texts, the instruction can include different constraints depending on the user's need. However, most recent studies do not cover such diverse instruction patterns when creating datasets for LLM detection. In this pape… ▽ More

    Submitted 12 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: under review

  11. arXiv:2311.08107  [pdf, other

    cs.CL

    SAIE Framework: Support Alone Isn't Enough -- Advancing LLM Training with Adversarial Remarks

    Authors: Mengsay Loem, Masahiro Kaneko, Naoaki Okazaki

    Abstract: Large Language Models (LLMs) can justify or critique their predictions through discussions with other models or humans, thereby enriching their intrinsic understanding of instances. While proactive discussions in the inference phase have been shown to boost performance, such interactions have not been extensively explored during the training phase. We hypothesize that incorporating interactive dis… ▽ More

    Submitted 29 February, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

  12. arXiv:2309.11439  [pdf, other

    cs.CL

    Controlled Generation with Prompt Insertion for Natural Language Explanations in Grammatical Error Correction

    Authors: Masahiro Kaneko, Naoaki Okazaki

    Abstract: In Grammatical Error Correction (GEC), it is crucial to ensure the user's comprehension of a reason for correction. Existing studies present tokens, examples, and hints as to the basis for correction but do not directly explain the reasons for corrections. Although methods that use Large Language Models (LLMs) to provide direct explanations in natural language have been proposed for various tasks,… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: Work in progress

  13. arXiv:2309.09697  [pdf, other

    cs.CL

    Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels

    Authors: Panatchakorn Anantaprayoon, Masahiro Kaneko, Naoaki Okazaki

    Abstract: Discriminatory gender biases have been found in Pre-trained Language Models (PLMs) for multiple languages. In Natural Language Inference (NLI), existing bias evaluation methods have focused on the prediction results of one specific label out of three labels, such as neutral. However, such evaluation methods can be inaccurate since unique biased inferences are associated with unique prediction labe… ▽ More

    Submitted 18 May, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: LREC-COLING 2024

  14. arXiv:2309.09092  [pdf, other

    cs.CL

    The Impact of Debiasing on the Performance of Language Models in Downstream Tasks is Underestimated

    Authors: Masahiro Kaneko, Danushka Bollegala, Naoaki Okazaki

    Abstract: Pre-trained language models trained on large-scale data have learned serious levels of social biases. Consequently, various methods have been proposed to debias pre-trained models. Debiasing methods need to mitigate only discriminatory bias information from the pre-trained models, while retaining information that is useful for the downstream tasks. In previous research, whether useful information… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

    Comments: IJCNLP-AACL 2023

  15. arXiv:2309.07251  [pdf, other

    cs.CL

    In-Contextual Gender Bias Suppression for Large Language Models

    Authors: Daisuke Oba, Masahiro Kaneko, Danushka Bollegala

    Abstract: Despite their impressive performance in a wide range of NLP tasks, Large Language Models (LLMs) have been reported to encode worrying-levels of gender biases. Prior work has proposed debiasing methods that require human labelled examples, data augmentation and fine-tuning of LLMs, which are computationally costly. Moreover, one might not even have access to the model parameters for performing debi… ▽ More

    Submitted 20 February, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: EACL 2024 Findings - Long Paper

  16. arXiv:2307.11729  [pdf, other

    cs.CL

    OUTFOX: LLM-Generated Essay Detection Through In-Context Learning with Adversarially Generated Examples

    Authors: Ryuto Koike, Masahiro Kaneko, Naoaki Okazaki

    Abstract: Large Language Models (LLMs) have achieved human-level fluency in text generation, making it difficult to distinguish between human-written and LLM-generated texts. This poses a growing risk of misuse of LLMs and demands the development of detectors to identify LLM-generated texts. However, existing detectors lack robustness against attacks: they degrade detection accuracy by simply paraphrasing L… ▽ More

    Submitted 18 February, 2024; v1 submitted 21 July, 2023; originally announced July 2023.

    Comments: AAAI 2024 camera ready. Code and dataset available at https://github.com/ryuryukke/OUTFOX

  17. arXiv:2305.18156  [pdf, other

    cs.CL cs.AI

    Exploring Effectiveness of GPT-3 in Grammatical Error Correction: A Study on Performance and Controllability in Prompt-Based Methods

    Authors: Mengsay Loem, Masahiro Kaneko, Sho Takase, Naoaki Okazaki

    Abstract: Large-scale pre-trained language models such as GPT-3 have shown remarkable performance across various natural language processing tasks. However, applying prompt-based methods with GPT-3 for Grammatical Error Correction (GEC) tasks and their controllability remains underexplored. Controllability in GEC is crucial for real-world applications, particularly in educational settings, where the ability… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: Accepted in BEA 2023

  18. arXiv:2305.11862  [pdf, other

    cs.CL

    Reducing Sequence Length by Predicting Edit Operations with Large Language Models

    Authors: Masahiro Kaneko, Naoaki Okazaki

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance in various tasks and gained significant attention. LLMs are also used for local sequence transduction tasks, including grammatical error correction (GEC) and formality style transfer, where most tokens in a source text are kept unchanged. However, the models that generate all target tokens in such tasks have a tendency to simply… ▽ More

    Submitted 20 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: EMNLP2023

  19. arXiv:2305.11789  [pdf, other

    cs.CL

    Solving NLP Problems through Human-System Collaboration: A Discussion-based Approach

    Authors: Masahiro Kaneko, Graham Neubig, Naoaki Okazaki

    Abstract: Humans work together to solve common problems by having discussions, explaining, and agreeing or disagreeing with each other. Similarly, if a system can have discussions with humans when solving tasks, it can improve the system's performance and reliability. In previous research on explainability, it has only been possible for the system to make predictions and for humans to ask questions about th… ▽ More

    Submitted 30 January, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: EACL2024 Findings

  20. arXiv:2301.12074  [pdf, other

    cs.CL

    Comparing Intrinsic Gender Bias Evaluation Measures without using Human Annotated Examples

    Authors: Masahiro Kaneko, Danushka Bollegala, Naoaki Okazaki

    Abstract: Numerous types of social biases have been identified in pre-trained language models (PLMs), and various intrinsic bias evaluation measures have been proposed for quantifying those social biases. Prior works have relied on human annotated examples to compare existing intrinsic bias evaluation measures. However, this approach is not easily adaptable to different languages nor amenable to large scale… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

    Comments: EACL 2023

  21. arXiv:2210.02938  [pdf, other

    cs.CL

    Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks

    Authors: Masahiro Kaneko, Danushka Bollegala, Naoaki Okazaki

    Abstract: We study the relationship between task-agnostic intrinsic and task-specific extrinsic social bias evaluation measures for Masked Language Models (MLMs), and find that there exists only a weak correlation between these two types of evaluation measures. Moreover, we find that MLMs debiased using different methods still re-learn social biases during fine-tuning on downstream tasks. We identify the so… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: COLING 2022

  22. arXiv:2207.13354  [pdf, other

    cs.CL

    Are Neighbors Enough? Multi-Head Neural n-gram can be Alternative to Self-attention

    Authors: Mengsay Loem, Sho Takase, Masahiro Kaneko, Naoaki Okazaki

    Abstract: Impressive performance of Transformer has been attributed to self-attention, where dependencies between entire input in a sequence are considered at every position. In this work, we reform the neural $n$-gram model, which focuses on only several surrounding representations of each position, with the multi-head mechanism as in Vaswani et al.(2017). Through experiments on sequence-to-sequence tasks,… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

  23. arXiv:2205.09867  [pdf, other

    cs.CL

    Gender Bias in Meta-Embeddings

    Authors: Masahiro Kaneko, Danushka Bollegala, Naoaki Okazaki

    Abstract: Different methods have been proposed to develop meta-embeddings from a given set of source embeddings. However, the source embeddings can contain unfair gender-related biases, and how these influence the meta-embeddings has not been studied yet. We study the gender bias in meta-embeddings created under three different settings: (1) meta-embedding multiple sources without performing any debiasing (… ▽ More

    Submitted 6 October, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: Findings of EMNLP 2022

  24. arXiv:2205.00551  [pdf, other

    cs.CL

    Gender Bias in Masked Language Models for Multiple Languages

    Authors: Masahiro Kaneko, Aizhan Imankulova, Danushka Bollegala, Naoaki Okazaki

    Abstract: Masked Language Models (MLMs) pre-trained by predicting masked tokens on large corpora have been used successfully in natural language processing tasks for a variety of languages. Unfortunately, it was reported that MLMs also learn discriminative biases regarding attributes such as gender and race. Because most studies have focused on MLMs in English, the bias of MLMs in other languages has rarely… ▽ More

    Submitted 4 May, 2022; v1 submitted 1 May, 2022; originally announced May 2022.

    Comments: NAACL 2022

  25. arXiv:2203.07523  [pdf, other

    cs.CL

    Sense Embeddings are also Biased--Evaluating Social Biases in Static and Contextualised Sense Embeddings

    Authors: Yi Zhou, Masahiro Kaneko, Danushka Bollegala

    Abstract: Sense embedding learning methods learn different embeddings for the different senses of an ambiguous word. One sense of an ambiguous word might be socially biased while its other senses remain unbiased. In comparison to the numerous prior work evaluating the social biases in pretrained word embeddings, the biases in sense embeddings have been relatively understudied. We create a benchmark dataset… ▽ More

    Submitted 16 March, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: Accepted to ACL 2022

  26. arXiv:2203.07085  [pdf, other

    cs.CL

    Interpretability for Language Learners Using Example-Based Grammatical Error Correction

    Authors: Masahiro Kaneko, Sho Takase, Ayana Niwa, Naoaki Okazaki

    Abstract: Grammatical Error Correction (GEC) should not focus only on high accuracy of corrections but also on interpretability for language learning. However, existing neural-based GEC models mainly aim at improving accuracy, and their interpretability has not been explored. A promising approach for improving interpretability is an example-based method, which uses similar retrieved examples to generate cor… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: ACL 2022

  27. arXiv:2201.06199  [pdf, other

    cs.CL

    Proficiency Matters Quality Estimation in Grammatical Error Correction

    Authors: Yu** Takahashi, Masahiro Kaneko, Masato Mita, Mamoru Komachi

    Abstract: This study investigates how supervised quality estimation (QE) models of grammatical error correction (GEC) are affected by the learners' proficiency with the data. QE models for GEC evaluations in prior work have obtained a high correlation with manual evaluations. However, when functioning in a real-world context, the data used for the reported results have limitations because prior works were b… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

    Comments: 6 pages (4 pages + references)

  28. arXiv:2201.05313  [pdf, other

    cs.CL

    ExtraPhrase: Efficient Data Augmentation for Abstractive Summarization

    Authors: Mengsay Loem, Sho Takase, Masahiro Kaneko, Naoaki Okazaki

    Abstract: Neural models trained with large amount of parallel data have achieved impressive performance in abstractive summarization tasks. However, large-scale parallel corpora are expensive and challenging to construct. In this work, we introduce a low-cost and effective strategy, ExtraPhrase, to augment training data for abstractive summarization tasks. ExtraPhrase constructs pseudo training data in two… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

  29. arXiv:2104.08478  [pdf, other

    cs.CL

    Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation

    Authors: Seiichiro Kondo, Kengo Hotate, Masahiro Kaneko, Mamoru Komachi

    Abstract: Neural machine translation (NMT) has recently gained widespread attention because of its high translation accuracy. However, it shows poor performance in the translation of long sentences, which is a major issue in low-resource languages. It is assumed that this issue is caused by insufficient number of long sentences in the training data. Therefore, this study proposes a simple data augmentation… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

    Comments: 7 pages; camera-ready for NAACL Student Research Workshop 2021

  30. arXiv:2104.07848  [pdf, other

    cs.CL

    Comparison of Grammatical Error Correction Using Back-Translation Models

    Authors: Aomi Koyama, Kengo Hotate, Masahiro Kaneko, Mamoru Komachi

    Abstract: Grammatical error correction (GEC) suffers from a lack of sufficient parallel data. Therefore, GEC studies have developed various methods to generate pseudo data, which comprise pairs of grammatical and artificially produced ungrammatical sentences. Currently, a mainstream approach to generate pseudo data is back-translation (BT). Most previous GEC studies using BT have employed the same architect… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

    Comments: 10 pages; camera-ready for NAACL Student Research Workshop 2021

  31. arXiv:2104.07496  [pdf, other

    cs.CL

    Unmasking the Mask -- Evaluating Social Biases in Masked Language Models

    Authors: Masahiro Kaneko, Danushka Bollegala

    Abstract: Masked Language Models (MLMs) have shown superior performances in numerous downstream NLP tasks when used as text encoders. Unfortunately, MLMs also demonstrate significantly worrying levels of social biases. We show that the previously proposed evaluation metrics for quantifying the social biases in MLMs are problematic due to following reasons: (1) prediction accuracy of the masked tokens itself… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  32. arXiv:2104.07410  [pdf, other

    cs.CL

    Simultaneous Multi-Pivot Neural Machine Translation

    Authors: Raj Dabre, Aizhan Imankulova, Masahiro Kaneko, Abhisek Chakrabarty

    Abstract: Parallel corpora are indispensable for training neural machine translation (NMT) models, and parallel corpora for most language pairs do not exist or are scarce. In such cases, pivot language NMT can be helpful where a pivot language is used such that there exist parallel corpora between the source and pivot and pivot and target languages. Naturally, the quality of pivot language translation is mo… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

    Comments: preliminary work. pardon the messy writing and mistakes. will be submitted to emnlp after major overhaul

  33. arXiv:2101.09525  [pdf, other

    cs.CL

    Dictionary-based Debiasing of Pre-trained Word Embeddings

    Authors: Masahiro Kaneko, Danushka Bollegala

    Abstract: Word embeddings trained on large corpora have shown to encode high levels of unfair discriminatory gender, racial, religious and ethnic biases. In contrast, human-written dictionaries describe the meanings of words in a concise, objective and an unbiased manner. We propose a method for debiasing pre-trained word embeddings using dictionaries, without requiring access to the original training r… ▽ More

    Submitted 23 January, 2021; originally announced January 2021.

    Comments: EACL 2021

  34. arXiv:2101.09523  [pdf, other

    cs.CL

    Debiasing Pre-trained Contextualised Embeddings

    Authors: Masahiro Kaneko, Danushka Bollegala

    Abstract: In comparison to the numerous debiasing methods proposed for the static non-contextualised word embeddings, the discriminative biases in contextualised embeddings have received relatively little attention. We propose a fine-tuning method that can be applied at token- or sentence-levels to debias pre-trained contextualised embeddings. Our proposed method can be applied to any pre-trained contextual… ▽ More

    Submitted 23 January, 2021; originally announced January 2021.

    Comments: EACL 2021

  35. arXiv:2010.13094  [pdf, other

    cs.CL

    Autoencoding Improves Pre-trained Word Embeddings

    Authors: Masahiro Kaneko, Danushka Bollegala

    Abstract: Prior work investigating the geometry of pre-trained word embeddings have shown that word embeddings to be distributed in a narrow cone and by centering and projecting using principal component vectors one can increase the accuracy of a given set of pre-trained word embeddings. However, theoretically, this post-processing step is equivalent to applying a linear autoencoder to minimise the squared… ▽ More

    Submitted 27 October, 2020; v1 submitted 25 October, 2020; originally announced October 2020.

    Comments: COLING 2020

  36. arXiv:2010.03155  [pdf, other

    cs.CL

    A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction

    Authors: Masato Mita, Shun Kiyono, Masahiro Kaneko, Jun Suzuki, Kentaro Inui

    Abstract: Existing approaches for grammatical error correction (GEC) largely rely on supervised learning with manually created GEC datasets. However, there has been little focus on verifying and ensuring the quality of the datasets, and on how lower-quality data might affect GEC performance. We indeed found that there is a non-negligible amount of "noise" where errors were inappropriately edited or left unc… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: accepted by EMNLP 2020 (Findings)

  37. arXiv:2009.12275  [pdf, other

    cs.NI

    Energy Efficient Resource Allocation Optimization in Fog Radio Access Networks with Outdated Channel Knowledge

    Authors: Thi Ha Ly Dinh, Megumi Kaneko, Ellen Hidemi Fukuda, Lila Boukhatem

    Abstract: Fog Radio Access Networks (F-RAN) are gaining worldwide interests for enabling mobile edge computing for Beyond 5G. However, to realize the future real-time and delay-sensitive applications, F-RAN tailored radio resource allocation and interference management become necessary. This work investigates user association and beamforming issues for providing energy efficient F-RANs. We formulate the ene… ▽ More

    Submitted 25 September, 2020; originally announced September 2020.

  38. arXiv:2005.00987  [pdf, other

    cs.CL

    Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction

    Authors: Masahiro Kaneko, Masato Mita, Shun Kiyono, Jun Suzuki, Kentaro Inui

    Abstract: This paper investigates how to effectively incorporate a pre-trained masked language model (MLM), such as BERT, into an encoder-decoder (EncDec) model for grammatical error correction (GEC). The answer to this question is not as straightforward as one might expect because the previous common methods for incorporating a MLM into an EncDec model have potential drawbacks when applied to GEC. For exam… ▽ More

    Submitted 31 May, 2020; v1 submitted 3 May, 2020; originally announced May 2020.

    Comments: Accepted as a short paper to the 58th Annual Conference of the Association for Computational Linguistics (ACL-2020)

    Journal ref: Association for Computational Linguistics (ACL-2020)

  39. arXiv:2004.03180  [pdf, other

    cs.CL

    Towards Multimodal Simultaneous Neural Machine Translation

    Authors: Aizhan Imankulova, Masahiro Kaneko, Tosho Hirasawa, Mamoru Komachi

    Abstract: Simultaneous translation involves translating a sentence before the speaker's utterance is completed in order to realize real-time understanding in multiple languages. This task is significantly more challenging than the general full sentence translation because of the shortage of input information during decoding. To alleviate this shortage, we propose multimodal simultaneous neural machine trans… ▽ More

    Submitted 23 October, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: 10 pages; WMT 2020

  40. arXiv:1908.11313  [pdf, ps, other

    cs.NI cs.IT eess.SP math.OC

    Power and Beam Optimization for Uplink Millimeter-Wave Hotspot Communication Systems

    Authors: Rafail Ismayilov, Bernd Holfeld, Renato L. G. Cavalcante, Megumi Kaneko

    Abstract: We propose an effective interference management and beamforming mechanism for uplink communication systems that yields fair allocation of rates. In particular, we consider a hotspot area of a millimeter-wave (mmWave) access network consisting of multiple user equipment (UE) in the uplink and multiple access points (APs) with directional antennas and adjustable beam widths and directions (beam conf… ▽ More

    Submitted 9 August, 2021; v1 submitted 29 August, 2019; originally announced August 2019.

  41. arXiv:1906.00742  [pdf, other

    cs.CL cs.LG

    Gender-preserving Debiasing for Pre-trained Word Embeddings

    Authors: Masahiro Kaneko, Danushka Bollegala

    Abstract: Word embeddings learnt from massive text collections have demonstrated significant levels of discriminative biases such as gender, racial or ethnic biases, which in turn bias the down-stream NLP applications that use those word embeddings. Taking gender-bias as a working example, we propose a debiasing method that preserves non-discriminative gender-related information, while removing stereotypica… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

    Comments: Accepted as a long paper to the 57th Annual Conference of the Association for Computational Linguistics (ACL-2019)

    Journal ref: Association for Computational Linguistics (ACL-2019)

  42. arXiv:1905.01312  [pdf, other

    cs.CV

    TriDepth: Triangular Patch-based Deep Depth Prediction

    Authors: Masaya Kaneko, Ken Sakurada, Kiyoharu Aizawa

    Abstract: We propose a novel and efficient representation for single-view depth estimation using Convolutional Neural Networks (CNNs). Point-cloud is generally used for CNN-based 3D scene reconstruction; however it has some drawbacks: (1) it is redundant as a representation for planar surfaces, and (2) no spatial relationships between points are available (e.g, texture and surface). As a more efficient repr… ▽ More

    Submitted 11 March, 2020; v1 submitted 3 May, 2019; originally announced May 2019.

    Comments: Project webpage: https://meshdepth.github.io/

  43. arXiv:1904.11303  [pdf, ps, other

    cs.NI eess.SP

    Joint Allocation Strategies of Power and Spreading Factors with Imperfect Orthogonality in LoRa Networks

    Authors: Licia Amichi, Megumi Kaneko, Ellen Hidemi Fukuda, Nancy El Rachkidy, Alexandre Guitton

    Abstract: The LoRa physical layer is one of the most promising Low Power Wide-Area Network (LPWAN) technologies for future Internet of Things (IoT) applications. It provides a flexible adaptation of coverage and data rate by allocating different Spreading Factors (SFs) and transmit powers to end-devices. We focus on improving throughput fairness while reducing energy consumption. Whereas most existing metho… ▽ More

    Submitted 25 April, 2019; originally announced April 2019.

    Comments: 30 pages

  44. arXiv:1904.07334  [pdf, other

    cs.CL

    Multi-Head Multi-Layer Attention to Deep Language Representations for Grammatical Error Detection

    Authors: Masahiro Kaneko, Mamoru Komachi

    Abstract: It is known that a deep neural network model pre-trained with large-scale data greatly improves the accuracy of various tasks, especially when there are resource constraints. However, the information needed to solve a given task can vary, and simply using the output of the final layer is not necessarily sufficient. Moreover, to our knowledge, exploiting large language representation models to dete… ▽ More

    Submitted 15 April, 2019; originally announced April 2019.

    Comments: 12 pages; CICLing 2019

  45. arXiv:1904.02927  [pdf, other

    cs.CL

    Cross-Corpora Evaluation and Analysis of Grammatical Error Correction Models --- Is Single-Corpus Evaluation Enough?

    Authors: Masato Mita, Tomoya Mizumoto, Masahiro Kaneko, Ryo Nagata, Kentaro Inui

    Abstract: This study explores the necessity of performing cross-corpora evaluation for grammatical error correction (GEC) models. GEC models have been previously evaluated based on a single commonly applied corpus: the CoNLL-2014 benchmark. However, the evaluation remains incomplete because the task difficulty varies depending on the test corpus and conditions such as the proficiency levels of the writers a… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: accepted by NAACL-HLT 2019

  46. arXiv:1902.10388  [pdf, ps, other

    cs.NI

    Interference Management in NOMA-based Fog-Radio Access Networks via Joint Scheduling and Power Adaptation

    Authors: Itsikiantsoa Randrianantenaina, Megumi Kaneko, Hayssam Dahrouj, Hesham ElSawy, Mohamed-Slim Alouini

    Abstract: Non-Orthogonal Multiple Access (NOMA) and Fog Radio Access Networks (FRAN) are promising candidates within the 5G and beyond systems. This work examines the benefit of adopting NOMA in an FRAN architecture with constrained capacity fronthaul. The paper proposes methods for optimizing joint scheduling and power adaptation in the downlink of a NOMA-based FRAN with multiple resource blocks (RB). We c… ▽ More

    Submitted 27 February, 2019; originally announced February 2019.

  47. arXiv:1811.03752  [pdf, other

    cs.SE cs.LG

    DeepSaucer: Unified Environment for Verifying Deep Neural Networks

    Authors: Naoto Sato, Hironobu Kuruma, Masanori Kaneko, Yuichiroh Nakagawa, Hideto Ogawa, Thai Son Hoang, Michael Butler

    Abstract: In recent years, a number of methods for verifying DNNs have been developed. Because the approaches of the methods differ and have their own limitations, we think that a number of verification methods should be applied to a developed DNN. To apply a number of methods to the DNN, it is necessary to translate either the implementation of the DNN or the verification method so that one runs in the sam… ▽ More

    Submitted 8 November, 2018; originally announced November 2018.

  48. arXiv:1809.09470  [pdf, ps, other

    cs.NI

    SS5G: Collision Resolution Protocol for Delay and Energy Efficient LoRa Networks

    Authors: Nancy El Rachkidy, Alexandre Guitton, Megumi Kaneko

    Abstract: Future 5G and Internet of Things (IoT) applications will heavily rely on long-range communication technologies such as low-power wireless area networks (LPWANs). In particular, LoRaWAN built on LoRa physical layer is gathering increasing interests, both from academia and industries, for enabling low-cost energy efficient IoT wireless sensor networks for, e.g., environmental monitoring over wide ar… ▽ More

    Submitted 21 September, 2018; originally announced September 2018.

    Comments: arXiv admin note: text overlap with arXiv:1804.00503

  49. arXiv:1805.12361  [pdf, other

    cs.NI

    Topology Control for Energy-Efficient Localization in Mobile Underwater Sensor Networks using Stackelberg Game

    Authors: Yali Yuan, Chencheng Liang, Megumi Kaneko, Xu Chen, Dieter Hogrefe

    Abstract: The characteristics of mobile Underwater Sensor Networks (UWSNs), such as low communication bandwidth, large propagation delay, and sparse deployment, pose challenging issues for successful localization of sensor nodes. In addition, sensor nodes in UWSNs are usually powered by batteries whose replacements introduce high cost and complexity. Thus, the critical problem in UWSNs is to enable each sen… ▽ More

    Submitted 31 May, 2018; originally announced May 2018.

  50. arXiv:1804.00503  [pdf, ps, other

    cs.NI cs.PF

    Decoding Superposed LoRa Signals

    Authors: Nancy El Rachkidy, Alexandre Guitton, Megumi Kaneko

    Abstract: Long-range low-power wireless communications, such as LoRa, are used in many IoT and environmental monitoring applications. They typically increase the communication range to several kilometers, at the cost of reducing the bitrate to a few bits per seconds. Collisions further reduce the performance of these communications. In this paper, we propose two algorithms to decode colliding signals: one a… ▽ More

    Submitted 21 March, 2018; originally announced April 2018.