Skip to main content

Showing 1–9 of 9 results for author: Banaei, M

.
  1. arXiv:2405.17604  [pdf, other

    cs.LG cs.AI cs.CL

    LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters

    Authors: Klaudia Bałazy, Mohammadreza Banaei, Karl Aberer, Jacek Tabor

    Abstract: The recent trend in scaling language models has led to a growing demand for parameter-efficient tuning (PEFT) methods such as LoRA (Low-Rank Adaptation). LoRA consistently matches or surpasses the full fine-tuning baseline with fewer parameters. However, handling numerous task-specific or user-specific LoRA modules on top of a base model still presents significant storage challenges. To address th… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  2. arXiv:2401.05894  [pdf

    eess.SY

    A Lightweight Energy Management Method for Hybrid PV/Battery/Load Systems

    Authors: Mohsen Banaei, Razgar Ebrahimy, Henrik Madsen

    Abstract: In this paper, a computationally lightweight algorithm is introduced for hybrid PV/Battery/Load systems that is price responsive, responds fast, does not require powerful hardware, and considers the operational limitations of the system. The method is applied to two buildings equipped with PV and battery. Simulation results show that the method can give results that are up to 3.9% more expensive t… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  3. arXiv:2310.15258  [pdf, other

    cs.CL

    Breaking the Language Barrier: Improving Cross-Lingual Reasoning with Structured Self-Attention

    Authors: Negar Foroutan, Mohammadreza Banaei, Karl Aberer, Antoine Bosselut

    Abstract: In this work, we study whether multilingual language models (MultiLMs) can transfer logical reasoning abilities to other languages when they are fine-tuned for reasoning in a different language. We evaluate the cross-lingual reasoning abilities of MultiLMs in two schemes: (1) where the language of the context and the question remain the same in the new languages that are tested (i.e., the reasonin… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 - Findings

  4. arXiv:2310.08712  [pdf

    eess.SY

    Nash Equilibrium of Joint Day-ahead Electricity Markets and Forward Contracts in Congested Power Systems

    Authors: Mohsen Banaei, Majid Oloomi Buygi, Hani Raouf-Sheybani, Razgar Ebrahimy, Henrik Madsen

    Abstract: Uncertainty in the output power of large-scale wind power plants (WPPs) can face the electricity market players with undesirable profit variations. Market players can hedge themselves against these risks by participating in forward contracts markets alongside the day-ahead markets. The participation of market players in these two markets affects their profits and also the prices and power quantiti… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  5. arXiv:2302.04045  [pdf, other

    cs.CL

    Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models

    Authors: Mohammadreza Banaei, Klaudia Bałazy, Artur Kasymov, Rémi Lebret, Jacek Tabor, Karl Aberer

    Abstract: Recent transformer language models achieve outstanding results in many natural language processing (NLP) tasks. However, their enormous size often makes them impractical on memory-constrained devices, requiring practitioners to compress them to smaller networks. In this paper, we explore offline compression methods, meaning computationally-cheap approaches that do not require further fine-tuning o… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

  6. arXiv:2205.12672  [pdf, other

    cs.CL

    Discovering Language-neutral Sub-networks in Multilingual Language Models

    Authors: Negar Foroutan, Mohammadreza Banaei, Remi Lebret, Antoine Bosselut, Karl Aberer

    Abstract: Multilingual pre-trained language models transfer remarkably well on cross-lingual downstream tasks. However, the extent to which they learn language-neutral representations (i.e., shared representations that encode similar phenomena across languages), and the effect of such representations on cross-lingual transfer performance, remain open questions. In this work, we conceptualize language neutra… ▽ More

    Submitted 30 October, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

  7. arXiv:2203.16162  [pdf, other

    cs.LG cs.SI

    AdaGrid: Adaptive Grid Search for Link Prediction Training Objective

    Authors: Tim Poštuvan, Jiaxuan You, Mohammadreza Banaei, Rémi Lebret, Jure Leskovec

    Abstract: One of the most important factors that contribute to the success of a machine learning model is a good training objective. Training objective crucially influences the model's performance and generalization capabilities. This paper specifically focuses on graph neural network training objective for link prediction, which has not been explored in the existing literature. Here, the training objective… ▽ More

    Submitted 8 May, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

  8. Direction is what you need: Improving Word Embedding Compression in Large Language Models

    Authors: Klaudia Bałazy, Mohammadreza Banaei, Rémi Lebret, Jacek Tabor, Karl Aberer

    Abstract: The adoption of Transformer-based models in natural language processing (NLP) has led to great success using a massive number of parameters. However, due to deployment constraints in edge devices, there has been a rising interest in the compression of these models to improve their inference time and memory footprint. This paper presents a novel loss objective to compress token embeddings in the Tr… ▽ More

    Submitted 3 August, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

  9. arXiv:2006.03564  [pdf, other

    cs.CL cs.LG

    Spoken dialect identification in Twitter using a multi-filter architecture

    Authors: Mohammadreza Banaei, Rémi Lebret, Karl Aberer

    Abstract: This paper presents our approach for SwissText & KONVENS 2020 shared task 2, which is a multi-stage neural model for Swiss German (GSW) identification on Twitter. Our model outputs either GSW or non-GSW and is not meant to be used as a generic language identifier. Our architecture consists of two independent filters where the first one favors recall, and the second one filter favors precision (bot… ▽ More

    Submitted 5 June, 2020; originally announced June 2020.