Skip to main content

Showing 1–1 of 1 results for author: Abdullaeva, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.06212  [pdf, other

    cs.CV cs.AI cs.LG

    OmniFusion Technical Report

    Authors: Elizaveta Goncharova, Anton Razzhigaev, Matvey Mikhalchuk, Maxim Kurkin, Irina Abdullaeva, Matvey Skripkin, Ivan Oseledets, Denis Dimitrov, Andrey Kuznetsov

    Abstract: Last year, multimodal architectures served up a revolution in AI-based approaches and solutions, extending the capabilities of large language models (LLM). We propose an \textit{OmniFusion} model based on a pretrained LLM and adapters for visual modality. We evaluated and compared several architecture design principles for better text and visual data coupling: MLP and transformer adapters, various… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 17 pages, 4 figures, 9 tables, 2 appendices

    MSC Class: 6804; 68T50 (Primary) ACM Class: I.2.7; I.2.10; I.4.9