Skip to main content

Showing 1–1 of 1 results for author: Dong, D T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.07766  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Vision-Language Integration in Multimodal Video Transformers (Partially) Aligns with the Brain

    Authors: Dota Tianai Dong, Mariya Toneva

    Abstract: Integrating information from multiple modalities is arguably one of the essential prerequisites for grounding artificial intelligence systems with an understanding of the real world. Recent advances in video transformers that jointly learn from vision, text, and sound over time have made some progress toward this goal, but the degree to which these models integrate information from modalities stil… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.