Skip to main content

Showing 1–4 of 4 results for author: Yin, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.16071  [pdf, other

    cs.AI cs.CV cs.MM

    Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization

    Authors: Linzhi Wu, Xingyu Zhang, Yakun Zhang, Changyan Zheng, Tiejun Liu, Liang Xie, Ye Yan, Erwei Yin

    Abstract: Lip reading, the process of interpreting silent speech from visual lip movements, has gained rising attention for its wide range of realistic applications. Deep learning approaches greatly improve current lip reading systems. However, lip reading in cross-speaker scenarios where the speaker identity changes, poses a challenging problem due to inter-speaker variability. A well-trained lip reading s… ▽ More

    Submitted 2 May, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

    Comments: To appear in LREC-COLING 2024

    Journal ref: The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

  2. arXiv:2311.02817  [pdf, other

    cs.RO

    Safe-VLN: Collision Avoidance for Vision-and-Language Navigation of Autonomous Robots Operating in Continuous Environments

    Authors: Lu Yue, Dongliang Zhou, Liang Xie, Feitian Zhang, Ye Yan, Erwei Yin

    Abstract: The task of vision-and-language navigation in continuous environments (VLN-CE) aims at training an autonomous agent to perform low-level actions to navigate through 3D continuous surroundings using visual observations and language instructions. The significant potential of VLN-CE for mobile robots has been demonstrated across a large number of studies. However, most existing works in VLN-CE focus… ▽ More

    Submitted 11 April, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

  3. arXiv:2308.12587  [pdf, other

    cs.CV

    Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language Navigation

    Authors: Yibo Cui, Liang Xie, Yakun Zhang, Meishan Zhang, Ye Yan, Erwei Yin

    Abstract: Cross-modal alignment is one key challenge for Vision-and-Language Navigation (VLN). Most existing studies concentrate on map** the global instruction or single sub-instruction to the corresponding trajectory. However, another critical problem of achieving fine-grained alignment at the entity level is seldom considered. To address this problem, we propose a novel Grounded Entity-Landmark Adaptiv… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: ICCV 2023 Oral

  4. A Novel Method for Comparative Analysis of DNA Sequences by Ramanujan-Fourier Transform

    Authors: Changchuan Yin, Xuemeng E. Yin, Jiasong Wang

    Abstract: Alignment-free sequence analysis approaches provide important alternatives over multiple sequence alignment (MSA) in biological sequence analysis because alignment-free approaches have low computation complexity and are not dependent on high level of sequence identity, however, most of the existing alignment-free methods do not employ true full information content of sequences and thus can not acc… ▽ More

    Submitted 27 June, 2014; v1 submitted 6 March, 2014; originally announced March 2014.

    Comments: Ramanujan-Fourier transform and DNA sequences

    MSC Class: 42A16