Skip to main content

Showing 1–2 of 2 results for author: Fashandi, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.07115  [pdf, other

    cs.SD cs.CV cs.LG cs.MM eess.AS

    Getting More for Less: Using Weak Labels and AV-Mixup for Robust Audio-Visual Speaker Verification

    Authors: Anith Selvakumar, Homa Fashandi

    Abstract: Distance Metric Learning (DML) has typically dominated the audio-visual speaker verification problem space, owing to strong performance in new and unseen classes. In our work, we explored multitask learning techniques to further enhance DML, and show that an auxiliary task with even weak labels can increase the quality of the learned speaker representation without increasing model complexity durin… ▽ More

    Submitted 13 June, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: Accepted to INTERSPEECH 2024

  2. arXiv:2106.04413  [pdf, other

    cs.CV cs.LG

    Stochastic Whitening Batch Normalization

    Authors: Shengdong Zhang, Ehsan Nezhadarya, Homa Fashandi, Jiayi Liu, Darin Graham, Mohak Shah

    Abstract: Batch Normalization (BN) is a popular technique for training Deep Neural Networks (DNNs). BN uses scaling and shifting to normalize activations of mini-batches to accelerate convergence and improve generalization. The recently proposed Iterative Normalization (IterNorm) method improves these properties by whitening the activations iteratively using Newton's method. However, since Newton's method i… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: Accepted to the Main Conference of CVPR 2021