Skip to main content

Showing 1–8 of 8 results for author: Mall, U

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.00955  [pdf, other

    cs.CV

    How Video Meetings Change Your Expression

    Authors: Sumit Sarin, Utkarsh Mall, Purva Tendulkar, Carl Vondrick

    Abstract: Do our facial expressions change when we speak over video calls? Given two unpaired sets of videos of people, we seek to automatically find spatio-temporal patterns that are distinctive of each set. Existing methods use discriminative approaches and perform post-hoc explainability analysis. Such methods are insufficient as they are unable to provide insights beyond obvious dataset biases, and the… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: Project webpage is available at: https://facet.cs.columbia.edu

  2. arXiv:2404.09941  [pdf, other

    cs.CV cs.AI

    Evolving Interpretable Visual Classifiers with Large Language Models

    Authors: Mia Chiquier, Utkarsh Mall, Carl Vondrick

    Abstract: Multimodal pre-trained models, such as CLIP, are popular for zero-shot classification due to their open-vocabulary flexibility and high performance. However, vision-language models, which compute similarity scores between images and class labels, are largely black-box, with limited interpretability, risk for bias, and inability to discover new visual concepts not written down. Moreover, in practic… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  3. arXiv:2312.06960  [pdf, other

    cs.CV cs.LG

    Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment

    Authors: Utkarsh Mall, Cheng Perng Phoo, Meilin Kelsey Liu, Carl Vondrick, Bharath Hariharan, Kavita Bala

    Abstract: We introduce a method to train vision-language models for remote-sensing images without using any textual annotations. Our key insight is to use co-located internet imagery taken on the ground as an intermediary for connecting remote-sensing images and language. Specifically, we train an image encoder for remote sensing images to align with the image encoder of CLIP using a large amount of paired… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  4. arXiv:2108.10967  [pdf, other

    cs.CV

    Field-Guide-Inspired Zero-Shot Learning

    Authors: Utkarsh Mall, Bharath Hariharan, Kavita Bala

    Abstract: Modern recognition systems require large amounts of supervision to achieve accuracy. Adapting to new domains requires significant data from experts, which is onerous and can become too expensive. Zero-shot learning requires an annotated set of attributes for a novel category. Annotating the full set of attributes for a novel category proves to be a tedious and expensive task in deployment. This is… ▽ More

    Submitted 24 August, 2021; originally announced August 2021.

    Comments: Accepted to ICCV 2021

  5. arXiv:2103.17070  [pdf, other

    cs.CV

    PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering

    Authors: Jang Hyun Cho, Utkarsh Mall, Kavita Bala, Bharath Hariharan

    Abstract: We present a new framework for semantic segmentation without annotations via clustering. Off-the-shelf clustering methods are limited to curated, single-label, and object-centric images yet real-world data are dominantly uncurated, multi-label, and scene-centric. We extend clustering from images to pixels and assign separate cluster membership to different instances within each image. However, sol… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

    Comments: CVPR 2021

  6. arXiv:2012.02897  [pdf, other

    cs.CV

    Discovering Underground Maps from Fashion

    Authors: Utkarsh Mall, Kavita Bala, Tamara Berg, Kristen Grauman

    Abstract: The fashion sense -- meaning the clothing styles people wear -- in a geographical region can reveal information about that region. For example, it can reflect the kind of activities people do there, or the type of crowds that frequently visit the region (e.g., tourist hot spot, student neighborhood, business center). We propose a method to automatically create underground neighborhood maps of citi… ▽ More

    Submitted 4 December, 2020; originally announced December 2020.

  7. arXiv:1908.11412  [pdf, other

    cs.CV

    GeoStyle: Discovering Fashion Trends and Events

    Authors: Utkarsh Mall, Kevin Matzen, Bharath Hariharan, Noah Snavely, Kavita Bala

    Abstract: Understanding fashion styles and trends is of great potential interest to retailers and consumers alike. The photos people upload to social media are a historical and public data source of how people dress across the world and at different times. While we now have tools to automatically recognize the clothing and style attributes of what people are wearing in these photographs, we lack the ability… ▽ More

    Submitted 29 August, 2019; originally announced August 2019.

    Comments: Accepted in ICCV 2019

  8. arXiv:1712.03380  [pdf, other

    cs.GR cs.CV

    A Deep Recurrent Framework for Cleaning Motion Capture Data

    Authors: Utkarsh Mall, G. Roshan Lal, Siddhartha Chaudhuri, Parag Chaudhuri

    Abstract: We present a deep, bidirectional, recurrent framework for cleaning noisy and incomplete motion capture data. It exploits temporal coherence and joint correlations to infer adaptive filters for each joint in each frame. A single model can be trained to denoise a heterogeneous mix of action types, under substantial amounts of noise. A signal that has both noise and gaps is preprocessed with a second… ▽ More

    Submitted 9 December, 2017; originally announced December 2017.