Skip to main content

Showing 1–13 of 13 results for author: Khayatkhoei, M

.
  1. arXiv:2402.10401  [pdf, other

    cs.LG cs.CV

    ManiFPT: Defining and Analyzing Fingerprints of Generative Models

    Authors: Hae ** Song, Mahyar Khayatkhoei, Wael AbdAlmageed

    Abstract: Recent works have shown that generative models leave traces of their underlying generative process on the generated samples, broadly referred to as fingerprints of a generative model, and have studied their utility in detecting synthetic images from real ones. However, the extend to which these fingerprints can distinguish between various types of synthetic image and help identify the underlying g… ▽ More

    Submitted 29 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: Accepted to CVPR 2024

  2. arXiv:2402.07384  [pdf, other

    cs.CV cs.AI cs.LG

    Exploring Perceptual Limitation of Multimodal Large Language Models

    Authors: Jiarui Zhang, **yi Hu, Mahyar Khayatkhoei, Filip Ilievski, Maosong Sun

    Abstract: Multimodal Large Language Models (MLLMs) have recently shown remarkable perceptual capability in answering visual questions, however, little is known about the limits of their perception. In particular, while prior works have provided anecdotal evidence of MLLMs' sensitivity to object size, this phenomenon and its underlying causes have not been explored comprehensively. In this work, we quantitat… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

    Comments: 14 pages, 14 figures, 3 tables

  3. arXiv:2311.17088  [pdf, other

    cs.CV

    Unsupervised Multimodal Deepfake Detection Using Intra- and Cross-Modal Inconsistencies

    Authors: Mulin Tian, Mahyar Khayatkhoei, Joe Mathai, Wael AbdAlmageed

    Abstract: Deepfake videos present an increasing threat to society with potentially negative impact on criminal justice, democracy, and personal safety and privacy. Meanwhile, detecting deepfakes, at scale, remains a very challenging task that often requires labeled training data from existing deepfake generation methods. Further, even the most accurate supervised deepfake detection methods do not generalize… ▽ More

    Submitted 20 June, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: 11 pages, 3 figures, 3 tables

  4. arXiv:2311.07141  [pdf, other

    cs.LG cs.CY

    SABAF: Removing Strong Attribute Bias from Neural Networks with Adversarial Filtering

    Authors: Jiazhi Li, Mahyar Khayatkhoei, Jiageng Zhu, Hanchen Xie, Mohamed E. Hussein, Wael AbdAlmageed

    Abstract: Ensuring a neural network is not relying on protected attributes (e.g., race, sex, age) for prediction is crucial in advancing fair and trustworthy AI. While several promising methods for removing attribute bias in neural networks have been proposed, their limitations remain under-explored. To that end, in this work, we mathematically and empirically reveal the limitation of existing attribute bia… ▽ More

    Submitted 16 November, 2023; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: 35 pages, 18 figures, 32 tables. This work is an extended version of our paper (arXiv:2310.04955). Code will be released at https://github.com/jiazhi412/strong_attribute_bias

  5. arXiv:2310.16033  [pdf, other

    cs.CV cs.CL

    Towards Perceiving Small Visual Details in Zero-shot Visual Question Answering with Multimodal LLMs

    Authors: Jiarui Zhang, Mahyar Khayatkhoei, Prateek Chhikara, Filip Ilievski

    Abstract: Multimodal Large Language Models (MLLMs) have recently achieved promising zero-shot accuracy on visual question answering (VQA) -- a fundamental task affecting various downstream applications and domains. Given the great potential for the broad use of these models, it is important to investigate their limitations in dealing with different image and question properties. In this work, we investigate… ▽ More

    Submitted 12 February, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: 20 pages, 12 figures, 7 tables

  6. arXiv:2310.04955  [pdf, other

    cs.LG

    Information-Theoretic Bounds on The Removal of Attribute-Specific Bias From Neural Networks

    Authors: Jiazhi Li, Mahyar Khayatkhoei, Jiageng Zhu, Hanchen Xie, Mohamed E. Hussein, Wael AbdAlmageed

    Abstract: Ensuring a neural network is not relying on protected attributes (e.g., race, sex, age) for predictions is crucial in advancing fair and trustworthy AI. While several promising methods for removing attribute bias in neural networks have been proposed, their limitations remain under-explored. In this work, we mathematically and empirically reveal an important limitation of attribute bias removal me… ▽ More

    Submitted 16 November, 2023; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: 15 pages, 4 figures, 3 tables. To appear in Algorithmic Fairness through the Lens of Time Workshop at NeurIPS 2023

  7. arXiv:2308.05707  [pdf, other

    cs.LG cs.CV

    Shadow Datasets, New challenging datasets for Causal Representation Learning

    Authors: Jiageng Zhu, Hanchen Xie, Jianhua Wu, Jiazhi Li, Mahyar Khayatkhoei, Mohamed E. Hussein, Wael AbdAlmageed

    Abstract: Discovering causal relations among semantic factors is an emergent topic in representation learning. Most causal representation learning (CRL) methods are fully supervised, which is impractical due to costly labeling. To resolve this restriction, weakly supervised CRL methods were introduced. To evaluate CRL performance, four existing datasets, Pendulum, Flow, CelebA(BEARD) and CelebA(SMILE), are… ▽ More

    Submitted 11 August, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

  8. arXiv:2306.09618  [pdf, other

    cs.LG cs.CV

    Emergent Asymmetry of Precision and Recall for Measuring Fidelity and Diversity of Generative Models in High Dimensions

    Authors: Mahyar Khayatkhoei, Wael AbdAlmageed

    Abstract: Precision and Recall are two prominent metrics of generative performance, which were proposed to separately measure the fidelity and diversity of generative models. Given their central role in comparing and improving generative models, understanding their limitations are crucially important. To that end, in this work, we identify a critical flaw in the common approximation of these metrics using k… ▽ More

    Submitted 18 July, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: To appear in ICML 2023. Updated proof in Appendix B

  9. arXiv:2306.00228  [pdf, other

    cs.CV cs.AI cs.CL

    Using Visual Crop** to Enhance Fine-Detail Question Answering of BLIP-Family Models

    Authors: Jiarui Zhang, Mahyar Khayatkhoei, Prateek Chhikara, Filip Ilievski

    Abstract: Visual Question Answering is a challenging task, as it requires seamless interaction between perceptual, linguistic, and background knowledge systems. While the recent progress of visual and natural language models like BLIP has led to improved performance on this task, we lack understanding of the ability of such models to perform on different kinds of questions and reasoning types. As our initia… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: 16 pages, 5 figures, 7 tables

  10. arXiv:2305.07648  [pdf, other

    cs.CV

    A Critical View of Vision-Based Long-Term Dynamics Prediction Under Environment Misalignment

    Authors: Hanchen Xie, Jiageng Zhu, Mahyar Khayatkhoei, Jiazhi Li, Mohamed E. Hussein, Wael AbdAlmageed

    Abstract: Dynamics prediction, which is the problem of predicting future states of scene objects based on current and prior states, is drawing increasing attention as an instance of learning physics. To solve this problem, Region Proposal Convolutional Interaction Network (RPCIN), a vision-based model, was proposed and achieved state-of-the-art performance in long-term prediction. RPCIN only takes raw image… ▽ More

    Submitted 13 June, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: 14 pages, 5 figures, 10 tables. Accepted to ICML 2023

  11. arXiv:2010.01473  [pdf, other

    cs.LG eess.IV stat.ML

    Spatial Frequency Bias in Convolutional Generative Adversarial Networks

    Authors: Mahyar Khayatkhoei, Ahmed Elgammal

    Abstract: As the success of Generative Adversarial Networks (GANs) on natural images quickly propels them into various real-life applications across different domains, it becomes more and more important to clearly understand their limitations. Specifically, understanding GANs' capability across the full spectrum of spatial frequencies, i.e. beyond the low-frequency dominant spectrum of natural images, is cr… ▽ More

    Submitted 18 December, 2020; v1 submitted 3 October, 2020; originally announced October 2020.

  12. arXiv:1806.00880  [pdf, other

    cs.LG cs.CV stat.ML

    Disconnected Manifold Learning for Generative Adversarial Networks

    Authors: Mahyar Khayatkhoei, Ahmed Elgammal, Maneesh Singh

    Abstract: Natural images may lie on a union of disjoint manifolds rather than one globally connected manifold, and this can cause several difficulties for the training of common Generative Adversarial Networks (GANs). In this work, we first show that single generator GANs are unable to correctly model a distribution supported on a disconnected manifold, and investigate how sample quality, mode drop** and… ▽ More

    Submitted 10 January, 2019; v1 submitted 3 June, 2018; originally announced June 2018.

    Comments: NeurIPS 2018

  13. arXiv:1801.08607  [pdf, other

    cs.HC cs.CE

    Interactive Diversity Optimization of Environments

    Authors: Glen Berseth, Mahyar Khayatkhoei, Brandon Haworth, Muhammad Usman, Mubbasir Kapadia, Petros Faloutsos

    Abstract: The design of a building requires an architect to balance a wide range of constraints: aesthetic, geometric, usability, lighting, safety, etc. At the same time, there are often a multiplicity of diverse designs that can meet these constraints equally well. Architects must use their skills and artistic vision to explore these rich but highly constrained design spaces. A number of computer-aided des… ▽ More

    Submitted 22 January, 2018; originally announced January 2018.

    Comments: 20 pages