Skip to main content

Showing 1–50 of 91 results for author: Samaras, D

Searching in archive cs. Search in all archives.
.
  1. Predicting Visual Attention in Graphic Design Documents

    Authors: Souradeep Chakraborty, Zijun Wei, Conor Kelton, Seoyoung Ahn, Aruna Balasubramanian, Gregory J. Zelinsky, Dimitris Samaras

    Abstract: We present a model for predicting visual attention during the free viewing of graphic design documents. While existing works on this topic have aimed at predicting static saliency of graphic designs, our work is the first attempt to predict both spatial attention and dynamic temporal order in which the document regions are fixated by gaze using a deep learning based model. We propose a two-stage m… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Journal ref: IEEE Transactions on Multimedia 25 (2022): 4478-4493

  2. arXiv:2406.11077  [pdf, other

    cs.CV

    Learning Relighting and Intrinsic Decomposition in Neural Radiance Fields

    Authors: Yixiong Yang, Shilin Hu, Haoyu Wu, Ramon Baldrich, Dimitris Samaras, Maria Vanrell

    Abstract: The task of extracting intrinsic components, such as reflectance and shading, from neural radiance fields is of growing interest. However, current methods largely focus on synthetic scenes and isolated objects, overlooking the complexities of real scenes with backgrounds. To address this gap, our research introduces a method that combines relighting with intrinsic decomposition. By leveraging ligh… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Accepted by CVPR 2024 Workshop Neural Rendering Intelligence(NRI)

  3. arXiv:2406.02774  [pdf, other

    cs.CV

    Diffusion-Refined VQA Annotations for Semi-Supervised Gaze Following

    Authors: Qiaomu Miao, Alexandros Graikos, **gwei Zhang, Sounak Mondal, Minh Hoai, Dimitris Samaras

    Abstract: Training gaze following models requires a large number of images with gaze target coordinates annotated by human annotators, which is a laborious and inherently ambiguous process. We propose the first semi-supervised method for gaze following by introducing two novel priors to the task. We obtain the first prior using a large pretrained Visual Question Answering (VQA) model, where we compute Grad-… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  4. arXiv:2403.19920  [pdf, other

    cs.CV

    MI-NeRF: Learning a Single Face NeRF from Multiple Identities

    Authors: Aggelina Chatziagapi, Grigorios G. Chrysos, Dimitris Samaras

    Abstract: In this work, we introduce a method that learns a single dynamic neural radiance field (NeRF) from monocular talking face videos of multiple identities. NeRFs have shown remarkable results in modeling the 4D dynamics and appearance of human faces. However, they require per-identity optimization. Although recent approaches have proposed techniques to reduce the training and rendering time, increasi… ▽ More

    Submitted 2 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Project page: https://aggelinacha.github.io/MI-NeRF/

  5. arXiv:2403.17255  [pdf, other

    eess.IV cs.CV

    Decoding the visual attention of pathologists to reveal their level of expertise

    Authors: Souradeep Chakraborty, Dana Perez, Paul Friedman, Natallia Sheuka, Constantin Friedman, Oksana Yaskiv, Rajarsi Gupta, Gregory J. Zelinsky, Joel H. Saltz, Dimitris Samaras

    Abstract: We present a method for classifying the expertise of a pathologist based on how they allocated their attention during a cancer reading. We engage this decoding task by develo** a novel method for predicting the attention of pathologists as they read whole-slide Images (WSIs) of prostate and make cancer grade classifications. Our ground truth measure of a pathologists' attention is the x, y and z… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  6. arXiv:2403.11107  [pdf, other

    cs.CV

    Self-supervised co-salient object detection via feature correspondence at multiple scales

    Authors: Souradeep Chakraborty, Dimitris Samaras

    Abstract: Our paper introduces a novel two-stage self-supervised approach for detecting co-occurring salient objects (CoSOD) in image groups without requiring segmentation annotations. Unlike existing unsupervised methods that rely solely on patch-level information (e.g. clustering patch descriptors) or on computation heavy off-the-shelf components for CoSOD, our lightweight model leverages feature correspo… ▽ More

    Submitted 27 March, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  7. arXiv:2402.03723  [pdf, other

    cs.CV

    Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos

    Authors: Alfredo Rivero, ShahRukh Athar, Zhixin Shu, Dimitris Samaras

    Abstract: Creating controllable 3D human portraits from casual smartphone videos is highly desirable due to their immense value in AR/VR applications. The recent development of 3D Gaussian Splatting (3DGS) has shown improvements in rendering quality and training efficiency. However, it still remains a challenge to accurately model and disentangle head movements and facial expressions from a single-view capt… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  8. arXiv:2401.10236  [pdf, other

    math.NA cs.AR cs.CE

    MORCIC: Model Order Reduction Techniques for Electromagnetic Models of Integrated Circuits

    Authors: Dimitrios Garyfallou, Athanasios Stefanou, Christos Giamouzis, Moschos Antoniadis, Georgios Chararas, Konstantinos Chatzis, Dimitris Samaras, Rafaela Themeli, Anastasios Michailidis, Vasiliki Gogolou, Nikos Zachos, Nestor Evmorfopoulos, Thomas Noulis, Vasilis F. Pavlidis, Alkiviadis Hatzopoulos, Elpida Chatzineofytou, Yiannis Moisiadis

    Abstract: Model order reduction (MOR) is crucial for the design process of integrated circuits. Specifically, the vast amount of passive RLCk elements in electromagnetic models extracted from physical layouts exacerbates the extraction time, the storage requirements, and, most critically, the post-layout simulation time of the analyzed circuits. The MORCIC project aims to overcome this problem by proposing… ▽ More

    Submitted 14 November, 2023; originally announced January 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2311.08478

  9. arXiv:2312.15010  [pdf, other

    cs.CV

    SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

    Authors: Saarthak Kapse, Pushpak Pati, Srijan Das, **gwei Zhang, Chao Chen, Maria Vakalopoulou, Joel Saltz, Dimitris Samaras, Rajarsi R. Gupta, Prateek Prasanna

    Abstract: Introducing interpretability and reasoning into Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) analysis is challenging, given the complexity of gigapixel slides. Traditionally, MIL interpretability is limited to identifying salient regions deemed pertinent for downstream tasks, offering little insight to the end-user (pathologist) regarding the rationale behind these selectio… ▽ More

    Submitted 18 May, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  10. arXiv:2312.07330  [pdf, other

    cs.CV

    Learned representation-guided diffusion models for large-image generation

    Authors: Alexandros Graikos, Srikar Yellapragada, Minh-Quan Le, Saarthak Kapse, Prateek Prasanna, Joel Saltz, Dimitris Samaras

    Abstract: To synthesize high-fidelity samples, diffusion models typically require auxiliary data to guide the generation process. However, it is impractical to procure the painstaking patch-level annotation effort required in specialized domains like histopathology and satellite imagery; it is often performed by domain experts and involves hundreds of millions of patches. Modern-day self-supervised learning… ▽ More

    Submitted 28 March, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  11. arXiv:2311.06654  [pdf, other

    cs.CV

    Unsupervised and semi-supervised co-salient object detection via segmentation frequency statistics

    Authors: Souradeep Chakraborty, Shujon Naha, Muhammet Bastan, Amit Kumar K C, Dimitris Samaras

    Abstract: In this paper, we address the detection of co-occurring salient objects (CoSOD) in an image group using frequency statistics in an unsupervised manner, which further enable us to develop a semi-supervised method. While previous works have mostly focused on fully supervised CoSOD, less attention has been allocated to detecting co-salient objects when limited segmentation annotations are available f… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

    Comments: Accepted at IEEE WACV 2024

  12. arXiv:2310.05183  [pdf, other

    cs.CE

    ChiMera: Learning with noisy labels by contrasting mixed-up augmentations

    Authors: Zixuan Liu, Xin Zhang, Junjun He, Dan Fu, Dimitris Samaras, Robby Tan, Xiao Wang, Sheng Wang

    Abstract: Learning with noisy labels has been studied to address incorrect label annotations in real-world applications. In this paper, we present ChiMera, a two-stage learning-from-noisy-labels framework based on semi-supervised learning, developed based on a novel contrastive learning technique MixCLR. The key idea of MixCLR is to learn and refine the representations of mixed augmentations from two differ… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  13. arXiv:2309.13097  [pdf, other

    cs.CV

    Zero-Shot Object Counting with Language-Vision Models

    Authors: **gyi Xu, Hieu Le, Dimitris Samaras

    Abstract: Class-agnostic object counting aims to count object instances of an arbitrary class at test time. It is challenging but also enables many potential applications. Current methods require human-annotated exemplars as inputs which are often unavailable for novel categories, especially for autonomous systems. Thus, we propose zero-shot object counting (ZSC), a new setting where only the class name is… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: Extended version of CVPR23 arXiv:2303.02001 . Currently under review at T-PAMI

  14. arXiv:2309.11009  [pdf, other

    cs.CV

    Controllable Dynamic Appearance for Neural 3D Portraits

    Authors: ShahRukh Athar, Zhixin Shu, Zexiang Xu, Fujun Luan, Sai Bi, Kalyan Sunkavalli, Dimitris Samaras

    Abstract: Recent advances in Neural Radiance Fields (NeRFs) have made it possible to reconstruct and reanimate dynamic portrait scenes with control over head-pose, facial expressions and viewing direction. However, training such models assumes photometric consistency over the deformed region e.g. the face must be evenly lit as it deforms with changing head-pose and facial expression. Such photometric consis… ▽ More

    Submitted 21 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

  15. arXiv:2309.06439  [pdf, other

    cs.CV

    Attention De-sparsification Matters: Inducing Diversity in Digital Pathology Representation Learning

    Authors: Saarthak Kapse, Srijan Das, **gwei Zhang, Rajarsi R. Gupta, Joel Saltz, Dimitris Samaras, Prateek Prasanna

    Abstract: We propose DiRL, a Diversity-inducing Representation Learning technique for histopathology imaging. Self-supervised learning techniques, such as contrastive and non-contrastive approaches, have been shown to learn rich and effective representations of digitized tissue samples with limited pathologist supervision. Our analysis of vanilla SSL-pretrained models' attention distribution reveals an insi… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

  16. arXiv:2309.00748  [pdf, other

    cs.CV cs.LG

    PathLDM: Text conditioned Latent Diffusion Model for Histopathology

    Authors: Srikar Yellapragada, Alexandros Graikos, Prateek Prasanna, Tahsin Kurc, Joel Saltz, Dimitris Samaras

    Abstract: To achieve high-quality results, diffusion models must be trained on large datasets. This can be notably prohibitive for models in specialized domains, such as computational pathology. Conditioning on labeled data is known to help in data-efficient model training. Therefore, histopathology reports, which are rich in valuable clinical information, are an ideal choice as guidance for a histopatholog… ▽ More

    Submitted 30 November, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: WACV 2024 publication

  17. arXiv:2307.09570  [pdf, other

    eess.IV cs.CV

    SAM-Path: A Segment Anything Model for Semantic Segmentation in Digital Pathology

    Authors: **gwei Zhang, Ke Ma, Saarthak Kapse, Joel Saltz, Maria Vakalopoulou, Prateek Prasanna, Dimitris Samaras

    Abstract: Semantic segmentations of pathological entities have crucial clinical value in computational pathology workflows. Foundation models, such as the Segment Anything Model (SAM), have been recently proposed for universal use in segmentation tasks. SAM shows remarkable promise in instance segmentation on natural images. However, the applicability of SAM to computational pathology tasks is limited due t… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

    Comments: Submitted to MedAGI 2023

  18. arXiv:2307.07677  [pdf, other

    cs.CV

    Learning from Pseudo-labeled Segmentation for Multi-Class Object Counting

    Authors: **gyi Xu, Hieu Le, Dimitris Samaras

    Abstract: Class-agnostic counting (CAC) has numerous potential applications across various domains. The goal is to count objects of an arbitrary category during testing, based on only a few annotated exemplars. In this paper, we point out that the task of counting objects of interest when there are multiple object classes in the image (namely, multi-class object counting) is particularly challenging for cur… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

  19. arXiv:2306.01900  [pdf, other

    cs.CV

    Conditional Generation from Unconditional Diffusion Models using Denoiser Representations

    Authors: Alexandros Graikos, Srikar Yellapragada, Dimitris Samaras

    Abstract: Denoising diffusion models have gained popularity as a generative modeling technique for producing high-quality and diverse images. Applying these models to downstream tasks requires conditioning, which can take the form of text, class labels, or other forms of guidance. However, providing conditioning information to these models can be challenging, particularly when annotations are scarce or impr… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  20. arXiv:2304.13115  [pdf, other

    cs.CV

    AVFace: Towards Detailed Audio-Visual 4D Face Reconstruction

    Authors: Aggelina Chatziagapi, Dimitris Samaras

    Abstract: In this work, we present a multimodal solution to the problem of 4D face reconstruction from monocular videos. 3D face reconstruction from 2D images is an under-constrained problem due to the ambiguity of depth. State-of-the-art methods try to solve this problem by leveraging visual information from a single image or video, whereas 3D mesh animation approaches rely more on audio. However, in most… ▽ More

    Submitted 11 May, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

    Comments: Accepted by CVPR 2023. Project page: https://aggelinacha.github.io/AVFace/

  21. arXiv:2304.05482  [pdf, other

    eess.IV cs.CV

    Computational Pathology: A Survey Review and The Way Forward

    Authors: Mahdi S. Hosseini, Babak Ehteshami Bejnordi, Vincent Quoc-Huy Trinh, Danial Hasan, Xingwen Li, Taehyo Kim, Haochen Zhang, Theodore Wu, Kajanan Chinniah, Sina Maghsoudlou, Ryan Zhang, Stephen Yang, Jiadai Zhu, Lyndon Chan, Samir Khaki, Andrei Buin, Fatemeh Chaji, Ala Salehi, Bich Ngoc Nguyen, Dimitris Samaras, Konstantinos N. Plataniotis

    Abstract: Computational Pathology CPath is an interdisciplinary science that augments developments of computational approaches to analyze and model medical histopathology images. The main objective for CPath is to develop infrastructure and workflows of digital diagnostics as an assistive CAD system for clinical pathology, facilitating transformational changes in the diagnosis and treatment of cancer that a… ▽ More

    Submitted 27 January, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: Accepted in Elsevier Journal of Pathology Informatics (JPI) 2024

  22. arXiv:2304.05096  [pdf, other

    cs.CV

    Generating Features with Increased Crop-related Diversity for Few-Shot Object Detection

    Authors: **gyi Xu, Hieu Le, Dimitris Samaras

    Abstract: Two-stage object detectors generate object proposals and classify them to detect objects in images. These proposals often do not contain the objects perfectly but overlap with them in many possible ways, exhibiting great variability in the difficulty levels of the proposals. Training a robust classifier against this crop-related variability requires abundant training data, which is not available i… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 23

  23. arXiv:2304.02255  [pdf, other

    eess.IV cs.CV

    Topology-Guided Multi-Class Cell Context Generation for Digital Pathology

    Authors: Shahira Abousamra, Rajarsi Gupta, Tahsin Kurc, Dimitris Samaras, Joel Saltz, Chao Chen

    Abstract: In digital pathology, the spatial context of cells is important for cell classification, cancer diagnosis and prognosis. To model such complex cell context, however, is challenging. Cells form different mixtures, lineages, clusters and holes. To model such structural patterns in a learnable fashion, we introduce several mathematical tools from spatial statistics and topological data analysis. We i… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: To be published in proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023

  24. arXiv:2303.17712  [pdf, other

    cs.CV

    S-VolSDF: Sparse Multi-View Stereo Regularization of Neural Implicit Surfaces

    Authors: Haoyu Wu, Alexandros Graikos, Dimitris Samaras

    Abstract: Neural rendering of implicit surfaces performs well in 3D vision applications. However, it requires dense input views as supervision. When only sparse input images are available, output quality drops significantly due to the shape-radiance ambiguity problem. We note that this ambiguity can be constrained when a 3D point is visible in multiple views, as is the case in multi-view stereo (MVS). We th… ▽ More

    Submitted 2 September, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: ICCV 2023, Project page: https://hao-yu-wu.github.io/s-volsdf/

  25. arXiv:2303.15274  [pdf, other

    cs.CV

    Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention

    Authors: Sounak Mondal, Zhibo Yang, Seoyoung Ahn, Dimitris Samaras, Gregory Zelinsky, Minh Hoai

    Abstract: Predicting human gaze is important in Human-Computer Interaction (HCI). However, to practically serve HCI applications, gaze prediction models must be scalable, fast, and accurate in their spatial and temporal gaze predictions. Recent scanpath prediction models focus on goal-directed attention (search). Such models are limited in their application due to a common approach relying on trained target… ▽ More

    Submitted 2 July, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023

  26. arXiv:2303.12214  [pdf, other

    cs.CV

    Prompt-MIL: Boosting Multi-Instance Learning Schemes via Task-specific Prompt Tuning

    Authors: **gwei Zhang, Saarthak Kapse, Ke Ma, Prateek Prasanna, Joel Saltz, Maria Vakalopoulou, Dimitris Samaras

    Abstract: Whole slide image (WSI) classification is a critical task in computational pathology, requiring the processing of gigapixel-sized images, which is challenging for current deep-learning methods. Current state of the art methods are based on multi-instance learning schemes (MIL), which usually rely on pretrained features to represent the instances. Due to the lack of task-specific annotated data, th… ▽ More

    Submitted 4 October, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: Accepted to MICCAI 2023 (Oral)

  27. arXiv:2303.09383  [pdf, other

    cs.CV cs.AI

    Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers

    Authors: Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Ruoyu Xue, Gregory Zelinsky, Minh Hoai, Dimitris Samaras

    Abstract: Most models of visual attention aim at predicting either top-down or bottom-up control, as studied using different visual search and free-viewing tasks. In this paper we propose the Human Attention Transformer (HAT), a single model that predicts both forms of attention control. HAT uses a novel transformer-based architecture and a simplified foveated retina that collectively create a spatio-tempor… ▽ More

    Submitted 30 March, 2024; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: CVPR 2024

  28. arXiv:2303.06522  [pdf, other

    cs.CV

    Token Sparsification for Faster Medical Image Segmentation

    Authors: Lei Zhou, Huidong Liu, Joseph Bae, Junjun He, Dimitris Samaras, Prateek Prasanna

    Abstract: Can we use sparse tokens for dense prediction, e.g., segmentation? Although token sparsification has been applied to Vision Transformers (ViT) to accelerate classification, it is still unknown how to perform segmentation from sparse tokens. To this end, we reformulate segmentation as a sparse encoding -> token completion -> dense decoding (SCD) pipeline. We first empirically show that naively appl… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

    Comments: Accepted to IPMI'23. Code will be available here: https://github.com/cvlab-stonybrook/TokenSparse-for-MedSeg

  29. arXiv:2303.02001  [pdf, other

    cs.CV

    Zero-shot Object Counting

    Authors: **gyi Xu, Hieu Le, Vu Nguyen, Viresh Ranjan, Dimitris Samaras

    Abstract: Class-agnostic object counting aims to count object instances of an arbitrary class at test time. It is challenging but also enables many potential applications. Current methods require human-annotated exemplars as inputs which are often unavailable for novel categories, especially for autonomous systems. Thus, we propose zero-shot object counting (ZSC), a new setting where only the class name is… ▽ More

    Submitted 24 April, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  30. arXiv:2212.14215  [pdf, other

    cs.CV

    Local Learning on Transformers via Feature Reconstruction

    Authors: Priyank Pathak, **gwei Zhang, Dimitris Samaras

    Abstract: Transformers are becoming increasingly popular due to their superior performance over conventional convolutional neural networks(CNNs). However, transformers usually require a much larger amount of memory to train than CNNs, which prevents their application in many low resource settings. Local learning, which divides the network into several distinct modules and trains them individually, is a prom… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

  31. arXiv:2212.12105  [pdf, other

    cs.CV

    Precise Location Matching Improves Dense Contrastive Learning in Digital Pathology

    Authors: **gwei Zhang, Saarthak Kapse, Ke Ma, Prateek Prasanna, Maria Vakalopoulou, Joel Saltz, Dimitris Samaras

    Abstract: Dense prediction tasks such as segmentation and detection of pathological entities hold crucial clinical value in computational pathology workflows. However, obtaining dense annotations on large cohorts is usually tedious and expensive. Contrastive learning (CL) is thus often employed to leverage large volumes of unlabeled data to pre-train the backbone network. To boost CL for dense prediction, s… ▽ More

    Submitted 22 March, 2023; v1 submitted 22 December, 2022; originally announced December 2022.

    Comments: Accept to IPMI 2023

  32. arXiv:2211.11062  [pdf, other

    cs.CV

    Patch-level Gaze Distribution Prediction for Gaze Following

    Authors: Qiaomu Miao, Minh Hoai, Dimitris Samaras

    Abstract: Gaze following aims to predict where a person is looking in a scene, by predicting the target location, or indicating that the target is located outside the image. Recent works detect the gaze target by training a heatmap regression task with a pixel-wise mean-square error (MSE) loss, while formulating the in/out prediction task as a binary classification task. This training formulation puts a str… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

    Comments: Accepted to WACV 2023

  33. arXiv:2209.13492  [pdf, other

    q-bio.QM cs.AI cs.LG

    Unraveling Key Elements Underlying Molecular Property Prediction: A Systematic Study

    Authors: Jianyuan Deng, Zhibo Yang, Hehe Wang, Iwao Ojima, Dimitris Samaras, Fusheng Wang

    Abstract: Artificial intelligence (AI) has been widely applied in drug discovery with a major task as molecular property prediction. Despite booming techniques in molecular representation learning, key elements underlying molecular property prediction remain largely unexplored, which impedes further advancements in this field. Herein, we conduct an extensive evaluation of representative models using various… ▽ More

    Submitted 2 September, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

  34. arXiv:2208.03486  [pdf, other

    cs.CV cs.AI

    HaloAE: An HaloNet based Local Transformer Auto-Encoder for Anomaly Detection and Localization

    Authors: E. Mathian, H. Liu, L. Fernandez-Cuesta, D. Samaras, M. Foll, L. Chen

    Abstract: Unsupervised anomaly detection and localization is a crucial task as it is impossible to collect and label all possible anomalies. Many studies have emphasized the importance of integrating local and global information to achieve accurate segmentation of anomalies. To this end, there has been a growing interest in Transformer, which allows modeling long-range content interactions. However, global… ▽ More

    Submitted 26 September, 2022; v1 submitted 6 August, 2022; originally announced August 2022.

    Comments: 21 pages, 6 figures, rejected to ECCV 2023

  35. Gigapixel Whole-Slide Images Classification using Locally Supervised Learning

    Authors: **gwei Zhang, Xin Zhang, Ke Ma, Rajarsi Gupta, Joel Saltz, Maria Vakalopoulou, Dimitris Samaras

    Abstract: Histopathology whole slide images (WSIs) play a very important role in clinical studies and serve as the gold standard for many cancer diagnoses. However, generating automatic tools for processing WSIs is challenging due to their enormous sizes. Currently, to deal with this issue, conventional methods rely on a multiple instance learning (MIL) strategy to process a WSI at patch level. Although eff… ▽ More

    Submitted 26 September, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

    Comments: Accepted to MICCAI 2022 Oral

    Journal ref: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2022

  36. arXiv:2207.01166  [pdf, other

    cs.CV cs.AI

    Target-absent Human Attention

    Authors: Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Gregory Zelinsky, Minh Hoai, Dimitris Samaras

    Abstract: The prediction of human gaze behavior is important for building human-computer interactive systems that can anticipate a user's attention. Computer vision models have been developed to predict the fixations made by people as they search for target objects. But what about when the image has no target? Equally important is to know how people search when they cannot find a target, and when they would… ▽ More

    Submitted 1 November, 2022; v1 submitted 3 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV2022

  37. arXiv:2206.09012  [pdf, other

    cs.LG cs.CV

    Diffusion models as plug-and-play priors

    Authors: Alexandros Graikos, Nikolay Malkin, Nebojsa Jojic, Dimitris Samaras

    Abstract: We consider the problem of inferring high-dimensional data $\mathbf{x}$ in a model that consists of a prior $p(\mathbf{x})$ and an auxiliary differentiable constraint $c(\mathbf{x},\mathbf{y})$ on $x$ given some additional information $\mathbf{y}$. In this paper, the prior is an independently trained denoising diffusion generative model. The auxiliary constraint is expected to have a differentiabl… ▽ More

    Submitted 8 January, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022; code: https://github.com/AlexGraikos/diffusion_priors

  38. arXiv:2206.01742  [pdf, other

    eess.IV cs.CV

    Learning Probabilistic Topological Representations Using Discrete Morse Theory

    Authors: Xiaoling Hu, Dimitris Samaras, Chao Chen

    Abstract: Accurate delineation of fine-scale structures is a very important yet challenging problem. Existing methods use topological information as an additional training loss, but are ultimately making pixel-wise predictions. In this paper, we propose the first deep learning based method to learn topological/structural representations. We use discrete Morse theory and persistent homology to construct an o… ▽ More

    Submitted 1 October, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: 16 pages, 11 figures

  39. arXiv:2204.12283  [pdf

    q-bio.QM cs.CV eess.IV

    A Novel Framework for Characterization of Tumor-Immune Spatial Relationships in Tumor Microenvironment

    Authors: Mahmudul Hasan, Jakub R. Kaczmarzyk, David Paredes, Lyanne Oblein, Jaymie Oentoro, Shahira Abousamra, Michael Horowitz, Dimitris Samaras, Chao Chen, Tahsin Kurc, Kenneth R. Shroyer, Joel Saltz

    Abstract: Understanding the impact of tumor biology on the composition of nearby cells often requires characterizing the impact of biologically distinct tumor regions. Biomarkers have been developed to label biologically distinct tumor regions, but challenges arise because of differences in the spatial extent and distribution of differentially labeled regions. In this work, we present a framework for system… ▽ More

    Submitted 1 May, 2022; v1 submitted 23 April, 2022; originally announced April 2022.

  40. arXiv:2203.05573  [pdf, other

    eess.IV cs.CV cs.LG

    Self Pre-training with Masked Autoencoders for Medical Image Classification and Segmentation

    Authors: Lei Zhou, Huidong Liu, Joseph Bae, Junjun He, Dimitris Samaras, Prateek Prasanna

    Abstract: Masked Autoencoder (MAE) has recently been shown to be effective in pre-training Vision Transformers (ViT) for natural image analysis. By reconstructing full images from partially masked inputs, a ViT encoder aggregates contextual information to infer masked image regions. We believe that this context aggregation ability is particularly essential to the medical image domain where each anatomical s… ▽ More

    Submitted 21 April, 2023; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: ISBI2023 camera-ready version (no substantial difference from v1); Code is available at https://github.com/cvlab-stonybrook/SelfMedMAE

  41. Visual attention analysis of pathologists examining whole slide images of Prostate cancer

    Authors: Souradeep Chakraborty, Ke Ma, Rajarsi Gupta, Beatrice Knudsen, Gregory J. Zelinsky, Joel H. Saltz, Dimitris Samaras

    Abstract: We study the attention of pathologists as they examine whole-slide images (WSIs) of prostate cancer tissue using a digital microscope. To the best of our knowledge, our study is the first to report in detail how pathologists navigate WSIs of prostate cancer as they accumulate information for their diagnoses. We collected slide navigation data (i.e., viewport location, magnification level, and time… ▽ More

    Submitted 2 May, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: ISBI 2022 (Oral presentation)

  42. arXiv:2201.07344  [pdf, other

    eess.IV cs.CV

    Lung Swap** Autoencoder: Learning a Disentangled Structure-texture Representation of Chest Radiographs

    Authors: Lei Zhou, Joseph Bae, Huidong Liu, Gagandeep Singh, Jeremy Green, Amit Gupta, Dimitris Samaras, Prateek Prasanna

    Abstract: Well-labeled datasets of chest radiographs (CXRs) are difficult to acquire due to the high cost of annotation. Thus, it is desirable to learn a robust and transferable representation in an unsupervised manner to benefit tasks that lack labeled data. Unlike natural images, medical images have their own domain prior; e.g., we observe that many pulmonary diseases, such as the COVID-19, manifest as ch… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

    Comments: Extended version of the MICCAI 2021 paper https://link.springer.com/chapter/10.1007/978-3-030-87234-2_33 The code is available at https://github.com/cvlab-stonybrook/LSAE

  43. arXiv:2111.11652  [pdf, other

    cs.LG cs.CV

    CoDiM: Learning with Noisy Labels via Contrastive Semi-Supervised Learning

    Authors: Xin Zhang, Zixuan Liu, Kaiwen Xiao, Tian Shen, Junzhou Huang, Wei Yang, Dimitris Samaras, Xiao Han

    Abstract: Labels are costly and sometimes unreliable. Noisy label learning, semi-supervised learning, and contrastive learning are three different strategies for designing learning processes requiring less annotation cost. Semi-supervised learning and contrastive learning have been recently demonstrated to improve learning strategies that address datasets with noisy labels. Still, the inner connections betw… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

    Comments: 19 Pages, 9 figures, conference paper

  44. arXiv:2110.04886  [pdf, other

    cs.CV

    Multi-Class Cell Detection Using Spatial Context Representation

    Authors: Shahira Abousamra, David Belinsky, John Van Arnam, Felicia Allard, Eric Yee, Rajarsi Gupta, Tahsin Kurc, Dimitris Samaras, Joel Saltz, Chao Chen

    Abstract: In digital pathology, both detection and classification of cells are important for automatic diagnostic and prognostic tasks. Classifying cells into subtypes, such as tumor cells, lymphocytes or stromal cells is particularly challenging. Existing methods focus on morphological appearance of individual cells, whereas in practice pathologists often infer cell classes through their spatial context. I… ▽ More

    Submitted 5 June, 2022; v1 submitted 10 October, 2021; originally announced October 2021.

  45. arXiv:2108.05465  [pdf, other

    cs.CV

    SIDER: Single-Image Neural Optimization for Facial Geometric Detail Recovery

    Authors: Aggelina Chatziagapi, ShahRukh Athar, Francesc Moreno-Noguer, Dimitris Samaras

    Abstract: We present SIDER(Single-Image neural optimization for facial geometric DEtail Recovery), a novel photometric optimization method that recovers detailed facial geometry from a single image in an unsupervised manner. Inspired by classical techniques of coarse-to-fine optimization and recent advances in implicit neural representations of 3D shape, SIDER combines a geometry prior based on statistical… ▽ More

    Submitted 11 August, 2021; originally announced August 2021.

    Comments: version 1.0.0

  46. arXiv:2108.04913  [pdf, other

    cs.CV

    FLAME-in-NeRF : Neural control of Radiance Fields for Free View Face Animation

    Authors: ShahRukh Athar, Zhixin Shu, Dimitris Samaras

    Abstract: This paper presents a neural rendering method for controllable portrait video synthesis. Recent advances in volumetric neural rendering, such as neural radiance fields (NeRF), has enabled the photorealistic novel view synthesis of static scenes with impressive results. However, modeling dynamic and controllable objects as part of a scene with such scene representations is still challenging. In thi… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    Comments: version 1.0.0

  47. arXiv:2107.14287  [pdf, other

    cs.CV

    Temporal Feature War** for Video Shadow Detection

    Authors: Shilin Hu, Hieu Le, Dimitris Samaras

    Abstract: While single image shadow detection has been improving rapidly in recent years, video shadow detection remains a challenging task due to data scarcity and the difficulty in modelling temporal consistency. The current video shadow detection method achieves this goal via co-attention, which mostly exploits information that is temporally coherent but is not robust in detecting moving shadows and smal… ▽ More

    Submitted 29 July, 2021; originally announced July 2021.

  48. arXiv:2107.06276  [pdf, other

    eess.IV cs.CV

    Attention based CNN-LSTM Network for Pulmonary Embolism Prediction on Chest Computed Tomography Pulmonary Angiograms

    Authors: Sudhir Suman, Gagandeep Singh, Nicole Sakla, Rishabh Gattu, Jeremy Green, Tej Phatak, Dimitris Samaras, Prateek Prasanna

    Abstract: With more than 60,000 deaths annually in the United States, Pulmonary Embolism (PE) is among the most fatal cardiovascular diseases. It is caused by an artery blockage in the lung; confirming its presence is time-consuming and is prone to over-diagnosis. The utilization of automated PE detection systems is critical for diagnostic accuracy and efficiency. In this study we propose a two-stage attent… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

    Comments: This work will be presented at MICCAI 2021

  49. arXiv:2106.05386  [pdf, other

    cs.LG cs.AI

    Artificial Intelligence in Drug Discovery: Applications and Techniques

    Authors: Jianyuan Deng, Zhibo Yang, Iwao Ojima, Dimitris Samaras, Fusheng Wang

    Abstract: Artificial intelligence (AI) has been transforming the practice of drug discovery in the past decade. Various AI techniques have been used in a wide range of applications, such as virtual screening and drug design. In this survey, we first give an overview on drug discovery and discuss related applications, which can be reduced to two major tasks, i.e., molecular property prediction and molecule g… ▽ More

    Submitted 2 November, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: Accepted to Briefings in Bioinformatics

  50. arXiv:2103.13538  [pdf, other

    cs.CV

    Hierarchical Proxy-based Loss for Deep Metric Learning

    Authors: Zhibo Yang, Muhammet Bastan, Xinliang Zhu, Doug Gray, Dimitris Samaras

    Abstract: Proxy-based metric learning losses are superior to pair-based losses due to their fast convergence and low training complexity. However, existing proxy-based losses focus on learning class-discriminative features while overlooking the commonalities shared across classes which are potentially useful in describing and matching samples. Moreover, they ignore the implicit hierarchy of categories in re… ▽ More

    Submitted 17 October, 2021; v1 submitted 24 March, 2021; originally announced March 2021.

    Comments: Accepted to WACV2022