Skip to main content

Showing 1–26 of 26 results for author: Greenspan, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.16304  [pdf, other

    cs.LG cs.AI

    Federated Unsupervised Domain Generalization using Global and Local Alignment of Gradients

    Authors: Farhad Pourpanah, Mahdiyar Molahasani, Milad Soltany, Michael Greenspan, Ali Etemad

    Abstract: We address the problem of federated domain generalization in an unsupervised setting for the first time. We first theoretically establish a connection between domain shift and alignment of gradients in unsupervised federated learning and show that aligning the gradients at both client and server levels can facilitate the generalization of the model to new (target) domains. Building on this insight… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 23 pages, 4 figure

  2. arXiv:2311.09500  [pdf, other

    cs.CV

    Pseudo-keypoint RKHS Learning for Self-supervised 6DoF Pose Estimation

    Authors: Yangzheng Wu, Michael Greenspan

    Abstract: This paper addresses the simulation-to-real domain gap in 6DoF PE, and proposes a novel self-supervised keypoint radial voting-based 6DoF PE framework, effectively narrowing this gap using a learnable kernel in RKHS. We formulate this domain gap as a distance in high-dimensional feature space, distinct from previous iterative matching methods. We propose an adapter network, which evolves the netwo… ▽ More

    Submitted 17 November, 2023; v1 submitted 15 November, 2023; originally announced November 2023.

  3. arXiv:2309.01274  [pdf, other

    cs.CV

    Diffusion Models with Deterministic Normalizing Flow Priors

    Authors: Mohsen Zand, Ali Etemad, Michael Greenspan

    Abstract: For faster sampling and higher sample quality, we propose DiNof ($\textbf{Di}$ffusion with $\textbf{No}$rmalizing $\textbf{f}$low priors), a technique that makes use of normalizing flows and diffusion models. We use normalizing flows to parameterize the noisy data at any arbitrary step of the diffusion process and utilize it as the prior in the reverse diffusion process. More specifically, the for… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: 12 pages, 7 figures

  4. arXiv:2308.16801  [pdf, other

    cs.CV

    Multiscale Residual Learning of Graph Convolutional Sequence Chunks for Human Motion Prediction

    Authors: Mohsen Zand, Ali Etemad, Michael Greenspan

    Abstract: A new method is proposed for human motion prediction by learning temporal and spatial dependencies. Recently, multiscale graphs have been developed to model the human body at higher abstraction levels, resulting in more stable motion prediction. Current methods however predetermine scale levels and combine spatially proximal joints to generate coarser scales based on human priors, even though move… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: 13 pages

  5. arXiv:2308.07827  [pdf, other

    cs.CV

    Learning Better Keypoints for Multi-Object 6DoF Pose Estimation

    Authors: Yangzheng Wu, Michael Greenspan

    Abstract: We address the problem of keypoint selection, and find that the performance of 6DoF pose estimation methods can be improved when pre-defined keypoint locations are learned, rather than being heuristically selected as has been the standard approach. We found that accuracy and efficiency can be improved by training a graph network to select a set of disperse keypoints with similarly distributed vote… ▽ More

    Submitted 9 November, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

    Comments: WACV 2024

  6. arXiv:2307.03786  [pdf, other

    cs.CV

    Context-aware Pedestrian Trajectory Prediction with Multimodal Transformer

    Authors: Haleh Damirchi, Michael Greenspan, Ali Etemad

    Abstract: We propose a novel solution for predicting future trajectories of pedestrians. Our method uses a multimodal encoder-decoder transformer architecture, which takes as input both pedestrian locations and ego-vehicle speeds. Notably, our decoder predicts the entire future trajectory in a single-pass and does not perform one-step-ahead prediction, which makes the method effective for embedded edge depl… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  7. arXiv:2306.15117  [pdf, other

    cs.CV cs.AI

    Continual Learning for Out-of-Distribution Pedestrian Detection

    Authors: Mahdiyar Molahasani, Ali Etemad, Michael Greenspan

    Abstract: A continual learning solution is proposed to address the out-of-distribution generalization problem for pedestrian detection. While recent pedestrian detection models have achieved impressive performance on various datasets, they remain sensitive to shifts in the distribution of the inference data. Our method adopts and modifies Elastic Weight Consolidation to a backbone object detection network,… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  8. arXiv:2306.13275  [pdf, other

    cs.LG cs.CV

    Can Continual Learning Improve Long-Tailed Recognition? Toward a Unified Framework

    Authors: Mahdiyar Molahasani, Michael Greenspan, Ali Etemad

    Abstract: The Long-Tailed Recognition (LTR) problem emerges in the context of learning from highly imbalanced datasets, in which the number of samples among different classes is heavily skewed. LTR methods aim to accurately learn a dataset comprising both a larger Head set and a smaller Tail set. We propose a theorem where under the assumption of strong convexity of the loss function, the weights of a learn… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  9. arXiv:2305.09401  [pdf, other

    cs.CV cs.AI

    Diffusion Dataset Generation: Towards Closing the Sim2Real Gap for Pedestrian Detection

    Authors: Andrew Farley, Mohsen Zand, Michael Greenspan

    Abstract: We propose a method that augments a simulated dataset using diffusion models to improve the performance of pedestrian detection in real-world data. The high cost of collecting and annotating data in the real-world has motivated the use of simulation platforms to create training datasets. While simulated data is inexpensive to collect and annotate, it unfortunately does not always closely match the… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: 8 pages, 4 figures, Accepted to CRV2023 conference

  10. arXiv:2210.15456  [pdf, other

    cs.CL cs.AI

    JECC: Commonsense Reasoning Tasks Derived from Interactive Fictions

    Authors: Mo Yu, Yi Gu, Xiaoxiao Guo, Yufei Feng, Xiaodan Zhu, Michael Greenspan, Murray Campbell, Chuang Gan

    Abstract: Commonsense reasoning simulates the human ability to make presumptions about our physical world, and it is an essential cornerstone in building general AI systems. We propose a new commonsense reasoning dataset based on human's Interactive Fiction (IF) gameplay walkthroughs as human players demonstrate plentiful and diverse commonsense reasoning. The new dataset provides a natural mixture of vario… ▽ More

    Submitted 26 May, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: arXiv admin note: text overlap with arXiv:2010.09788

  11. arXiv:2210.08123  [pdf, other

    cs.CV

    Keypoint Cascade Voting for Point Cloud Based 6DoF Pose Estimation

    Authors: Yangzheng Wu, Alireza Javaheri, Mohsen Zand, Michael Greenspan

    Abstract: We propose a novel keypoint voting 6DoF object pose estimation method, which takes pure unordered point cloud geometry as input without RGB information. The proposed cascaded keypoint voting method, called RCVPose3D, is based upon a novel architecture which separates the task of semantic segmentation from that of keypoint regression, thereby increasing the effectiveness of both and improving the u… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  12. arXiv:2207.06985  [pdf, other

    cs.CV

    ObjectBox: From Centers to Boxes for Anchor-Free Object Detection

    Authors: Mohsen Zand, Ali Etemad, Michael Greenspan

    Abstract: We present ObjectBox, a novel single-stage anchor-free and highly generalizable object detection approach. As opposed to both existing anchor-based and anchor-free detectors, which are more biased toward specific object scales in their label assignments, we use only object center locations as positive samples and treat all objects equally in different feature levels regardless of the objects' size… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

    Comments: ECCV 2022 Oral

  13. arXiv:2203.04857  [pdf, other

    cs.CL

    Neuro-symbolic Natural Logic with Introspective Revision for Natural Language Inference

    Authors: Yufei Feng, Xiaoyu Yang, Xiaodan Zhu, Michael Greenspan

    Abstract: We introduce a neuro-symbolic natural logic framework based on reinforcement learning with introspective revision. The model samples and rewards specific reasoning paths through policy gradient, in which the introspective revision algorithm modifies intermediate symbolic reasoning steps to discover reward-earning operations as well as leverages external knowledge to alleviate spurious reasoning an… ▽ More

    Submitted 5 June, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

    Comments: To appear at TACL 2022, MIT Press

  14. arXiv:2202.09942  [pdf, other

    cs.CV cs.AI

    Multiscale Crowd Counting and Localization By Multitask Point Supervision

    Authors: Mohsen Zand, Haleh Damirchi, Andrew Farley, Mahdiyar Molahasani, Michael Greenspan, Ali Etemad

    Abstract: We propose a multitask approach for crowd counting and person localization in a unified framework. As the detection and localization tasks are well-correlated and can be jointly tackled, our model benefits from a multitask solution by learning multiscale representations of encoded crowd images, and subsequently fusing them. In contrast to the relatively more popular density-based methods, our mode… ▽ More

    Submitted 20 February, 2022; originally announced February 2022.

    Comments: 4 pages + references, 3 figures, 2 tables, Accepted by ICASSP 2022 Conference

  15. arXiv:2106.06684  [pdf, other

    cs.CV

    Multistream ValidNet: Improving 6D Object Pose Estimation by Automatic Multistream Validation

    Authors: Joy Mazumder, Mohsen Zand, Michael Greenspan

    Abstract: This work presents a novel approach to improve the results of pose estimation by detecting and distinguishing between the occurrence of True and False Positive results. It achieves this by training a binary classifier on the output of an arbitrary pose estimation algorithm, and returns a binary label indicating the validity of the result. We demonstrate that our approach improves upon a state-of-t… ▽ More

    Submitted 12 June, 2021; originally announced June 2021.

    Comments: 6 pages, 2 figures, 2 tables. To appear in the proceedings of the 28th IEEE International Conference on Image Processing (IEEE - ICIP), September 19-22, 2021, Anchorage, Alaska, USA

  16. Oriented Bounding Boxes for Small and Freely Rotated Objects

    Authors: Mohsen Zand, Ali Etemad, Michael Greenspan

    Abstract: A novel object detection method is presented that handles freely rotated objects of arbitrary sizes, including tiny objects as small as $2\times 2$ pixels. Such tiny objects appear frequently in remotely sensed images, and present a challenge to recent object detection algorithms. More importantly, current object detection methods have been designed originally to accommodate axis-aligned bounding… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: IEEE Transactions on Geoscience and Remote Sensing, 2021

  17. Flow-based Spatio-Temporal Structured Prediction of Motion Dynamics

    Authors: Mohsen Zand, Ali Etemad, Michael Greenspan

    Abstract: Conditional Normalizing Flows (CNFs) are flexible generative models capable of representing complicated distributions with high dimensionality and large interdimensional correlations, making them appealing for structured output learning. Their effectiveness in modelling multivariates spatio-temporal structured data has yet to be completely investigated. We propose MotionFlow as a novel normalizing… ▽ More

    Submitted 4 September, 2023; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: 13 pages, LaTeX; typos corrected, updated, in IEEE Transactions on Pattern Analysis and Machine Intelligence

  18. arXiv:2104.02527  [pdf, other

    cs.CV

    Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting

    Authors: Yangzheng Wu, Mohsen Zand, Ali Etemad, Michael Greenspan

    Abstract: We propose a novel keypoint voting scheme based on intersecting spheres, that is more accurate than existing schemes and allows for fewer, more disperse keypoints. The scheme is based upon the distance between points, which as a 1D quantity can be regressed more accurately than the 2D and 3D vector and offset quantities regressed in previous work, yielding more accurate keypoint localization. The… ▽ More

    Submitted 12 July, 2022; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: ECCV 2022 Oral

  19. arXiv:2104.02424  [pdf, other

    cs.CV

    Teacher-Student Adversarial Depth Hallucination to Improve Face Recognition

    Authors: Hardik Uppal, Alireza Sepas-Moghaddam, Michael Greenspan, Ali Etemad

    Abstract: We present the Teacher-Student Generative Adversarial Network (TS-GAN) to generate depth images from single RGB images in order to boost the performance of face recognition systems. For our method to generalize well across unseen datasets, we design two components in the architecture, a teacher and a student. The teacher, which itself consists of a generator and a discriminator, learns a latent ma… ▽ More

    Submitted 29 August, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: 10 pages, 6 figures, Accepted to International Conference on Computer Vision 2021

  20. Procam Calibration from a Single Pose of a Planar Target

    Authors: Ghani O. Lawal, Michael Greenspan

    Abstract: A novel user friendly method is proposed for calibrating a procam system from a single pose of a planar chessboard target. The user simply needs to orient the chessboard in a single appropriate pose. A sequence of Gray Code patterns are projected onto the chessboard, which allows correspondences between the camera, projector and the chessboard to be automatically extracted. These correspondences a… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    Comments: 11 pages, 9 figures, 10 tables. Submitted to the VISAPP Conference. Stored in the SciTepress Digital Library: https://www.scitepress.org/PublicationsDetail.aspx?ID=rGG70YCQyOs=&t=1

    Journal ref: In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021) - Volume 5: VISAPP, pages 817-827

  21. Depth as Attention for Face Representation Learning

    Authors: Hardik Uppal, Alireza Sepas-Moghaddam, Michael Greenspan, Ali Etemad

    Abstract: Face representation learning solutions have recently achieved great success for various applications such as verification and identification. However, face recognition approaches that are based purely on RGB images rely solely on intensity information, and therefore are more sensitive to facial variations, notably pose, occlusions, and environmental changes such as illumination and background. A n… ▽ More

    Submitted 5 April, 2021; v1 submitted 3 January, 2021; originally announced January 2021.

    Comments: 16 pages, 11 figures, Accepted to IEEE Transactions on Information Forensics and Security 2021

  22. arXiv:2011.04044  [pdf, other

    cs.CL cs.AI

    Exploring End-to-End Differentiable Natural Logic Modeling

    Authors: Yufei Feng, Zi'ou Zheng, Quan Liu, Michael Greenspan, Xiaodan Zhu

    Abstract: We explore end-to-end trained differentiable models that integrate natural logic with neural networks, aiming to keep the backbone of natural language reasoning based on the natural logic formalism while introducing subsymbolic vector representations and neural components. The proposed model adapts module networks to model natural logic operations, which is enhanced with a memory component to mode… ▽ More

    Submitted 8 November, 2020; originally announced November 2020.

    Comments: 10 pages

    Journal ref: COLING 2020

  23. arXiv:2010.09788  [pdf, other

    cs.AI cs.CL

    Deriving Commonsense Inference Tasks from Interactive Fictions

    Authors: Mo Yu, Xiaoxiao Guo, Yufei Feng, Xiaodan Zhu, Michael Greenspan, Murray Campbell

    Abstract: Commonsense reasoning simulates the human ability to make presumptions about our physical world, and it is an indispensable cornerstone in building general AI systems. We propose a new commonsense reasoning dataset based on human's interactive fiction game playings as human players demonstrate plentiful and diverse commonsense reasoning. The new dataset mitigates several limitations of the prior a… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

  24. arXiv:2004.02393  [pdf, other

    cs.CL

    Learning to Recover Reasoning Chains for Multi-Hop Question Answering via Cooperative Games

    Authors: Yufei Feng, Mo Yu, Wenhan Xiong, Xiaoxiao Guo, Junjie Huang, Shiyu Chang, Murray Campbell, Michael Greenspan, Xiaodan Zhu

    Abstract: We propose the new problem of learning to recover reasoning chains from weakly supervised signals, i.e., the question-answer pairs. We propose a cooperative game approach to deal with this problem, in which how the evidence passages are selected and how the selected passages are connected are handled by two models that cooperate to select the most confident chains from a large set of candidates (f… ▽ More

    Submitted 5 April, 2020; originally announced April 2020.

  25. arXiv:2003.00168  [pdf, other

    cs.CV cs.AI

    Two-Level Attention-based Fusion Learning for RGB-D Face Recognition

    Authors: Hardik Uppal, Alireza Sepas-Moghaddam, Michael Greenspan, Ali Etemad

    Abstract: With recent advances in RGB-D sensing technologies as well as improvements in machine learning and fusion techniques, RGB-D facial recognition has become an active area of research. A novel attention aware method is proposed to fuse two image modalities, RGB and depth, for enhanced RGB-D facial recognition. The proposed method first extracts features from both modalities using a convolutional feat… ▽ More

    Submitted 18 October, 2020; v1 submitted 28 February, 2020; originally announced March 2020.

    Comments: 8 Pages, 4 figure, Accepted to International Conference on Pattern Recognition (ICPR) 2020

  26. Difference of Normals as a Multi-Scale Operator in Unorganized Point Clouds

    Authors: Yani Ioannou, Babak Taati, Robin Harrap, Michael Greenspan

    Abstract: A novel multi-scale operator for unorganized 3D point clouds is introduced. The Difference of Normals (DoN) provides a computationally efficient, multi-scale approach to processing large unorganized 3D point clouds. The application of DoN in the multi-scale filtering of two different real-world outdoor urban LIDAR scene datasets is quantitatively and qualitatively demonstrated. In both datasets th… ▽ More

    Submitted 8 September, 2012; originally announced September 2012.

    Comments: To be published in proceedings of 3DIMPVT 2012

    Journal ref: Proceedings of the 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission (3DIMPVT)