Skip to main content

Showing 1–21 of 21 results for author: Aliakbarian, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.14785  [pdf, other

    cs.CV

    SimpleEgo: Predicting Probabilistic Body Pose from Egocentric Cameras

    Authors: Hanz Cuevas-Velasquez, Charlie Hewitt, Sadegh Aliakbarian, Tadas BaltruĊĦaitis

    Abstract: Our work addresses the problem of egocentric human pose estimation from downwards-facing cameras on head-mounted devices (HMD). This presents a challenging scenario, as parts of the body often fall outside of the image or are occluded. Previous solutions minimize this problem by using fish-eye camera lenses to capture a wider view, but these can present hardware design issues. They also predict 2D… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: Accepted in 3DV 2024

  2. arXiv:2312.00870  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    3DiFACE: Diffusion-based Speech-driven 3D Facial Animation and Editing

    Authors: Balamurugan Thambiraja, Sadegh Aliakbarian, Darren Cosker, Justus Thies

    Abstract: We present 3DiFACE, a novel method for personalized speech-driven 3D facial animation and editing. While existing methods deterministically predict facial animations from speech, they overlook the inherent one-to-many relationship between speech and facial expressions, i.e., there are multiple reasonable facial expression animations matching an audio input. It is especially important in content cr… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Project page: https://balamuruganthambiraja.github.io/3DiFACE/

  3. arXiv:2308.11261  [pdf, other

    cs.CV

    HMD-NeMo: Online 3D Avatar Motion Generation From Sparse Observations

    Authors: Sadegh Aliakbarian, Fatemeh Saleh, David Collier, Pashmina Cameron, Darren Cosker

    Abstract: Generating both plausible and accurate full body avatar motion is the key to the quality of immersive experiences in mixed reality scenarios. Head-Mounted Devices (HMDs) typically only provide a few input signals, such as head and hands 6-DoF. Recently, different approaches achieved impressive performance in generating full body motion given only head and hands signal. However, to the best of our… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: Accepted at ICCV 2023

  4. arXiv:2304.06024  [pdf, other

    cs.CV cs.AI

    Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views

    Authors: Siwei Zhang, Qianli Ma, Yan Zhang, Sadegh Aliakbarian, Darren Cosker, Siyu Tang

    Abstract: Automatic perception of human behaviors during social interactions is crucial for AR/VR applications, and an essential component is estimation of plausible 3D human pose and shape of our social partners from the egocentric view. One of the biggest challenges of this task is severe body truncation due to close social distances in egocentric scenarios, which brings large pose ambiguities for unseen… ▽ More

    Submitted 16 September, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Camera ready version for ICCV 2023, appendix included

  5. arXiv:2301.00023  [pdf, other

    cs.CV

    Imitator: Personalized Speech-driven 3D Facial Animation

    Authors: Balamurugan Thambiraja, Ikhsanul Habibie, Sadegh Aliakbarian, Darren Cosker, Christian Theobalt, Justus Thies

    Abstract: Speech-driven 3D facial animation has been widely explored, with applications in gaming, character animation, virtual reality, and telepresence systems. State-of-the-art methods deform the face topology of the target actor to sync the input audio without considering the identity-specific speaking style and facial idiosyncrasies of the target actor, thus, resulting in unrealistic and inaccurate lip… ▽ More

    Submitted 30 December, 2022; originally announced January 2023.

    Comments: https://youtu.be/JhXTdjiUCUw

  6. arXiv:2203.05789  [pdf, other

    cs.CV cs.LG

    FLAG: Flow-based 3D Avatar Generation from Sparse Observations

    Authors: Sadegh Aliakbarian, Pashmina Cameron, Federica Bogo, Andrew Fitzgibbon, Thomas J. Cashman

    Abstract: To represent people in mixed reality applications for collaboration and communication, we need to generate realistic and faithful avatar poses. However, the signal streams that can be applied for this task from head-mounted devices (HMDs) are typically limited to head pose and hand pose estimates. While these signals are valuable, they are an incomplete representation of the human body, making it… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: Accepted at CVPR 2022

  7. arXiv:2012.02337  [pdf, other

    cs.CV

    Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

    Authors: Fatemeh Saleh, Sadegh Aliakbarian, Hamid Rezatofighi, Mathieu Salzmann, Stephen Gould

    Abstract: Despite the recent advances in multiple object tracking (MOT), achieved by joint detection and tracking, dealing with long occlusions remains a challenge. This is due to the fact that such techniques tend to ignore the long-term motion information. In this paper, we introduce a probabilistic autoregressive motion model to score tracklet proposals by directly measuring their likelihood. This is ach… ▽ More

    Submitted 9 December, 2020; v1 submitted 3 December, 2020; originally announced December 2020.

  8. arXiv:2010.04368  [pdf, other

    cs.CV

    Deep Sequence Learning for Video Anticipation: From Discrete and Deterministic to Continuous and Stochastic

    Authors: Sadegh Aliakbarian

    Abstract: Video anticipation is the task of predicting one/multiple future representation(s) given limited, partial observation. This is a challenging task due to the fact that given limited observation, the future representation can be highly ambiguous. Based on the nature of the task, video anticipation can be considered from two viewpoints: the level of details and the level of determinism in the predict… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: The draft of my PhD thesis

  9. arXiv:2009.03075  [pdf, other

    cs.CV

    Uncertainty Inspired RGB-D Saliency Detection

    Authors: **g Zhang, Deng-** Fan, Yuchao Dai, Saeed Anwar, Fatemeh Saleh, Sadegh Aliakbarian, Nick Barnes

    Abstract: We propose the first stochastic framework to employ uncertainty for RGB-D saliency detection by learning from the data labeling process. Existing RGB-D saliency detection models treat this task as a point estimation problem by predicting a single saliency map following a deterministic learning pipeline. We argue that, however, the deterministic solution is relatively ill-posed. Inspired by the sal… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

  10. arXiv:2004.07482  [pdf, other

    cs.CV

    ArTIST: Autoregressive Trajectory Inpainting and Scoring for Tracking

    Authors: Fatemeh Saleh, Sadegh Aliakbarian, Mathieu Salzmann, Stephen Gould

    Abstract: One of the core components in online multiple object tracking (MOT) frameworks is associating new detections with existing tracklets, typically done via a scoring function. Despite the great advances in MOT, designing a reliable scoring function remains a challenge. In this paper, we introduce a probabilistic autoregressive generative model to score tracklet proposals by directly measuring the lik… ▽ More

    Submitted 16 April, 2020; originally announced April 2020.

  11. arXiv:2004.06853  [pdf, other

    eess.IV cs.CV cs.LG

    Mosaic Super-resolution via Sequential Feature Pyramid Networks

    Authors: Mehrdad Shoeiby, Mohammad Ali Armin, Sadegh Aliakbarian, Saeed Anwar, Lars Petersson

    Abstract: Advances in the design of multi-spectral cameras have led to great interests in a wide range of applications, from astronomy to autonomous driving. However, such cameras inherently suffer from a trade-off between the spatial and spectral resolution. In this paper, we propose to address this limitation by introducing a novel method to carry out super-resolution on raw mosaic images, multi-spectral… ▽ More

    Submitted 14 April, 2020; originally announced April 2020.

    Comments: Accepted by IEEE CVPR Workshop

  12. arXiv:1912.08521  [pdf, other

    cs.LG cs.CV stat.ML

    Contextually Plausible and Diverse 3D Human Motion Prediction

    Authors: Sadegh Aliakbarian, Fatemeh Sadat Saleh, Lars Petersson, Stephen Gould, Mathieu Salzmann

    Abstract: We tackle the task of diverse 3D human motion prediction, that is, forecasting multiple plausible future 3D poses given a sequence of observed 3D poses. In this context, a popular approach consists of using a Conditional Variational Autoencoder (CVAE). However, existing approaches that do so either fail to capture the diversity in human motion, or generate diverse but semantically implausible cont… ▽ More

    Submitted 5 December, 2020; v1 submitted 18 December, 2019; originally announced December 2019.

  13. arXiv:1909.07577  [pdf, other

    eess.IV cs.CV

    Multi-FAN: Multi-Spectral Mosaic Super-Resolution Via Multi-Scale Feature Aggregation Network

    Authors: Mehrdad Shoeiby, Sadegh Aliakbarian, Saeed Anwar, Lars Petersson

    Abstract: This paper introduces a novel method to super-resolve multi-spectral images captured by modern real-time single-shot mosaic image sensors, also known as multi-spectral cameras. Our contribution is two-fold. Firstly, we super-resolve multi-spectral images from mosaic images rather than image cubes, which helps to take into account the spatial offset of each wavelength. Secondly, we introduce an ext… ▽ More

    Submitted 6 November, 2019; v1 submitted 16 September, 2019; originally announced September 2019.

  14. arXiv:1909.02221  [pdf, other

    eess.IV cs.CV

    Super-resolved Chromatic Map** of Snapshot Mosaic Image Sensors via a Texture Sensitive Residual Network

    Authors: Mehrdad Shoeiby, Lars Petersson, Mohammad Ali Armin, Sadegh Aliakbarian, Antonio Robles-Kelly

    Abstract: This paper introduces a novel method to simultaneously super-resolve and colour-predict images acquired by snapshot mosaic sensors. These sensors allow for spectral images to be acquired using low-power, small form factor, solid-state CMOS sensors that can operate at video frame rates without the need for complex optical setups. Despite their desirable traits, their main drawback stems from the fa… ▽ More

    Submitted 5 September, 2019; originally announced September 2019.

  15. arXiv:1908.00733  [pdf, other

    cs.LG cs.CV stat.ML

    Learning Variations in Human Motion via Mix-and-Match Perturbation

    Authors: Mohammad Sadegh Aliakbarian, Fatemeh Sadat Saleh, Mathieu Salzmann, Lars Petersson, Stephen Gould, Amirhossein Habibian

    Abstract: Human motion prediction is a stochastic process: Given an observed sequence of poses, multiple future motions are plausible. Existing approaches to modeling this stochasticity typically combine a random noise vector with information about the previous poses. This combination, however, is done in a deterministic manner, which gives the network the flexibility to learn to ignore the random noise. In… ▽ More

    Submitted 24 February, 2020; v1 submitted 2 August, 2019; originally announced August 2019.

  16. arXiv:1810.09044  [pdf, other

    cs.CV

    VIENA2: A Driving Anticipation Dataset

    Authors: Mohammad Sadegh Aliakbarian, Fatemeh Sadat Saleh, Mathieu Salzmann, Basura Fernando, Lars Petersson, Lars Andersson

    Abstract: Action anticipation is critical in scenarios where one needs to react before the action is finalized. This is, for instance, the case in automated driving, where a car needs to, e.g., avoid hitting pedestrians and respect traffic lights. While solutions have been proposed to tackle subsets of the driving anticipation tasks, by making use of diverse, task-specific sensors, there is no single datase… ▽ More

    Submitted 29 October, 2018; v1 submitted 21 October, 2018; originally announced October 2018.

    Comments: Accepted in ACCV 2018

  17. arXiv:1807.06132  [pdf, other

    cs.CV

    Effective Use of Synthetic Data for Urban Scene Semantic Segmentation

    Authors: Fatemeh Sadat Saleh, Mohammad Sadegh Aliakbarian, Mathieu Salzmann, Lars Petersson, Jose M. Alvarez

    Abstract: Training a deep network to perform semantic segmentation requires large amounts of labeled data. To alleviate the manual effort of annotating real images, researchers have investigated the use of synthetic data, which can be labeled automatically. Unfortunately, a network trained on synthetic data performs relatively poorly on real images. While this can be addressed by domain adaptation, existing… ▽ More

    Submitted 16 July, 2018; originally announced July 2018.

    Comments: Accepted in European Conference on Computer Vision (ECCV), 2018

  18. arXiv:1708.04400  [pdf, other

    cs.CV

    Bringing Background into the Foreground: Making All Classes Equal in Weakly-supervised Video Semantic Segmentation

    Authors: Fatemeh Sadat Saleh, Mohammad Sadegh Aliakbarian, Mathieu Salzmann, Lars Petersson, Jose M. Alvarez

    Abstract: Pixel-level annotations are expensive and time-consuming to obtain. Hence, weak supervision using only image tags could have a significant impact in semantic segmentation. Recent years have seen great progress in weakly-supervised semantic segmentation, whether from a single image or from videos. However, most existing methods are designed to handle a single background class. In practical applicat… ▽ More

    Submitted 15 August, 2017; originally announced August 2017.

    Comments: 11 pages, 4 figures, 7 tables, Accepted in ICCV 2017

  19. Incorporating Network Built-in Priors in Weakly-supervised Semantic Segmentation

    Authors: Fatemeh Sadat Saleh, Mohammad Sadegh Aliakbarian, Mathieu Salzmann, Lars Petersson, Jose M. Alvarez, Stephen Gould

    Abstract: Pixel-level annotations are expensive and time consuming to obtain. Hence, weak supervision using only image tags could have a significant impact in semantic segmentation. Recently, CNN-based methods have proposed to fine-tune pre-trained networks using image tags. Without additional information, this leads to poor localization accuracy. This problem, however, was alleviated by making use of objec… ▽ More

    Submitted 5 June, 2017; originally announced June 2017.

    Comments: 14 pages, 11 figures, 8 tables, Accepted in IEEE Transaction on Pattern Analysis and Machine Intelligence (IEEE TPAMI)

  20. arXiv:1703.07023  [pdf, other

    cs.CV

    Encouraging LSTMs to Anticipate Actions Very Early

    Authors: Mohammad Sadegh Aliakbarian, Fatemeh Sadat Saleh, Mathieu Salzmann, Basura Fernando, Lars Petersson, Lars Andersson

    Abstract: In contrast to the widely studied problem of recognizing an action given a complete sequence, action anticipation aims to identify the action from only partially available videos. As such, it is therefore key to the success of computer vision applications requiring to react as early as possible, such as autonomous navigation. In this paper, we propose a new action anticipation method that achieves… ▽ More

    Submitted 13 August, 2017; v1 submitted 20 March, 2017; originally announced March 2017.

    Comments: 13 Pages, 7 Figures, 11 Tables. Accepted in ICCV 2017. arXiv admin note: text overlap with arXiv:1611.05520

  21. arXiv:1611.05520  [pdf, other

    cs.CV

    Deep Action- and Context-Aware Sequence Learning for Activity Recognition and Anticipation

    Authors: Mohammad Sadegh Aliakbarian, Fatemehsadat Saleh, Basura Fernando, Mathieu Salzmann, Lars Petersson, Lars Andersson

    Abstract: Action recognition and anticipation are key to the success of many computer vision applications. Existing methods can roughly be grouped into those that extract global, context-aware representations of the entire image or sequence, and those that aim at focusing on the regions where the action occurs. While the former may suffer from the fact that context is not always reliable, the latter complet… ▽ More

    Submitted 17 November, 2016; v1 submitted 16 November, 2016; originally announced November 2016.

    Comments: 10 pages, 4 figures, 7 tables