Skip to main content

Showing 1–37 of 37 results for author: Mirmehdi, M

Searching in archive cs. Search in all archives.
.
  1. Automated Radiology Report Generation: A Review of Recent Advances

    Authors: Phillip Sloan, Philip Clatworthy, Edwin Simpson, Majid Mirmehdi

    Abstract: Increasing demands on medical imaging departments are taking a toll on the radiologist's ability to deliver timely and accurate reports. Recent technological advances in artificial intelligence have demonstrated great potential for automatic radiology report generation (ARRG), sparking an explosion of research. This survey paper conducts a methodological review of contemporary ARRG approaches by w… ▽ More

    Submitted 29 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: 24 pages, 8 figures, 6 tables. Accepted by IEEE Reviews in Biomedical Engineering

    MSC Class: 68T99 ACM Class: I.2; I.4; J.3

  2. arXiv:2404.08937  [pdf, other

    cs.CV cs.AI cs.LG

    ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition

    Authors: Otto Brookes, Majid Mirmehdi, Hjalmar Kuhl, Tilo Burghardt

    Abstract: We show that chimpanzee behaviour understanding from camera traps can be enhanced by providing visual architectures with access to an embedding of text descriptions that detail species behaviours. In particular, we present a vision-language model which employs multi-modal decoding of visual features extracted directly from camera trap videos to process query tokens representing behaviours and outp… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  3. arXiv:2401.13554  [pdf, other

    cs.CV

    PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition

    Authors: Otto Brookes, Majid Mirmehdi, Colleen Stephens, Samuel Angedakin, Katherine Corogenes, Dervla Dowd, Paula Dieguez, Thurston C. Hicks, Sorrel Jones, Kevin Lee, Vera Leinert, Juan Lapuente, Maureen S. McCarthy, Amelia Meier, Mizuki Murai, Emmanuelle Normand, Virginie Vergnes, Erin G. Wessling, Roman M. Wittig, Kevin Langergraber, Nuria Maldonado, Xinyu Yang, Klaus Zuberbuhler, Christophe Boesch, Mimi Arandjelovic , et al. (2 additional authors not shown)

    Abstract: We present the PanAf20K dataset, the largest and most diverse open-access annotated video dataset of great apes in their natural environment. It comprises more than 7 million frames across ~20,000 camera trap videos of chimpanzees and gorillas collected at 14 field sites in tropical Africa as part of the Pan African Programme: The Cultured Chimpanzee. The footage is accompanied by a rich set of an… ▽ More

    Submitted 31 January, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted at IJCV

  4. arXiv:2312.00856  [pdf, other

    cs.CV

    QAFE-Net: Quality Assessment of Facial Expressions with Landmark Heatmaps

    Authors: Shuchao Duan, Amirhossein Dadashzadeh, Alan Whone, Majid Mirmehdi

    Abstract: Facial expression recognition (FER) methods have made great inroads in categorising moods and feelings in humans. Beyond FER, pain estimation methods assess levels of intensity in pain expressions, however assessing the quality of all facial expressions is of critical value in health-related applications. In this work, we address the quality of five different facial expressions in patients affecte… ▽ More

    Submitted 12 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: Accepted to ELFA workshop at WACV 2024

  5. arXiv:2311.16446  [pdf, other

    cs.CV

    Centre Stage: Centricity-based Audio-Visual Temporal Action Detection

    Authors: Hanyuan Wang, Majid Mirmehdi, Dima Damen, Toby Perrett

    Abstract: Previous one-stage action detection approaches have modelled temporal dependencies using only the visual modality. In this paper, we explore different strategies to incorporate the audio modality, using multi-scale cross-attention to fuse the two modalities. We also demonstrate the correlation between the distance from the timestep to the action centre and the accuracy of the predicted boundaries.… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Accepted to VUA workshop at BMVC 2023

  6. arXiv:2311.07603  [pdf, other

    cs.CV

    PECoP: Parameter Efficient Continual Pretraining for Action Quality Assessment

    Authors: Amirhossein Dadashzadeh, Shuchao Duan, Alan Whone, Majid Mirmehdi

    Abstract: The limited availability of labelled data in Action Quality Assessment (AQA), has forced previous works to fine-tune their models pretrained on large-scale domain-general datasets. This common approach results in weak generalisation, particularly when there is a significant domain shift. We propose a novel, parameter efficient, continual pretraining framework, PECoP, to reduce such domain shift vi… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: Accepted to WACV 2024 (preprint)

  7. arXiv:2304.01143  [pdf, other

    cs.CV

    Use Your Head: Improving Long-Tail Video Recognition

    Authors: Toby Perrett, Saptarshi Sinha, Tilo Burghardt, Majid Mirmehdi, Dima Damen

    Abstract: This paper presents an investigation into long-tail video recognition. We demonstrate that, unlike naturally-collected video datasets and existing long-tail image benchmarks, current video benchmarks fall short on multiple long-tailed properties. Most critically, they lack few-shot classes in their tails. In response, we propose new video benchmarks that better assess long-tail recognition, by sam… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  8. Video-SwinUNet: Spatio-temporal Deep Learning Framework for VFSS Instance Segmentation

    Authors: Chengxi Zeng, Xinyu Yang, David Smithard, Majid Mirmehdi, Alberto M Gambaruto, Tilo Burghardt

    Abstract: This paper presents a deep learning framework for medical video segmentation. Convolution neural network (CNN) and transformer-based methods have achieved great milestones in medical image segmentation tasks due to their incredible semantic feature encoding and global information comprehension abilities. However, most existing approaches ignore a salient aspect of medical video data - the temporal… ▽ More

    Submitted 4 July, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

  9. arXiv:2301.10829  [pdf, other

    eess.IV cs.CV

    TranSOP: Transformer-based Multimodal Classification for Stroke Treatment Outcome Prediction

    Authors: Zeynel A. Samak, Philip Clatworthy, Majid Mirmehdi

    Abstract: Acute ischaemic stroke, caused by an interruption in blood flow to brain tissue, is a leading cause of disability and mortality worldwide. The selection of patients for the most optimal ischaemic stroke treatment is a crucial step for a successful outcome, as the effect of treatment highly depends on the time to treatment. We propose a transformer-based multimodal network (TranSOP) for a classific… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Comments: Accepted at IEEE ISBI 2023, 5 pages

  10. arXiv:2301.02642  [pdf, other

    cs.CV cs.AI cs.LG

    Triple-stream Deep Metric Learning of Great Ape Behavioural Actions

    Authors: Otto Brookes, Majid Mirmehdi, Hjalmar Kühl, Tilo Burghardt

    Abstract: We propose the first metric learning system for the recognition of great ape behavioural actions. Our proposed triple stream embedding architecture works on camera trap videos taken directly in the wild and demonstrates that the utilisation of an explicit DensePose-C chimpanzee body part segmentation stream effectively complements traditional RGB appearance and optical flow streams. We evaluate sy… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

  11. arXiv:2210.14284  [pdf, other

    cs.CV

    Refining Action Boundaries for One-stage Detection

    Authors: Hanyuan Wang, Majid Mirmehdi, Dima Damen, Toby Perrett

    Abstract: Current one-stage action detection methods, which simultaneously predict action boundaries and the corresponding class, do not estimate or use a measure of confidence in their boundary predictions, which can lead to inaccurate boundaries. We incorporate the estimation of boundary confidence into one-stage anchor-free detection, through an additional prediction head that predicts the refined bounda… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: Accepted to AVSS 2022. Our code is available at https://github.com/hanielwang/Refining_Boundary_Head.git

  12. arXiv:2208.08315  [pdf, other

    eess.IV cs.CV

    Video-TransUNet: Temporally Blended Vision Transformer for CT VFSS Instance Segmentation

    Authors: Chengxi Zeng, Xinyu Yang, Majid Mirmehdi, Alberto M Gambaruto, Tilo Burghardt

    Abstract: We propose Video-TransUNet, a deep architecture for instance segmentation in medical CT videos constructed by integrating temporal feature blending into the TransUNet deep learning framework. In particular, our approach amalgamates strong frame representation via a ResNet CNN backbone, multi-frame feature blending via a Temporal Context Module (TCM), non-local attention via a Vision Transformer, a… ▽ More

    Submitted 22 August, 2022; v1 submitted 17 August, 2022; originally announced August 2022.

    Comments: Accepted by International Conference on Machine Vision 2022

  13. arXiv:2207.08064  [pdf, other

    cs.CV

    Detecting Humans in RGB-D Data with CNNs

    Authors: Kaiyang Zhou, Adeline Paiement, Majid Mirmehdi

    Abstract: We address the problem of people detection in RGB-D data where we leverage depth information to develop a region-of-interest (ROI) selection method that provides proposals to two color and depth CNNs. To combine the detections produced by the two CNNs, we propose a novel fusion approach based on the characteristics of depth images. We also present a new depth-encoding scheme, which not only encode… ▽ More

    Submitted 16 July, 2022; originally announced July 2022.

    Comments: An (outdated) MSc project (2016), which studied how to use CNNs to detect humans in RGBD data

  14. arXiv:2207.06789  [pdf, other

    cs.CV

    Inertial Hallucinations -- When Wearable Inertial Devices Start Seeing Things

    Authors: Alessandro Masullo, Toby Perrett, Tilo Burghardt, Ian Craddock, Dima Damen, Majid Mirmehdi

    Abstract: We propose a novel approach to multimodal sensor fusion for Ambient Assisted Living (AAL) which takes advantage of learning using privileged information (LUPI). We address two major shortcomings of standard multimodal approaches, limited area coverage and reduced reliability. Our new framework fuses the concept of modality hallucination with triplet learning to train a model with different modalit… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  15. arXiv:2205.00275  [pdf, other

    cs.CV

    Dynamic Curriculum Learning for Great Ape Detection in the Wild

    Authors: Xinyu Yang, Tilo Burghardt, Majid Mirmehdi

    Abstract: We propose a novel end-to-end curriculum learning approach for sparsely labelled animal datasets leveraging large volumes of unlabelled data to improve supervised species detectors. We exemplify the method in detail on the task of finding great apes in camera trap footage taken in challenging real-world jungle environments. In contrast to previous semi-supervised methods, our approach adjusts lear… ▽ More

    Submitted 2 January, 2023; v1 submitted 30 April, 2022; originally announced May 2022.

    Comments: Accepted at IJCV

  16. arXiv:2201.00434  [pdf, other

    cs.CV

    TVNet: Temporal Voting Network for Action Localization

    Authors: Hanyuan Wang, Dima Damen, Majid Mirmehdi, Toby Perrett

    Abstract: We propose a Temporal Voting Network (TVNet) for action localization in untrimmed videos. This incorporates a novel Voting Evidence Module to locate temporal boundaries, more accurately, where temporal contextual evidence is accumulated to predict frame-level probabilities of start and end action boundaries. Our action-independent evidence module is incorporated within a pipeline to calculate conf… ▽ More

    Submitted 2 January, 2022; originally announced January 2022.

    Comments: 9 pages, 7 figures, 11 tables

  17. arXiv:2112.04011  [pdf, other

    cs.CV

    Auxiliary Learning for Self-Supervised Video Representation via Similarity-based Knowledge Distillation

    Authors: Amirhossein Dadashzadeh, Alan Whone, Majid Mirmehdi

    Abstract: Despite the outstanding success of self-supervised pretraining methods for video representation learning, they generalise poorly when the unlabeled dataset for pretraining is small or the domain difference between unlabelled data in source task (pretraining) and labeled data in target task (finetuning) is significant. To mitigate these issues, we propose a novel approach to complement self-supervi… ▽ More

    Submitted 25 April, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

  18. arXiv:2111.06830  [pdf, other

    cs.CV

    Small or Far Away? Exploiting Deep Super-Resolution and Altitude Data for Aerial Animal Surveillance

    Authors: Mowen Xue, Theo Greenslade, Majid Mirmehdi, Tilo Burghardt

    Abstract: Visuals captured by high-flying aerial drones are increasingly used to assess biodiversity and animal population dynamics around the globe. Yet, challenging acquisition scenarios and tiny animal depictions in airborne imagery, despite ultra-high resolution cameras, have so far been limiting factors for applying computer vision detectors successfully with high confidence. In this paper, we address… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

    Comments: 11 pages, 7 figures, 2 tables

    MSC Class: 65D19

  19. arXiv:2109.08730  [pdf, ps, other

    cs.CV

    Unsupervised View-Invariant Human Posture Representation

    Authors: Faegheh Sardari, Björn Ommer, Majid Mirmehdi

    Abstract: Most recent view-invariant action recognition and performance assessment approaches rely on a large amount of annotated 3D skeleton data to extract view-invariant features. However, acquiring 3D skeleton data can be cumbersome, if not impractical, in in-the-wild scenarios. To overcome this problem, we present a novel unsupervised approach that learns to extract view-invariant 3D human pose represe… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

  20. arXiv:2101.06184  [pdf, other

    cs.CV

    Temporal-Relational CrossTransformers for Few-Shot Action Recognition

    Authors: Toby Perrett, Alessandro Masullo, Tilo Burghardt, Majid Mirmehdi, Dima Damen

    Abstract: We propose a novel approach to few-shot action recognition, finding temporally-corresponding frame tuples between the query and videos in the support set. Distinct from previous few-shot works, we construct class prototypes using the CrossTransformer attention mechanism to observe relevant sub-sequences of all support videos, rather than using class averages or single best matches. Video represent… ▽ More

    Submitted 28 March, 2021; v1 submitted 15 January, 2021; originally announced January 2021.

    Comments: Accepted in CVPR 2021

  21. arXiv:2012.09890  [pdf, other

    cs.CV

    Exploring Motion Boundaries in an End-to-End Network for Vision-based Parkinson's Severity Assessment

    Authors: Amirhossein Dadashzadeh, Alan Whone, Michal Rolinski, Majid Mirmehdi

    Abstract: Evaluating neurological disorders such as Parkinson's disease (PD) is a challenging task that requires the assessment of several motor and non-motor functions. In this paper, we present an end-to-end deep learning framework to measure PD severity in two important components, hand movement and gait, of the Unified Parkinson's Disease Rating Scale (UPDRS). Our method leverages on an Inflated 3D CNN… ▽ More

    Submitted 24 December, 2020; v1 submitted 17 December, 2020; originally announced December 2020.

  22. arXiv:2010.07217  [pdf, other

    cs.CV

    Back to the Future: Cycle Encoding Prediction for Self-supervised Contrastive Video Representation Learning

    Authors: Xinyu Yang, Majid Mirmehdi, Tilo Burghardt

    Abstract: In this paper we show that learning video feature spaces in which temporal cycles are maximally predictable benefits action classification. In particular, we propose a novel learning approach termed Cycle Encoding Prediction (CEP) that is able to effectively represent high-level spatio-temporal structure of unlabelled video content. CEP builds a latent space wherein the concept of closed forward-b… ▽ More

    Submitted 24 October, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

    Comments: accepted at BMVC

  23. arXiv:2008.04999  [pdf, ps, other

    cs.CV

    VI-Net: View-Invariant Quality of Human Movement Assessment

    Authors: Faegheh Sardari, Adeline Paiement, Sion Hannuna, Majid Mirmehdi

    Abstract: We propose a view-invariant method towards the assessment of the quality of human movements which does not rely on skeleton data. Our end-to-end convolutional neural network consists of two stages, where at first a view-invariant trajectory descriptor for each body joint is generated from RGB images, and then the collection of trajectories for all joints are processed by an adapted, pre-trained 2D… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: 13 pages, 6 figures, 7 tables

  24. arXiv:2007.14658  [pdf, other

    cs.CV

    Meta-Learning with Context-Agnostic Initialisations

    Authors: Toby Perrett, Alessandro Masullo, Tilo Burghardt, Majid Mirmehdi, Dima Damen

    Abstract: Meta-learning approaches have addressed few-shot problems by finding initialisations suited for fine-tuning to target tasks. Often there are additional properties within training data (which we refer to as context), not relevant to the target task, which act as a distractor to meta-learning, particularly when the target task contains examples from a novel context not seen during training. We addre… ▽ More

    Submitted 22 October, 2020; v1 submitted 29 July, 2020; originally announced July 2020.

    Comments: Accepted at ACCV 2020

  25. arXiv:2005.13061  [pdf, other

    eess.IV cs.CV

    Prediction of Thrombectomy Functional Outcomes using Multimodal Data

    Authors: Zeynel A. Samak, Philip Clatworthy, Majid Mirmehdi

    Abstract: Recent randomised clinical trials have shown that patients with ischaemic stroke {due to occlusion of a large intracranial blood vessel} benefit from endovascular thrombectomy. However, predicting outcome of treatment in an individual patient remains a challenge. We propose a novel deep learning approach to directly exploit multimodal data (clinical metadata information, imaging data, and imaging… ▽ More

    Submitted 28 May, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

    Comments: Accepted at Medical Image Understanding and Analysis (MIUA) 2020

  26. arXiv:1910.09920  [pdf, other

    cs.CV

    Weakly-Supervised Completion Moment Detection using Temporal Attention

    Authors: Farnoosh Heidarivincheh, Majid Mirmehdi, Dima Damen

    Abstract: Monitoring the progression of an action towards completion offers fine grained insight into the actor's behaviour. In this work, we target detecting the completion moment of actions, that is the moment when the action's goal has been successfully accomplished. This has potential applications from surveillance to assistive living and human-robot interactions. Previous effort required human annotati… ▽ More

    Submitted 22 October, 2019; originally announced October 2019.

  27. Sit-to-Stand Analysis in the Wild using Silhouettes for Longitudinal Health Monitoring

    Authors: Alessandro Masullo, Tilo Burghardt, Toby Perrett, Dima Damen, Majid Mirmehdi

    Abstract: We present the first fully automated Sit-to-Stand or Stand-to-Sit (StS) analysis framework for long-term monitoring of patients in free-living environments using video silhouettes. Our method adopts a coarse-to-fine time localisation approach, where a deep learning classifier identifies possible StS sequences from silhouettes, and a smart peak detection stage provides fine localisation based on 3D… ▽ More

    Submitted 3 October, 2019; originally announced October 2019.

  28. arXiv:1908.11240  [pdf, other

    cs.CV

    Great Ape Detection in Challenging Jungle Camera Trap Footage via Attention-Based Spatial and Temporal Feature Blending

    Authors: Xinyu Yang, Majid Mirmehdi, Tilo Burghardt

    Abstract: We propose the first multi-frame video object detection framework trained to detect great apes. It is applicable to challenging camera trap footage in complex jungle environments and extends a traditional feature pyramid architecture by adding self-attention driven feature blending in both the spatial as well as the temporal domain. We demonstrate that this extension can detect distinctive species… ▽ More

    Submitted 29 August, 2019; originally announced August 2019.

    Comments: Accepted by ICCV workshop 2019

  29. arXiv:1806.08152  [pdf, other

    cs.CV

    CaloriNet: From silhouettes to calorie estimation in private environments

    Authors: Alessandro Masullo, Tilo Burghardt, Dima Damen, Sion Hannuna, Victor Ponce-López, Majid Mirmehdi

    Abstract: We propose a novel deep fusion architecture, CaloriNet, for the online estimation of energy expenditure for free living monitoring in private environments, where RGB data is discarded and replaced by silhouettes. Our fused convolutional neural network architecture is trainable end-to-end, to estimate calorie expenditure, using temporal foreground silhouettes alongside accelerometer data. The netwo… ▽ More

    Submitted 21 June, 2018; originally announced June 2018.

    Comments: 11 pages, 7 figures

  30. arXiv:1806.05653  [pdf, other

    cs.CV

    HGR-Net: A Fusion Network for Hand Gesture Segmentation and Recognition

    Authors: Amirhossein Dadashzadeh, Alireza Tavakoli Targhi, Maryam Tahmasbi, Majid Mirmehdi

    Abstract: We propose a two-stage convolutional neural network (CNN) architecture for robust recognition of hand gestures, called HGR-Net, where the first stage performs accurate semantic segmentation to determine hand regions, and the second stage identifies the gesture. The segmentation stage architecture is based on the combination of fully convolutional residual network and atrous spatial pyramid pooling… ▽ More

    Submitted 28 December, 2019; v1 submitted 14 June, 2018; originally announced June 2018.

  31. arXiv:1806.04074  [pdf, other

    cs.CV

    Semantically Selective Augmentation for Deep Compact Person Re-Identification

    Authors: Víctor Ponce-López, Tilo Burghardt, Sion Hannunna, Dima Damen, Alessandro Masullo, Majid Mirmehdi

    Abstract: We present a deep person re-identification approach that combines semantically selective, deep data augmentation with clustering-based network compression to generate high performance, light and fast inference networks. In particular, we propose to augment limited training data via sampling from a deep convolutional generative adversarial network (DCGAN), whose discriminator is constrained by a se… ▽ More

    Submitted 18 June, 2018; v1 submitted 11 June, 2018; originally announced June 2018.

  32. arXiv:1805.11907  [pdf, other

    cs.OH

    A Guide to the SPHERE 100 Homes Study Dataset

    Authors: Atis Elsts, Tilo Burghardt, Dallan Byrne, Massimo Camplani, Dima Damen, Xenofon Fafoutis, Sion Hannuna, William Harwin, Michael Holmes, Balazs Janko, Victor Ponce Lopez, Alessandro Masullo, Majid Mirmehdi, George Oikonomou, Robert Piechocki, R. Simon Sherratt, Emma Tonkin, Niall Twomey, Antonis Vafeas, Przemyslaw Woznowski, Ian Craddock

    Abstract: The SPHERE project has developed a multi-modal sensor platform for health and behavior monitoring in residential environments. So far, the SPHERE platform has been deployed for data collection in approximately 50 homes for duration up to one year. This technical document describes the format and the expected content of the SPHERE dataset(s) under preparation. It includes a list of some data qualit… ▽ More

    Submitted 30 October, 2018; v1 submitted 30 May, 2018; originally announced May 2018.

  33. arXiv:1805.06749  [pdf, ps, other

    cs.CV

    Action Completion: A Temporal Model for Moment Detection

    Authors: Farnoosh Heidarivincheh, Majid Mirmehdi, Dima Damen

    Abstract: We introduce completion moment detection for actions - the problem of locating the moment of completion, when the action's goal is confidently considered achieved. The paper proposes a joint classification-regression recurrent model that predicts completion from a given frame, and then integrates frame-level contributions to detect sequence-level completion moment. We introduce a recurrent voting… ▽ More

    Submitted 23 July, 2018; v1 submitted 17 May, 2018; originally announced May 2018.

  34. arXiv:1710.02310  [pdf, ps, other

    cs.CV

    Detecting the Moment of Completion: Temporal Models for Localising Action Completion

    Authors: Farnoosh Heidarivincheh, Majid Mirmehdi, Dima Damen

    Abstract: Action completion detection is the problem of modelling the action's progression towards localising the moment of completion - when the action's goal is confidently considered achieved. In this work, we assess the ability of two temporal models, namely Hidden Markov Models (HMM) and Long-Short Term Memory (LSTM), to localise completion for six object interactions: switch, plug, open, pull, pick an… ▽ More

    Submitted 6 October, 2017; originally announced October 2017.

  35. arXiv:1607.08196  [pdf, other

    cs.CV

    Calorie Counter: RGB-Depth Visual Estimation of Energy Expenditure at Home

    Authors: Lili Tao, Tilo Burghardt, Majid Mirmehdi, Dima Damen, Ashley Cooper, Sion Hannuna, Massimo Camplani, Adeline Paiement, Ian Craddock

    Abstract: We present a new framework for vision-based estimation of calorific expenditure from RGB-D data - the first that is validated on physical gas exchange measurements and applied to daily living scenarios. Deriving a person's energy expenditure from sensors is an important tool in tracking physical activity levels for health and lifestyle monitoring. Most existing methods use metabolic lookup tables… ▽ More

    Submitted 27 July, 2016; originally announced July 2016.

  36. arXiv:1606.04450  [pdf, other

    cs.CV

    Multiple Human Tracking in RGB-D Data: A Survey

    Authors: Massimo Camplani, Adeline Paiement, Majid Mirmehdi, Dima Damen, Sion Hannuna, Tilo Burghardt, Lili Tao

    Abstract: Multiple human tracking (MHT) is a fundamental task in many computer vision applications. Appearance-based approaches, primarily formulated on RGB data, are constrained and affected by problems arising from occlusions and/or illumination variations. In recent years, the arrival of cheap RGB-Depth (RGB-D) devices has {led} to many new approaches to MHT, and many of these integrate color and depth c… ▽ More

    Submitted 14 June, 2016; originally announced June 2016.

  37. arXiv:1512.07080  [pdf, other

    cs.CV

    Cost-based Feature Transfer for Vehicle Occupant Classification

    Authors: Toby Perrett, Majid Mirmehdi, Eduardo Dias

    Abstract: Knowledge of human presence and interaction in a vehicle is of growing interest to vehicle manufacturers for design and safety purposes. We present a framework to perform the tasks of occupant detection and occupant classification for automatic child locks and airbag suppression. It operates for all passenger seats, using a single overhead camera. A transfer learning technique is introduced to mak… ▽ More

    Submitted 22 December, 2015; originally announced December 2015.

    Comments: 9 pages, 4 figures, 5 tables

    ACM Class: I.4.9