Skip to main content

Showing 1–43 of 43 results for author: Jagersand, M

.
  1. arXiv:2407.00324  [pdf, other

    cs.RO cs.LG

    Revisiting Constant Negative Rewards for Goal-Reaching Tasks in Robot Learning

    Authors: Gautham Vasan, Yan Wang, Fahim Shahriar, James Bergstra, Martin Jagersand, A. Rupam Mahmood

    Abstract: Many real-world robot learning problems, such as pick-and-place or arriving at a destination, can be seen as a problem of reaching a goal state as soon as possible. These problems, when formulated as episodic reinforcement learning tasks, can easily be specified to align well with our intended goal: -1 reward every time step with termination upon reaching the goal state, called minimum-time tasks.… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: In Proceedings of Reinforcement Learning Conference 2024. For video demo, see https://drive.google.com/file/d/1O8D3oCWq5xf2hi1JOlMBbs6W1ClrvUFb/view?usp=sharing

  2. arXiv:2310.03932  [pdf, other

    cs.RO

    Bridging Low-level Geometry to High-level Concepts in Visual Servoing of Robot Manipulation Task Using Event Knowledge Graphs and Vision-Language Models

    Authors: Chen Jiang, Martin Jagersand

    Abstract: In this paper, we propose a framework of building knowledgeable robot control in the scope of smart human-robot interaction, by empowering a basic uncalibrated visual servoing controller with contextual knowledge through the joint usage of event knowledge graphs (EKGs) and large-scale pretrained vision-language models (VLMs). The framework is expanded in twofold: first, we interpret low-level imag… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  3. arXiv:2309.09183  [pdf, other

    cs.RO cs.CV

    CLIPUNetr: Assisting Human-robot Interface for Uncalibrated Visual Servoing Control with CLIP-driven Referring Expression Segmentation

    Authors: Chen Jiang, Yuchen Yang, Martin Jagersand

    Abstract: The classical human-robot interface in uncalibrated image-based visual servoing (UIBVS) relies on either human annotations or semantic segmentation with categorical labels. Both methods fail to match natural human communication and convey rich semantics in manipulation tasks as effectively as natural language expressions. In this paper, we tackle this problem by using referring expression segmenta… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

  4. arXiv:2307.05141  [pdf, other

    cs.RO cs.LG

    Deep Probabilistic Movement Primitives with a Bayesian Aggregator

    Authors: Michael Przystupa, Faezeh Haghverd, Martin Jagersand, Samuele Tosatto

    Abstract: Movement primitives are trainable parametric models that reproduce robotic movements starting from a limited set of demonstrations. Previous works proposed simple linear models that exhibited high sample efficiency and generalization power by allowing temporal modulation of movements (reproducing movements faster or slower), blending (merging two movements into one), via-point conditioning (constr… ▽ More

    Submitted 6 June, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

  5. arXiv:2212.08235  [pdf, other

    cs.LG cs.RO

    A Simple Decentralized Cross-Entropy Method

    Authors: Zichen Zhang, Jun **, Martin Jagersand, Jun Luo, Dale Schuurmans

    Abstract: Cross-Entropy Method (CEM) is commonly used for planning in model-based reinforcement learning (MBRL) where a centralized approach is typically utilized to update the sampling distribution based on only the top-$k$ operation's results on samples. In this paper, we show that such a centralized approach makes CEM vulnerable to local optima, thus impairing its sample efficiency. To tackle this issue,… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: NeurIPS 2022. The last two authors advised equally

  6. arXiv:2212.04407  [pdf, other

    cs.LG cs.AI

    Dynamic Decision Frequency with Continuous Options

    Authors: Amirmohammad Karimi, Jun **, Jun Luo, A. Rupam Mahmood, Martin Jagersand, Samuele Tosatto

    Abstract: In classic reinforcement learning algorithms, agents make decisions at discrete and fixed time intervals. The duration between decisions becomes a crucial hyperparameter, as setting it too short may increase the problem's difficulty by requiring the agent to make numerous decisions to achieve its goal while setting it too long can result in the agent losing control over the system. However, physic… ▽ More

    Submitted 25 October, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

    Comments: Appears in the Proceedings of the 2023 International Conference on Intelligent Robots and Systems (IROS). Source code at https://github.com/amir-karimi96/continuous-time-continuous-option-policy-gradient.git

  7. arXiv:2202.13604  [pdf, other

    cs.RO cs.AI

    Generalizable task representation learning from human demonstration videos: a geometric approach

    Authors: Jun **, Martin Jagersand

    Abstract: We study the problem of generalizable task learning from human demonstration videos without extra training on the robot or pre-recorded robot motions. Given a set of human demonstration videos showing a task with different objects/tools (categorical objects), we aim to learn a representation of visual observation that generalizes to categorical objects and enables efficient controller design. We p… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: Accepted in ICRA 2022

  8. arXiv:2106.06083  [pdf, other

    cs.RO

    Analyzing Neural Jacobian Methods in Applications of Visual Servoing and Kinematic Control

    Authors: Michael Przystupa, Masood Dehghan, Martin Jagersand, A. Rupam Mahmood

    Abstract: Designing adaptable control laws that can transfer between different robots is a challenge because of kinematic and dynamic differences, as well as in scenarios where external sensors are used. In this work, we empirically investigate a neural networks ability to approximate the Jacobian matrix for an application in Cartesian control schemes. Specifically, we are interested in approximating the ki… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: 8 pages, 6 Figures, https://www.youtube.com/watch?v=mOMIIBLCL20

  9. arXiv:2105.03533  [pdf, other

    cs.CV

    Video Class Agnostic Segmentation with Contrastive Learning for Autonomous Driving

    Authors: Mennatullah Siam, Alex Kendall, Martin Jagersand

    Abstract: Semantic segmentation in autonomous driving predominantly focuses on learning from large-scale data with a closed set of known classes without considering unknown objects. Motivated by safety reasons, we address the video class agnostic segmentation task, which considers unknown objects outside the closed set of known classes in our training data. We propose a novel auxiliary contrastive loss to l… ▽ More

    Submitted 10 May, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

  10. arXiv:2104.03892  [pdf, other

    cs.RO cs.HC

    A Quantitative Analysis of Activities of Daily Living: Insights into Improving Functional Independence with Assistive Robotics

    Authors: Laura Petrich, Jun **, Masood Dehghan, Martin Jagersand

    Abstract: Human assistive robotics have the potential to help the elderly and individuals living with disabilities with their Activities of Daily Living (ADL). Robotics researchers focus on assistive tasks from the perspective of various control schemes and motion types. Health research on the other hand focuses on clinical assessment and rehabilitation, arguably leaving important differences between the tw… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

    Comments: Submitted to IROS 2021. arXiv admin note: substantial text overlap with arXiv:2101.02750

  11. arXiv:2103.11015  [pdf, other

    cs.CV

    Video Class Agnostic Segmentation Benchmark for Autonomous Driving

    Authors: Mennatullah Siam, Alex Kendall, Martin Jagersand

    Abstract: Semantic segmentation approaches are typically trained on large-scale data with a closed finite set of known classes without considering unknown objects. In certain safety-critical robotics applications, especially autonomous driving, it is important to segment all objects, including those unknown at training time. We formalize the task of video class agnostic segmentation from monocular video seq… ▽ More

    Submitted 19 April, 2021; v1 submitted 19 March, 2021; originally announced March 2021.

    Comments: Accepted in WAD workshop, CVPR 2021

  12. arXiv:2101.04704  [pdf, other

    cs.CV

    Boundary-Aware Segmentation Network for Mobile and Web Applications

    Authors: Xuebin Qin, Deng-** Fan, Chenyang Huang, Cyril Diagne, Zichen Zhang, Adrià Cabeza Sant'Anna, Albert Suàrez, Martin Jagersand, Ling Shao

    Abstract: Although deep models have greatly improved the accuracy and robustness of image segmentation, obtaining segmentation results with highly accurate boundaries and fine structures is still a challenging problem. In this paper, we propose a simple yet powerful Boundary-Aware Segmentation Network (BASNet), which comprises a predict-refine architecture and a hybrid loss, for highly accurate image segmen… ▽ More

    Submitted 11 May, 2021; v1 submitted 12 January, 2021; originally announced January 2021.

    Comments: 18 pages, 16 figures, submitted to TPAMI

  13. arXiv:2101.02750  [pdf, other

    cs.RO

    Assistive arm and hand manipulation: How does current research intersect with actual healthcare needs?

    Authors: Laura Petrich, Jun **, Masood Dehghan, Martin Jagersand

    Abstract: Human assistive robotics have the potential to help the elderly and individuals living with disabilities with their Activities of Daily Living (ADL). Robotics researchers present bottom up solutions using various control methods for different types of movements. Health research on the other hand focuses on clinical assessment and rehabilitation leaving arguably important differences between the tw… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

    Comments: Submitted to ICRA 2021

  14. arXiv:2011.05857  [pdf, other

    cs.RO cs.AI

    Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement Learning

    Authors: Jun **, Daniel Graves, Cameron Haigh, Jun Luo, Martin Jagersand

    Abstract: We consider real-world reinforcement learning (RL) of robotic manipulation tasks that involve both visuomotor skills and contact-rich skills. We aim to train a policy that maps multimodal sensory observations (vision and force) to a manipulator's joint velocities under practical considerations. We propose to use offline samples to learn a set of general value functions (GVFs) that make counterfact… ▽ More

    Submitted 25 February, 2022; v1 submitted 11 November, 2020; originally announced November 2020.

    Comments: Accepted in ICRA 2022

  15. U$^2$-Net: Going Deeper with Nested U-Structure for Salient Object Detection

    Authors: Xuebin Qin, Zichen Zhang, Chenyang Huang, Masood Dehghan, Osmar R. Zaiane, Martin Jagersand

    Abstract: In this paper, we design a simple yet powerful deep network architecture, U$^2$-Net, for salient object detection (SOD). The architecture of our U$^2$-Net is a two-level nested U-structure. The design has the following advantages: (1) it is able to capture more contextual information from different scales thanks to the mixture of receptive fields of different sizes in our proposed ReSidual U-block… ▽ More

    Submitted 8 March, 2022; v1 submitted 18 May, 2020; originally announced May 2020.

    Comments: Accepted in Pattern Recognition 2020

  16. arXiv:2003.02768  [pdf, other

    cs.RO cs.LG

    A Geometric Perspective on Visual Imitation Learning

    Authors: Jun **, Laura Petrich, Masood Dehghan, Martin Jagersand

    Abstract: We consider the problem of visual imitation learning without human supervision (e.g. kinesthetic teaching or teleoperation), nor access to an interactive reinforcement learning (RL) training environment. We present a geometric perspective to derive solutions to this problem. Specifically, we propose VGS-IL (Visual Geometric Skill Imitation Learning), an end-to-end geometry-parameterized task conce… ▽ More

    Submitted 5 March, 2020; originally announced March 2020.

    Comments: submitted to IROS 2020

  17. arXiv:2003.01163  [pdf, other

    cs.CV cs.RO

    Understanding Contexts Inside Robot and Human Manipulation Tasks through a Vision-Language Model and Ontology System in a Video Stream

    Authors: Chen Jiang, Masood Dehghan, Martin Jagersand

    Abstract: Manipulation tasks in daily life, such as pouring water, unfold intentionally under specialized manipulation contexts. Being able to process contextual knowledge in these Activities of Daily Living (ADLs) over time can help us understand manipulation intentions, which are essential for an intelligent robot to transition smoothly between various manipulation actions. In this paper, to model the int… ▽ More

    Submitted 2 March, 2020; originally announced March 2020.

  18. arXiv:2001.09540  [pdf, other

    cs.CV

    Weakly Supervised Few-shot Object Segmentation using Co-Attention with Visual and Semantic Embeddings

    Authors: Mennatullah Siam, Naren Doraiswamy, Boris N. Oreshkin, Hengshuai Yao, Martin Jagersand

    Abstract: Significant progress has been made recently in develo** few-shot object segmentation methods. Learning is shown to be successful in few-shot segmentation settings, using pixel-level, scribbles and bounding box supervision. This paper takes another approach, i.e., only requiring image-level label for few-shot object segmentation. We propose a novel multi-modal interaction module for few-shot obje… ▽ More

    Submitted 17 May, 2020; v1 submitted 26 January, 2020; originally announced January 2020.

    Comments: Accepted to IJCAI'20. The first three authors listed contributed equally

  19. arXiv:1912.08936  [pdf, other

    cs.CV

    One-Shot Weakly Supervised Video Object Segmentation

    Authors: Mennatullah Siam, Naren Doraiswamy, Boris N. Oreshkin, Hengshuai Yao, Martin Jagersand

    Abstract: Conventional few-shot object segmentation methods learn object segmentation from a few labelled support images with strongly labelled segmentation masks. Recent work has shown to perform on par with weaker levels of supervision in terms of scribbles and bounding boxes. However, there has been limited attention given to the problem of few-shot object segmentation with image-level supervision. We pr… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

  20. Visual Geometric Skill Inference by Watching Human Demonstration

    Authors: Jun **, Laura Petrich, Zichen Zhang, Masood Dehghan, Martin Jagersand

    Abstract: We study the problem of learning manipulation skills from human demonstration video by inferring the association relationships between geometric features. Motivation for this work stems from the observation that humans perform eye-hand coordination tasks by using geometric primitives to define a task while a geometric control error drives the task through execution. We propose a graph based kernel… ▽ More

    Submitted 5 March, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: Accepted in ICRA 2020

  21. Mapless Navigation among Dynamics with Social-safety-awareness: a reinforcement learning approach from 2D laser scans

    Authors: Jun **, Nhat M. Nguyen, Nazmus Sakib, Daniel Graves, Hengshuai Yao, Martin Jagersand

    Abstract: We propose a method to tackle the problem of mapless collision-avoidance navigation where humans are present using 2D laser scans. Our proposed method uses ego-safety to measure collision from the robot's perspective while social-safety to measure the impact of our robot's actions on surrounding pedestrians. Specifically, the social-safety part predicts the intrusion impact of our robot's action i… ▽ More

    Submitted 5 March, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: Accepted in ICRA 2020

  22. arXiv:1909.07459  [pdf, other

    cs.CV

    Bridging Visual Perception with Contextual Semantics for Understanding Robot Manipulation Tasks

    Authors: Chen Jiang, Martin Jagersand

    Abstract: Understanding manipulation scenarios allows intelligent robots to plan for appropriate actions to complete a manipulation task successfully. It is essential for intelligent robots to semantically interpret manipulation knowledge by describing entities, relations and attributes in a structural manner. In this paper, we propose an implementing framework to generate high-level conceptual dynamic know… ▽ More

    Submitted 26 July, 2020; v1 submitted 16 September, 2019; originally announced September 2019.

  23. arXiv:1903.09189  [pdf, other

    cs.RO

    Long range teleoperation for fine manipulation tasks under time-delay network conditions

    Authors: Jun **, Laura Petrich, Shida He, Masood Dehghan, Martin Jagersand

    Abstract: We present a coarse-to-fine approach based semi-autonomous teleoperation system using vision guidance. The system is optimized for long range teleoperation tasks under time-delay network conditions and does not require prior knowledge of the remote scene. Our system initializes with a self exploration behavior that senses the remote surroundings through a freely mounted eye-in-hand web cam. The se… ▽ More

    Submitted 21 March, 2019; originally announced March 2019.

    Comments: --submitted to IROS 2019 with RA-L option

  24. arXiv:1903.00634  [pdf, other

    cs.RO

    Evaluation of state representation methods in robot hand-eye coordination learning from demonstration

    Authors: Jun **, Masood Dehghan, Laura Petrich, Steven Weikai Lu, Martin Jagersand

    Abstract: We evaluate different state representation methods in robot hand-eye coordination learning on different aspects. Regarding state dimension reduction: we evaluates how these state representation methods capture relevant task information and how much compactness should a state representation be. Regarding controllability: experiments are designed to use different state representation methods in a tr… ▽ More

    Submitted 2 March, 2019; originally announced March 2019.

    Comments: submitted to IROS 2019

  25. arXiv:1902.11123  [pdf, other

    cs.CV cs.LG stat.ML

    Adaptive Masked Proxies for Few-Shot Segmentation

    Authors: Mennatullah Siam, Boris Oreshkin, Martin Jagersand

    Abstract: Deep learning has thrived by training on large-scale datasets. However, in robotics applications sample efficiency is critical. We propose a novel adaptive masked proxies method that constructs the final segmentation layer weights from few labelled samples. It utilizes multi-resolution average pooling on base embeddings masked with the label to act as a positive proxy for the new class, while fusi… ▽ More

    Submitted 14 October, 2019; v1 submitted 19 February, 2019; originally announced February 2019.

    Comments: Accepted to ICCV'19

  26. arXiv:1810.07733  [pdf, other

    cs.CV

    Video Object Segmentation using Teacher-Student Adaptation in a Human Robot Interaction (HRI) Setting

    Authors: Mennatullah Siam, Chen Jiang, Steven Lu, Laura Petrich, Mahmoud Gamal, Mohamed Elhoseiny, Martin Jagersand

    Abstract: Video object segmentation is an essential task in robot manipulation to facilitate gras** and learning affordances. Incremental learning is important for robotics in unstructured environments, since the total number of objects and their variations can be intractable. Inspired by the children learning process, human robot interaction (HRI) can be utilized to teach robots about the world guided by… ▽ More

    Submitted 12 March, 2019; v1 submitted 17 October, 2018; originally announced October 2018.

    Comments: Accepted in ICRA'19, https://msiam.github.io/ivos/

  27. Robot eye-hand coordination learning by watching human demonstrations: a task function approximation approach

    Authors: Jun **, Laura Petrich, Masood Dehghan, Zichen Zhang, Martin Jagersand

    Abstract: We present a robot eye-hand coordination learning method that can directly learn visual task specification by watching human demonstrations. Task specification is represented as a task function, which is learned using inverse reinforcement learning(IRL) by inferring differential rewards between state changes. The learned task function is then used as continuous feedbacks in an uncalibrated visual… ▽ More

    Submitted 27 February, 2019; v1 submitted 29 September, 2018; originally announced October 2018.

    Comments: Accepted in ICRA 2019

  28. arXiv:1809.08722  [pdf, other

    cs.RO

    Online Object and Task Learning via Human Robot Interaction

    Authors: Masood Dehghan, Zichen Zhang, Mennatullah Siam, Jun **, Laura Petrich, Martin Jagersand

    Abstract: This work describes the development of a robotic system that acquires knowledge incrementally through human interaction where new tools and motions are taught on the fly. The robotic system developed was one of the five finalists in the KUKA Innovation Award competition and demonstrated during the Hanover Messe 2018 in Germany. The main contributions of the system are a) a novel incremental object… ▽ More

    Submitted 27 February, 2019; v1 submitted 23 September, 2018; originally announced September 2018.

    Comments: 7 pages. ICRA19

  29. arXiv:1803.02758  [pdf, other

    cs.CV

    RTSeg: Real-time Semantic Segmentation Comparative Study

    Authors: Mennatullah Siam, Mostafa Gamal, Moemen Abdel-Razek, Senthil Yogamani, Martin Jagersand

    Abstract: Semantic segmentation benefits robotics related applications especially autonomous driving. Most of the research on semantic segmentation is only on increasing the accuracy of segmentation models with little attention to computationally efficient solutions. The few work conducted in this direction does not provide principled methods to evaluate the different design choices for segmentation. In thi… ▽ More

    Submitted 16 May, 2020; v1 submitted 7 March, 2018; originally announced March 2018.

    Comments: Accepted in IEEE ICIP 2018. IEEE Copyrights: Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses

  30. arXiv:1801.02722  [pdf, other

    cs.CV

    End-to-end detection-segmentation network with ROI convolution

    Authors: Zichen Zhang, Min Tang, Dana Cobzas, Dornoosh Zonoobi, Martin Jagersand, Jacob L. Jaremko

    Abstract: We propose an end-to-end neural network that improves the segmentation accuracy of fully convolutional networks by incorporating a localization unit. This network performs object localization first, which is then used as a cue to guide the training of the segmentation network. We test the proposed method on a segmentation task of small objects on a clinical dataset of ultrasound images. We show th… ▽ More

    Submitted 2 December, 2019; v1 submitted 8 January, 2018; originally announced January 2018.

    Comments: ISBI 2018

  31. arXiv:1711.00139  [pdf, other

    cs.CV

    Segmentation-by-Detection: A Cascade Network for Volumetric Medical Image Segmentation

    Authors: Min Tang, Zichen Zhang, Dana Cobzas, Martin Jagersand, Jacob L. Jaremko

    Abstract: We propose an attention mechanism for 3D medical image segmentation. The method, named segmentation-by-detection, is a cascade of a detection module followed by a segmentation module. The detection module enables a region of interest to come to attention and produces a set of object region candidates which are further used as an attention model. Rather than dealing with the entire volume, the segm… ▽ More

    Submitted 31 October, 2017; originally announced November 2017.

  32. arXiv:1709.04821  [pdf, other

    cs.CV cs.RO

    MODNet: Moving Object Detection Network with Motion and Appearance for Autonomous Driving

    Authors: Mennatullah Siam, Heba Mahgoub, Mohamed Zahran, Senthil Yogamani, Martin Jagersand, Ahmad El-Sallab

    Abstract: We propose a novel multi-task learning system that combines appearance and motion cues for a better semantic reasoning of the environment. A unified architecture for joint vehicle detection and motion segmentation is introduced. In this architecture, a two-stream encoder is shared among both tasks. In order to evaluate our method in autonomous driving setting, KITTI annotated sequences with detect… ▽ More

    Submitted 12 November, 2017; v1 submitted 14 September, 2017; originally announced September 2017.

  33. arXiv:1708.03275  [pdf, other

    cs.CV

    Incremental 3D Line Segment Extraction from Semi-dense SLAM

    Authors: Shida He, Xuebin Qin, Zichen Zhang, Martin Jagersand

    Abstract: Although semi-dense Simultaneous Localization and Map** (SLAM) has been becoming more popular over the last few years, there is a lack of efficient methods for representing and processing their large scale point clouds. In this paper, we propose using 3D line segments to simplify the point clouds generated by semi-dense SLAM. Specifically, we present a novel incremental approach for 3D line segm… ▽ More

    Submitted 26 April, 2018; v1 submitted 10 August, 2017; originally announced August 2017.

    Comments: Accepted at ICPR 2018

  34. arXiv:1707.02432  [pdf, other

    stat.ML cs.CV

    Deep Semantic Segmentation for Automated Driving: Taxonomy, Roadmap and Challenges

    Authors: Mennatullah Siam, Sara Elkerdawy, Martin Jagersand, Senthil Yogamani

    Abstract: Semantic segmentation was seen as a challenging computer vision problem few years ago. Due to recent advancements in deep learning, relatively accurate solutions are now possible for its use in automated driving. In this paper, the semantic segmentation problem is explored from the perspective of automated driving. Most of the current semantic segmentation algorithms are designed for generic image… ▽ More

    Submitted 3 August, 2017; v1 submitted 8 July, 2017; originally announced July 2017.

    Comments: To appear in IEEE ITSC 2017

  35. arXiv:1705.00360  [pdf, other

    cs.CV

    Real-Time Salient Closed Boundary Tracking via Line Segments Perceptual Grou**

    Authors: Xuebin Qin, Shida He, Camilo Perez Quintero, Abhineet Singh, Masood Dehghan, Martin Jagersand

    Abstract: This paper presents a novel real-time method for tracking salient closed boundaries from video image sequences. This method operates on a set of straight line segments that are produced by line detection. The tracking scheme is coherently integrated into a perceptual grou** framework in which the visual tracking problem is tackled by identifying a subset of these line segments and connecting the… ▽ More

    Submitted 9 August, 2017; v1 submitted 30 April, 2017; originally announced May 2017.

    Comments: 7 pages, 8 figures, The 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017) submission ID 1034

  36. arXiv:1703.01698  [pdf, other

    cs.CV

    4-DoF Tracking for Robot Fine Manipulation Tasks

    Authors: Mennatullah Siam, Abhineet Singh, Camilo Perez, Martin Jagersand

    Abstract: This paper presents two visual trackers from the different paradigms of learning and registration based tracking and evaluates their application in image based visual servoing. They can track object motion with four degrees of freedom (DoF) which, as we will show here, is sufficient for many fine manipulation tasks. One of these trackers is a newly developed learning based tracker that relies on l… ▽ More

    Submitted 3 April, 2017; v1 submitted 5 March, 2017; originally announced March 2017.

    Comments: accepted in CRV 2017

  37. arXiv:1701.04693  [pdf, other

    cs.RO cs.HC cs.LG

    Incremental Learning for Robot Perception through HRI

    Authors: Sepehr Valipour, Camilo Perez, Martin Jagersand

    Abstract: Scene understanding and object recognition is a difficult to achieve yet crucial skill for robots. Recently, Convolutional Neural Networks (CNN), have shown success in this task. However, there is still a gap between their performance on image datasets and real-world robotics scenarios. We present a novel paradigm for incrementally improving a robot's visual perception through active human interac… ▽ More

    Submitted 17 January, 2017; originally announced January 2017.

  38. arXiv:1611.05435  [pdf, other

    cs.CV

    Convolutional Gated Recurrent Networks for Video Segmentation

    Authors: Mennatullah Siam, Sepehr Valipour, Martin Jagersand, Nilanjan Ray

    Abstract: Semantic segmentation has recently witnessed major progress, where fully convolutional neural networks have shown to perform well. However, most of the previous work focused on improving single image segmentation. To our knowledge, no prior work has made use of temporal video information in a recurrent network. In this paper, we introduce a novel approach to implicitly utilize temporal data in vid… ▽ More

    Submitted 21 November, 2016; v1 submitted 16 November, 2016; originally announced November 2016.

    Comments: arXiv admin note: text overlap with arXiv:1606.00487

  39. arXiv:1607.04673  [pdf, ps, other

    cs.CV

    Unifying Registration based Tracking: A Case Study with Structural Similarity

    Authors: Abhineet Singh, Mennatullah Siam, Martin Jagersand

    Abstract: This paper adapts a popular image quality measure called structural similarity for high precision registration based tracking while also introducing a simpler and faster variant of the same. Further, these are evaluated comprehensively against existing measures using a unified approach to study registration based trackers that decomposes them into three constituent sub modules - appearance model,… ▽ More

    Submitted 30 January, 2017; v1 submitted 15 July, 2016; originally announced July 2016.

    Comments: Accepted at WACV 2017. Supplementary available at: http://webdocs.cs.ualberta.ca/~vis/mtf/ssim_supplementary.pdf arXiv admin note: text overlap with arXiv:1603.01292

  40. arXiv:1606.09367  [pdf, other

    cs.CV

    Parking Stall Vacancy Indicator System Based on Deep Convolutional Neural Networks

    Authors: Sepehr Valipour, Mennatullah Siam, Eleni Stroulia, Martin Jagersand

    Abstract: Parking management systems, and vacancy-indication services in particular, can play a valuable role in reducing traffic and energy waste in large cities. Visual detection methods represent a cost-effective option, since they can take advantage of hardware usually already available in many parking lots, namely cameras. However, visual detection methods can be fragile and not easily generalizable. I… ▽ More

    Submitted 30 June, 2016; originally announced June 2016.

  41. arXiv:1606.00487  [pdf, other

    cs.CV

    Recurrent Fully Convolutional Networks for Video Segmentation

    Authors: Sepehr Valipour, Mennatullah Siam, Martin Jagersand, Nilanjan Ray

    Abstract: Image segmentation is an important step in most visual tasks. While convolutional neural networks have shown to perform well on single image segmentation, to our knowledge, no study has been been done on leveraging recurrent gated architectures for video segmentation. Accordingly, we propose a novel method for online segmentation of video sequences that incorporates temporal data. The network is b… ▽ More

    Submitted 30 October, 2016; v1 submitted 1 June, 2016; originally announced June 2016.

  42. arXiv:1603.01292  [pdf, ps, other

    cs.CV

    Modular Decomposition and Analysis of Registration based Trackers

    Authors: Abhineet Singh, Ankush Roy, Xi Zhang, Martin Jagersand

    Abstract: This paper presents a new way to study registration based trackers by decomposing them into three constituent sub modules: appearance model, state space model and search method. It is often the case that when a new tracker is introduced in literature, it only contributes to one or two of these sub modules while using existing methods for the rest. Since these are often selected arbitrarily by the… ▽ More

    Submitted 25 March, 2016; v1 submitted 3 March, 2016; originally announced March 2016.

  43. arXiv:1602.09130  [pdf, other

    cs.CV cs.RO

    Modular Tracking Framework: A Unified Approach to Registration based Tracking

    Authors: Abhineet Singh, Martin Jagersand

    Abstract: This paper presents a modular, extensible and highly efficient open source framework for registration based tracking called Modular Tracking Framework (MTF). Targeted at robotics applications, it is implemented entirely in C++ and designed from the ground up to easily integrate with systems that support any of several major vision and robotics libraries including OpenCV, ROS, ViSP and Eigen. It im… ▽ More

    Submitted 17 May, 2018; v1 submitted 29 February, 2016; originally announced February 2016.

    Comments: Under consideration at Computer Vision and Image Understanding