-
Real-Time 3D Shape of Micro-Details
Authors:
Maryam Khanian,
Ali Sharifi Boroujerdi,
Michael Breuss
Abstract:
Motivated by the growing demand for interactive environments, we propose an accurate real-time 3D shape reconstruction technique. To provide a reliable 3D reconstruction which is still a challenging task when dealing with real-world applications, we integrate several components including (i) Photometric Stereo (PS), (ii) perspective Cook-Torrance reflectance model that enables PS to deal with a br…
▽ More
Motivated by the growing demand for interactive environments, we propose an accurate real-time 3D shape reconstruction technique. To provide a reliable 3D reconstruction which is still a challenging task when dealing with real-world applications, we integrate several components including (i) Photometric Stereo (PS), (ii) perspective Cook-Torrance reflectance model that enables PS to deal with a broad range of possible real-world object reflections, (iii) realistic lightening situation, (iv) a Recurrent Optimization Network (RON) and finally (v) heuristic Dijkstra Gaussian Mean Curvature (DGMC) initialization approach. We demonstrate the potential benefits of our hybrid model by providing 3D shape with highly-detailed information from micro-prints for the first time. All real-world images are taken by a mobile phone camera under a simple setup as a consumer-level equipment. In addition, complementary synthetic experiments confirm the beneficial properties of our novel method and its superiority over the state-of-the-art approaches.
△ Less
Submitted 16 February, 2018;
originally announced February 2018.
-
Photometric stereo for strong specular highlights
Authors:
Maryam Khanian,
Ali Sharifi Boroujerdi,
Michael Breuß
Abstract:
Photometric stereo (PS) is a fundamental technique in computer vision known to produce 3-D shape with high accuracy. The setting of PS is defined by using several input images of a static scene taken from one and the same camera position but under varying illumination. The vast majority of studies in this 3-D reconstruction method assume orthographic projection for the camera model. In addition, t…
▽ More
Photometric stereo (PS) is a fundamental technique in computer vision known to produce 3-D shape with high accuracy. The setting of PS is defined by using several input images of a static scene taken from one and the same camera position but under varying illumination. The vast majority of studies in this 3-D reconstruction method assume orthographic projection for the camera model. In addition, they mainly consider the Lambertian reflectance model as the way that light scatters at surfaces. So, providing reliable PS results from real world objects still remains a challenging task. We address 3-D reconstruction by PS using a more realistic set of assumptions combining for the first time the complete Blinn-Phong reflectance model and perspective projection. To this end, we will compare two different methods of incorporating the perspective projection into our model. Experiments are performed on both synthetic and real world images. Note that our real-world experiments do not benefit from laboratory conditions. The results show the high potential of our method even for complex real world applications such as medical endoscopy images which may include high amounts of specular highlights.
△ Less
Submitted 5 September, 2017;
originally announced September 2017.
-
Deep Interactive Region Segmentation and Captioning
Authors:
Ali Sharifi Boroujerdi,
Maryam Khanian,
Michael Breuss
Abstract:
With recent innovations in dense image captioning, it is now possible to describe every object of the scene with a caption while objects are determined by bounding boxes. However, interpretation of such an output is not trivial due to the existence of many overlap** bounding boxes. Furthermore, in current captioning frameworks, the user is not able to involve personal preferences to exclude out…
▽ More
With recent innovations in dense image captioning, it is now possible to describe every object of the scene with a caption while objects are determined by bounding boxes. However, interpretation of such an output is not trivial due to the existence of many overlap** bounding boxes. Furthermore, in current captioning frameworks, the user is not able to involve personal preferences to exclude out of interest areas. In this paper, we propose a novel hybrid deep learning architecture for interactive region segmentation and captioning where the user is able to specify an arbitrary region of the image that should be processed. To this end, a dedicated Fully Convolutional Network (FCN) named Lyncean FCN (LFCN) is trained using our special training data to isolate the User Intention Region (UIR) as the output of an efficient segmentation. In parallel, a dense image captioning model is utilized to provide a wide variety of captions for that region. Then, the UIR will be explained with the caption of the best match bounding box. To the best of our knowledge, this is the first work that provides such a comprehensive output. Our experiments show the superiority of the proposed approach over state-of-the-art interactive segmentation methods on several well-known datasets. In addition, replacement of the bounding boxes with the result of the interactive segmentation leads to a better understanding of the dense image captioning output as well as accuracy enhancement for the object detection in terms of Intersection over Union (IoU).
△ Less
Submitted 26 July, 2017;
originally announced July 2017.
-
Fast and Accurate Surface Normal Integration on Non-Rectangular Domains
Authors:
Martin Bähr,
Michael Breuß,
Yvain Quéau,
Ali Sharifi Boroujerdi,
Jean-Denis Durou
Abstract:
The integration of surface normals for the purpose of computing the shape of a surface in 3D space is a classic problem in computer vision. However, even nowadays it is still a challenging task to devise a method that combines the flexibility to work on non-trivial computational domains with high accuracy, robustness and computational efficiency. By uniting a classic approach for surface normal in…
▽ More
The integration of surface normals for the purpose of computing the shape of a surface in 3D space is a classic problem in computer vision. However, even nowadays it is still a challenging task to devise a method that combines the flexibility to work on non-trivial computational domains with high accuracy, robustness and computational efficiency. By uniting a classic approach for surface normal integration with modern computational techniques we construct a solver that fulfils these requirements. Building upon the Poisson integration model we propose to use an iterative Krylov subspace solver as a core step in tackling the task. While such a method can be very efficient, it may only show its full potential when combined with a suitable numerical preconditioning and a problem-specific initialisation. We perform a thorough numerical study in order to identify an appropriate preconditioner for our purpose. To address the issue of a suitable initialisation we propose to compute this initial state via a recently developed fast marching integrator. Detailed numerical experiments illuminate the benefits of this novel combination. In addition, we show on real-world photometric stereo datasets that the developed numerical framework is flexible enough to tackle modern computer vision applications.
△ Less
Submitted 19 October, 2016;
originally announced October 2016.
-
Highly Robust Clustering of GPS Driver Data for Energy Efficient Driving Style Modelling
Authors:
Michael Breuß,
Laurent Hoeltgen,
Ali Sharifi Boroujerdi,
Ashkan Mansouri Yarahmadi
Abstract:
This paper presents a novel approach to distinguish driving styles with respect to their energy efficiency. A distinct property of our method is that it relies exclusively on Global Positioning System (GPS) logs of drivers. This setting is highly relevant in practice as these data can easily be acquired.
Relying on positional data alone means that all derived features will be correlated, so we s…
▽ More
This paper presents a novel approach to distinguish driving styles with respect to their energy efficiency. A distinct property of our method is that it relies exclusively on Global Positioning System (GPS) logs of drivers. This setting is highly relevant in practice as these data can easily be acquired.
Relying on positional data alone means that all derived features will be correlated, so we strive to find a single quantity that allows us to perform the driving style analysis. To this end we consider a robust variation of the so called jerk of a movement. We show that our feature choice outperforms other more commonly used jerk-based formulations and we discuss the handling of noisy, inconsistent, and incomplete data as this is a notorious problem when dealing with real-world GPS logs.
Our solving strategy relies on an agglomerative hierarchical clustering combined with an L-term heuristic to determine the relevant number of clusters. It can easily be implemented and performs fast, even on very large, real-world data sets. Experiments show that our approach is robust against noise and able to discern different driving styles.
△ Less
Submitted 10 October, 2016;
originally announced October 2016.