Skip to main content

Showing 51–88 of 88 results for author: Kannala, J

.
  1. arXiv:1904.06145  [pdf, other

    cs.LG cs.CV stat.ML

    Towards Photographic Image Manipulation with Balanced Growing of Generative Autoencoders

    Authors: Ari Heljakka, Arno Solin, Juho Kannala

    Abstract: We present a generative autoencoder that provides fast encoding, faithful reconstructions (eg. retaining the identity of a face), sharp generated/reconstructed samples in high resolutions, and a well-structured latent space that supports semantic manipulation of the inputs. There are no current autoencoder or GAN models that satisfactorily achieve all of these. We build on the progressively growin… ▽ More

    Submitted 20 February, 2020; v1 submitted 12 April, 2019; originally announced April 2019.

    Comments: WACV 2020

  2. Digging Deeper into Egocentric Gaze Prediction

    Authors: Hamed R. Tavakoli, Esa Rahtu, Juho Kannala, Ali Borji

    Abstract: This paper digs deeper into factors that influence egocentric gaze. Instead of training deep models for this purpose in a blind manner, we propose to inspect factors that contribute to gaze guidance during daily tasks. Bottom-up saliency and optical flow are assessed versus strong spatial prior baselines. Task-specific cues such as vanishing point, manipulation point, and hand regions are analyzed… ▽ More

    Submitted 12 April, 2019; originally announced April 2019.

    Comments: presented at WACV 2019

  3. arXiv:1904.01920  [pdf, other

    cs.CV

    CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis

    Authors: Ahti Kalervo, Juha Ylioinas, Markus Häikiö, Antti Karhu, Juho Kannala

    Abstract: Better understanding and modelling of building interiors and the emergence of more impressive AR/VR technology has brought up the need for automatic parsing of floorplan images. However, there is a clear lack of representative datasets to investigate the problem further. To address this shortcoming, this paper presents a novel image dataset called CubiCasa5K, a large-scale floorplan image dataset… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

  4. arXiv:1904.01909  [pdf, other

    cs.CV

    ICface: Interpretable and Controllable Face Reenactment Using GANs

    Authors: Soumya Tripathy, Juho Kannala, Esa Rahtu

    Abstract: This paper presents a generic face animator that is able to control the pose and expressions of a given face image. The animation is driven by human interpretable control signals consisting of head pose angles and the Action Unit (AU) values. The control information can be obtained from multiple sources including external driving videos and manual controls. Due to the interpretable nature of the d… ▽ More

    Submitted 17 January, 2020; v1 submitted 3 April, 2019; originally announced April 2019.

    Comments: Accepted in WACV-2020

  5. arXiv:1903.11981  [pdf, other

    cs.LG cs.RO stat.ML

    Regularizing Trajectory Optimization with Denoising Autoencoders

    Authors: Rinu Boney, Norman Di Palo, Mathias Berglund, Alexander Ilin, Juho Kannala, Antti Rasmus, Harri Valpola

    Abstract: Trajectory optimization using a learned model of the environment is one of the core elements of model-based reinforcement learning. This procedure often suffers from exploiting inaccuracies of the learned model. We propose to regularize trajectory optimization by means of a denoising autoencoder that is trained on the same trajectories as the model of the environment. We show that the proposed reg… ▽ More

    Submitted 25 December, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

    Comments: NeurIPS 2019

  6. arXiv:1903.03825  [pdf

    stat.ML cs.AI cs.LG

    Interpolation Consistency Training for Semi-Supervised Learning

    Authors: Vikas Verma, Kenji Kawaguchi, Alex Lamb, Juho Kannala, Arno Solin, Yoshua Bengio, David Lopez-Paz

    Abstract: We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm. ICT encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolation of the predictions at those points. In classification problems, ICT moves the decision boundary to low-density reg… ▽ More

    Submitted 19 October, 2022; v1 submitted 9 March, 2019; originally announced March 2019.

    Comments: This is the latest version, which is published in the Journal, "Neural Networks", in 2022. All the previous results are unchanged. Keyword: Deep Learning, Semi-supervised Learning, Mixup

    Journal ref: Neural Networks, volume 145, pages 90-106 (2022)

  7. arXiv:1902.02166  [pdf, other

    cs.CV

    Unstructured Multi-View Depth Estimation Using Mask-Based Multiplane Representation

    Authors: Yuxin Hou, Arno Solin, Juho Kannala

    Abstract: This paper presents a novel method, MaskMVS, to solve depth estimation for unstructured multi-view image-pose pairs. In the plane-sweep procedure, the depth planes are sampled by histogram matching that ensures covering the depth range of interest. Unlike other plane-sweep methods, we do not rely on a cost metric to explicitly build the cost volume, but instead infer a multiplane mask representati… ▽ More

    Submitted 10 April, 2019; v1 submitted 6 February, 2019; originally announced February 2019.

  8. arXiv:1901.10170  [pdf, other

    cs.CV

    Mask-RCNN and U-net Ensembled for Nuclei Segmentation

    Authors: Aarno Oskar Vuola, Saad Ullah Akram, Juho Kannala

    Abstract: Nuclei segmentation is both an important and in some ways ideal task for modern computer vision methods, e.g. convolutional neural networks. While recent developments in theory and open-source software have made these tools easier to implement, expert knowledge is still required to choose the right model architecture and training setup. We compare two popular segmentation frameworks, U-Net and Mas… ▽ More

    Submitted 29 January, 2019; originally announced January 2019.

    Comments: To appear in IEEE International Symposium on Biomedical Imaging (ISBI) 2019

  9. arXiv:1901.08341  [pdf, other

    cs.CV

    Semantic Matching by Weakly Supervised 2D Point Set Registration

    Authors: Zakaria Laskar, Hamed R. Tavakoli, Juho Kannala

    Abstract: In this paper we address the problem of establishing correspondences between different instances of the same object. The problem is posed as finding the geometric transformation that aligns a given image pair. We use a convolutional neural network (CNN) to directly regress the parameters of the transformation model. The alignment problem is defined in the setting where an unordered set of semantic… ▽ More

    Submitted 24 January, 2019; originally announced January 2019.

    Comments: Accepted to WACV 2019

  10. arXiv:1901.08339  [pdf, other

    cs.CV

    Semi-Supervised Semantic Matching

    Authors: Zakaria Laskar, Juho Kannala

    Abstract: Convolutional neural networks (CNNs) have been successfully applied to solve the problem of correspondence estimation between semantically related images. Due to non-availability of large training datasets, existing methods resort to self-supervised or unsupervised training paradigm. In this paper we propose a semi-supervised learning framework that imposes cyclic consistency constraint on unlabel… ▽ More

    Submitted 24 January, 2019; originally announced January 2019.

    Comments: Accepted to ECCVW (GMDL) 2018

  11. arXiv:1811.09485  [pdf, other

    cs.CV

    LSD$_2$ -- Joint Denoising and Deblurring of Short and Long Exposure Images with CNNs

    Authors: Janne Mustaniemi, Juho Kannala, Jiri Matas, Simo Särkkä, Janne Heikkilä

    Abstract: The paper addresses the problem of acquiring high-quality photographs with handheld smartphone cameras in low-light imaging conditions. We propose an approach based on capturing pairs of short and long exposure images in rapid succession and fusing them into a single high-quality photograph. Unlike existing methods, we take advantage of both images simultaneously and perform a joint denoising and… ▽ More

    Submitted 1 September, 2020; v1 submitted 23 November, 2018; originally announced November 2018.

  12. arXiv:1810.08393  [pdf, other

    cs.CV

    DGC-Net: Dense Geometric Correspondence Network

    Authors: Iaroslav Melekhov, Aleksei Tiulpin, Torsten Sattler, Marc Pollefeys, Esa Rahtu, Juho Kannala

    Abstract: This paper addresses the challenge of dense pixel correspondence estimation between two images. This problem is closely related to optical flow estimation task where ConvNets (CNNs) have recently achieved significant progress. While optical flow methods produce very accurate results for the small pixel translation and limited appearance variation scenarios, they hardly deal with the strong geometr… ▽ More

    Submitted 22 October, 2018; v1 submitted 19 October, 2018; originally announced October 2018.

    Comments: Supplementary material included; Affiliation section has been changed

  13. arXiv:1810.00986  [pdf, other

    cs.CV

    Gyroscope-Aided Motion Deblurring with Deep Networks

    Authors: Janne Mustaniemi, Juho Kannala, Simo Särkkä, Jiri Matas, Janne Heikkilä

    Abstract: We propose a deblurring method that incorporates gyroscope measurements into a convolutional neural network (CNN). With the help of such measurements, it can handle extremely strong and spatially-variant motion blur. At the same time, the image data is used to overcome the limitations of gyro-based blur estimation. To train our network, we also introduce a novel way of generating realistic trainin… ▽ More

    Submitted 23 November, 2018; v1 submitted 1 October, 2018; originally announced October 2018.

  14. arXiv:1808.04999  [pdf, other

    cs.CV

    Scene Coordinate Regression with Angle-Based Reprojection Loss for Camera Relocalization

    Authors: Xiaotian Li, Juha Ylioinas, Jakob Verbeek, Juho Kannala

    Abstract: Image-based camera relocalization is an important problem in computer vision and robotics. Recent works utilize convolutional neural networks (CNNs) to regress for pixels in a query image their corresponding 3D world coordinates in the scene. The final pose is then solved via a RANSAC-based optimization scheme using the predicted coordinates. Usually, the CNN is trained with ground truth scene coo… ▽ More

    Submitted 30 September, 2018; v1 submitted 15 August, 2018; originally announced August 2018.

    Comments: ECCV 2018 Workshop (Geometry Meets Deep Learning)

  15. arXiv:1808.03485  [pdf, other

    cs.CV

    Deep Learning Based Speed Estimation for Constraining Strapdown Inertial Navigation on Smartphones

    Authors: Santiago Cortés, Arno Solin, Juho Kannala

    Abstract: Strapdown inertial navigation systems are sensitive to the quality of the data provided by the accelerometer and gyroscope. Low-grade IMUs in handheld smart-devices pose a problem for inertial odometry on these devices. We propose a scheme for constraining the inertial odometry problem by complementing non-linear state estimation by a CNN-based deep-learning model for inferring the momentary speed… ▽ More

    Submitted 10 August, 2018; originally announced August 2018.

    Comments: To appear in IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2018

  16. arXiv:1807.11677  [pdf, other

    cs.CV

    Leveraging Unlabeled Whole-Slide-Images for Mitosis Detection

    Authors: Saad Ullah Akram, Talha Qaiser, Simon Graham, Juho Kannala, Janne Heikkilä, Nasir Rajpoot

    Abstract: Mitosis count is an important biomarker for prognosis of various cancers. At present, pathologists typically perform manual counting on a few selected regions of interest in breast whole-slide-images (WSIs) of patient biopsies. This task is very time-consuming, tedious and subjective. Automated mitosis detection methods have made great advances in recent years. However, these methods require exhau… ▽ More

    Submitted 31 July, 2018; originally announced July 2018.

    Comments: Accepted for MICCAI COMPAY 2018 Workshop

  17. arXiv:1807.09828  [pdf, other

    cs.CV

    ADVIO: An authentic dataset for visual-inertial odometry

    Authors: Santiago Cortés, Arno Solin, Esa Rahtu, Juho Kannala

    Abstract: The lack of realistic and open benchmarking datasets for pedestrian visual-inertial odometry has made it hard to pinpoint differences in published methods. Existing datasets either lack a full six degree-of-freedom ground-truth or are limited to small spaces with optical tracking systems. We take advantage of advances in pure inertial navigation, and develop a set of versatile and challenging real… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

    Comments: To appear in European Conference on Computer Vision (ECCV)

  18. arXiv:1807.07741  [pdf, other

    cs.CL cs.LG stat.ML

    Learning Representations for Soft Skill Matching

    Authors: Luiza Sayfullina, Eric Malmi, Juho Kannala

    Abstract: Employers actively look for talents having not only specific hard skills but also various soft skills. To analyze the soft skill demands on the job market, it is important to be able to detect soft skill phrases from job advertisements automatically. However, a naive matching of soft skill phrases can lead to false positive matches when a soft skill phrase, such as friendly, is used to describe a… ▽ More

    Submitted 20 July, 2018; originally announced July 2018.

    Comments: Accepted by 7th International Conference - Analysis of Images, Social networks and Texts, http://aistconf.org/ (Best Paper Award)

  19. arXiv:1807.03026  [pdf, other

    cs.LG cs.CV stat.ML

    Pioneer Networks: Progressively Growing Generative Autoencoder

    Authors: Ari Heljakka, Arno Solin, Juho Kannala

    Abstract: We introduce a novel generative autoencoder network model that learns to encode and reconstruct images with high quality and resolution, and supports smooth random sampling from the latent space of the encoder. Generative adversarial networks (GANs) are known for their ability to simulate random high-quality images, but they cannot reconstruct existing images. Previous works have attempted to exte… ▽ More

    Submitted 9 October, 2018; v1 submitted 9 July, 2018; originally announced July 2018.

    Comments: To appear in ACCV 2018

  20. arXiv:1805.12506  [pdf, other

    cs.CV cs.RO

    Robust Gyroscope-Aided Camera Self-Calibration

    Authors: Santiago Cortés Reina, Arno Solin, Juho Kannala

    Abstract: Camera calibration for estimating the intrinsic parameters and lens distortion is a prerequisite for various monocular vision applications including feature tracking and video stabilization. This application paper proposes a model for estimating the parameters on the fly by fusing gyroscope and camera data, both readily available in modern day smartphones. The model is based on joint estimation of… ▽ More

    Submitted 31 May, 2018; originally announced May 2018.

    Comments: Appearing in Proceedings of the International Conference on Information Fusion (FUSION 2018)

  21. arXiv:1805.08542  [pdf, other

    cs.CV

    Fast Motion Deblurring for Feature Detection and Matching Using Inertial Measurements

    Authors: Janne Mustaniemi, Juho Kannala, Simo Särkkä, Jiri Matas, Janne Heikkilä

    Abstract: Many computer vision and image processing applications rely on local features. It is well-known that motion blur decreases the performance of traditional feature detectors and descriptors. We propose an inertial-based deblurring method for improving the robustness of existing feature detectors and descriptors against the motion blur. Unlike most deblurring algorithms, the method can handle spatial… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

  22. arXiv:1805.03189  [pdf, other

    cs.CV

    Learning image-to-image translation using paired and unpaired training samples

    Authors: Soumya Tripathy, Juho Kannala, Esa Rahtu

    Abstract: Image-to-image translation is a general name for a task where an image from one domain is converted to a corresponding image in another domain, given sufficient training data. Traditionally different approaches have been proposed depending on whether aligned image pairs or two sets of (unaligned) examples from both domains are available for training. While paired training samples might be difficul… ▽ More

    Submitted 8 May, 2018; originally announced May 2018.

  23. arXiv:1804.08912  [pdf, other

    cs.CV

    Accurate 3-D Reconstruction with RGB-D Cameras using Depth Map Fusion and Pose Refinement

    Authors: Markus Ylimäki, Juho Kannala, Janne Heikkilä

    Abstract: Depth map fusion is an essential part in both stereo and RGB-D based 3-D reconstruction pipelines. Whether produced with a passive stereo reconstruction or using an active depth sensor, such as Microsoft Kinect, the depth maps have noise and may have poor initial registration. In this paper, we introduce a method which is capable of handling outliers, and especially, even significant registration… ▽ More

    Submitted 24 April, 2018; originally announced April 2018.

    Comments: Accepted to ICPR 2018

  24. arXiv:1802.07351  [pdf, other

    cs.CV

    Devon: Deformable Volume Network for Learning Optical Flow

    Authors: Yao Lu, Jack Valmadre, Heng Wang, Juho Kannala, Mehrtash Harandi, Philip H. S. Torr

    Abstract: State-of-the-art neural network models estimate large displacement optical flow in multi-resolution and use war** to propagate the estimation between two resolutions. Despite their impressive results, it is known that there are two problems with the approach. First, the multi-resolution estimation of optical flow fails in situations where small objects move fast. Second, war** creates artifact… ▽ More

    Submitted 4 March, 2019; v1 submitted 20 February, 2018; originally announced February 2018.

  25. arXiv:1802.05023  [pdf, other

    cs.CV

    Recursive Chaining of Reversible Image-to-image Translators For Face Aging

    Authors: Ari Heljakka, Arno Solin, Juho Kannala

    Abstract: This paper addresses the modeling and simulation of progressive changes over time, such as human face aging. By treating the age phases as a sequence of image domains, we construct a chain of transformers that map images from one age domain to the next. Leveraging recent adversarial image translation methods, our approach requires no training samples of the same individual at different ages. Here,… ▽ More

    Submitted 6 August, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

    Comments: To appear in Advanced Concepts for Intelligent Vision Systems (ACIVS) 2018

  26. arXiv:1802.03237  [pdf, other

    cs.CV

    Full-Frame Scene Coordinate Regression for Image-Based Localization

    Authors: Xiaotian Li, Juha Ylioinas, Juho Kannala

    Abstract: Image-based localization, or camera relocalization, is a fundamental problem in computer vision and robotics, and it refers to estimating camera pose from an image. Recent state-of-the-art approaches use learning based methods, such as Random Forests (RFs) and Convolutional Neural Networks (CNNs), to regress for each pixel in the image its corresponding position in the scene's world coordinate fra… ▽ More

    Submitted 25 June, 2018; v1 submitted 9 February, 2018; originally announced February 2018.

    Comments: RSS 2018

  27. arXiv:1710.11359  [pdf, other

    cs.CV

    Image Patch Matching Using Convolutional Descriptors with Euclidean Distance

    Authors: Iaroslav Melekhov, Juho Kannala, Esa Rahtu

    Abstract: In this work we propose a neural network based image descriptor suitable for image patch matching, which is an important task in many computer vision applications. Our approach is influenced by recent success of deep convolutional neural networks (CNNs) in object detection and classification tasks. We develop a model which maps the raw input patch to a low dimensional feature vector so that the di… ▽ More

    Submitted 31 October, 2017; originally announced October 2017.

    Comments: The paper was published in ACCV 2016 Workshops proceedings (Workshop on Interpretation and Visualization of Deep Neural Nets)

  28. arXiv:1708.00894  [pdf, other

    cs.CV

    PIVO: Probabilistic Inertial-Visual Odometry for Occlusion-Robust Navigation

    Authors: Arno Solin, Santiago Cortes, Esa Rahtu, Juho Kannala

    Abstract: This paper presents a novel method for visual-inertial odometry. The method is based on an information fusion framework employing low-cost IMU sensors and the monocular camera in a standard smartphone. We formulate a sequential inference scheme, where the IMU drives the dynamical model and the camera frames are used in coupling trailing sequences of augmented poses. The novelty in the model is in… ▽ More

    Submitted 23 January, 2018; v1 submitted 2 August, 2017; originally announced August 2017.

    Comments: 10 pages, 4 figures. Paper to be published in WACV 2018

  29. arXiv:1707.09733  [pdf, other

    cs.CV

    Camera Relocalization by Computing Pairwise Relative Poses Using Convolutional Neural Network

    Authors: Zakaria Laskar, Iaroslav Melekhov, Surya Kalia, Juho Kannala

    Abstract: We propose a new deep learning based approach for camera relocalization. Our approach localizes a given query image by using a convolutional neural network (CNN) for first retrieving similar database images and then predicting the relative pose between the query and the database images, whose poses are known. The camera location for the query image is obtained via triangulation from two relative t… ▽ More

    Submitted 1 August, 2017; v1 submitted 31 July, 2017; originally announced July 2017.

  30. arXiv:1705.05665  [pdf, other

    cs.CV cs.LG

    Learning Image Relations with Contrast Association Networks

    Authors: Yao Lu, Zhirong Yang, Juho Kannala, Samuel Kaski

    Abstract: Inferring the relations between two images is an important class of tasks in computer vision. Examples of such tasks include computing optical flow and stereo disparity. We treat the relation inference tasks as a machine learning problem and tackle it with neural networks. A key to the problem is learning a representation of relations. We propose a new neural network module, contrast association u… ▽ More

    Submitted 11 March, 2019; v1 submitted 16 May, 2017; originally announced May 2017.

  31. arXiv:1705.03386  [pdf, other

    cs.CV

    Cell Tracking via Proposal Generation and Selection

    Authors: Saad Ullah Akram, Juho Kannala, Lauri Eklund, Janne Heikkilä

    Abstract: Microscopy imaging plays a vital role in understanding many biological processes in development and disease. The recent advances in automation of microscopes and development of methods and markers for live cell imaging has led to rapid growth in the amount of image data being captured. To efficiently and reliably extract useful insights from these captured sequences, automated cell tracking is ess… ▽ More

    Submitted 9 May, 2017; originally announced May 2017.

  32. arXiv:1703.07971  [pdf, other

    cs.CV

    Image-based Localization using Hourglass Networks

    Authors: Iaroslav Melekhov, Juha Ylioinas, Juho Kannala, Esa Rahtu

    Abstract: In this paper, we propose an encoder-decoder convolutional neural network (CNN) architecture for estimating camera pose (orientation and location) from a single RGB-image. The architecture has a hourglass shape consisting of a chain of convolution and up-convolution layers followed by a regression part. The up-convolution layers are introduced to preserve the fine-grained information of the input… ▽ More

    Submitted 24 August, 2017; v1 submitted 23 March, 2017; originally announced March 2017.

    Comments: Camera-ready version for ICCVW 2017 (fixed glitches in abstract)

  33. arXiv:1703.01226  [pdf, other

    cs.CV

    Context Aware Query Image Representation for Particular Object Retrieval

    Authors: Zakaria Laskar, Juho Kannala

    Abstract: The current models of image representation based on Convolutional Neural Networks (CNN) have shown tremendous performance in image retrieval. Such models are inspired by the information flow along the visual pathway in the human visual cortex. We propose that in the field of particular object retrieval, the process of extracting CNN representations from query images with a given region of interest… ▽ More

    Submitted 3 March, 2017; originally announced March 2017.

    Comments: 14 pages, Extended version of a manuscript submitted to SCIA 2017

    ACM Class: I.5.4

  34. arXiv:1703.00154  [pdf, other

    cs.CV stat.AP

    Inertial Odometry on Handheld Smartphones

    Authors: Arno Solin, Santiago Cortes, Esa Rahtu, Juho Kannala

    Abstract: Building a complete inertial navigation system using the limited quality data provided by current smartphones has been regarded challenging, if not impossible. This paper shows that by careful crafting and accounting for the weak information in the sensor samples, smartphones are capable of pure inertial navigation. We present a probabilistic approach for orientation and use-case free inertial odo… ▽ More

    Submitted 7 June, 2018; v1 submitted 1 March, 2017; originally announced March 2017.

    Comments: Appearing in Proceedings of the International Conference on Information Fusion (FUSION 2018)

  35. arXiv:1702.01381  [pdf, other

    cs.CV

    Relative Camera Pose Estimation Using Convolutional Neural Networks

    Authors: Iaroslav Melekhov, Juha Ylioinas, Juho Kannala, Esa Rahtu

    Abstract: This paper presents a convolutional neural network based approach for estimating the relative pose between two cameras. The proposed network takes RGB images from both cameras as input and directly produces the relative rotation and translation as output. The system is trained in an end-to-end manner utilising transfer learning from a large scale classification dataset. The introduced approach is… ▽ More

    Submitted 28 July, 2017; v1 submitted 5 February, 2017; originally announced February 2017.

    Comments: To be published in proceedings of Advanced Concepts for Intelligent Vision Systems (ACIVS) 2017

  36. arXiv:1611.09498  [pdf, other

    cs.CV

    Inertial-Based Scale Estimation for Structure from Motion on Mobile Devices

    Authors: Janne Mustaniemi, Juho Kannala, Simo Särkkä, Jiri Matas, Janne Heikkilä

    Abstract: Structure from motion algorithms have an inherent limitation that the reconstruction can only be determined up to the unknown scale factor. Modern mobile devices are equipped with an inertial measurement unit (IMU), which can be used for estimating the scale of the reconstruction. We propose a method that recovers the metric scale given inertial measurements and camera poses. In the process, we al… ▽ More

    Submitted 11 August, 2017; v1 submitted 29 November, 2016; originally announced November 2016.

  37. arXiv:1609.07420  [pdf, other

    cs.CV

    Real-time Human Pose Estimation from Video with Convolutional Neural Networks

    Authors: Marko Linna, Juho Kannala, Esa Rahtu

    Abstract: In this paper, we present a method for real-time multi-person human pose estimation from video by utilizing convolutional neural networks. Our method is aimed for use case specific applications, where good accuracy is essential and variation of the background and poses is limited. This enables us to use a generic network architecture, which is both accurate and fast. We divide the problem into two… ▽ More

    Submitted 23 September, 2016; originally announced September 2016.

    Comments: 16 pages

  38. arXiv:1306.5151  [pdf, other

    cs.CV

    Fine-Grained Visual Classification of Aircraft

    Authors: Subhransu Maji, Esa Rahtu, Juho Kannala, Matthew Blaschko, Andrea Vedaldi

    Abstract: This paper introduces FGVC-Aircraft, a new dataset containing 10,000 images of aircraft spanning 100 aircraft models, organised in a three-level hierarchy. At the finer level, differences between models are often subtle but always visually measurable, making visual recognition challenging but possible. A benchmark is obtained by defining corresponding classification tasks and evaluation protocols,… ▽ More

    Submitted 21 June, 2013; originally announced June 2013.