Skip to main content

Showing 1–50 of 85 results for author: Kannala, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01726  [pdf

    cs.CV

    Grouped Discrete Representation Guides Object-Centric Learning

    Authors: Rongzhen Zhao, Vivienne Wang, Juho Kannala, Joni Pajarinen

    Abstract: Similar to humans perceiving visual scenes as objects, Object-Centric Learning (OCL) can abstract dense images or videos into sparse object-level features. Transformer-based OCL handles complex textures well due to the decoding guidance of discrete representation, obtained by discretizing noisy features in image or video feature maps using template features from a codebook. However, treating featu… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    ACM Class: I.4.6

  2. arXiv:2404.17324  [pdf, other

    cs.CV

    Dense Road Surface Grip Map Prediction from Multimodal Image Data

    Authors: Jyri Maanpää, Julius Pesonen, Heikki Hyyti, Iaroslav Melekhov, Juho Kannala, Petri Manninen, Antero Kukko, Juha Hyyppä

    Abstract: Slippery road weather conditions are prevalent in many regions and cause a regular risk for traffic. Still, there has been less research on how autonomous vehicles could detect slippery driving conditions on the road to drive safely. In this work, we propose a method to predict a dense grip map from the area in front of the car, based on postprocessed multimodal sensor data. We trained a convoluti… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 17 pages, 7 figures (supplementary material 1 page, 1 figure). Submitted to 27th International Conference of Pattern Recognition (ICPR 2024)

  3. arXiv:2403.17822  [pdf, other

    cs.CV

    DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing

    Authors: Matias Turkulainen, Xuqian Ren, Iaroslav Melekhov, Otto Seiskari, Esa Rahtu, Juho Kannala

    Abstract: 3D Gaussian splatting, a novel differentiable rendering technique, has achieved state-of-the-art novel view synthesis results with high rendering speeds and relatively low training times. However, its performance on scenes commonly seen in indoor datasets is poor due to the lack of geometric constraints during optimization. We extend 3D Gaussian splatting with depth and normal cues to tackle chall… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  4. arXiv:2403.13327  [pdf, other

    cs.CV

    Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion

    Authors: Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo, Pekka Rantalankila, Matias Turkulainen, Juho Kannala, Arno Solin

    Abstract: High-quality scene reconstruction and novel view synthesis based on Gaussian Splatting (3DGS) typically require steady, high-quality photographs, often impractical to capture with handheld cameras. We present a method that adapts to camera motion and allows high-quality scene reconstruction with handheld video data suffering from motion blur and rolling shutter distortion. Our approach is based on… ▽ More

    Submitted 24 May, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: Source code available at https://github.com/SpectacularAI/3dgs-deblur

  5. arXiv:2311.02778  [pdf, other

    cs.CV

    MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis

    Authors: Xuqian Ren, Wenjia Wang, Dingding Cai, Tuuli Tuominen, Juho Kannala, Esa Rahtu

    Abstract: Metaverse technologies demand accurate, real-time, and immersive modeling on consumer-grade hardware for both non-human perception (e.g., drone/robot/autonomous car navigation) and immersive technologies like AR/VR, requiring both structural accuracy and photorealism. However, there exists a knowledge gap in how to apply geometric reconstruction and photorealism modeling (novel view synthesis) in… ▽ More

    Submitted 19 March, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

  6. arXiv:2311.01953  [pdf, other

    cs.LG cs.MA

    Optimistic Multi-Agent Policy Gradient

    Authors: Wenshuai Zhao, Yi Zhao, Zhiyuan Li, Juho Kannala, Joni Pajarinen

    Abstract: *Relative overgeneralization* (RO) occurs in cooperative multi-agent learning tasks when agents converge towards a suboptimal joint policy due to overfitting to suboptimal behavior of other agents. No methods have been proposed for addressing RO in multi-agent policy gradient (MAPG) methods although these methods produce state-of-the-art results. To address this gap, we propose a general, yet simp… ▽ More

    Submitted 25 May, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: Published at ICML 2024, 17 pages, 10 figures

  7. arXiv:2310.15128  [pdf, other

    cs.CV cs.LG quant-ph

    Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients

    Authors: Maximilian Krahn, Michelle Sasdelli, Fengyi Yang, Vladislav Golyanik, Juho Kannala, Tat-Jun Chin, Tolga Birdal

    Abstract: We present, QP-SBGD, a novel layer-wise stochastic optimiser tailored towards training neural networks with binary weights, known as binary neural networks (BNNs), on quantum hardware. BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy. However, training them in practice remains to be an open challenge. Most known BNN-optimisers… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  8. arXiv:2306.12547  [pdf, other

    cs.CV

    DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching

    Authors: Shuzhe Wang, Juho Kannala, Daniel Barath

    Abstract: Matching 2D keypoints in an image to a sparse 3D point cloud of the scene without requiring visual descriptors has garnered increased interest due to its low memory requirements, inherent privacy preservation, and reduced need for expensive 3D model maintenance compared to visual descriptor-based methods. However, existing algorithms often compromise on performance, resulting in a significant dete… ▽ More

    Submitted 24 March, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: CVPR 2024

  9. arXiv:2306.09466  [pdf, other

    cs.LG cs.RO

    Simplified Temporal Consistency Reinforcement Learning

    Authors: Yi Zhao, Wenshuai Zhao, Rinu Boney, Juho Kannala, Joni Pajarinen

    Abstract: Reinforcement learning is able to solve complex sequential decision-making tasks but is currently limited by sample efficiency and required computation. To improve sample efficiency, recent work focuses on model-based RL which interleaves model learning with planning. Recent methods further utilize policy learning, value estimation, and, self-supervised learning as auxiliary objectives. In this pa… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  10. arXiv:2305.03595  [pdf, other

    cs.CV

    HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer

    Authors: Shuzhe Wang, Zakaria Laskar, Iaroslav Melekhov, Xiaotian Li, Yi Zhao, Giorgos Tolias, Juho Kannala

    Abstract: Visual localization is critical to many applications in computer vision and robotics. To address single-image RGB localization, state-of-the-art feature-based methods match local descriptors between a query image and a pre-built 3D model. Recently, deep neural networks have been exploited to regress the map** between raw pixels and 3D coordinates in the scene, and thus the matching is implicitly… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

  11. arXiv:2302.09825  [pdf, other

    cs.CV

    TBPos: Dataset for Large-Scale Precision Visual Localization

    Authors: Masud Fahim, Ilona Söchting, Luca Ferranti, Juho Kannala, Jani Boutellier

    Abstract: Image based localization is a classical computer vision challenge, with several well-known datasets. Generally, datasets consist of a visual 3D database that captures the modeled scenery, as well as query images whose 3D pose is to be discovered. Usually the query images have been acquired with a camera that differs from the imaging hardware used to collect the 3D database; consequently, it is har… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: Scandinavian Conference on Image Analysis 2023

  12. arXiv:2301.01057  [pdf, other

    cs.CV

    BS3D: Building-scale 3D Reconstruction from RGB-D Images

    Authors: Janne Mustaniemi, Juho Kannala, Esa Rahtu, Li Liu, Janne Heikkilä

    Abstract: Various datasets have been proposed for simultaneous localization and map** (SLAM) and related problems. Existing datasets often include small environments, have incomplete ground truth, or lack important sensor data, such as depth and infrared images. We propose an easy-to-use framework for acquiring building-scale 3D reconstruction using a consumer depth camera. Unlike complex and expensive ac… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

  13. arXiv:2212.13381  [pdf, other

    cs.LG cs.CV

    MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

    Authors: Yingtian Zou, Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham, Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi

    Abstract: Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional deri… ▽ More

    Submitted 15 October, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

    Comments: 16 pages, Best Student Paper Award at UAI 2023

  14. arXiv:2211.15656  [pdf, other

    cs.CV cs.RO

    SuperFusion: Multilevel LiDAR-Camera Fusion for Long-Range HD Map Generation

    Authors: Hao Dong, Xian**g Zhang, **tao Xu, Rui Ai, Weihao Gu, Huimin Lu, Juho Kannala, Xieyuanli Chen

    Abstract: High-definition (HD) semantic map generation of the environment is an essential component of autonomous driving. Existing methods have achieved good performance in this task by fusing different sensor modalities, such as LiDAR and camera. However, current works are based on raw data or network feature-level fusion and only consider short-range HD map generation, limiting their deployment to realis… ▽ More

    Submitted 16 March, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

  15. arXiv:2211.00392  [pdf, other

    cs.CV

    Expansion of Visual Hints for Improved Generalization in Stereo Matching

    Authors: Andrea Pilzer, Yuxin Hou, Niki Loppi, Arno Solin, Juho Kannala

    Abstract: We introduce visual hints expansion for guiding stereo matching to improve generalization. Our work is motivated by the robustness of Visual Inertial Odometry (VIO) in computer vision and robotics, where a sparse and unevenly distributed set of feature points characterizes a scene. To improve stereo matching, we propose to elevate 2D hints to 3D points. These sparse and unevenly distributed 3D vis… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: 2023 IEEE Winter Conference on Applications of Computer Vision (WACV)

  16. arXiv:2210.13846  [pdf, other

    cs.LG cs.RO

    Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning

    Authors: Yi Zhao, Rinu Boney, Alexander Ilin, Juho Kannala, Joni Pajarinen

    Abstract: Offline reinforcement learning, by learning from a fixed dataset, makes it possible to learn agent behaviors without interacting with the environment. However, depending on the quality of the offline dataset, such pre-trained agents may have limited performance and would further need to be fine-tuned online by interacting with the environment. During online fine-tuning, the performance of the pre-… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

  17. arXiv:2210.01426  [pdf, other

    cs.AI cs.LG cs.RO

    Continuous Monte Carlo Graph Search

    Authors: Kalle Kujanpää, Amin Babadi, Yi Zhao, Juho Kannala, Alexander Ilin, Joni Pajarinen

    Abstract: Online planning is crucial for high performance in many complex sequential decision-making tasks. Monte Carlo Tree Search (MCTS) employs a principled mechanism for trading off exploration for exploitation for efficient online planning, and it outperforms comparison methods in many discrete decision-making domains such as Go, Chess, and Shogi. Subsequently, extensions of MCTS to continuous domains… ▽ More

    Submitted 7 February, 2024; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: Accepted at AAMAS 2024 (full paper & oral)

  18. arXiv:2208.07591  [pdf, other

    cs.CV cs.LG

    Uncertainty-guided Source-free Domain Adaptation

    Authors: Subhankar Roy, Martin Trapp, Andrea Pilzer, Juho Kannala, Nicu Sebe, Elisa Ricci, Arno Solin

    Abstract: Source-free domain adaptation (SFDA) aims to adapt a classifier to an unlabelled target data set by only using a pre-trained source model. However, the absence of the source data and the domain shift makes the predictions on the target data unreliable. We propose quantifying the uncertainty in the source model predictions and utilizing it to guide the target adaptation. For this, we construct a pr… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: ECCV 2022

  19. arXiv:2208.06933  [pdf, other

    cs.CV

    Visual Localization via Few-Shot Scene Region Classification

    Authors: Siyan Dong, Shuzhe Wang, Yixin Zhuang, Juho Kannala, Marc Pollefeys, Baoquan Chen

    Abstract: Visual (re)localization addresses the problem of estimating the 6-DoF (Degree of Freedom) camera pose of a query image captured in a known scene, which is a key building block of many computer vision and robotics applications. Recent advances in structure-based localization solve this problem by memorizing the map** from image pixels to scene coordinates with neural networks to build 2D-3D corre… ▽ More

    Submitted 14 August, 2022; originally announced August 2022.

    Comments: 3DV 2022

  20. arXiv:2206.08890  [pdf, other

    cs.LG cs.CV

    Disentangling Model Multiplicity in Deep Learning

    Authors: Ari Heljakka, Martin Trapp, Juho Kannala, Arno Solin

    Abstract: Model multiplicity is a well-known but poorly understood phenomenon that undermines the generalisation guarantees of machine learning models. It appears when two models with similar training-time performance differ in their predictions and real-world performance characteristics. This observed 'predictive' multiplicity (PM) also implies elusive differences in the internals of the models, their 'rep… ▽ More

    Submitted 31 January, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: 13 pages, 6 figures

  21. arXiv:2205.11299  [pdf, other

    cs.SD eess.AS eess.SP

    Multiple Offsets Multilateration: a new paradigm for sensor network calibration with unsynchronized reference nodes

    Authors: Luca Ferranti, Kalle Åström, Magnus Oskarsson, Jani Boutellier, Juho Kannala

    Abstract: Positioning using wave signal measurements is used in several applications, such as GPS systems, structure from sound and Wifi based positioning. Mathematically, such problems require the computation of the positions of receivers and/or transmitters as well as time offsets if the devices are unsynchronized. In this paper, we expand the previous state-of-the-art on positioning formulations by intro… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Comments: accepted to ICASSP2022

  22. Bridging the gap between paired and unpaired medical image translation

    Authors: Pauliina Paavilainen, Saad Ullah Akram, Juho Kannala

    Abstract: Medical image translation has the potential to reduce the imaging workload, by removing the need to capture some sequences, and to reduce the annotation burden for develo** machine learning methods. GANs have been used successfully to translate images from one domain to another, such as MR to CT. At present, paired data (registered MR and CT images) or extra supervision (e.g. segmentation masks)… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

    Comments: Deep Generative Models for MICCAI (DGM4MICCAI) workshop 2021

  23. arXiv:2110.04773  [pdf, other

    cs.CV

    Digging Into Self-Supervised Learning of Feature Descriptors

    Authors: Iaroslav Melekhov, Zakaria Laskar, Xiaotian Li, Shuzhe Wang, Juho Kannala

    Abstract: Fully-supervised CNN-based approaches for learning local image descriptors have shown remarkable results in a wide range of geometric tasks. However, most of them require per-pixel ground-truth keypoint correspondence data which is difficult to acquire at scale. To address this challenge, recent weakly- and self-supervised methods can learn feature descriptors from relative camera poses or using o… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

    Comments: Camera ready (3DV 2021)

  24. arXiv:2108.09112  [pdf, other

    cs.CV

    Continual Learning for Image-Based Camera Localization

    Authors: Shuzhe Wang, Zakaria Laskar, Iaroslav Melekhov, Xiaotian Li, Juho Kannala

    Abstract: For several emerging technologies such as augmented reality, autonomous driving and robotics, visual localization is a critical component. Directly regressing camera pose/3D scene coordinates from the input image using deep neural networks has shown great potential. However, such methods assume a stationary data distribution with all scenes simultaneously available during training. In this paper,… ▽ More

    Submitted 27 April, 2022; v1 submitted 20 August, 2021; originally announced August 2021.

    Comments: ICCV 2021

  25. HybVIO: Pushing the Limits of Real-time Visual-inertial Odometry

    Authors: Otto Seiskari, Pekka Rantalankila, Juho Kannala, Jerry Ylilammi, Esa Rahtu, Arno Solin

    Abstract: We present HybVIO, a novel hybrid approach for combining filtering-based visual-inertial odometry (VIO) with optimization-based SLAM. The core of our method is highly robust, independent VIO with improved IMU bias modeling, outlier rejection, stationarity detection, and feature track selection, which is adjustable to run on embedded hardware. Long-term consistency is achieved with a loosely-couple… ▽ More

    Submitted 25 November, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: 2022 IEEE Winter Conference on Applications of Computer Vision (WACV)

  26. arXiv:2106.07995  [pdf, other

    cs.LG cs.CV cs.RO

    Learning of feature points without additional supervision improves reinforcement learning from images

    Authors: Rinu Boney, Alexander Ilin, Juho Kannala

    Abstract: In many control problems that include vision, optimal controls can be inferred from the location of the objects in the scene. This information can be represented using feature points, which is a list of spatial locations in learned feature maps of an input image. Previous works show that feature points learned using unsupervised pre-training or human supervision can provide good features for contr… ▽ More

    Submitted 4 June, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

  27. arXiv:2104.03117  [pdf, other

    cs.CV

    Single Source One Shot Reenactment using Weighted motion From Paired Feature Points

    Authors: Soumya Tripathy, Juho Kannala, Esa Rahtu

    Abstract: Image reenactment is a task where the target object in the source image imitates the motion represented in the driving image. One of the most common reenactment tasks is face image animation. The major challenge in the current face reenactment approaches is to distinguish between facial motion and identity. For this reason, the previous models struggle to produce high-quality animations if the dri… ▽ More

    Submitted 7 April, 2021; originally announced April 2021.

  28. arXiv:2101.01619  [pdf, other

    cs.CV

    Novel View Synthesis via Depth-guided Skip Connections

    Authors: Yuxin Hou, Arno Solin, Juho Kannala

    Abstract: We introduce a principled approach for synthesizing new views of a scene given a single source image. Previous methods for novel view synthesis can be divided into image-based rendering methods (e.g. flow prediction) or pixel generation methods. Flow predictions enable the target view to re-use pixels directly, but can easily lead to distorted results. Directly regressing pixels can produce struct… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

  29. arXiv:2012.12186  [pdf, other

    cs.AI

    Learning to Play Imperfect-Information Games by Imitating an Oracle Planner

    Authors: Rinu Boney, Alexander Ilin, Juho Kannala, Jarno Seppänen

    Abstract: We consider learning to play multiplayer imperfect-information games with simultaneous moves and large state-action spaces. Previous attempts to tackle such challenging games have largely focused on model-free learning methods, often requiring hundreds of years of experience to produce competitive agents. Our approach is based on model-based planning. We tackle the problem of partial observability… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

  30. arXiv:2011.04439  [pdf, other

    cs.CV

    FACEGAN: Facial Attribute Controllable rEenactment GAN

    Authors: Soumya Tripathy, Juho Kannala, Esa Rahtu

    Abstract: The face reenactment is a popular facial animation method where the person's identity is taken from the source image and the facial motion from the driving image. Recent works have demonstrated high quality results by combining the facial landmark based motion representations with the generative adversarial networks. These models perform best if the source and driving images depict the same person… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: Accepted to WACV-2021

  31. arXiv:2011.03085  [pdf, other

    cs.RO cs.AI

    RealAnt: An Open-Source Low-Cost Quadruped for Education and Research in Real-World Reinforcement Learning

    Authors: Rinu Boney, Jussi Sainio, Mikko Kaivola, Arno Solin, Juho Kannala

    Abstract: Current robot platforms available for research are either very expensive or unable to handle the abuse of exploratory controls in reinforcement learning. We develop RealAnt, a minimal low-cost physical version of the popular `Ant' benchmark used in reinforcement learning. RealAnt costs only $\sim$350 EUR (\$410) in materials and can be assembled in less than an hour. We validate the platform with… ▽ More

    Submitted 4 June, 2022; v1 submitted 5 November, 2020; originally announced November 2020.

  32. arXiv:2010.09105  [pdf, other

    cs.CV

    Movement-induced Priors for Deep Stereo

    Authors: Yuxin Hou, Muhammad Kamran Janjua, Juho Kannala, Arno Solin

    Abstract: We propose a method for fusing stereo disparity estimation with movement-induced prior information. Instead of independent inference frame-by-frame, we formulate the problem as a non-parametric learning task in terms of a temporal Gaussian process prior with a movement-driven kernel for inter-frame reasoning. We present a hierarchy of three Gaussian process kernels depending on the availability of… ▽ More

    Submitted 18 October, 2020; originally announced October 2020.

  33. arXiv:2010.00347  [pdf, other

    cs.CV

    Can You Trust Your Pose? Confidence Estimation in Visual Localization

    Authors: Luca Ferranti, Xiaotian Li, Jani Boutellier, Juho Kannala

    Abstract: Camera pose estimation in large-scale environments is still an open question and, despite recent promising results, it may still fail in some situations. The research so far has focused on improving subcomponents of estimation pipelines, to achieve more accurate poses. However, there is no guarantee for the result to be correct, even though the correctness of pose estimation is critically importan… ▽ More

    Submitted 1 October, 2020; originally announced October 2020.

    Comments: To appear in ICPR 2020

  34. arXiv:2008.06959  [pdf, other

    cs.CV

    Image Stylization for Robust Features

    Authors: Iaroslav Melekhov, Gabriel J. Brostow, Juho Kannala, Daniyar Turmukhambetov

    Abstract: Local features that are robust to both viewpoint and appearance changes are crucial for many computer vision tasks. In this work we investigate if photorealistic image stylization improves robustness of local features to not only day-night, but also weather and season variations. We show that image stylization in addition to color augmentation is a powerful method of learning robust features. We e… ▽ More

    Submitted 16 August, 2020; originally announced August 2020.

    Comments: v1.1

  35. arXiv:2008.00715  [pdf, other

    cs.RO

    Learning to Drive (L2D) as a Low-Cost Benchmark for Real-World Reinforcement Learning

    Authors: Ari Viitala, Rinu Boney, Yi Zhao, Alexander Ilin, Juho Kannala

    Abstract: We present Learning to Drive (L2D), a low-cost benchmark for real-world reinforcement learning (RL). L2D involves a simple and reproducible experimental setup where an RL agent has to learn to drive a Donkey car around three miniature tracks, given only monocular image observations and speed of the car. The agent has to learn to drive from disengagements, which occurs when it drives off the track.… ▽ More

    Submitted 6 November, 2020; v1 submitted 3 August, 2020; originally announced August 2020.

  36. arXiv:2007.05299  [pdf, other

    cs.CV

    Data-Efficient Ranking Distillation for Image Retrieval

    Authors: Zakaria Laskar, Juho Kannala

    Abstract: Recent advances in deep learning has lead to rapid developments in the field of image retrieval. However, the best performing architectures incur significant computational cost. Recent approaches tackle this issue using knowledge distillation to transfer knowledge from a deeper and heavier architecture to a much smaller network. In this paper we address knowledge distillation for metric learning p… ▽ More

    Submitted 13 July, 2020; v1 submitted 10 July, 2020; originally announced July 2020.

    Comments: 10 pages, 2 figures. Edited figure 7

  37. arXiv:2006.02158  [pdf, other

    cs.CV

    Interpolation-based semi-supervised learning for object detection

    Authors: Jisoo Jeong, Vikas Verma, Minsung Hyun, Juho Kannala, Nojun Kwak

    Abstract: Despite the data labeling cost for the object detection tasks being substantially more than that of the classification tasks, semi-supervised learning methods for object detection have not been studied much. In this paper, we propose an Interpolation-based Semi-supervised learning method for object Detection (ISD), which considers and solves the problems caused by applying conventional Interpolati… ▽ More

    Submitted 29 December, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

  38. arXiv:2005.10298  [pdf, ps, other

    eess.SP cs.NI

    Sensor Networks TDOA Self-Calibration: 2D Complexity Analysis and Solutions

    Authors: Luca Ferranti, Kalle Åström, Magnus Oskarsson, Jani Boutellier, Juho Kannala

    Abstract: Given a network of receivers and transmitters, the process of determining their positions from measured pseudoranges is known as network self-calibration. In this paper we consider 2D networks with synchronized receivers but unsynchronized transmitters and the corresponding calibration techniques, known as Time-Difference-Of-Arrival (TDOA) techniques. Despite previous work, TDOA self-calibration i… ▽ More

    Submitted 22 October, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

  39. arXiv:1912.10321  [pdf, other

    cs.LG cs.CV stat.ML

    Deep Automodulators

    Authors: Ari Heljakka, Yuxin Hou, Juho Kannala, Arno Solin

    Abstract: We introduce a new category of generative autoencoders called automodulators. These networks can faithfully reproduce individual real-world input images like regular autoencoders, but also generate a fused sample from an arbitrary combination of several such images, allowing instantaneous 'style-mixing' and other new applications. An automodulator decouples the data flow of decoder operations from… ▽ More

    Submitted 29 October, 2020; v1 submitted 21 December, 2019; originally announced December 2019.

    Comments: To appear in Advances in Neural Information Processing Systems (NeurIPS 2020)

  40. arXiv:1910.05527  [pdf, other

    cs.LG cs.RO stat.ML

    Regularizing Model-Based Planning with Energy-Based Models

    Authors: Rinu Boney, Juho Kannala, Alexander Ilin

    Abstract: Model-based reinforcement learning could enable sample-efficient learning by quickly acquiring rich knowledge about the world and using it to improve behaviour without additional data. Learned dynamics models can be directly used for planning actions but this has been challenging because of inaccuracies in the learned models. In this paper, we focus on planning with learned dynamics models and pro… ▽ More

    Submitted 12 October, 2019; originally announced October 2019.

    Comments: Conference on Robot Learning 2019

  41. arXiv:1909.11715  [pdf, other

    cs.LG stat.ML

    GraphMix: Improved Training of GNNs for Semi-Supervised Learning

    Authors: Vikas Verma, Meng Qu, Kenji Kawaguchi, Alex Lamb, Yoshua Bengio, Juho Kannala, Jian Tang

    Abstract: We present GraphMix, a regularization method for Graph Neural Network based semi-supervised object classification, whereby we propose to train a fully-connected network jointly with the graph neural network via parameter sharing and interpolation-based regularization. Further, we provide a theoretical analysis of how GraphMix improves the generalization bounds of the underlying graph neural networ… ▽ More

    Submitted 8 October, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

    Comments: https://github.com/vikasverma1077/GraphMix

  42. arXiv:1909.06216  [pdf, other

    cs.CV

    Hierarchical Scene Coordinate Classification and Regression for Visual Localization

    Authors: Xiaotian Li, Shuzhe Wang, Yi Zhao, Jakob Verbeek, Juho Kannala

    Abstract: Visual localization is critical to many applications in computer vision and robotics. To address single-image RGB localization, state-of-the-art feature-based methods match local descriptors between a query image and a pre-built 3D model. Recently, deep neural networks have been exploited to regress the map** between raw pixels and 3D coordinates in the scene, and thus the matching is implicitly… ▽ More

    Submitted 31 March, 2020; v1 submitted 13 September, 2019; originally announced September 2019.

    Comments: CVPR 2020

  43. Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy

    Authors: Alex Lamb, Vikas Verma, Kenji Kawaguchi, Alexander Matyasko, Savya Khosla, Juho Kannala, Yoshua Bengio

    Abstract: Adversarial robustness has become a central goal in deep learning, both in the theory and the practice. However, successful methods to improve the adversarial robustness (such as adversarial training) greatly hurt generalization performance on the unperturbed data. This could have a major impact on how the adversarial robustness affects real world systems (i.e. many may opt to forego robustness if… ▽ More

    Submitted 19 October, 2022; v1 submitted 16 June, 2019; originally announced June 2019.

    Comments: This is the latest version, which is published in the Journal, "Neural Networks", in 2022. All the previous results are unchanged. First two authors contributed equally

    Journal ref: Neural Networks, volume 154, pages 218-233 (2022)

  44. arXiv:1906.00360  [pdf, other

    cs.CV

    Iterative Path Reconstruction for Large-Scale Inertial Navigation on Smartphones

    Authors: Santiago Cortés Reina, Yuxin Hou, Juho Kannala, Arno Solin

    Abstract: Modern smartphones have all the sensing capabilities required for accurate and robust navigation and tracking. In specific environments some data streams may be absent, less reliable, or flat out wrong. In particular, the GNSS signal can become flawed or silent inside buildings or in streets with tall buildings. In this application paper, we aim to advance the current state-of-the-art in motion es… ▽ More

    Submitted 2 June, 2019; originally announced June 2019.

    Comments: To appear in Proceedings FUSION 2019

  45. arXiv:1905.10693  [pdf, other

    cs.CV

    DAVE: A Deep Audio-Visual Embedding for Dynamic Saliency Prediction

    Authors: Hamed R. Tavakoli, Ali Borji, Esa Rahtu, Juho Kannala

    Abstract: This paper studies audio-visual deep saliency prediction. It introduces a conceptually simple and effective Deep Audio-Visual Embedding for dynamic saliency prediction dubbed ``DAVE" in conjunction with our efforts towards building an Audio-Visual Eye-tracking corpus named ``AVE". Despite existing a strong relation between auditory and visual cues for guiding gaze during perception, video saliency… ▽ More

    Submitted 7 January, 2020; v1 submitted 25 May, 2019; originally announced May 2019.

  46. arXiv:1904.06882  [pdf, other

    cs.CV

    Geometric Image Correspondence Verification by Dense Pixel Matching

    Authors: Zakaria Laskar, Iaroslav Melekhov, Hamed R. Tavakoli, Juha Ylioinas, Juho Kannala

    Abstract: This paper addresses the problem of determining dense pixel correspondences between two images and its application to geometric correspondence verification in image retrieval. The main contribution is a geometric correspondence verification approach for re-ranking a shortlist of retrieved database images based on their dense pair-wise matching with the query image at a pixel level. We determine a… ▽ More

    Submitted 17 August, 2020; v1 submitted 15 April, 2019; originally announced April 2019.

    Comments: The appendix has been updated by adding some clarifications

  47. arXiv:1904.06397  [pdf, other

    cs.CV

    Multi-View Stereo by Temporal Nonparametric Fusion

    Authors: Yuxin Hou, Juho Kannala, Arno Solin

    Abstract: We propose a novel idea for depth estimation from multi-view image-pose pairs, where the model has capability to leverage information from previous latent-space encodings of the scene. This model uses pairs of images and poses, which are passed through an encoder--decoder model for disparity estimation. The novelty lies in soft-constraining the bottleneck layer by a nonparametric Gaussian process… ▽ More

    Submitted 16 August, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

    Comments: ICCV 2019

  48. arXiv:1904.06145  [pdf, other

    cs.LG cs.CV stat.ML

    Towards Photographic Image Manipulation with Balanced Growing of Generative Autoencoders

    Authors: Ari Heljakka, Arno Solin, Juho Kannala

    Abstract: We present a generative autoencoder that provides fast encoding, faithful reconstructions (eg. retaining the identity of a face), sharp generated/reconstructed samples in high resolutions, and a well-structured latent space that supports semantic manipulation of the inputs. There are no current autoencoder or GAN models that satisfactorily achieve all of these. We build on the progressively growin… ▽ More

    Submitted 20 February, 2020; v1 submitted 12 April, 2019; originally announced April 2019.

    Comments: WACV 2020

  49. Digging Deeper into Egocentric Gaze Prediction

    Authors: Hamed R. Tavakoli, Esa Rahtu, Juho Kannala, Ali Borji

    Abstract: This paper digs deeper into factors that influence egocentric gaze. Instead of training deep models for this purpose in a blind manner, we propose to inspect factors that contribute to gaze guidance during daily tasks. Bottom-up saliency and optical flow are assessed versus strong spatial prior baselines. Task-specific cues such as vanishing point, manipulation point, and hand regions are analyzed… ▽ More

    Submitted 12 April, 2019; originally announced April 2019.

    Comments: presented at WACV 2019

  50. arXiv:1904.01920  [pdf, other

    cs.CV

    CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image Analysis

    Authors: Ahti Kalervo, Juha Ylioinas, Markus Häikiö, Antti Karhu, Juho Kannala

    Abstract: Better understanding and modelling of building interiors and the emergence of more impressive AR/VR technology has brought up the need for automatic parsing of floorplan images. However, there is a clear lack of representative datasets to investigate the problem further. To address this shortcoming, this paper presents a novel image dataset called CubiCasa5K, a large-scale floorplan image dataset… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.