Skip to main content

Showing 1–16 of 16 results for author: Köpüklü, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2106.03932  [pdf, other

    cs.CV cs.LG cs.SD eess.AS

    How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild

    Authors: Okan Köpüklü, Maja Taseska, Gerhard Rigoll

    Abstract: Successful active speaker detection requires a three-stage pipeline: (i) audio-visual encoding for all speakers in the clip, (ii) inter-speaker relation modeling between a reference speaker and the background speakers within each frame, and (iii) temporal modeling for the reference speaker. Each stage of this pipeline plays an important role for the final performance of the created architecture. B… ▽ More

    Submitted 7 September, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: Accepted to ICCV 2021

  2. arXiv:2102.12570  [pdf, other

    cs.CV

    Deep Compact Polyhedral Conic Classifier for Open and Closed Set Recognition

    Authors: Hakan Cevikalp, Bedirhan Uzun, Okan Köpüklü, Gurkan Ozturk

    Abstract: In this paper, we propose a new deep neural network classifier that simultaneously maximizes the inter-class separation and minimizes the intra-class variation by using the polyhedral conic classification function. The proposed method has one loss term that allows the margin maximization to maximize the inter-class separation and another loss term that controls the compactness of the class accepta… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

  3. TRAT: Tracking by Attention Using Spatio-Temporal Features

    Authors: Hasan Saribas, Hakan Cevikalp, Okan Köpüklü, Bedirhan Uzun

    Abstract: Robust object tracking requires knowledge of tracked objects' appearance, motion and their evolution over time. Although motion provides distinctive and complementary information especially for fast moving objects, most of the recent tracking architectures primarily focus on the objects' appearance information. In this paper, we propose a two-stream deep neural network tracker that uses both spati… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  4. arXiv:2009.14660  [pdf, other

    cs.CV cs.LG eess.IV

    Driver Anomaly Detection: A Dataset and Contrastive Learning Approach

    Authors: Okan Köpüklü, Jiapeng Zheng, Hang Xu, Gerhard Rigoll

    Abstract: Distracted drivers are more likely to fail to anticipate hazards, which result in car accidents. Therefore, detecting anomalies in drivers' actions (i.e., any action deviating from normal driving) contains the utmost importance to reduce driver-related accidents. However, there are unbounded many anomalous actions that a driver can do while driving, which leads to an 'open set recognition' problem… ▽ More

    Submitted 30 November, 2020; v1 submitted 30 September, 2020; originally announced September 2020.

    Comments: Accepted to IEEE Winter Conference on Applications of Computer Vision (WACV 2021)

  5. arXiv:2009.14639  [pdf, other

    cs.CV cs.LG eess.IV

    Dissected 3D CNNs: Temporal Skip Connections for Efficient Online Video Processing

    Authors: Okan Köpüklü, Stefan Hörmann, Fabian Herzog, Hakan Cevikalp, Gerhard Rigoll

    Abstract: Convolutional Neural Networks with 3D kernels (3D-CNNs) currently achieve state-of-the-art results in video recognition tasks due to their supremacy in extracting spatiotemporal features within video frames. There have been many successful 3D-CNN architectures surpassing the state-of-the-art results successively. However, nearly all of them are designed to operate offline creating several serious… ▽ More

    Submitted 18 October, 2021; v1 submitted 30 September, 2020; originally announced September 2020.

  6. arXiv:2003.00951  [pdf, ps, other

    cs.CV

    DriverMHG: A Multi-Modal Dataset for Dynamic Recognition of Driver Micro Hand Gestures and a Real-Time Recognition Framework

    Authors: Okan Köpüklü, Thomas Ledwon, Yao Rong, Neslihan Kose, Gerhard Rigoll

    Abstract: The use of hand gestures provides a natural alternative to cumbersome interface devices for Human-Computer Interaction (HCI) systems. However, real-time recognition of dynamic micro hand gestures from video streams is challenging for in-vehicle scenarios since (i) the gestures should be performed naturally without distracting the driver, (ii) micro hand gestures occur within very short time interv… ▽ More

    Submitted 19 October, 2021; v1 submitted 2 March, 2020; originally announced March 2020.

    Comments: Accepted to IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020)

  7. arXiv:1912.04618  [pdf, other

    cs.CV

    Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments

    Authors: Mert Kayhan, Okan Köpüklü, Mhd Hasan Sarhan, Mehmet Yigitsoy, Abouzar Eslami, Gerhard Rigoll

    Abstract: For many practical problems and applications, it is not feasible to create a vast and accurately labeled dataset, which restricts the application of deep learning in many areas. Semi-supervised learning algorithms intend to improve performance by also leveraging unlabeled data. This is very valuable for 2D-pose estimation task where data labeling requires substantial time and is subject to noise.… ▽ More

    Submitted 11 January, 2021; v1 submitted 10 December, 2019; originally announced December 2019.

  8. arXiv:1911.08995  [pdf, other

    cs.CV

    Unsupervised Monocular Depth Prediction for Indoor Continuous Video Streams

    Authors: Yinglong Feng, Shuncheng Wu, Okan Köpüklü, Xueyang Kang, Federico Tombari

    Abstract: This paper studies unsupervised monocular depth prediction problem. Most of existing unsupervised depth prediction algorithms are developed for outdoor scenarios, while the depth prediction work in the indoor environment is still very scarce to our knowledge. Therefore, this work focuses on narrowing the gap by firstly evaluating existing approaches in the indoor environments and then improving th… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

  9. arXiv:1911.06644  [pdf, other

    cs.CV

    You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization

    Authors: Okan Köpüklü, Xiangyu Wei, Gerhard Rigoll

    Abstract: Spatiotemporal action localization requires the incorporation of two sources of information into the designed architecture: (1) temporal information from the previous frames and (2) spatial information from the key frame. Current state-of-the-art approaches usually extract these information with separate networks and use an extra mechanism for fusion to get detections. In this work, we present YOW… ▽ More

    Submitted 18 October, 2021; v1 submitted 15 November, 2019; originally announced November 2019.

  10. arXiv:1909.05165  [pdf, other

    cs.CV eess.IV

    Comparative Analysis of CNN-based Spatiotemporal Reasoning in Videos

    Authors: Okan Köpüklü, Fabian Herzog, Gerhard Rigoll

    Abstract: Understanding actions and gestures in video streams requires temporal reasoning of the spatial content from different time instants, i.e., spatiotemporal (ST) modeling. In this survey paper, we have made a comparative analysis of different ST modeling techniques for action and gecture recognition tasks. Since Convolutional Neural Networks (CNNs) are proved to be an effective tool as a feature extr… ▽ More

    Submitted 11 January, 2021; v1 submitted 11 September, 2019; originally announced September 2019.

  11. arXiv:1907.08009  [pdf, other

    cs.CV cs.LG eess.IV

    Real-Time Driver State Monitoring Using a CNN Based Spatio-Temporal Approach

    Authors: Neslihan Kose, Okan Kopuklu, Alexander Unnervik, Gerhard Rigoll

    Abstract: Many road accidents occur due to distracted drivers. Today, driver monitoring is essential even for the latest autonomous vehicles to alert distracted drivers in order to take over control of the vehicle in case of emergency. In this paper, a spatio-temporal approach is applied to classify drivers' distraction level and movement decisions using convolutional neural networks (CNNs). We approach thi… ▽ More

    Submitted 18 July, 2019; originally announced July 2019.

    Comments: Accepted for publication by the IEEE Intelligent Transportation Systems Conference (ITSC 2019)

  12. arXiv:1905.04225  [pdf, other

    cs.CV cs.HC cs.LG

    Talking With Your Hands: Scaling Hand Gestures and Recognition With CNNs

    Authors: Okan Köpüklü, Yao Rong, Gerhard Rigoll

    Abstract: The use of hand gestures provides a natural alternative to cumbersome interface devices for Human-Computer Interaction (HCI) systems. As the technology advances and communication between humans and machines becomes more complex, HCI systems should also be scaled accordingly in order to accommodate the introduced complexities. In this paper, we propose a methodology to scale hand gestures by formin… ▽ More

    Submitted 30 August, 2019; v1 submitted 10 May, 2019; originally announced May 2019.

    Comments: Accepted to ICCV 2019 workshop - Observing and Understanding Hands in Action (HANDS 2019)

  13. arXiv:1904.02422  [pdf, other

    cs.CV

    Resource Efficient 3D Convolutional Neural Networks

    Authors: Okan Köpüklü, Neslihan Kose, Ahmet Gunduz, Gerhard Rigoll

    Abstract: Recently, convolutional neural networks with 3D kernels (3D CNNs) have been very popular in computer vision community as a result of their superior ability of extracting spatio-temporal features within video frames compared to 2D CNNs. Although there has been great advances recently to build resource efficient 2D CNN architectures considering memory and power budget, there is hardly any similar re… ▽ More

    Submitted 18 October, 2021; v1 submitted 4 April, 2019; originally announced April 2019.

    Comments: Accepted to ICCV 2019 workshop - Neural Architects

  14. arXiv:1901.10323  [pdf, other

    cs.CV cs.AI

    Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks

    Authors: Okan Köpüklü, Ahmet Gunduz, Neslihan Kose, Gerhard Rigoll

    Abstract: Real-time recognition of dynamic hand gestures from video streams is a challenging task since (i) there is no indication when a gesture starts and ends in the video, (ii) performed gestures should only be recognized once, and (iii) the entire architecture should be designed considering the memory and power budget. In this work, we address these challenges by proposing a hierarchical structure enab… ▽ More

    Submitted 18 October, 2019; v1 submitted 29 January, 2019; originally announced January 2019.

    Comments: Published at IEEE International Conference on Automatic Face and Gesture Recognition (FG 2019) - Best student paper award! -

  15. arXiv:1901.09615  [pdf, other

    cs.CV

    Convolutional Neural Networks with Layer Reuse

    Authors: Okan Köpüklü, Maryam Babaee, Stefan Hörmann, Gerhard Rigoll

    Abstract: A convolutional layer in a Convolutional Neural Network (CNN) consists of many filters which apply convolution operation to the input, capture some special patterns and pass the result to the next layer. If the same patterns also occur at the deeper layers of the network, why wouldn't the same convolutional filters be used also in those layers? In this paper, we propose a CNN architecture, Layer R… ▽ More

    Submitted 1 February, 2019; v1 submitted 28 January, 2019; originally announced January 2019.

    Comments: Computer Vision and Pattern Recognition

  16. arXiv:1804.07187  [pdf, other

    cs.CV

    Motion Fused Frames: Data Level Fusion Strategy for Hand Gesture Recognition

    Authors: Okan Köpüklü, Neslihan Köse, Gerhard Rigoll

    Abstract: Acquiring spatio-temporal states of an action is the most crucial step for action classification. In this paper, we propose a data level fusion strategy, Motion Fused Frames (MFFs), designed to fuse motion information into static images as better representatives of spatio-temporal states of an action. MFFs can be used as input to any deep learning architecture with very little modification on the… ▽ More

    Submitted 26 April, 2018; v1 submitted 19 April, 2018; originally announced April 2018.

    Comments: Accepted to CVPR 2018 as workshop paper