Search | arXiv e-print repository

doi 10.3390/s21196537

Using Perspective-n-Point Algorithms for a Local Positioning System Based on LEDs and a QADA Receiver

Authors: Elena Aparicio-Esteve, Jesús Ureña, Álvaro Hernández, Daniel Pizarro, David Moltó

Abstract: The research interest on location-based services has increased during the last years ever since 3D centimetre accuracy inside intelligent environments could be confronted with. This work proposes an indoor local positioning system based on LED lighting, transmitted from a set of beacons to a receiver.The receiver is based on a quadrant photodiode angular diversity aperture (QADA) plus an aperture… ▽ More The research interest on location-based services has increased during the last years ever since 3D centimetre accuracy inside intelligent environments could be confronted with. This work proposes an indoor local positioning system based on LED lighting, transmitted from a set of beacons to a receiver.The receiver is based on a quadrant photodiode angular diversity aperture (QADA) plus an aperture placed over it.This configuration can be modelled as a perspective camera, where the image position of the transmitters can be used to recover the receiver's 3D pose. This process is known as the perspective-n-point (PnP) problem, which is well known in computer vision and photogrammetry. This work investigates the use of different state-of-the-art PnP algorithms to localize the receiver in a large space based on four co-planar transmitters and with a distance from transmitters to receiver of 3.4 m. Encoding techniques are used to permit the simultaneous emission of all the transmitted signals and their processing in the receiver. In addition, correlation techniques are used to determine the image points projected from each emitter on the QADA. This work uses Monte Carlo simulations to characterize the absolute errors for a grid of test points under noisy measurements, as well as the robustness of the system when varying the 3D location of one transmitter. The IPPE algorithm obtained the best performance in this configuration. The proposal has also been experimentally evaluated in a real setup. The estimation of the receiver's position at three points using the IPPE algorithm achieves average absolute errors of 4.33cm, 3.51cm and 28.90cm in the coordinates x, y and z, respectively. These positioning results are in line with those obtained in previous work using triangulation techniques but with the addition that the complete pose of the receiver is obtained in this proposal. △ Less

Submitted 6 February, 2024; originally announced February 2024.

ACM Class: I.4

Journal ref: Sensors 2021, 21, 6537

arXiv:2007.07171 [pdf, other]

Towards Dense People Detection with Deep Learning and Depth images

Authors: David Fuentes-Jimenez, Cristina Losada-Gutierrez, David Casillas-Perez, Javier Macias-Guarasa, Roberto Martin-Lopez, Daniel Pizarro, Carlos A. Luna

Abstract: This paper proposes a DNN-based system that detects multiple people from a single depth image. Our neural network processes a depth image and outputs a likelihood map in image coordinates, where each detection corresponds to a Gaussian-shaped local distribution, centered at the person's head. The likelihood map encodes both the number of detected people and their 2D image positions, and can be use… ▽ More This paper proposes a DNN-based system that detects multiple people from a single depth image. Our neural network processes a depth image and outputs a likelihood map in image coordinates, where each detection corresponds to a Gaussian-shaped local distribution, centered at the person's head. The likelihood map encodes both the number of detected people and their 2D image positions, and can be used to recover the 3D position of each person using the depth image and the camera calibration parameters. Our architecture is compact, using separated convolutions to increase performance, and runs in real-time with low budget GPUs. We use simulated data for initially training the network, followed by fine tuning with a relatively small amount of real data. We show this strategy to be effective, producing networks that generalize to work with scenes different from those used during training. We thoroughly compare our method against the existing state-of-the-art, including both classical and DNN-based solutions. Our method outperforms existing methods and can accurately detect people in scenes with significant occlusions. △ Less

Submitted 14 July, 2020; originally announced July 2020.

arXiv:2006.05447 [pdf, other]

Towards Domain Independence in CNN-based Acoustic Localization using Deep Cross Correlations

Authors: Juan Manuel Vera-Diaz, Daniel Pizarro, Javier Macias-Guarasa

Abstract: Time delay estimation is essential in Acoustic Source Localization (ASL) systems. One of the most used techniques for this purpose is the Generalized Cross Correlation (GCC) between a pair of signals and its use in Steered Response Power (SRP) techniques, which estimate the acoustic power at a specific location. Nowadays, Deep Learning strategies may outperform these methods. However, they are gen… ▽ More Time delay estimation is essential in Acoustic Source Localization (ASL) systems. One of the most used techniques for this purpose is the Generalized Cross Correlation (GCC) between a pair of signals and its use in Steered Response Power (SRP) techniques, which estimate the acoustic power at a specific location. Nowadays, Deep Learning strategies may outperform these methods. However, they are generally dependent on the geometric and sensor configuration conditions that are available during the training phases, thus having limited generalization capabilities when facing new environments if no re-training nor adaptation is applied. In this work, we propose a method based on an encoder-decoder CNN architecture capable of outperforming the well known SRP-PHAT algorithm, and also other Deep Learning strategies when working in mismatched training-testing conditions without requiring a model re-training. Our proposal aims to estimate a smoothed version of the correlation signals, that is then used to generate a refined acoustic power map, which leads to better performance on the ASL task. Our experimental evaluation uses three publicly available realistic datasets and provides a comparison with the SRP-PHAT algorithm and other recent proposals based on Deep Learning. △ Less

Submitted 9 June, 2020; originally announced June 2020.

arXiv:1807.11094 [pdf, other]

doi 10.3390/s18103418

Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

Authors: Juan Manuel Vera-Diaz, Daniel Pizarro, Javier Macias-Guarasa

Abstract: This paper presents a novel approach for indoor acoustic source localization using microphone arrays and based on a Convolutional Neural Network (CNN). The proposed solution is, to the best of our knowledge, the first published work in which the CNN is designed to directly estimate the three dimensional position of an acoustic source, using the raw audio signal as the input information avoiding th… ▽ More This paper presents a novel approach for indoor acoustic source localization using microphone arrays and based on a Convolutional Neural Network (CNN). The proposed solution is, to the best of our knowledge, the first published work in which the CNN is designed to directly estimate the three dimensional position of an acoustic source, using the raw audio signal as the input information avoiding the use of hand crafted audio features. Given the limited amount of available localization data, we propose in this paper a training strategy based on two steps. We first train our network using semi-synthetic data, generated from close talk speech recordings, and where we simulate the time delays and distortion suffered in the signal that propagates from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results show that this strategy is able to produce networks that significantly improve existing localization methods based on \textit{SRP-PHAT} strategies. In addition, our experiments show that our CNN method exhibits better resistance against varying gender of the speaker and different window sizes compared with the other methods. △ Less

Submitted 29 July, 2018; originally announced July 2018.

Comments: 18 pages, 3 figures, 8 tables

Journal ref: Sensors 2018, (volume 18(10), 3418)

Showing 1–4 of 4 results for author: Pizarro, D