Skip to main content

Showing 1–21 of 21 results for author: Beltran, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.05785  [pdf, other

    cs.CV

    Joint object detection and re-identification for 3D obstacle multi-camera systems

    Authors: Irene Cortés, Jorge Beltrán, Arturo de la Escalera, Fernando García

    Abstract: In recent years, the field of autonomous driving has witnessed remarkable advancements, driven by the integration of a multitude of sensors, including cameras and LiDAR systems, in different prototypes. However, with the proliferation of sensor data comes the pressing need for more sophisticated information processing techniques. This research paper introduces a novel modification to an object det… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  2. arXiv:2306.08510  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Permutation Invariant Recurrent Neural Networks for Sound Source Tracking Applications

    Authors: David Diaz-Guerra, Archontis Politis, Antonio Miguel, Jose R. Beltran, Tuomas Virtanen

    Abstract: Many multi-source localization and tracking models based on neural networks use one or several recurrent layers at their final stages to track the movement of the sources. Conventional recurrent neural networks (RNNs), such as the long short-term memories (LSTMs) or the gated recurrent units (GRUs), take a vector as their input and use another vector to store their state. However, this approach re… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: Accepted for publication at Forum Acusticum 2023

  3. arXiv:2303.13881  [pdf, other

    cs.SD cs.AI eess.AS

    Symbolic Music Structure Analysis with Graph Representations and Changepoint Detection Methods

    Authors: Carlos Hernandez-Olivan, Sonia Rubio Llamas, Jose R. Beltran

    Abstract: Music Structure Analysis is an open research task in Music Information Retrieval (MIR). In the past, there have been several works that attempt to segment music into the audio and symbolic domains, however, the identification and segmentation of the music structure at different levels is still an open research problem in this area. In this work we propose three methods, two of which are novel grap… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

  4. arXiv:2210.13944  [pdf, other

    cs.AI cs.SD eess.AS

    A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives

    Authors: Carlos Hernandez-Olivan, Javier Hernandez-Olivan, Jose R. Beltran

    Abstract: Music is one of the Gardner's intelligences in his theory of multiple intelligences. How humans perceive and understand music is still being studied and is crucial to develop artificial intelligence models that imitate such processes. Music generation with Artificial Intelligence is an emerging field that is gaining much attention in the recent years. In this paper, we describe how humans compose… ▽ More

    Submitted 3 November, 2022; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: Under review

  5. arXiv:2209.07974  [pdf, other

    cs.SD cs.MM eess.AS

    musicaiz: A Python Library for Symbolic Music Generation, Analysis and Visualization

    Authors: Carlos Hernandez-Olivan, Jose R. Beltran

    Abstract: In this article, we present musicaiz, an object-oriented library for analyzing, generating and evaluating symbolic music. The submodules of the package allow the user to create symbolic music data from scratch, build algorithms to analyze symbolic music, encode MIDI data as tokens to train deep learning sequence models, modify existing music data and evaluate music generation systems. The evaluati… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

  6. arXiv:2203.16940  [pdf

    eess.AS cs.LG cs.SD eess.SP

    Direction of Arrival Estimation of Sound Sources Using Icosahedral CNNs

    Authors: David Diaz-Guerra, Antonio Miguel, Jose R. Beltran

    Abstract: In this paper, we present a new model for Direction of Arrival (DOA) estimation of sound sources based on an Icosahedral Convolutional Neural Network (CNN) applied over SRP-PHAT power maps computed from the signals received by a microphone array. This icosahedral CNN is equivariant to the 60 rotational symmetries of the icosahedron, which represent a good approximation of the continuous space of s… ▽ More

    Submitted 6 December, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: The code to reproduce this work can be found in our GitHub repository: https://github.com/DavidDiazGuerra/icoDOA

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 313-321, 2023

  7. arXiv:2203.14641  [pdf, other

    cs.SD cs.AI

    Subjective Evaluation of Deep Learning Models for Symbolic Music Composition

    Authors: Carlos Hernandez-Olivan, Jorge Abadias Puyuelo, Jose R. Beltran

    Abstract: Deep learning models are typically evaluated to measure and compare their performance on a given task. The metrics that are commonly used to evaluate these models are standard metrics that are used for different tasks. In the field of music composition or generation, the standard metrics used in other fields have no clear meaning in terms of music theory. In this paper, we propose a subjective met… ▽ More

    Submitted 3 April, 2022; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: Workshop on Generative AI and HCI, CHI 2022

  8. arXiv:2108.12290  [pdf, other

    cs.SD cs.AI eess.AS

    Music Composition with Deep Learning: A Review

    Authors: Carlos Hernandez-Olivan, Jose R. Beltran

    Abstract: Generating a complex work of art such as a musical composition requires exhibiting true creativity that depends on a variety of factors that are related to the hierarchy of musical language. Music generation have been faced with Algorithmic methods and recently, with Deep Learning models that are being used in other fields such as Computer Vision. In this paper we want to put into context the exis… ▽ More

    Submitted 7 September, 2021; v1 submitted 27 August, 2021; originally announced August 2021.

  9. arXiv:2107.06231  [pdf, other

    cs.SD cs.LG eess.AS

    Timbre Classification of Musical Instruments with a Deep Learning Multi-Head Attention-Based Model

    Authors: Carlos Hernandez-Olivan, Jose R. Beltran

    Abstract: The aim of this work is to define a model based on deep learning that is able to identify different instrument timbres with as few parameters as possible. For this purpose, we have worked with classical orchestral instruments played with different dynamics, which are part of a few instrument families and which play notes in the same pitch range. It has been possible to assess the ability to classi… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

  10. arXiv:2104.11021  [pdf, other

    cs.CV

    Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing Simulation-to-Real Domain Shift in LiDAR Bird's Eye View

    Authors: Alejandro Barrera, Jorge Beltrán, Carlos Guindel, Jose Antonio Iglesias, Fernando García

    Abstract: The performance of object detection methods based on LiDAR information is heavily impacted by the availability of training data, usually limited to certain laser devices. As a result, the use of synthetic data is becoming popular when training neural network models, as both sensor specifications and driving scenarios can be generated ad-hoc. However, bridging the gap between virtual and real envir… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

    Comments: Submitted to IEEE International Conference on Intelligent Transportation Systems (ITSC2021)

  11. Automatic Extrinsic Calibration Method for LiDAR and Camera Sensor Setups

    Authors: Jorge Beltrán, Carlos Guindel, Arturo de la Escalera, Fernando García

    Abstract: Most sensor setups for onboard autonomous perception are composed of LiDARs and vision systems, as they provide complementary information that improves the reliability of the different algorithms necessary to obtain a robust scene understanding. However, the effective use of information from different sources requires an accurate calibration between the sensors involved, which usually implies a te… ▽ More

    Submitted 15 March, 2022; v1 submitted 12 January, 2021; originally announced January 2021.

    Comments: Published on IEEE Transactions on Intelligent Transportation Systems

    Journal ref: IEEE Transactions on Intelligent Transportation Systems, 2022

  12. arXiv:2008.09672  [pdf, other

    cs.CV cs.RO

    Towards Autonomous Driving: a Multi-Modal 360$^{\circ}$ Perception Proposal

    Authors: Jorge Beltrán, Carlos Guindel, Irene Cortés, Alejandro Barrera, Armando Astudillo, Jesús Urdiales, Mario Álvarez, Farid Bekka, Vicente Milanés, Fernando García

    Abstract: In this paper, a multi-modal 360$^{\circ}$ framework for 3D object detection and tracking for autonomous vehicles is presented. The process is divided into four main stages. First, images are fed into a CNN network to obtain instance segmentation of the surrounding road participants. Second, LiDAR-to-image association is performed for the estimated mask proposals. Then, the isolated points of ever… ▽ More

    Submitted 21 August, 2020; originally announced August 2020.

    Comments: Accepted for publication in IEEE ITSC 2020

  13. arXiv:2008.07527  [pdf, other

    eess.AS cs.LG cs.SD

    Music Boundary Detection using Convolutional Neural Networks: A comparative analysis of combined input features

    Authors: Carlos Hernandez-Olivan, Jose R. Beltran, David Diaz-Guerra

    Abstract: The analysis of the structure of musical pieces is a task that remains a challenge for Artificial Intelligence, especially in the field of Deep Learning. It requires prior identification of structural boundaries of the music pieces. This structural boundary analysis has recently been studied with unsupervised methods and \textit{end-to-end} techniques such as Convolutional Neural Networks (CNN) us… ▽ More

    Submitted 1 December, 2021; v1 submitted 17 August, 2020; originally announced August 2020.

    Journal ref: International Journal of Interactive Multimedia & Artificial Intelligence (2021), vol. 7, no 2, p. 78-88

  14. arXiv:2006.09006  [pdf

    eess.AS cs.LG cs.SD eess.SP

    Robust Sound Source Tracking Using SRP-PHAT and 3D Convolutional Neural Networks

    Authors: David Diaz-Guerra, Antonio Miguel, Jose R. Beltran

    Abstract: In this paper, we present a new single sound source DOA estimation and tracking system based on the well-known SRP-PHAT algorithm and a three-dimensional Convolutional Neural Network. It uses SRP-PHAT power maps as input features of a fully convolutional causal architecture that uses 3D convolutional layers to accurately perform the tracking of a sound source even in highly reverberant scenarios w… ▽ More

    Submitted 16 December, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: This is a pre-print of an article published in IEEE/ACM Transactions on Audio Speech and Language Processing. The code to reproduce this work can be found in our GitHub repository: https://github.com/DavidDiazGuerra/Cross3D

    Journal ref: in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 300-311, 2021

  15. arXiv:2003.04188  [pdf, other

    cs.CV

    BirdNet+: End-to-End 3D Object Detection in LiDAR Bird's Eye View

    Authors: Alejandro Barrera, Carlos Guindel, Jorge Beltrán, Fernando García

    Abstract: On-board 3D object detection in autonomous vehicles often relies on geometry information captured by LiDAR devices. Albeit image features are typically preferred for detection, numerous approaches take only spatial data as input. Exploiting this information in inference usually involves the use of compact representations such as the Bird's Eye View (BEV) projection, which entails a loss of informa… ▽ More

    Submitted 9 March, 2020; originally announced March 2020.

    Comments: Submitted to IEEE International Conference on Intelligent Transportation Systems (ITSC2020)

  16. arXiv:2002.08239  [pdf, other

    cs.CV

    siaNMS: Non-Maximum Suppression with Siamese Networks for Multi-Camera 3D Object Detection

    Authors: Irene Cortes, Jorge Beltran, Arturo de la Escalera, Fernando Garcia

    Abstract: The rapid development of embedded hardware in autonomous vehicles broadens their computational capabilities, thus bringing the possibility to mount more complete sensor setups able to handle driving scenarios of higher complexity. As a result, new challenges such as multiple detections of the same object have to be addressed. In this work, a siamese network is integrated into the pipeline of a wel… ▽ More

    Submitted 19 February, 2020; originally announced February 2020.

    Comments: Submitted to IEEE Intelligent Vehicles Symposium 2020 (IV2020)

  17. gpuRIR: A Python Library for Room Impulse Response Simulation with GPU Acceleration

    Authors: David Diaz-Guerra, Antonio Miguel, Jose R. Beltran

    Abstract: The Image Source Method (ISM) is one of the most employed techniques to calculate acoustic Room Impulse Responses (RIRs), however, its computational complexity grows fast with the reverberation time of the room and its computation time can be prohibitive for some applications where a huge number of RIRs are needed. In this paper, we present a new implementation that dramatically improves the compu… ▽ More

    Submitted 9 October, 2020; v1 submitted 26 October, 2018; originally announced October 2018.

    Comments: This is a pre-print of an article published in Multimedia Tools and Applications (2020)

  18. arXiv:1805.01195  [pdf, other

    cs.CV

    BirdNet: a 3D Object Detection Framework from LiDAR information

    Authors: Jorge Beltran, Carlos Guindel, Francisco Miguel Moreno, Daniel Cruzado, Fernando Garcia, Arturo de la Escalera

    Abstract: Understanding driving situations regardless the conditions of the traffic scene is a cornerstone on the path towards autonomous vehicles; however, despite common sensor setups already include complementary devices such as LiDAR or radar, most of the research on perception systems has traditionally focused on computer vision. We present a LiDAR-based 3D object detection pipeline entailing three sta… ▽ More

    Submitted 3 May, 2018; originally announced May 2018.

    Comments: Submittied to IEEE International Conference on Intelligent Transportation Systems 2018 (ITSC)

  19. arXiv:1802.02548  [pdf, other

    cs.LG cs.AI cs.CY physics.ao-ph stat.ML

    Predicting Hurricane Trajectories using a Recurrent Neural Network

    Authors: Sheila Alemany, Jonathan Beltran, Adrian Perez, Sam Ganzfried

    Abstract: Hurricanes are cyclones circulating about a defined center whose closed wind speeds exceed 75 mph originating over tropical and subtropical waters. At landfall, hurricanes can result in severe disasters. The accuracy of predicting their trajectory paths is critical to reduce economic loss and save human lives. Given the complexity and nonlinearity of weather data, a recurrent neural network (RNN)… ▽ More

    Submitted 12 September, 2018; v1 submitted 1 February, 2018; originally announced February 2018.

  20. arXiv:1705.04085  [pdf, other

    cs.CV cs.RO

    Automatic Extrinsic Calibration for Lidar-Stereo Vehicle Sensor Setups

    Authors: Carlos Guindel, Jorge Beltrán, David Martín, Fernando García

    Abstract: Sensor setups consisting of a combination of 3D range scanner lasers and stereo vision systems are becoming a popular choice for on-board perception systems in vehicles; however, the combined use of both sources of information implies a tedious calibration process. We present a method for extrinsic calibration of lidar-stereo camera pairs without user intervention. Our calibration approach is aime… ▽ More

    Submitted 27 July, 2017; v1 submitted 11 May, 2017; originally announced May 2017.

    Comments: Accepted to IEEE International Conference on Intelligent Transportation Systems 2017 (ITSC)

    MSC Class: 68T45 ACM Class: I.4.8; I.2.9; I.4.1

  21. arXiv:1512.03564  [pdf, other

    cs.DB

    Scalable Package Queries in Relational Database Systems

    Authors: Matteo Brucato, Juan Felipe Beltran, Azza Abouzied, Alexandra Meliou

    Abstract: Traditional database queries follow a simple model: they define constraints that each tuple in the result must satisfy. This model is computationally efficient, as the database system can evaluate the query conditions on each tuple individually. However, many practical, real-world problems require a collection of result tuples to satisfy constraints collectively, rather than individually. In this… ▽ More

    Submitted 15 December, 2015; v1 submitted 11 December, 2015; originally announced December 2015.

    Comments: Extended version of PVLDB 2016 submission