Skip to main content

Showing 1–17 of 17 results for author: Fuentes, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.07797  [pdf, other

    cs.LG cs.AI

    Joint Selection: Adaptively Incorporating Public Information for Private Synthetic Data

    Authors: Miguel Fuentes, Brett Mullins, Ryan McKenna, Gerome Miklau, Daniel Sheldon

    Abstract: Mechanisms for generating differentially private synthetic data based on marginals and graphical models have been successful in a wide range of settings. However, one limitation of these methods is their inability to incorporate public data. Initializing a data generating model by pre-training on public data has shown to improve the quality of synthetic data, but this technique is not applicable w… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  2. arXiv:2309.13343  [pdf, other

    cs.SD eess.AS

    Two vs. Four-Channel Sound Event Localization and Detection

    Authors: Julia Wilkins, Magdalena Fuentes, Luca Bondi, Shabnam Ghaffarzadegan, Ali Abavisani, Juan Pablo Bello

    Abstract: Sound event localization and detection (SELD) systems estimate both the direction-of-arrival (DOA) and class of sound sources over time. In the DCASE 2022 SELD Challenge (Task 3), models are designed to operate in a 4-channel setting. While beneficial to further the development of SELD systems using a multichannel recording setup such as first-order Ambisonics (FOA), most consumer electronics devi… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

  3. arXiv:2309.09288  [pdf, other

    cs.SD eess.AS

    Sound Source Distance Estimation in Diverse and Dynamic Acoustic Conditions

    Authors: Saksham Singh Kushwaha, Iran R. Roman, Magdalena Fuentes, Juan Pablo Bello

    Abstract: Localizing a moving sound source in the real world involves determining its direction-of-arrival (DOA) and distance relative to a microphone. Advancements in DOA estimation have been facilitated by data-driven methods optimized with large open-source datasets with microphone array recordings in diverse environments. In contrast, estimating a sound source's distance remains understudied. Existing a… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: Accepted in WASPAA 2023

  4. arXiv:2308.09089  [pdf, other

    cs.SD cs.CV cs.IR cs.MM eess.AS

    Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries

    Authors: Julia Wilkins, Justin Salamon, Magdalena Fuentes, Juan Pablo Bello, Oriol Nieto

    Abstract: Finding the right sound effects (SFX) to match moments in a video is a difficult and time-consuming task, and relies heavily on the quality and completeness of text metadata. Retrieving high-quality (HQ) SFX using a video frame directly as the query is an attractive alternative, removing the reliance on text metadata and providing a low barrier to entry for non-experts. Due to the lack of HQ audio… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: WASPAA 2023. Project page: https://juliawilkins.github.io/sound-effects-retrieval-from-video/. 4 pages, 2 figures, 2 tables

  5. arXiv:2306.12300  [pdf, other

    cs.SD eess.AS

    A Multimodal Prototypical Approach for Unsupervised Sound Classification

    Authors: Saksham Singh Kushwaha, Magdalena Fuentes

    Abstract: In the context of environmental sound classification, the adaptability of systems is key: which sound classes are interesting depends on the context and the user's needs. Recent advances in text-to-audio retrieval allow for zero-shot audio classification, but performance compared to supervised models remains limited. This work proposes a multimodal prototypical approach that exploits local audio-t… ▽ More

    Submitted 17 August, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: Accepted to INTERSPEECH 2023

  6. arXiv:2304.07186  [pdf, other

    cs.SD eess.AS

    Adapting Meter Tracking Models to Latin American Music

    Authors: Lucas S. Maia, Martín Rocamora, Luiz W. P. Biscainho, Magdalena Fuentes

    Abstract: Beat and downbeat tracking models have improved significantly in recent years with the introduction of deep learning methods. However, despite these improvements, several challenges remain. Particularly, the adaptation of available models to underrepresented music traditions in MIR is usually synonymous with collecting and annotating large amounts of data, which is impractical and time-consuming.… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

    Comments: Accepted at ISMIR 2022. This version was made after a bug fix in the code, which lead to minor modifications in the results (updated in Figure 1 and Table 1). The paper's conclusions remain unchanged

  7. Tempo vs. Pitch: understanding self-supervised tempo estimation

    Authors: Giovana Morais, Matthew E. P. Davies, Marcelo Queiroz, Magdalena Fuentes

    Abstract: Self-supervision methods learn representations by solving pretext tasks that do not require human-generated labels, alleviating the need for time-consuming annotations. These methods have been applied in computer vision, natural language processing, environmental sound analysis, and recently in music information retrieval, e.g. for pitch estimation. Particularly in the context of music, there are… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: 5 pages, 3 figures, published on 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing

  8. arXiv:2211.08367  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    FlowGrad: Using Motion for Visual Sound Source Localization

    Authors: Rajsuryan Singh, Pablo Zinemanas, Xavier Serra, Juan Pablo Bello, Magdalena Fuentes

    Abstract: Most recent work in visual sound source localization relies on semantic audio-visual representations learned in a self-supervised manner, and by design excludes temporal information present in videos. While it proves to be effective for widely used benchmark datasets, the method falls short for challenging scenarios like urban traffic. This work introduces temporal context into the state-of-the-ar… ▽ More

    Submitted 14 April, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: Accepted in ICASSP 2023

  9. arXiv:2204.05156  [pdf, other

    cs.SD eess.AS

    How to Listen? Rethinking Visual Sound Localization

    Authors: Ho-Hsiang Wu, Magdalena Fuentes, Prem Seetharaman, Juan Pablo Bello

    Abstract: Localizing visual sounds consists on locating the position of objects that emit sound within an image. It is a growing research area with potential applications in monitoring natural and urban environments, such as wildlife migration and urban traffic. Previous works are usually evaluated with datasets having mostly a single dominant visible object, and proposed models usually require the introduc… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: Submitted to INTERSPEECH 2022

  10. arXiv:2203.10425  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    A Study on Robustness to Perturbations for Representations of Environmental Sound

    Authors: Sangeeta Srivastava, Ho-Hsiang Wu, Joao Rulff, Magdalena Fuentes, Mark Cartwright, Claudio Silva, Anish Arora, Juan Pablo Bello

    Abstract: Audio applications involving environmental sound analysis increasingly use general-purpose audio representations, also known as embeddings, for transfer learning. Recently, Holistic Evaluation of Audio Representations (HEAR) evaluated twenty-nine embedding models on nineteen diverse tasks. However, the evaluation's effectiveness depends on the variation already captured within a given dataset. The… ▽ More

    Submitted 6 July, 2022; v1 submitted 19 March, 2022; originally announced March 2022.

    Comments: Accepted in EUSIPCO 2022

  11. arXiv:2109.12690  [pdf, ps, other

    cs.SD cs.DB cs.LG eess.AS

    Soundata: A Python library for reproducible use of audio datasets

    Authors: Magdalena Fuentes, Justin Salamon, Pablo Zinemanas, Martín Rocamora, Genís Paja, Irán R. Román, Marius Miron, Xavier Serra, Juan Pablo Bello

    Abstract: Soundata is a Python library for loading and working with audio datasets in a standardized way, removing the need for writing custom loaders in every project, and improving reproducibility by providing tools to validate data against a canonical version. It speeds up research pipelines by allowing users to quickly download a dataset, load it into memory in a standardized and reproducible way, valid… ▽ More

    Submitted 4 October, 2021; v1 submitted 26 September, 2021; originally announced September 2021.

  12. arXiv:2106.01149  [pdf, other

    cs.SD cs.IR eess.AS

    Exploring modality-agnostic representations for music classification

    Authors: Ho-Hsiang Wu, Magdalena Fuentes, Juan P. Bello

    Abstract: Music information is often conveyed or recorded across multiple data modalities including but not limited to audio, images, text and scores. However, music information retrieval research has almost exclusively focused on single modality recognition, requiring development of separate models for each modality. Some multi-modal works require multiple coexisting modalities given to the model as inputs… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

  13. arXiv:2009.05188  [pdf, other

    cs.SD cs.LG eess.AS

    SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context

    Authors: Mark Cartwright, Jason Cramer, Ana Elisa Mendez Mendez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon, Oded Nov, Juan Pablo Bello

    Abstract: We present SONYC-UST-V2, a dataset for urban sound tagging with spatiotemporal information. This dataset is aimed for the development and evaluation of machine listening systems for real-world urban noise monitoring. While datasets of urban recordings are available, this dataset provides the opportunity to investigate how spatiotemporal metadata can aid in the prediction of urban sound tags. SONYC… ▽ More

    Submitted 10 September, 2020; originally announced September 2020.

  14. Pioneering Studies on LTE eMBMS: Towards 5G Point-to-Multipoint Transmissions

    Authors: Hongzhi Chen, De Mi, Manuel Fuentes, David Vargas, Eduardo Garro, Jose Luis Carcel, Belkacem Mouhouche, Pei Xiao, Rahim Tafazolli

    Abstract: The first 5G (5th generation wireless systems) New Radio Release-15 was recently completed. However, the specification only considers the use of unicast technologies and the extension to point-to-multipoint (PTM) scenarios is not yet considered. To this end, we first present in this work a technical overview of the state-of-the-art LTE (Long Term Evolution) PTM technology, i.e., eMBMS (evolved Mul… ▽ More

    Submitted 29 November, 2019; originally announced January 2020.

    Comments: SAM 2018, 5 pages, 4 figs

  15. On the Performance of PDCCH in LTE and 5G New Radio

    Authors: Hongzhi Chen, De Mi, Manuel Fuentes, Eduardo Garro, Jose Luis Carcel, Belkacem Mouhouche, Pei Xiao, Rahim Tafazolli

    Abstract: 5G New Radio (NR) Release 15 has been specified in June 2018. It introduces numerous changes and potential improvements for physical layer data transmissions, although only point-to-point (PTP) communications are considered. In order to use physical data channels such as the Physical Downlink Shared Channel (PDSCH), it is essential to guarantee a successful transmission of control information via… ▽ More

    Submitted 29 November, 2019; originally announced January 2020.

    Comments: Globecomm 2018 workshop, 6 pages, 7 figs

  16. arXiv:1709.00927   

    cs.HC

    A Fuzzy Control System for Inductive Video Games

    Authors: Carlos Lara-Alvarez, Hugo Mitre-Hernandez, Juan Flores, Maria Fuentes

    Abstract: It has been shown that the emotional state of students has an important relationship with learning; for instance, engaged concentration is positively correlated with learning. This paper proposes the Inductive Control (IC) for educational games. Unlike conventional approaches that only modify the game level, the proposed technique also induces emotions in the player for supporting the learning pro… ▽ More

    Submitted 15 April, 2018; v1 submitted 4 September, 2017; originally announced September 2017.

    Comments: It needs to be reviewed

  17. arXiv:1409.7336  [pdf, other

    physics.soc-ph cs.CL nlin.AO physics.data-an

    Does network complexity help organize Babel's library?

    Authors: Juan Pablo Cárdenas, Iván González, Gerardo Vidal, Miguel Fuentes

    Abstract: In this work, we study properties of texts from the perspective of complex network theory. Words in given texts are linked by co-occurrence and transformed into networks, and we observe that these display topological properties common to other complex systems. However, there are some properties that seem to be exclusive to texts; many of these properties depend on the frequency of words in the tex… ▽ More

    Submitted 16 October, 2015; v1 submitted 23 September, 2014; originally announced September 2014.