Skip to main content

Showing 1–33 of 33 results for author: Davies, M E

Searching in archive eess. Search in all archives.
.
  1. arXiv:2401.08902  [pdf, other

    cs.SD cs.DL cs.IR cs.LG eess.AS

    Similar but Faster: Manipulation of Tempo in Music Audio Embeddings for Tempo Prediction and Search

    Authors: Matthew C. McCallum, Florian Henkel, Jaehun Kim, Samuel E. Sandberg, Matthew E. P. Davies

    Abstract: Audio embeddings enable large scale comparisons of the similarity of audio files for applications such as search and recommendation. Due to the subjectivity of audio similarity, it can be desirable to design systems that answer not only whether audio is similar, but similar in what way (e.g., wrt. tempo, mood or genre). Previous works have proposed disentangled embedding spaces where subspaces rep… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted to the International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

  2. arXiv:2401.08891  [pdf, other

    cs.SD cs.LG eess.AS

    Tempo estimation as fully self-supervised binary classification

    Authors: Florian Henkel, Jaehun Kim, Matthew C. McCallum, Samuel E. Sandberg, Matthew E. P. Davies

    Abstract: This paper addresses the problem of global tempo estimation in musical audio. Given that annotating tempo is time-consuming and requires certain musical expertise, few publicly available data sources exist to train machine learning models for this task. Towards alleviating this issue, we propose a fully self-supervised approach that does not rely on any human labeled data. Our method builds on the… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted to the International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

  3. arXiv:2401.08889  [pdf, other

    cs.SD cs.IR cs.LG cs.MM eess.AS

    On the Effect of Data-Augmentation on Local Embedding Properties in the Contrastive Learning of Music Audio Representations

    Authors: Matthew C. McCallum, Matthew E. P. Davies, Florian Henkel, Jaehun Kim, Samuel E. Sandberg

    Abstract: Audio embeddings are crucial tools in understanding large catalogs of music. Typically embeddings are evaluated on the basis of the performance they provide in a wide range of downstream tasks, however few studies have investigated the local properties of the embedding spaces themselves which are important in nearest neighbor algorithms, commonly used in music search and recommendation. In this wo… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted to the International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024

  4. arXiv:2308.10355  [pdf, other

    eess.AS cs.SD

    Local Periodicity-Based Beat Tracking for Expressive Classical Piano Music

    Authors: Ching-Yu Chiu, Meinard Müller, Matthew E. P. Davies, Alvin Wen-Yu Su, Yi-Hsuan Yang

    Abstract: To model the periodicity of beats, state-of-the-art beat tracking systems use "post-processing trackers" (PPTs) that rely on several empirically determined global assumptions for tempo transition, which work well for music with a steady tempo. For expressive classical music, however, these assumptions can be too rigid. With two large datasets of Western classical piano music, namely the Aligned Sc… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

    Comments: Accepted to IEEE/ACM Transactions on Audio, Speech, and Language Processing (July 2023)

  5. Tempo vs. Pitch: understanding self-supervised tempo estimation

    Authors: Giovana Morais, Matthew E. P. Davies, Marcelo Queiroz, Magdalena Fuentes

    Abstract: Self-supervision methods learn representations by solving pretext tasks that do not require human-generated labels, alleviating the need for time-consuming annotations. These methods have been applied in computer vision, natural language processing, environmental sound analysis, and recently in music information retrieval, e.g. for pitch estimation. Particularly in the context of music, there are… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: 5 pages, 3 figures, published on 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing

  6. arXiv:2211.14162  [pdf, other

    eess.SP cs.RO

    A Gaussian Process Regression based Dynamical Models Learning Algorithm for Target Tracking

    Authors: Mengwei Sun, Mike E. Davies, Ian K. Proudler, James R. Hopgood

    Abstract: Maneuvering target tracking is a challenging problem for sensor systems because of the unpredictability of the targets' motions. This paper proposes a novel data-driven method for learning the dynamical motion model of a target. Non-parametric Gaussian process regression (GPR) is used to learn a target's naturally shift invariant motion (NSIM) behavior, which is translationally invariant and does… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: 11 pages, 10 figures

  7. arXiv:2210.07314  [pdf, ps, other

    eess.IV

    Spline Sketches: An Efficient Approach for Photon Counting Lidar

    Authors: Michael Patrick Sheehan, Julian Tachella, Mike E. Davies

    Abstract: Photon counting lidar has become an invaluable tool for 3D depth imaging due to the fine-precision it can achieve over long ranges. However, high frame rate, high resolution lidar devices produce an enormous amount of time-of-flight (ToF) data which can cause a severe data processing bottleneck hindering the deployment of real-time systems. In this paper, an efficient photon counting approach is p… ▽ More

    Submitted 29 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: 13 pages, 13 figures

  8. An Analysis Method for Metric-Level Switching in Beat Tracking

    Authors: Ching-Yu Chiu, Meinard Müller, Matthew E. P. Davies, Alvin Wen-Yu Su, Yi-Hsuan Yang

    Abstract: For expressive music, the tempo may change over time, posing challenges to tracking the beats by an automatic model. The model may first tap to the correct tempo, but then may fail to adapt to a tempo change, or switch between several incorrect but perceptually plausible ones (e.g., half- or double-tempo). Existing evaluation metrics for beat tracking do not reflect such behaviors, as they typical… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE Signal Processing Letters (Oct. 2022)

  9. arXiv:2203.16165  [pdf, other

    eess.AS cs.AI cs.MM

    Symbolic music generation conditioned on continuous-valued emotions

    Authors: Serkan Sulun, Matthew E. P. Davies, Paula Viana

    Abstract: In this paper we present a new approach for the generation of multi-instrument symbolic music driven by musical emotion. The principal novelty of our approach centres on conditioning a state-of-the-art transformer based on continuous-valued valence and arousal labels. In addition, we provide a new large-scale dataset of symbolic music paired with emotion labels in terms of valence and arousal. We… ▽ More

    Submitted 4 May, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: Published in IEEE Access

    Journal ref: volume:10, year:2022, pages:44617-44626

  10. Adaptive Kernel Kalman Filter

    Authors: Mengwei Sun, Mike E. Davies, Ian K. Proudler, James R. Hopgood

    Abstract: Sequential Bayesian filters in non-linear dynamic systems require the recursive estimation of the predictive and posterior distributions. This paper introduces a Bayesian filter called the adaptive kernel Kalman filter (AKKF). With this filter, the arbitrary predictive and posterior distributions of hidden states are approximated using the empirical kernel mean embeddings (KMEs) in reproducing ker… ▽ More

    Submitted 27 February, 2023; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: The manuscript has been accepted for publication as a regular paper in the IEEE Transactions on Signal Processing

  11. arXiv:2203.00952  [pdf, other

    eess.IV cs.CV eess.SP

    Sketched RT3D: How to reconstruct billions of photons per second

    Authors: Julián Tachella, Michael P. Sheehan, Mike E. Davies

    Abstract: Single-photon light detection and ranging (lidar) captures depth and intensity information of a 3D scene. Reconstructing a scene from observed photons is a challenging task due to spurious detections associated with background illumination sources. To tackle this problem, there is a plethora of 3D reconstruction algorithms which exploit spatial regularity of natural scenes to provide stable recons… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

    Comments: Accepted at ICASSP 2022

  12. arXiv:2201.09375  [pdf, other

    eess.IV eess.SP

    Deep Unrolling for Magnetic Resonance Fingerprinting

    Authors: Dongdong Chen, Mike E. Davies, Mohammad Golbabaee

    Abstract: Magnetic Resonance Fingerprinting (MRF) has emerged as a promising quantitative MR imaging approach. Deep learning methods have been proposed for MRF and demonstrated improved performance over classical compressed sensing algorithms. However many of these end-to-end models are physics-free, while consistency of the predictions with respect to the physical forward model is crucial for reliably solv… ▽ More

    Submitted 25 January, 2022; v1 submitted 23 January, 2022; originally announced January 2022.

    Comments: Tech report. arXiv admin note: substantial text overlap with arXiv:2006.15271

  13. arXiv:2111.12855  [pdf, other

    cs.CV eess.IV eess.SP

    Robust Equivariant Imaging: a fully unsupervised framework for learning to image from noisy and partial measurements

    Authors: Dongdong Chen, Julián Tachella, Mike E. Davies

    Abstract: Deep networks provide state-of-the-art performance in multiple imaging inverse problems ranging from medical imaging to computational photography. However, most existing networks are trained with clean signals which are often hard or impossible to obtain. Equivariant imaging (EI) is a recent self-supervised learning framework that exploits the group invariance present in signal distributions to le… ▽ More

    Submitted 15 March, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: CVPR 2022. Code: https://github.com/edongdongchen/REI

  14. arXiv:2110.08045  [pdf, ps, other

    stat.ML cs.LG eess.SP

    Compressive Independent Component Analysis: Theory and Algorithms

    Authors: Michael P. Sheehan, Mike E. Davies

    Abstract: Compressive learning forms the exciting intersection between compressed sensing and statistical learning where one exploits forms of sparsity and structure to reduce the memory and/or computational complexity of the learning task. In this paper, we look at the independent component analysis (ICA) model through the compressive learning lens. In particular, we show that solutions to the cumulant bas… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

    Comments: 27 pages, 8 figures, under review

  15. arXiv:2105.06920  [pdf, ps, other

    eess.SP

    Surface Detection for Sketched Single Photon Lidar

    Authors: Michael P. Sheehan, Julián Tachella, Mike E. Davies

    Abstract: Single-photon lidar devices are able to collect an ever-increasing amount of time-stamped photons in small time periods due to increasingly larger arrays, generating a memory and computational bottleneck on the data processing side. Recently, a sketching technique was introduced to overcome this bottleneck which compresses the amount of information to be stored and processed. The size of the sketc… ▽ More

    Submitted 14 May, 2021; originally announced May 2021.

    Comments: 5 pages, Accepted at EUSIPCO 2021

  16. arXiv:2103.14756  [pdf, other

    cs.CV eess.IV eess.SP

    Equivariant Imaging: Learning Beyond the Range Space

    Authors: Dongdong Chen, Julián Tachella, Mike E. Davies

    Abstract: In various imaging problems, we only have access to compressed measurements of the underlying signals, hindering most learning-based strategies which usually require pairs of signals and associated measurements for training. Learning only from compressed measurements is impossible in general, as the compressed observations do not contain information outside the range of the forward sensing operato… ▽ More

    Submitted 23 August, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

    Comments: ICCV 2021. Code: https://github.com/edongdongchen/EI

  17. A Sketching Framework for Reduced Data Transfer in Photon Counting Lidar

    Authors: Michael P. Sheehan, Julián Tachella, Mike E. Davies

    Abstract: Single-photon lidar has become a prominent tool for depth imaging in recent years. At the core of the technique, the depth of a target is measured by constructing a histogram of time delays between emitted light pulses and detected photon arrivals. A major data processing bottleneck arises on the device when either the number of photons per pixel is large or the resolution of the time stamp is fin… ▽ More

    Submitted 5 January, 2022; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: 16 pages, 20 figures. Figure 8 Corrected. Accepted at IEEE TCI

    Journal ref: IEEE Transactions on Computational Imaging, Volume 7, 2021, Pages 989 - 1004

  18. arXiv:2011.07274  [pdf, other

    eess.AS cs.AI cs.LG cs.SD

    On Filter Generalization for Music Bandwidth Extension Using Deep Neural Networks

    Authors: Serkan Sulun, Matthew E. P. Davies

    Abstract: In this paper, we address a sub-topic of the broad domain of audio enhancement, namely musical audio bandwidth extension. We formulate the bandwidth extension problem using deep neural networks, where a band-limited signal is provided as input to the network, with the goal of reconstructing a full-bandwidth output. Our main contribution centers on the impact of the choice of low pass filter when t… ▽ More

    Submitted 6 January, 2021; v1 submitted 14 November, 2020; originally announced November 2020.

    Comments: Qualitative examples on https://serkansulun.com/bwe. Source code on https://github.com/serkansulun/deep-music-enhancer

  19. arXiv:2008.11529  [pdf, other

    eess.AS cs.SD

    TIV.lib: an open-source library for the tonal description of musical audio

    Authors: António Ramires, Gilberto Bernardes, Matthew E. P. Davies, Xavier Serra

    Abstract: In this paper, we present TIV.lib, an open-source library for the content-based tonal description of musical audio signals. Its main novelty relies on the perceptually-inspired Tonal Interval Vector space based on the Discrete Fourier transform, from which multiple instantaneous and global representations, descriptors and metrics are computed - e.g., harmonic change, dissonance, diatonicity, and m… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

  20. arXiv:2008.02957  [pdf, ps, other

    eess.IV cs.LG

    Dual Convolutional Neural Networks for Breast Mass Segmentation and Diagnosis in Mammography

    Authors: Heyi Li, Dongdong Chen, William H. Nailon, Mike E. Davies, David Laurenson

    Abstract: Deep convolutional neural networks (CNNs) have emerged as a new paradigm for Mammogram diagnosis. Contemporary CNN-based computer-aided-diagnosis (CAD) for breast cancer directly extract latent features from input mammogram image and ignore the importance of morphological features. In this paper, we introduce a novel deep learning framework for mammogram image processing, which computes mass segme… ▽ More

    Submitted 11 August, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

  21. arXiv:2007.14281  [pdf, ps, other

    eess.SP cs.LG stat.ML

    DeepMP for Non-Negative Sparse Decomposition

    Authors: Konstantinos A. Voulgaris, Mike E. Davies, Mehrdad Yaghoobi

    Abstract: Non-negative signals form an important class of sparse signals. Many algorithms have already beenproposed to recover such non-negative representations, where greedy and convex relaxed algorithms are among the most popular methods. The greedy techniques are low computational cost algorithms, which have also been modified to incorporate the non-negativity of the representations. One such modificatio… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

  22. arXiv:2006.15271  [pdf, other

    eess.IV cs.CV

    Compressive MR Fingerprinting reconstruction with Neural Proximal Gradient iterations

    Authors: Dongdong Chen, Mike E. Davies, Mohammad Golbabaee

    Abstract: Consistency of the predictions with respect to the physical forward model is pivotal for reliably solving inverse problems. This consistency is mostly un-controlled in the current end-to-end deep learning methodologies proposed for the Magnetic Resonance Fingerprinting (MRF) problem. To address this, we propose ProxNet, a learned proximal gradient descent framework that directly incorporates the f… ▽ More

    Submitted 6 July, 2020; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: To appear in MICCAI 2020

  23. arXiv:2006.04472  [pdf, ps, other

    eess.SP

    Accelerated Search for Non-Negative Greedy Sparse Decomposition via Dimensionality Reduction

    Authors: Konstantinos Voulgaris, Mike E. Davies, Mehrdad Yaghoobi

    Abstract: Non-negative signals form an important class of sparse signals. Many algorithms have already beenproposed to recover such non-negative representations, where greedy and convex relaxed algorithms are among the most popular methods. One fast implementation is the FNNOMP algorithm that updates the non-negative coefficients in an iterative manner. Even though FNNOMP is a good approach when working on… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  24. Robust 3D reconstruction of dynamic scenes from single-photon lidar using Beta-divergences

    Authors: Quentin Legros, Julian Tachella, Rachael Tobin, Aongus McCarthy, Sylvain Meignen, Gerald S. Buller, Yoann Altmann, Stephen McLaughlin, Michael E. Davies

    Abstract: In this paper, we present a new algorithm for fast, online 3D reconstruction of dynamic scenes using times of arrival of photons recorded by single-photon detector arrays. One of the main challenges in 3D imaging using single-photon lidar in practical applications is the presence of strong ambient illumination which corrupts the data and can jeopardize the detection of peaks/surface in the signals… ▽ More

    Submitted 18 December, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

    Comments: 12 pages

  25. (l1,l2)-RIP and Projected Back-Projection Reconstruction for Phase-Only Measurements

    Authors: Thomas Feuillen, Mike E. Davies, Luc Vandendorpe, Laurent Jacques

    Abstract: This letter analyzes the performances of a simple reconstruction method, namely the Projected Back-Projection (PBP), for estimating the direction of a sparse signal from its phase-only (or amplitude-less) complex Gaussian random measurements, i.e., an extension of one-bit compressive sensing to the complex field. To study the performances of this algorithm, we show that complex Gaussian random mat… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

    Comments: 4 pages, 2 figures

  26. arXiv:1911.11028  [pdf, other

    eess.IV cs.CV

    Deep Decomposition Learning for Inverse Imaging Problems

    Authors: Dongdong Chen, Mike E. Davies

    Abstract: Deep learning is emerging as a new paradigm for solving inverse imaging problems. However, the deep learning methods often lack the assurance of traditional physics-based methods due to the lack of physical information considerations in neural network training and deploying. The appropriate supervision and explicit calibration by the information of the physic model can enhance the neural network l… ▽ More

    Submitted 16 July, 2020; v1 submitted 25 November, 2019; originally announced November 2019.

    Comments: To appear in ECCV 2020

  27. arXiv:1911.09846  [pdf, other

    eess.IV

    A Fully Convolutional Network for MR Fingerprinting

    Authors: Dongdong Chen, Mohammad Golbabaee, Pedro A. Gomez, Marion I. Menzel, Mike E. Davies

    Abstract: Magnetic Resonance Fingerprinting (MRF) methods typically rely on dictionary matching to map the temporal MRF signals to quantitative tissue parameters. These methods suffer from heavy storage and computation requirements as the dictionary size grows. To address these issues, we proposed an end to end fully convolutional neural network for MRF reconstruction (MRF-FCNN), which firstly employ linear… ▽ More

    Submitted 21 November, 2019; originally announced November 2019.

    Comments: The Signal Processing with Adaptive Sparse Structured Representations (SPARS'2019) workshop

  28. arXiv:1907.00300  [pdf, other

    eess.IV cs.LG stat.ML

    Signed Laplacian Deep Learning with Adversarial Augmentation for Improved Mammography Diagnosis

    Authors: Heyi Li, Dongdong Chen, William H. Nailon, Mike E. Davies, David I. Laurenson

    Abstract: Computer-aided breast cancer diagnosis in mammography is limited by inadequate data and the similarity between benign and cancerous masses. To address this, we propose a signed graph regularized deep neural network with adversarial augmentation, named \textsc{DiagNet}. Firstly, we use adversarial learning to generate positive and negative mass-contained mammograms for each mass class. After that,… ▽ More

    Submitted 14 September, 2019; v1 submitted 29 June, 2019; originally announced July 2019.

    Comments: To appear in MICCAI October 2019

  29. arXiv:1905.06058  [pdf, other

    eess.IV physics.med-ph physics.optics

    Blur resolved OCT: full-range interferometric synthetic aperture microscopy through dispersion encoding

    Authors: Jonathan H. Mason, Mike E. Davies, Pierre O. Bagnaninchi

    Abstract: We present a computational method for full-range interferometric synthetic aperture microscopy (ISAM) under dispersion encoding. With this, one can effectively double the depth range of optical coherence tomography (OCT), whilst dramatically enhancing the spatial resolution away from the focal plane. To this end, we propose a model-based iterative reconstruction (MBIR) method, where ISAM is direct… ▽ More

    Submitted 31 January, 2020; v1 submitted 15 May, 2019; originally announced May 2019.

    Comments: 17 pages, 7 figures. The images have been compressed for arxiv - please follow DOI for full resolution

    Journal ref: Opt. Express 28, 3879-3894 (2020)

  30. arXiv:1905.02944  [pdf, other

    eess.IV stat.AP stat.CO

    Fast online 3D reconstruction of dynamic scenes from individual single-photon detection events

    Authors: Yoann Altmann, Stephen McLaughlin, Michael E. Davies

    Abstract: In this paper, we present an algorithm for online 3D reconstruction of dynamic scenes using individual times of arrival (ToA) of photons recorded by single-photon detector arrays. One of the main challenges in 3D imaging using single-photon Lidar is the integration time required to build ToA histograms and reconstruct reliable 3D profiles in the presence of non-negligible ambient illumination. Thi… ▽ More

    Submitted 8 May, 2019; originally announced May 2019.

  31. arXiv:1811.04460  [pdf, other

    eess.SP cs.DM

    Analysis vs Synthesis - An Investigation of (Co)sparse Signal Models on Graphs

    Authors: Madeleine S. Kotzagiannidis, Mike E. Davies

    Abstract: In this work, we present a theoretical study of signals with sparse representations in the vertex domain of a graph, which is primarily motivated by the discrepancy arising from respectively adopting a synthesis and analysis view of the graph Laplacian matrix. Sparsity on graphs and, in particular, the characterization of the subspaces of signals which are sparse with respect to the connectivity o… ▽ More

    Submitted 11 November, 2018; originally announced November 2018.

    Comments: IEEE GlobalSIP 2018. An extended version of this work can be found at arXiv:1811.04493

  32. arXiv:1811.02411  [pdf, other

    cs.SD eess.AS

    An audio-only method for advertisement detection in broadcast television content

    Authors: António Ramires, Diogo Cocharro, Matthew E. P. Davies

    Abstract: We address the task of advertisement detection in broadcast television content. While typically approached from a video-only or audio-visual perspective, we present an audio-only method. Our approach centres on the detection of short silences which exist at the boundaries between programming and advertising, as well as between the advertisements themselves. To identify advertising regions we first… ▽ More

    Submitted 6 November, 2018; originally announced November 2018.

    Journal ref: Proc. of RecPad-2017, Amadora, Portugal, pp. 21-22, October, 2017

  33. arXiv:1811.02406  [pdf, other

    cs.SD eess.AS

    User Specific Adaptation in Automatic Transcription of Vocalised Percussion

    Authors: António Ramires, Rui Penha, Matthew E. P. Davies

    Abstract: The goal of this work is to develop an application that enables music producers to use their voice to create drum patterns when composing in Digital Audio Workstations (DAWs). An easy-to-use and user-oriented system capable of automatically transcribing vocalisations of percussion sounds, called LVT - Live Vocalised Transcription, is presented. LVT is developed as a Max for Live device which follo… ▽ More

    Submitted 6 November, 2018; originally announced November 2018.

    Journal ref: Proc. of RecPad-2017, Amadora, Portugal, pp. 19-20, October, 2017