Skip to main content

Showing 1–15 of 15 results for author: Cohen, E

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.19363  [pdf, ps, other

    eess.AS

    Tradition or Innovation: A Comparison of Modern ASR Methods for Forced Alignment

    Authors: Rotem Rousso, Eyal Cohen, Joseph Keshet, Eleanor Chodroff

    Abstract: Forced alignment (FA) plays a key role in speech research through the automatic time alignment of speech signals with corresponding text transcriptions. Despite the move towards end-to-end architectures for speech technology, FA is still dominantly achieved through a classic GMM-HMM acoustic model. This work directly compares alignment performance from leading automatic speech recognition (ASR) me… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Journal ref: Interspeech 2024

  2. arXiv:2402.14023  [pdf

    physics.med-ph eess.IV physics.optics

    25-Fold Resolution Enhancement of X-ray Microscopy Using Multipixel Ghost Imaging

    Authors: O. Sefi, A. Ben Yehuda, Y. Klein, S. Bloch, H. Schwartz, E. Cohen, S. Shwartz

    Abstract: Hard x-ray imaging is indispensable across diverse fields owing to its high penetrability. However, the resolution of traditional x-ray imaging modalities, such as computed tomography (CT) systems, is constrained by factors including beam properties, the absence of optical components, and detection resolution. As a result, typical resolution in commercial imaging systems is limited to a few hundre… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 9 pages, 4 figures

  3. arXiv:2311.18604  [pdf, other

    cs.SD cs.IR eess.AS

    Barwise Music Structure Analysis with the Correlation Block-Matching Segmentation Algorithm

    Authors: Axel Marmoret, Jérémy E. Cohen, Frédéric Bimbot

    Abstract: Music Structure Analysis (MSA) is a Music Information Retrieval task consisting of representing a song in a simplified, organized manner by breaking it down into sections typically corresponding to ``chorus'', ``verse'', ``solo'', etc. In this work, we extend an MSA algorithm called the Correlation Block-Matching (CBM) algorithm introduced by (Marmoret et al., 2020, 2022b). The CBM algorithm is a… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: 19 pages, 13 figures, 11 tables, 1 algorithm, published in Transactions of the International Society for Music Information Retrieval

    ACM Class: H.5.5

    Journal ref: Transactions of the International Society for Music Information Retrieval, 6(1), 2023, 167--185

  4. arXiv:2212.11054  [pdf, other

    cs.SD eess.AS

    Polytopic Analysis of Music

    Authors: Axel Marmoret, Jérémy E. Cohen, Frédéric Bimbot

    Abstract: Structural segmentation of music refers to the task of finding a symbolic representation of the organisation of a song, reducing the musical flow to a partition of non-overlap** segments. Under this definition, the musical structure may not be unique, and may even be ambiguous. One way to resolve that ambiguity is to see this task as a compression process, and to consider the musical structure a… ▽ More

    Submitted 22 December, 2022; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: Work document

    ACM Class: H.5.5

  5. arXiv:2210.15356  [pdf, other

    cs.SD cs.IR eess.AS

    Convolutive Block-Matching Segmentation Algorithm with Application to Music Structure Analysis

    Authors: Axel Marmoret, Jérémy E. Cohen, Frédéric Bimbot

    Abstract: Music Structure Analysis (MSA) consists of representing a song in sections (such as ``chorus'', ``verse'', ``solo'' etc), and can be seen as the retrieval of a simplified organization of the song. This work presents a new algorithm, called Convolutive Block-Matching (CBM) algorithm, devoted to MSA. In particular, the CBM algorithm is a dynamic programming algorithm, applying on autosimilarity matr… ▽ More

    Submitted 26 September, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: 4 pages, 7 figures. Accepted for publication at WASPAA 2023. The associated toolbox is available at https://gitlab.inria.fr/amarmore/autosimilarity_segmentation/-/tree/WASPAA23

    ACM Class: H.5.5

  6. arXiv:2202.04989  [pdf, other

    cs.SD cs.LG eess.AS

    Semi-Supervised Convolutive NMF for Automatic Piano Transcription

    Authors: Haoran Wu, Axel Marmoret, Jérémy E. Cohen

    Abstract: Automatic Music Transcription, which consists in transforming an audio recording of a musical performance into symbolic format, remains a difficult Music Information Retrieval task. In this work, which focuses on piano transcription, we propose a semi-supervised approach using low-rank matrix factorization techniques, in particular Convolutive Nonnegative Matrix Factorization. In the semi-supervis… ▽ More

    Submitted 14 April, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: Published at the 2022 Sound and Music Computing (SMC) conference, 7 pages, 5 figures, 3 tables, code available at https://github.com/cohenjer/TransSSCNMF

    ACM Class: H.5.5

  7. arXiv:2202.04981  [pdf, other

    cs.SD cs.LG eess.AS

    Barwise Compression Schemes for Audio-Based Music Structure Analysis

    Authors: Axel Marmoret, Jérémy E. Cohen, Frédéric Bimbot

    Abstract: Music Structure Analysis (MSA) consists in segmenting a music piece in several distinct sections. We approach MSA within a compression framework, under the hypothesis that the structure is more easily revealed by a simplified representation of the original content of the song. More specifically, under the hypothesis that MSA is correlated with similarities occurring at the bar scale, this article… ▽ More

    Submitted 15 April, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: Published at the 2022 Sound and Music Computing (SMC) conference, 8 pages, 6 figures, 1 table, code available at https://gitlab.inria.fr/amarmore/barwisemusiccompression. arXiv admin note: substantial text overlap with arXiv:2110.14437

    ACM Class: H.5.5

  8. arXiv:2110.14437  [pdf, other

    cs.SD cs.LG eess.AS

    Exploring single-song autoencoding schemes for audio-based music structure analysis

    Authors: Axel Marmoret, Jérémy E. Cohen, Frédéric Bimbot

    Abstract: The ability of deep neural networks to learn complex data relations and representations is established nowadays, but it generally relies on large sets of training data. This work explores a "piece-specific" autoencoding scheme, in which a low-dimensional autoencoder is trained to learn a latent/compressed representation specific to a given song, which can then be used to infer the song structure.… ▽ More

    Submitted 7 March, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: 4 pages, 4 figures, 2 tables. Rejected from ICASSP 2022, an extended version is available at arXiv:2202.04981

    ACM Class: H.5.5

  9. arXiv:2110.14434  [pdf, ps, other

    cs.SD cs.LG eess.AS math.NA

    Nonnegative Tucker Decomposition with Beta-divergence for Music Structure Analysis of Audio Signals

    Authors: Axel Marmoret, Florian Voorwinden, Valentin Leplat, Jérémy E. Cohen, Frédéric Bimbot

    Abstract: Nonnegative Tucker decomposition (NTD), a tensor decomposition model, has received increased interest in the recent years because of its ability to blindly extract meaningful patterns, in particular in Music Information Retrieval. Nevertheless, existing algorithms to compute NTD are mostly designed for the Euclidean loss. This work proposes a multiplicative updates algorithm to compute NTD with th… ▽ More

    Submitted 2 August, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: 4 pages, 2 figures, 1 table, 1 algorithm. To be published in GRETSI2022. The algorithm is available at https://gitlab.inria.fr/amarmore/nonnegative-factorization

    MSC Class: 15-04 ACM Class: G.1.6; H.5.5

  10. arXiv:2104.08580  [pdf, other

    cs.SD cs.LG eess.AS

    Uncovering audio patterns in music with Nonnegative Tucker Decomposition for structural segmentation

    Authors: Axel Marmoret, Jérémy E. Cohen, Nancy Bertin, Frédéric Bimbot

    Abstract: Recent work has proposed the use of tensor decomposition to model repetitions and to separate tracks in loop-based electronic music. The present work investigates further on the ability of Nonnegative Tucker Decompositon (NTD) to uncover musical patterns and structure in pop songs in their audio form. Exploiting the fact that NTD tends to express the content of bars as linear combinations of a few… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

    Comments: 7 pages, 6 figures; Code and experiments details available at https://gitlab.inria.fr/amarmore/musicntd/-/tree/0.1.0; Experiments details available at https://ax-le.github.io/resources/ISMIR2020/Notebooks_mainpage.html

    Report number: ISBN: 978-0-9813537-0-8 ACM Class: H.5.5

    Journal ref: 21st International Society for Music Information Retrieval Conference (ISMIR), Montréal, Canada, 2020, 788-794

  11. arXiv:2102.01389  [pdf, other

    eess.IV cs.CV cs.LG

    aura-net : robust segmentation of phase-contrast microscopy images with few annotations

    Authors: Ethan Cohen, Virginie Uhlmann

    Abstract: We present AURA-net, a convolutional neural network (CNN) for the segmentation of phase-contrast microscopy images. AURA-net uses transfer learning to accelerate training and Attention mechanisms to help the network focus on relevant image features. In this way, it can be trained efficiently with a very limited amount of annotations. Our network can thus be used to automate the segmentation of dat… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: Accepted at ISBI 2021

  12. arXiv:2011.08001  [pdf, other

    eess.IV cs.CV cs.LG

    Deep-LIBRA: Artificial intelligence method for robust quantification of breast density with independent validation in breast cancer risk assessment

    Authors: Omid Haji Maghsoudi, Aimilia Gastounioti, Christopher Scott, Lauren Pantalone, Fang-Fang Wu, Eric A. Cohen, Stacey Winham, Emily F. Conant, Celine Vachon, Despina Kontos

    Abstract: Breast density is an important risk factor for breast cancer that also affects the specificity and sensitivity of screening mammography. Current federal legislation mandates reporting of breast density for all women undergoing breast screening. Clinically, breast density is assessed visually using the American College of Radiology Breast Imaging Reporting And Data System (BI-RADS) scale. Here, we… ▽ More

    Submitted 18 October, 2021; v1 submitted 13 November, 2020; originally announced November 2020.

  13. arXiv:2009.02264  [pdf, other

    eess.IV cs.CV physics.bio-ph

    Improving axial resolution in SIM using deep learning

    Authors: Miguel Boland, Edward A. K. Cohen, Seth Flaxman, Mark A. A. Neil

    Abstract: Structured Illumination Microscopy is a widespread methodology to image live and fixed biological structures smaller than the diffraction limits of conventional optical microscopy. Using recent advances in image up-scaling through deep learning models, we demonstrate a method to reconstruct 3D SIM image stacks with twice the axial resolution attainable through conventional SIM reconstructions. We… ▽ More

    Submitted 18 February, 2021; v1 submitted 4 September, 2020; originally announced September 2020.

    ACM Class: I.4.5; I.2.10

  14. arXiv:2006.07553  [pdf, other

    cs.LG cs.CV eess.SP math.OC stat.ML

    Sparse Separable Nonnegative Matrix Factorization

    Authors: Nicolas Nadisic, Arnaud Vandaele, Jeremy E. Cohen, Nicolas Gillis

    Abstract: We propose a new variant of nonnegative matrix factorization (NMF), combining separability and sparsity assumptions. Separability requires that the columns of the first NMF factor are equal to columns of the input matrix, while sparsity requires that the columns of the second NMF factor are sparse. We call this variant sparse separable NMF (SSNMF), which we prove to be NP-complete, as opposed to s… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: 20 pages, accepted in ECML 2020

  15. arXiv:1809.07824  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Metric Learning for Phoneme Perception

    Authors: Yair Lakretz, Gal Chechik, Evan-Gary Cohen, Alessandro Treves, Naama Friedmann

    Abstract: Metric functions for phoneme perception capture the similarity structure among phonemes in a given language and therefore play a central role in phonology and psycho-linguistics. Various phenomena depend on phoneme similarity, such as spoken word recognition or serial recall from verbal working memory. This study presents a new framework for learning a metric function for perceptual distances amon… ▽ More

    Submitted 20 September, 2018; originally announced September 2018.