Skip to main content

Showing 1–14 of 14 results for author: Balazs, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15856  [pdf, other

    cs.LG

    Injectivity of ReLU-layers: Perspectives from Frame Theory

    Authors: Daniel Haider, Martin Ehler, Peter Balazs

    Abstract: Injectivity is the defining property of a map** that ensures no information is lost and any input can be perfectly reconstructed from its output. By performing hard thresholding, the ReLU function naturally interferes with this property, making the injectivity analysis of ReLU-layers in neural networks a challenging yet intriguing task that has not yet been fully solved. This article establishes… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2309.05855  [pdf, other

    cs.LG cs.SD eess.AS

    Instabilities in Convnets for Raw Audio

    Authors: Daniel Haider, Vincent Lostanlen, Martin Ehler, Peter Balazs

    Abstract: What makes waveform-based deep learning so hard? Despite numerous attempts at training convolutional neural networks (convnets) for filterbank design, they often fail to outperform hand-crafted baselines. These baselines are linear time-invariant systems: as such, they can be approximated by convnets with wide receptive fields. Yet, in practice, gradient-based optimization leads to suboptimal appr… ▽ More

    Submitted 26 April, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: 4 pages, 5 figures, 1 page appendix with mathematical proofs

    Journal ref: IEEE Signal Processing Letters 31 (2024) 1084-1088

  3. arXiv:2307.13821  [pdf, other

    cs.SD cs.AI cs.LG eess.AS math.FA

    Fitting Auditory Filterbanks with Multiresolution Neural Networks

    Authors: Vincent Lostanlen, Daniel Haider, Han Han, Mathieu Lagrange, Peter Balazs, Martin Ehler

    Abstract: Waveform-based deep learning faces a dilemma between nonparametric and parametric approaches. On one hand, convolutional neural networks (convnets) may approximate any linear time-invariant system; yet, in practice, their frequency responses become more irregular as their receptive fields grow. On the other hand, a parametric model such as LEAF is guaranteed to yield Gabor filters, hence an optima… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: 4 pages, 4 figures, 1 table, conference

  4. arXiv:2307.09672  [pdf, other

    cs.LG math.OC

    Convex Geometry of ReLU-layers, Injectivity on the Ball and Local Reconstruction

    Authors: Daniel Haider, Martin Ehler, Peter Balazs

    Abstract: The paper uses a frame-theoretic setting to study the injectivity of a ReLU-layer on the closed ball of $\mathbb{R}^n$ and its non-negative part. In particular, the interplay between the radius of the ball and the bias vector is emphasized. Together with a perspective from convex geometry, this leads to a computationally feasible method of verifying the injectivity of a ReLU-layer under reasonable… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: 10 pages main paper + 2 pages appendix, 4 figures, 2 algorithms, conference

    Journal ref: International Conference on Machine Learning 2023

  5. arXiv:2202.12380  [pdf, other

    math.NA cs.IT cs.MS eess.SP

    Fast Matching Pursuit with Multi-Gabor Dictionaries

    Authors: Zdeněk Průša, Nicki Holighaus, Peter Balazs

    Abstract: Finding the best K-sparse approximation of a signal in a redundant dictionary is an NP-hard problem. Suboptimal greedy matching pursuit (MP) algorithms are generally used for this task. In this work, we present an acceleration technique and an implementation of the matching pursuit algorithm acting on a multi-Gabor dictionary, i.e., a concatenation of several Gabor-type time-frequency dictionaries… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

  6. arXiv:2202.07484  [pdf, other

    cs.SD eess.AS eess.SP

    Phase-Based Signal Representations for Scattering

    Authors: Daniel Haider, Peter Balazs, Nicki Holighaus

    Abstract: The scattering transform is a non-linear signal representation method based on cascaded wavelet transform magnitudes. In this paper we introduce phase scattering, a novel approach where we use phase derivatives in a scattering procedure. We first revisit phase-related concepts for representing time-frequency information of audio signals, in particular, the partial derivatives of the phase in the t… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

  7. arXiv:2202.07479  [pdf, ps, other

    cs.SD eess.AS

    Audio Inpainting via $\ell_1$-Minimization and Dictionary Learning

    Authors: Shristi Rajbamshi, Georg Tauböck, Peter Balazs, Nicki Holighaus

    Abstract: Audio inpainting refers to signal processing techniques that aim at restoring missing or corrupted consecutive samples in audio signals. Prior works have shown that $\ell_1$- minimization with appropriate weighting is capable of solving audio inpainting problems, both for the analysis and the synthesis models. These models assume that audio signals are sparse with respect to some redundant diction… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

  8. Image-based material characterization of complex microarchitectured additively manufactured structures

    Authors: N. Korshunova, J. Jomo, G. Lékó, D. Reznik, P. Balázs, S. Kollmannsberger

    Abstract: Significant developments in the field of additive manufacturing (AM) allowed the fabrication of complex microarchitectured components with varying porosity across different scales. However, due to the high complexity of this process, the final parts can exhibit significant variations in the nominal geometry. Computer tomographic images of 3D printed components provide extensive information about t… ▽ More

    Submitted 22 July, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

  9. arXiv:1904.10380  [pdf, other

    cs.SD eess.AS math.SP

    Harmonic-aligned Frame Mask Based on Non-stationary Gabor Transform with Application to Content-dependent Speaker Comparison

    Authors: Feng Huang, Peter Balazs

    Abstract: We propose harmonic-aligned frame mask for speech signals using non-stationary Gabor transform (NSGT). A frame mask operates on the transfer coefficients of a signal and consequently converts the signal into a counterpart signal. It depicts the difference between the two signals. In preceding studies, frame masks based on regular Gabor transform were applied to single-note instrumental sound analy… ▽ More

    Submitted 23 April, 2019; originally announced April 2019.

    Comments: Interspeech2019

  10. Frame Theory for Signal Processing in Psychoacoustics

    Authors: Peter Balazs, Nicki Holighaus, Thibaud Necciari, Diana Stoeva

    Abstract: This review chapter aims to strengthen the link between frame theory and signal processing tasks in psychoacoustics. On the one side, the basic concepts of frame theory are presented and some proofs are provided to explain those concepts in some detail. The goal is to reveal to hearing scientists how this mathematical theory could be relevant for their research. In particular, we focus on frame th… ▽ More

    Submitted 3 November, 2016; originally announced November 2016.

    Journal ref: In: Balan R., Benedetto J., Czaja W., Dellatorre M., Okoudjou K. (eds) Excursions in Harmonic Analysis, Vol. 5. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham, 2017, 225-268

  11. A Non-iterative Method for (Re)Construction of Phase from STFT Magnitude

    Authors: Zdeněk Průša, Peter Balazs, Peter L. Søndergaard

    Abstract: A non-iterative method for the construction of the Short-Time Fourier Transform (STFT) phase from the magnitude is presented. The method is based on the direct relationship between the partial derivatives of the phase and the logarithm of the magnitude of the un-sampled STFT with respect to the Gaussian window. Although the theory holds in the continuous setting only, the experiments show that the… ▽ More

    Submitted 1 September, 2016; originally announced September 2016.

    Comments: 10 pages, 4 figures

    Journal ref: IEEE Transactions on Audio, Speech and Language Processing, Vol. 25 (5), pp. 1154 - 1164 (2017)

  12. arXiv:1607.06667  [pdf, other

    cs.SD cs.AI cs.MM cs.SE

    Inpainting of long audio segments with similarity graphs

    Authors: Nathanael Perraudin, Nicki Holighaus, Piotr Majdak, Peter Balazs

    Abstract: We present a novel method for the compensation of long duration data loss in audio signals, in particular music. The concealment of such signal defects is based on a graph that encodes signal structure in terms of time-persistent spectral similarity. A suitable candidate segment for the substitution of the lost content is proposed by an intuitive optimization scheme and smoothly inserted into the… ▽ More

    Submitted 23 February, 2018; v1 submitted 22 July, 2016; originally announced July 2016.

  13. arXiv:1601.06652  [pdf, ps, other

    cs.SD

    A Perceptually Motivated Filter Bank with Perfect Reconstruction for Audio Signal Processing

    Authors: Thibaud Necciari, Nicki Holighaus, Peter Balazs, Zdenek Prusa

    Abstract: Many audio applications rely on filter banks (FBs) to analyze, process, and re-synthesize sounds. To approximate the auditory frequency resolution in the signal chain, some applications rely on perceptually motivated FBs, the gammatone FB being a popular example. However, most perceptually motivated FBs only allow partial signal reconstruction at high redundancies and/or do not have good resistanc… ▽ More

    Submitted 25 January, 2016; originally announced January 2016.

  14. arXiv:1109.6651  [pdf, other

    cs.SD

    Sound Analysis and Synthesis Adaptive in Time and Two Frequency Bands

    Authors: Marco Liuni, Peter Balazs, Axel Röbel

    Abstract: We present an algorithm for sound analysis and resynthesis with local automatic adaptation of time-frequency resolution. There exists several algorithms allowing to adapt the analysis window depending on its time or frequency location; in what follows we propose a method which select the optimal resolution depending on both time and frequency. We consider an approach that we denote as analysis-wei… ▽ More

    Submitted 29 September, 2011; originally announced September 2011.

    Journal ref: Proc. of the 14th Int. Conference on Digital Audio Effects (DAFx-11), Paris, France, September 19-23, 2011