Skip to main content

Showing 1–25 of 25 results for author: Eghbal-Zadeh, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.01281  [pdf, other

    stat.ML cs.LG math.NA

    Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation

    Authors: Marius-Constantin Dinu, Markus Holzleitner, Maximilian Beck, Hoan Duc Nguyen, Andrea Huber, Hamid Eghbal-zadeh, Bernhard A. Moser, Sergei Pereverzyev, Sepp Hochreiter, Werner Zellinger

    Abstract: We study the problem of choosing algorithm hyper-parameters in unsupervised domain adaptation, i.e., with labeled data in a source domain and unlabeled data in a target domain, drawn from a different input distribution. We follow the strategy to compute several models using different hyper-parameters, and, to subsequently compute a linear aggregation of the models. While several heuristics exist t… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: Oral talk (notable-top-5%) at International Conference On Learning Representations (ICLR), 2023

    Journal ref: International Conference On Learning Representations (ICLR), https://openreview.net/forum?id=M95oDwJXayG, 2023

  2. arXiv:2211.13956  [pdf, other

    cs.SD cs.LG eess.AS

    Learning General Audio Representations with Large-Scale Training of Patchout Audio Transformers

    Authors: Khaled Koutini, Shahed Masoudian, Florian Schmid, Hamid Eghbal-zadeh, Jan Schlüter, Gerhard Widmer

    Abstract: The success of supervised deep learning methods is largely due to their ability to learn relevant features from raw data. Deep Neural Networks (DNNs) trained on large-scale datasets are capable of capturing a diverse set of features, and learning a representation that can generalize onto unseen tasks and datasets that are from the same domain. Hence, these models can be used as powerful feature ex… ▽ More

    Submitted 2 March, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: will apear in HEAR: Holistic Evaluation of Audio Representations Proceedings of Machine Learning Research PMLR 166. Source code: https://github.com/kkoutini/passt_hear21

    Journal ref: Proceedings of Machine Learning Research v166 (2022) 65-89

  3. arXiv:2207.05742  [pdf, other

    cs.LG cs.AI

    Reactive Exploration to Cope with Non-Stationarity in Lifelong Reinforcement Learning

    Authors: Christian Steinparz, Thomas Schmied, Fabian Paischer, Marius-Constantin Dinu, Vihang Patil, Angela Bitto-Nemling, Hamid Eghbal-zadeh, Sepp Hochreiter

    Abstract: In lifelong learning, an agent learns throughout its entire life without resets, in a constantly changing environment, as we humans do. Consequently, lifelong learning comes with a plethora of research problems such as continual domain shifts, which result in non-stationary rewards and environment dynamics. These non-stationarities are difficult to detect and cope with due to their continuous natu… ▽ More

    Submitted 22 September, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

    Comments: CoLLAs 2022

  4. arXiv:2206.03483  [pdf, other

    cs.LG

    Few-Shot Learning by Dimensionality Reduction in Gradient Space

    Authors: Martin Gauch, Maximilian Beck, Thomas Adler, Dmytro Kotsur, Stefan Fiel, Hamid Eghbal-zadeh, Johannes Brandstetter, Johannes Kofler, Markus Holzleitner, Werner Zellinger, Daniel Klotz, Sepp Hochreiter, Sebastian Lehner

    Abstract: We introduce SubGD, a novel few-shot learning method which is based on the recent finding that stochastic gradient descent updates tend to live in a low-dimensional parameter subspace. In experimental and theoretical analyses, we show that models confined to a suitable predefined subspace generalize well for few-shot learning. A suitable subspace fulfills three criteria across the given tasks: it… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: Accepted at Conference on Lifelong Learning Agents (CoLLAs) 2022. Code: https://github.com/ml-jku/subgd Blog post: https://ml-jku.github.io/subgd

    Journal ref: Proceedings of The 1st Conference on Lifelong Learning Agents, PMLR 199:1043-1064 (2022)

  5. arXiv:2205.12258  [pdf, other

    cs.LG cs.CL stat.ML

    History Compression via Language Models in Reinforcement Learning

    Authors: Fabian Paischer, Thomas Adler, Vihang Patil, Angela Bitto-Nemling, Markus Holzleitner, Sebastian Lehner, Hamid Eghbal-zadeh, Sepp Hochreiter

    Abstract: In a partially observable Markov decision process (POMDP), an agent typically uses a representation of the past to approximate the underlying MDP. We propose to utilize a frozen Pretrained Language Transformer (PLT) for history representation and compression to improve sample efficiency. To avoid training of the Transformer, we introduce FrozenHopfield, which automatically associates observations… ▽ More

    Submitted 21 February, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: ICML 2022

  6. arXiv:2111.04714  [pdf, other

    cs.LG cs.AI

    A Dataset Perspective on Offline Reinforcement Learning

    Authors: Kajetan Schweighofer, Andreas Radler, Marius-Constantin Dinu, Markus Hofmarcher, Vihang Patil, Angela Bitto-Nemling, Hamid Eghbal-zadeh, Sepp Hochreiter

    Abstract: The application of Reinforcement Learning (RL) in real world environments can be expensive or risky due to sub-optimal policies during training. In Offline RL, this problem is avoided since interactions with an environment are prohibited. Policies are learned from a given dataset, which solely determines their performance. Despite this fact, how dataset characteristics influence Offline RL algorit… ▽ More

    Submitted 12 July, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

    Comments: Code: https://github.com/ml-jku/OfflineRL

  7. Efficient Training of Audio Transformers with Patchout

    Authors: Khaled Koutini, Jan Schlüter, Hamid Eghbal-zadeh, Gerhard Widmer

    Abstract: The great success of transformer-based models in natural language processing (NLP) has led to various attempts at adapting these architectures to other domains such as vision and audio. Recent work has shown that transformers can outperform Convolutional Neural Networks (CNNs) on vision and audio tasks. However, one of the main shortcomings of transformer models, compared to the well-established C… ▽ More

    Submitted 29 March, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: Submitted to Interspeech 2022. Source code: https://github.com/kkoutini/PaSST

  8. arXiv:2107.08933  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Over-Parameterization and Generalization in Audio Classification

    Authors: Khaled Koutini, Hamid Eghbal-zadeh, Florian Henkel, Jan Schlüter, Gerhard Widmer

    Abstract: Convolutional Neural Networks (CNNs) have been dominating classification tasks in various domains, such as machine vision, machine listening, and natural language processing. In machine listening, while generally exhibiting very good generalization capabilities, CNNs are sensitive to the specific audio recording device used, which has been recognized as a substantial problem in the acoustic scene… ▽ More

    Submitted 19 July, 2021; originally announced July 2021.

    Comments: Presented at the ICML 2021 Workshop on Overparameterization: Pitfalls & Opportunities

  9. arXiv:2105.12395  [pdf, other

    cs.SD cs.LG cs.NE eess.AS

    Receptive Field Regularization Techniques for Audio Classification and Tagging with Deep Convolutional Neural Networks

    Authors: Khaled Koutini, Hamid Eghbal-zadeh, Gerhard Widmer

    Abstract: In this paper, we study the performance of variants of well-known Convolutional Neural Network (CNN) architectures on different audio tasks. We show that tuning the Receptive Field (RF) of CNNs is crucial to their generalization. An insufficient RF limits the CNN's ability to fit the training data. In contrast, CNNs with an excessive RF tend to over-fit the training data and fail to generalize to… ▽ More

    Submitted 26 May, 2021; originally announced May 2021.

    Comments: Accepted in IEEE/ACM Transactions on Audio, Speech, and Language Processing. Code available: https://github.com/kkoutini/cpjku_dcase20

  10. arXiv:2011.02955  [pdf, other

    cs.LG cs.NE cs.SD

    Low-Complexity Models for Acoustic Scene Classification Based on Receptive Field Regularization and Frequency Dam**

    Authors: Khaled Koutini, Florian Henkel, Hamid Eghbal-zadeh, Gerhard Widmer

    Abstract: Deep Neural Networks are known to be very demanding in terms of computing and memory requirements. Due to the ever increasing use of embedded systems and mobile devices with a limited resource budget, designing low-complexity models without sacrificing too much of their predictive performance gained great importance. In this work, we investigate and compare several well-known methods to reduce the… ▽ More

    Submitted 5 November, 2020; originally announced November 2020.

    Comments: Proceedings of the Detection and Classification of Acoustic Scenes and Events 2020 Workshop (DCASE2020)

  11. arXiv:2007.13503  [pdf, other

    eess.AS cs.LG cs.SD

    Receptive-Field Regularized CNNs for Music Classification and Tagging

    Authors: Khaled Koutini, Hamid Eghbal-Zadeh, Verena Haunschmid, Paul Primus, Shreyan Chowdhury, Gerhard Widmer

    Abstract: Convolutional Neural Networks (CNNs) have been successfully used in various Music Information Retrieval (MIR) tasks, both as end-to-end models and as feature extractors for more complex systems. However, the MIR field is still dominated by the classical VGG-based CNN architecture variants, often in combination with more complex modules such as attention, and/or techniques such as pre-training on l… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

  12. arXiv:2007.02650  [pdf, other

    cs.LG stat.ML

    On Data Augmentation and Adversarial Risk: An Empirical Analysis

    Authors: Hamid Eghbal-zadeh, Khaled Koutini, Paul Primus, Verena Haunschmid, Michal Lewandowski, Werner Zellinger, Bernhard A. Moser, Gerhard Widmer

    Abstract: Data augmentation techniques have become standard practice in deep learning, as it has been shown to greatly improve the generalisation abilities of models. These techniques rely on different ideas such as invariance-preserving transformations (e.g, expert-defined augmentation), statistical heuristics (e.g, Mixup), and learning the data distribution (e.g, GANs). However, in the adversarial setting… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

    Comments: 21 pages, 15 figures, 3 tables

  13. arXiv:1911.05833  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Emotion and Theme Recognition in Music with Frequency-Aware RF-Regularized CNNs

    Authors: Khaled Koutini, Shreyan Chowdhury, Verena Haunschmid, Hamid Eghbal-zadeh, Gerhard Widmer

    Abstract: We present CP-JKU submission to MediaEval 2019; a Receptive Field-(RF)-regularized and Frequency-Aware CNN approach for tagging music with emotion/mood labels. We perform an investigation regarding the impact of the RF of the CNNs on their performance on this dataset. We observe that ResNets with smaller receptive fields -- originally adapted for acoustic scene classification -- also perform well… ▽ More

    Submitted 28 October, 2019; originally announced November 2019.

    Comments: MediaEval`19, 27-29 October 2019, Sophia Antipolis, France

  14. arXiv:1909.02869  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Exploiting Parallel Audio Recordings to Enforce Device Invariance in CNN-based Acoustic Scene Classification

    Authors: Paul Primus, Hamid Eghbal-zadeh, David Eitelsebner, Khaled Koutini, Andreas Arzt, Gerhard Widmer

    Abstract: Distribution mismatches between the data seen at training and at application time remain a major challenge in all application areas of machine learning. We study this problem in the context of machine listening (Task 1b of the DCASE 2019 Challenge). We propose a novel approach to learn domain-invariant classifiers in an end-to-end fashion by enforcing equal hidden layer representations for domain-… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

    Comments: Published at the Workshop on Detection and Classification of Acoustic Scenes and Events, 25-26 October 2019, New York, USA

  15. arXiv:1909.02859  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Receptive-field-regularized CNN variants for acoustic scene classification

    Authors: Khaled Koutini, Hamid Eghbal-zadeh, Gerhard Widmer

    Abstract: Acoustic scene classification and related tasks have been dominated by Convolutional Neural Networks (CNNs). Top-performing CNNs use mainly audio spectograms as input and borrow their architectural design primarily from computer vision. A recent study has shown that restricting the receptive field (RF) of CNNs in appropriate ways is crucial for their performance, robustness and generalization in a… ▽ More

    Submitted 5 September, 2019; originally announced September 2019.

    Comments: Accepted at Detection and Classification of Acoustic Scenes and Events 2019 (DCASE Workshop 2019)

  16. arXiv:1907.01803  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification

    Authors: Khaled Koutini, Hamid Eghbal-zadeh, Matthias Dorfer, Gerhard Widmer

    Abstract: Convolutional Neural Networks (CNNs) have had great success in many machine vision as well as machine audition tasks. Many image recognition network architectures have consequently been adapted for audio processing tasks. However, despite some successes, the performance of many of these did not translate from the image to the audio domain. For example, very deep architectures such as ResNet and De… ▽ More

    Submitted 3 July, 2019; originally announced July 2019.

    Comments: IEEE EUSIPCO 2019

  17. arXiv:1905.06586  [pdf, other

    cs.LG stat.ML

    On Conditioning GANs to Hierarchical Ontologies

    Authors: Hamid Eghbal-zadeh, Lukas Fischer, Thomas Hoch

    Abstract: The recent success of Generative Adversarial Networks (GAN) is a result of their ability to generate high quality images from a latent vector space. An important application is the generation of images from a text description, where the text description is encoded and further used in the conditioning of the generated image. Thus the generative network has to additionally learn a map** from the t… ▽ More

    Submitted 16 May, 2019; originally announced May 2019.

    Comments: Under review at MLKgraphs2019: http://www.dexa.org/mlkgraphs2019

  18. arXiv:1811.00152  [pdf, other

    cs.LG stat.ML

    Mixture Density Generative Adversarial Networks

    Authors: Hamid Eghbal-zadeh, Werner Zellinger, Gerhard Widmer

    Abstract: Generative Adversarial Networks have surprising ability for generating sharp and realistic images, though they are known to suffer from the so-called mode collapse problem. In this paper, we propose a new GAN variant called Mixture Density GAN that while being capable of generating high-quality images, overcomes this problem by encouraging the Discriminator to form clusters in its embedding space,… ▽ More

    Submitted 29 November, 2018; v1 submitted 31 October, 2018; originally announced November 2018.

    Comments: Accepted at the third workshop on Bayesian Deep Learning (NeurIPS 2018), Montréal, Canada

  19. arXiv:1807.10501  [pdf, ps, other

    cs.SD eess.AS

    Large-Scale Weakly Labeled Semi-Supervised Sound Event Detection in Domestic Environments

    Authors: Romain Serizel, Nicolas Turpault, Hamid Eghbal-Zadeh, Ankit Parag Shah

    Abstract: This paper presents DCASE 2018 task 4. The task evaluates systems for the large-scale detection of sound events using weakly labeled data (without time boundaries). The target of the systems is to provide not only the event class but also the event time boundaries given that multiple events can be present in an audio recording. Another challenge of the task is to explore the possibility to exploit… ▽ More

    Submitted 27 July, 2018; originally announced July 2018.

  20. arXiv:1806.08840  [pdf, other

    q-bio.GN cs.LG

    Deep SNP: An End-to-end Deep Neural Network with Attention-based Localization for Break-point Detection in SNP Array Genomic data

    Authors: Hamid Eghbal-zadeh, Lukas Fischer, Niko Popitsch, Florian Kromp, Sabine Taschner-Mandl, Khaled Koutini, Teresa Gerber, Eva Bozsaky, Peter F. Ambros, Inge M. Ambros, Gerhard Widmer, Bernhard A. Moser

    Abstract: Diagnosis and risk stratification of cancer and many other diseases require the detection of genomic breakpoints as a prerequisite of calling copy number alterations (CNA). This, however, is still challenging and requires time-consuming manual curation. As deep-learning methods outperformed classical state-of-the-art algorithms in various domains and have also been successfully applied to life sci… ▽ More

    Submitted 22 June, 2018; originally announced June 2018.

    Comments: Accepted at the Joint ICML and IJCAI 2018 Workshop on Computational Biology

  21. arXiv:1711.04022  [pdf, other

    cs.LG cs.AI cs.SD eess.AS

    Deep Within-Class Covariance Analysis for Robust Audio Representation Learning

    Authors: Hamid Eghbal-zadeh, Matthias Dorfer, Gerhard Widmer

    Abstract: Convolutional Neural Networks (CNNs) can learn effective features, though have been shown to suffer from a performance drop when the distribution of the data changes from training to test data. In this paper we analyze the internal representations of CNNs and observe that the representations of unseen data in each class, spread more (with higher variance) in the embedding space of the CNN compared… ▽ More

    Submitted 30 November, 2018; v1 submitted 10 November, 2017; originally announced November 2017.

    Comments: 11 pages, 3 tables, 4 figures

  22. arXiv:1708.01886  [pdf, other

    cs.LG stat.ML

    Probabilistic Generative Adversarial Networks

    Authors: Hamid Eghbal-zadeh, Gerhard Widmer

    Abstract: We introduce the Probabilistic Generative Adversarial Network (PGAN), a new GAN variant based on a new kind of objective function. The central idea is to integrate a probabilistic model (a Gaussian Mixture Model, in our case) into the GAN framework which supports a new kind of loss function (based on likelihood rather than classification loss), and at the same time gives a meaningful measure of th… ▽ More

    Submitted 6 August, 2017; originally announced August 2017.

    Comments: Submitted to NIPS 2017

  23. arXiv:1707.07530  [pdf, other

    cs.LG cs.AI

    Likelihood Estimation for Generative Adversarial Networks

    Authors: Hamid Eghbal-zadeh, Gerhard Widmer

    Abstract: We present a simple method for assessing the quality of generated images in Generative Adversarial Networks (GANs). The method can be applied in any kind of GAN without interfering with the learning procedure or affecting the learning objective. The central idea is to define a likelihood function that correlates with the quality of the generated images. In particular, we derive a Gaussian likeliho… ▽ More

    Submitted 24 July, 2017; originally announced July 2017.

    Comments: ICML 2017 Workshop on Implicit Models

    Report number: 1707.07530

  24. arXiv:1706.06525  [pdf, ps, other

    cs.SD cs.AI cs.LG

    A Hybrid Approach with Multi-channel I-Vectors and Convolutional Neural Networks for Acoustic Scene Classification

    Authors: Hamid Eghbal-zadeh, Bernhard Lehner, Matthias Dorfer, Gerhard Widmer

    Abstract: In Acoustic Scene Classification (ASC) two major approaches have been followed . While one utilizes engineered features such as mel-frequency-cepstral-coefficients (MFCCs), the other uses learned features that are the outcome of an optimization algorithm. I-vectors are the result of a modeling technique that usually takes engineered features as input. It has been shown that standard MFCCs extracte… ▽ More

    Submitted 20 June, 2017; originally announced June 2017.

  25. Music Playlist Continuation by Learning from Hand-Curated Examples and Song Features: Alleviating the Cold-Start Problem for Rare and Out-of-Set Songs

    Authors: Andreu Vall, Hamid Eghbal-zadeh, Matthias Dorfer, Markus Schedl, Gerhard Widmer

    Abstract: Automated music playlist generation is a specific form of music recommendation. Generally stated, the user receives a set of song suggestions defining a coherent listening session. We hypothesize that the best way to convey such playlist coherence to new recommendations is by learning it from actual curated examples, in contrast to imposing ad hoc constraints. Collaborative filtering methods can b… ▽ More

    Submitted 7 September, 2017; v1 submitted 23 May, 2017; originally announced May 2017.