Skip to main content

Showing 1–17 of 17 results for author: Stadelmann, T

Searching in archive cs. Search in all archives.
.
  1. MathNet: A Data-Centric Approach for Printed Mathematical Expression Recognition

    Authors: Felix M. Schmitt-Koopmann, Elaine M. Huang, Hans-Peter Hutter, Thilo Stadelmann, Alireza Darvishy

    Abstract: Printed mathematical expression recognition (MER) models are usually trained and tested using LaTeX-generated mathematical expressions (MEs) as input and the LaTeX source code as ground truth. As the same ME can be generated by various different LaTeX source codes, this leads to unwanted variations in the ground truth data that bias test performance results and hinder efficient learning. In additi… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 12 pages, 6 figures

    Journal ref: IEEE Access 12 (2024) 76963-76974

  2. arXiv:2311.08525  [pdf, other

    cs.CV cs.AI

    Efficient Rotation Invariance in Deep Neural Networks through Artificial Mental Rotation

    Authors: Lukas Tuggener, Thilo Stadelmann, Jürgen Schmidhuber

    Abstract: Humans and animals recognize objects irrespective of the beholder's point of view, which may drastically change their appearances. Artificial pattern recognizers also strive to achieve this, e.g., through translational invariance in convolutional neural networks (CNNs). However, both CNNs and vision transformers (ViTs) perform very poorly on rotated inputs. Here we present artificial mental rotati… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  3. arXiv:2311.00489  [pdf, other

    cs.SD cs.CV cs.LG eess.AS

    Deep Neural Networks for Automatic Speaker Recognition Do Not Learn Supra-Segmental Temporal Features

    Authors: Daniel Neururer, Volker Dellwo, Thilo Stadelmann

    Abstract: While deep neural networks have shown impressive results in automatic speaker recognition and related tasks, it is dissatisfactory how little is understood about what exactly is responsible for these results. Part of the success has been attributed in prior work to their capability to model supra-segmental temporal information (SST), i.e., learn rhythmic-prosodic characteristics of speech in addit… ▽ More

    Submitted 2 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 7 pages, 2 figures, 3 tables

  4. A Comprehensive Survey of Deep Transfer Learning for Anomaly Detection in Industrial Time Series: Methods, Applications, and Directions

    Authors: Peng Yan, Ahmed Abdulkadir, Paul-Philipp Luley, Matthias Rosenthal, Gerrit A. Schatte, Benjamin F. Grewe, Thilo Stadelmann

    Abstract: Automating the monitoring of industrial processes has the potential to enhance efficiency and optimize quality by promptly detecting abnormal events and thus facilitating timely interventions. Deep learning, with its capacity to discern non-trivial patterns within large datasets, plays a pivotal role in this process. Standard deep learning methods are suitable to solve a specific task given a spec… ▽ More

    Submitted 10 January, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

    Comments: 27 pages, 8 figures, 2 tables, published in IEEE Acess

    ACM Class: I.2.0; I.2.4

    Journal ref: IEEE Acess 12 (2024) 3768-3789

  5. Video object detection for privacy-preserving patient monitoring in intensive care

    Authors: Raphael Emberger, Jens Michael Boss, Daniel Baumann, Marko Seric, Shufan Huo, Lukas Tuggener, Emanuela Keller, Thilo Stadelmann

    Abstract: Patient monitoring in intensive care units, although assisted by biosensors, needs continuous supervision of staff. To reduce the burden on staff members, IT infrastructures are built to record monitoring data and develop clinical decision support systems. These systems, however, are vulnerable to artifacts (e.g. muscle movement due to ongoing treatment), which are often indistinguishable from rea… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: 4 pages, 3 figures, 2023 10th Swiss Conference on Data Science (SDS), code available at https://github.com/raember/yolov5r_autodidact and https://github.com/raember/VideoProc

    ACM Class: I.2.10

  6. Trace and Detect Adversarial Attacks on CNNs using Feature Response Maps

    Authors: Mohammadreza Amirian, Friedhelm Schwenker, Thilo Stadelmann

    Abstract: The existence of adversarial attacks on convolutional neural networks (CNN) questions the fitness of such models for serious applications. The attacks manipulate an input image such that misclassification is evoked while still looking normal to a human observer -- they are thus not easily detectable. In a different context, backpropagated activations of CNN hidden layers -- "feature responses" to… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: 13 pages, 6 figures

    Report number: zhaw-3863

    Journal ref: 8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition 8th IAPR TC3 Workshop on Artificial Neural Networks in Pattern Recognition (ANNPR 2018)

  7. PrepNet: A Convolutional Auto-Encoder to Homogenize CT Scans for Cross-Dataset Medical Image Analysis

    Authors: Mohammadreza Amirian, Javier A. Montoya-Zegarra, Jonathan Gruss, Yves D. Stebler, Ahmet Selman Bozkir, Marco Calandri, Friedhelm Schwenker, Thilo Stadelmann

    Abstract: With the spread of COVID-19 over the world, the need arose for fast and precise automatic triage mechanisms to decelerate the spread of the disease by reducing human efforts e.g. for image-based diagnosis. Although the literature has shown promising efforts in this direction, reported results do not consider the variability of CT scans acquired under varying circumstances, thus rendering resulting… ▽ More

    Submitted 19 August, 2022; originally announced August 2022.

    Comments: 7 pages 4 figures peer reviewed and published in IEEE EMBS Regional Conference on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI 2021)

    Report number: zhaw-23318

    Journal ref: IEEE EMBS Regional Conference on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI 2021)

  8. arXiv:2205.00002  [pdf, other

    cs.AI q-bio.NC

    A Theory of Natural Intelligence

    Authors: Christoph von der Malsburg, Thilo Stadelmann, Benjamin F. Grewe

    Abstract: Introduction: In contrast to current AI technology, natural intelligence -- the kind of autonomous intelligence that is realized in the brains of animals and humans to attain in their natural environment goals defined by a repertoire of innate behavioral schemata -- is far superior in terms of learning speed, generalization capabilities, autonomy and creativity. How are these strengths, by what me… ▽ More

    Submitted 22 April, 2022; originally announced May 2022.

    ACM Class: I.2

  9. Is it enough to optimize CNN architectures on ImageNet?

    Authors: Lukas Tuggener, Jürgen Schmidhuber, Thilo Stadelmann

    Abstract: Classification performance based on ImageNet is the de-facto standard metric for CNN development. In this work we challenge the notion that CNN architecture design solely based on ImageNet leads to generally effective convolutional neural network (CNN) architectures that perform well on a diverse set of datasets and application domains. To this end, we investigate and ultimately improve ImageNet a… ▽ More

    Submitted 6 March, 2023; v1 submitted 16 March, 2021; originally announced March 2021.

    Journal ref: Frontiers in Computer Science, Volume 4, 2022

  10. arXiv:2004.13439  [pdf, other

    cs.AI cs.LG cs.MA

    Improving Sample Efficiency and Multi-Agent Communication in RL-based Train Rescheduling

    Authors: Dano Roost, Ralph Meier, Stephan Huschauer, Erik Nygren, Adrian Egli, Andreas Weiler, Thilo Stadelmann

    Abstract: We present preliminary results from our sixth placed entry to the Flatland international competition for train rescheduling, including two improvements for optimized reinforcement learning (RL) training efficiency, and two hypotheses with respect to the prospect of deep RL for complex real-world control tasks: first, that current state of the art policy gradient methods seem inappropriate in the d… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.

    Comments: Accepted for publication at the 7th Swiss Conference on Data Science (SDS 2020)

  11. arXiv:1907.08392  [pdf, other

    cs.LG cs.AI stat.ML

    Automated Machine Learning in Practice: State of the Art and Recent Results

    Authors: Lukas Tuggener, Mohammadreza Amirian, Katharina Rombach, Stefan Lörwald, Anastasia Varlet, Christian Westermann, Thilo Stadelmann

    Abstract: A main driver behind the digitization of industry and society is the belief that data-driven model building and decision making can contribute to higher degrees of automation and more informed decisions. Building such models from data often involves the application of some form of machine learning. Thus, there is an ever growing demand in work force with the necessary skill set to do so. This dema… ▽ More

    Submitted 19 July, 2019; originally announced July 2019.

    Comments: Accepted full paper at SDS2019, the 6th Swiss Conference on Data Science

  12. arXiv:1810.05423  [pdf, other

    cs.CV

    DeepScores and Deep Watershed Detection: current state and open issues

    Authors: Ismail Elezi, Lukas Tuggener, Marcello Pelillo, Thilo Stadelmann

    Abstract: This paper gives an overview of our current Optical Music Recognition (OMR) research. We recently released the OMR dataset \emph{DeepScores} as well as the object detection method \emph{Deep Watershed Detector}. We are currently taking some additional steps to improve both of them. Here we summarize current and future efforts, aimed at improving usefulness on real-world task and tackling extreme c… ▽ More

    Submitted 12 October, 2018; originally announced October 2018.

    Comments: Published on WORMS workshop (ISMIR affiliated workshop)

  13. arXiv:1807.04950  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Deep Learning in the Wild

    Authors: Thilo Stadelmann, Mohammadreza Amirian, Ismail Arabaci, Marek Arnold, Gilbert François Duivesteijn, Ismail Elezi, Melanie Geiger, Stefan Lörwald, Benjamin Bruno Meier, Katharina Rombach, Lukas Tuggener

    Abstract: Deep learning with neural networks is applied by an increasing number of people outside of classic research environments, due to the vast success of the methodology on a wide range of machine perception tasks. While this interest is fueled by beautiful success stories, practical work in deep learning on novel tasks without existing baselines remains challenging. This paper explores the specific ch… ▽ More

    Submitted 13 July, 2018; originally announced July 2018.

    Comments: Invited paper on ANNPR 2018

  14. arXiv:1807.04001  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Learning Neural Models for End-to-End Clustering

    Authors: Benjamin Bruno Meier, Ismail Elezi, Mohammadreza Amirian, Oliver Durr, Thilo Stadelmann

    Abstract: We propose a novel end-to-end neural network architecture that, once trained, directly outputs a probabilistic clustering of a batch of input examples in one pass. It estimates a distribution over the number of clusters $k$, and for each $1 \leq k \leq k_\mathrm{max}$, a distribution over the individual cluster assignment for each data point. The network is trained in advance in a supervised fashi… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

    Comments: Accepted for publication on ANNPR 2018

  15. arXiv:1805.10548  [pdf, other

    cs.CV cs.AI

    Deep Watershed Detector for Music Object Recognition

    Authors: Lukas Tuggener, Ismail Elezi, Jurgen Schmidhuber, Thilo Stadelmann

    Abstract: Optical Music Recognition (OMR) is an important and challenging area within music information retrieval, the accurate detection of music symbols in digital images is a core functionality of any OMR pipeline. In this paper, we introduce a novel object detection method, based on synthetic energy maps and the watershed transform, called Deep Watershed Detector (DWD). Our method is specifically tailor… ▽ More

    Submitted 26 May, 2018; originally announced May 2018.

    Comments: Accepted on The 19th International Society for Music Information Retrieval Conference 2018

  16. arXiv:1805.08641  [pdf, other

    cs.SD eess.AS

    Speaker Clustering Using Dominant Sets

    Authors: Feliks Hibraj, Sebastiano Vascon, Thilo Stadelmann, Marcello Pelillo

    Abstract: Speaker clustering is the task of forming speaker-specific groups based on a set of utterances. In this paper, we address this task by using Dominant Sets (DS). DS is a graph-based clustering algorithm with interesting properties that fits well to our problem and has never been applied before to speaker clustering. We report on a comprehensive set of experiments on the TIMIT dataset against standa… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

    Comments: ICPR 2018

  17. arXiv:1804.00525  [pdf, other

    cs.CV cs.LG

    DeepScores -- A Dataset for Segmentation, Detection and Classification of Tiny Objects

    Authors: Lukas Tuggener, Ismail Elezi, Jürgen Schmidhuber, Marcello Pelillo, Thilo Stadelmann

    Abstract: We present the DeepScores dataset with the goal of advancing the state-of-the-art in small objects recognition, and by placing the question of object recognition in the context of scene understanding. DeepScores contains high quality images of musical scores, partitioned into 300,000 sheets of written music that contain symbols of different shapes and sizes. With close to a hundred millions of sma… ▽ More

    Submitted 26 May, 2018; v1 submitted 27 March, 2018; originally announced April 2018.

    Comments: 6 pages, accepted on IEEE International Conference on Pattern Recognition 2018