Skip to main content

Showing 1–50 of 66 results for author: Schmidt-Thieme, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07246  [pdf, other

    cs.LG

    Marginalization Consistent Mixture of Separable Flows for Probabilistic Irregular Time Series Forecasting

    Authors: Vijaya Krishna Yalavarthi, Randolf Scholz, Kiran Madhusudhanan, Stefan Born, Lars Schmidt-Thieme

    Abstract: Probabilistic forecasting models for joint distributions of targets in irregular time series are a heavily under-researched area in machine learning with, to the best of our knowledge, only three models researched so far: GPR, the Gaussian Process Regression model~\citep{Durichen2015.Multitask}, TACTiS, the Transformer-Attentional Copulas for Time Series~\cite{Drouin2022.Tactis, ashok2024tactis} a… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. HMAR: Hierarchical Masked Attention for Multi-Behaviour Recommendation

    Authors: Shereen Elsayed, Ahmed Rashed, Lars Schmidt-Thieme

    Abstract: In the context of recommendation systems, addressing multi-behavioral user interactions has become vital for understanding the evolving user behavior. Recent models utilize techniques like graph neural networks and attention mechanisms for modeling diverse behaviors, but capturing sequential patterns in historical interactions remains challenging. To tackle this, we introduce Hierarchical Masked A… ▽ More

    Submitted 29 April, 2024; originally announced May 2024.

  3. arXiv:2405.03582  [pdf, other

    cs.LG

    Functional Latent Dynamics for Irregularly Sampled Time Series Forecasting

    Authors: Christian Klötergens, Vijaya Krishna Yalavarthi, Maximilian Stubbemann, Lars Schmidt-Thieme

    Abstract: Irregularly sampled time series with missing values are often observed in multiple real-world applications such as healthcare, climate and astronomy. They pose a significant challenge to standard deep learn- ing models that operate only on fully observed and regularly sampled time series. In order to capture the continuous dynamics of the irreg- ular time series, many models rely on solving an Ord… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  4. arXiv:2404.06966  [pdf, other

    cs.LG eess.SP

    Are EEG Sequences Time Series? EEG Classification with Time Series Models and Joint Subject Training

    Authors: Johannes Burchert, Thorben Werner, Vijaya Krishna Yalavarthi, Diego Coello de Portugal, Maximilian Stubbemann, Lars Schmidt-Thieme

    Abstract: As with most other data domains, EEG data analysis relies on rich domain-specific preprocessing. Beyond such preprocessing, machine learners would hope to deal with such data as with any other time series data. For EEG classification many models have been developed with layer types and architectures we typically do not see in time series classification. Furthermore, typically separate models for e… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  5. arXiv:2403.04477  [pdf, other

    cs.LG

    Hyperparameter Tuning MLPs for Probabilistic Time Series Forecasting

    Authors: Kiran Madhusudhanan, Shayan Jawed, Lars Schmidt-Thieme

    Abstract: Time series forecasting attempts to predict future events by analyzing past trends and patterns. Although well researched, certain critical aspects pertaining to the use of deep learning in time series forecasting remain ambiguous. Our research primarily focuses on examining the impact of specific hyperparameters related to time series, such as context length and validation strategy, on the perfor… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 14 pages, 5 figures, Accepted at PAKDD24

  6. arXiv:2403.03812  [pdf, other

    cs.LG cs.AI

    ProbSAINT: Probabilistic Tabular Regression for Used Car Pricing

    Authors: Kiran Madhusudhanan, Gunnar Behrens, Maximilian Stubbemann, Lars Schmidt-Thieme

    Abstract: Used car pricing is a critical aspect of the automotive industry, influenced by many economic factors and market dynamics. With the recent surge in online marketplaces and increased demand for used cars, accurate pricing would benefit both buyers and sellers by ensuring fair transactions. However, the transition towards automated pricing algorithms using machine learning necessitates the comprehen… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 9 pages, 4 figures

  7. arXiv:2402.06293  [pdf, other

    cs.LG stat.ML

    Probabilistic Forecasting of Irregular Time Series via Conditional Flows

    Authors: Vijaya Krishna Yalavarthi, Randolf Scholz, Stefan Born, Lars Schmidt-Thieme

    Abstract: Probabilistic forecasting of irregularly sampled multivariate time series with missing values is an important problem in many fields, including health care, astronomy, and climate. State-of-the-art methods for the task estimate only marginal distributions of observations in single channels and at single timepoints, assuming a fixed-shape parametric distribution. In this work, we propose a novel mo… ▽ More

    Submitted 21 May, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

  8. arXiv:2402.04915  [pdf, other

    cs.LG

    Moco: A Learnable Meta Optimizer for Combinatorial Optimization

    Authors: Tim Dernedde, Daniela Thyssens, Sören Dittrich, Maximilian Stubbemann, Lars Schmidt-Thieme

    Abstract: Relevant combinatorial optimization problems (COPs) are often NP-hard. While they have been tackled mainly via handcrafted heuristics in the past, advances in neural networks have motivated the development of general methods to learn heuristics from data. Many approaches utilize a neural network to directly construct a solution, but are limited in further improving based on already constructed sol… ▽ More

    Submitted 9 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: 13 pages, 3 figures

  9. arXiv:2312.09684  [pdf, other

    cs.IR

    Context-Aware Sequential Model for Multi-Behaviour Recommendation

    Authors: Shereen Elsayed, Ahmed Rashed, Lars Schmidt-Thieme

    Abstract: Sequential recommendation models are crucial for next-item recommendations in online platforms, capturing complex patterns in user interactions. However, many focus on a single behavior, overlooking valuable implicit interactions like clicks and favorites. Existing multi-behavioral models often fail to simultaneously capture sequential patterns. We propose CASM, a Context-Aware Sequential Model, l… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  10. arXiv:2311.18356  [pdf, ps, other

    cs.LG stat.ML

    Towards Comparable Active Learning

    Authors: Thorben Werner, Johannes Burchert, Lars Schmidt-Thieme

    Abstract: Active Learning has received significant attention in the field of machine learning for its potential in selecting the most informative samples for labeling, thereby reducing data annotation costs. However, we show that the reported lifts in recent literature generalize poorly to other domains leading to an inconclusive landscape in Active Learning research. Furthermore, we highlight overlooked pr… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  11. arXiv:2310.04140  [pdf, other

    cs.LG

    Routing Arena: A Benchmark Suite for Neural Routing Solvers

    Authors: Daniela Thyssens, Tim Dernedde, Jonas K. Falkner, Lars Schmidt-Thieme

    Abstract: Neural Combinatorial Optimization has been researched actively in the last eight years. Even though many of the proposed Machine Learning based approaches are compared on the same datasets, the evaluation protocol exhibits essential flaws and the selection of baselines often neglects State-of-the-Art Operations Research approaches. To improve on both of these shortcomings, we propose the Routing A… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  12. arXiv:2309.17089  [pdf, other

    cs.LG

    Too Big, so Fail? -- Enabling Neural Construction Methods to Solve Large-Scale Routing Problems

    Authors: Jonas K. Falkner, Lars Schmidt-Thieme

    Abstract: In recent years new deep learning approaches to solve combinatorial optimization problems, in particular NP-hard Vehicle Routing Problems (VRP), have been proposed. The most impactful of these methods are sequential neural construction approaches which are usually trained via reinforcement learning. Due to the high training costs of these models, they usually are trained on limited instance sizes… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

  13. arXiv:2307.09796  [pdf, other

    cs.LG

    Forecasting Early with Meta Learning

    Authors: Shayan Jawed, Kiran Madhusudhanan, Vijaya Krishna Yalavarthi, Lars Schmidt-Thieme

    Abstract: In the early observation period of a time series, there might be only a few historic observations available to learn a model. However, in cases where an existing prior set of datasets is available, Meta learning methods can be applicable. In this paper, we devise a Meta learning method that exploits samples from additional datasets and learns to augment time series through adversarial learning as… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: IJCNN 2023

  14. arXiv:2306.06068  [pdf, other

    cs.CV cs.LG

    DeepStay: Stay Region Extraction from Location Trajectories using Weak Supervision

    Authors: Christian Löwens, Daniela Thyssens, Emma Andersson, Christina Jenkins, Lars Schmidt-Thieme

    Abstract: Nowadays, mobile devices enable constant tracking of the user's position and location trajectories can be used to infer personal points of interest (POIs) like homes, workplaces, or stores. A common way to extract POIs is to first identify spatio-temporal regions where a user spends a significant amount of time, known as stay regions (SRs). Common approaches to SR extraction are evaluated either… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Paper under peer review

  15. arXiv:2305.12932  [pdf, ps, other

    cs.LG

    Forecasting Irregularly Sampled Time Series using Graphs

    Authors: Vijaya Krishna Yalavarthi, Kiran Madhusudhanan, Randolf Sholz, Nourhan Ahmed, Johannes Burchert, Shayan Jawed, Stefan Born, Lars Schmidt-Thieme

    Abstract: Forecasting irregularly sampled time series with missing values is a crucial task for numerous real-world applications such as healthcare, astronomy, and climate sciences. State-of-the-art approaches to this problem rely on Ordinary Differential Equations (ODEs) which are known to be slow and often require additional features to handle missing values. To address this issue, we propose a novel mode… ▽ More

    Submitted 10 August, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

  16. arXiv:2304.07262  [pdf, other

    cs.CV

    Phantom Embeddings: Using Embedding Space for Model Regularization in Deep Neural Networks

    Authors: Mofassir ul Islam Arif, Mohsan Jameel, Josif Grabocka, Lars Schmidt-Thieme

    Abstract: The strength of machine learning models stems from their ability to learn complex function approximations from data; however, this strength also makes training deep neural networks challenging. Notably, the complex models tend to memorize the training data, which results in poor regularization performance on test data. The regularization techniques such as L1, L2, dropout, etc. are proposed to red… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  17. arXiv:2304.07256  [pdf, other

    cs.CV cs.LG

    Directly Optimizing IoU for Bounding Box Localization

    Authors: Mofassir ul Islam Arif, Mohsan Jameel, Lars Schmidt-Thieme

    Abstract: Object detection has seen remarkable progress in recent years with the introduction of Convolutional Neural Networks (CNN). Object detection is a multi-task learning problem where both the position of the objects in the images as well as their classes needs to be correctly identified. The idea here is to maximize the overlap between the ground-truth bounding boxes and the predictions i.e. the Inte… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  18. arXiv:2302.05134  [pdf, other

    cs.LG

    Neural Capacitated Clustering

    Authors: Jonas K. Falkner, Lars Schmidt-Thieme

    Abstract: Recent work on deep clustering has found new promising methods also for constrained clustering problems. Their typically pairwise constraints often can be used to guide the partitioning of the data. Many problems however, feature cluster-level constraints, e.g. the Capacitated Clustering Problem (CCP), where each point has a weight and the total weight sum of all points in each cluster is bounded… ▽ More

    Submitted 19 May, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

    Comments: Accepted at the 32nd International Joint Conference on Artificial Intelligence (IJCAI) 2023

  19. arXiv:2212.11771  [pdf, other

    cs.LG

    Few-shot human motion prediction for heterogeneous sensors

    Authors: Rafael Rego Drumond, Lukas Brinkmeyer, Lars Schmidt-Thieme

    Abstract: Human motion prediction is a complex task as it involves forecasting variables over time on a graph of connected sensors. This is especially true in the case of few-shot learning, where we strive to forecast motion sequences for previously unseen actions based on only a few examples. Despite this, almost all related approaches for few-shot motion prediction do not incorporate the underlying graph,… ▽ More

    Submitted 20 March, 2023; v1 submitted 22 December, 2022; originally announced December 2022.

    MSC Class: 68 ACM Class: I.2.6

  20. arXiv:2212.02578  [pdf, ps, other

    cs.LG stat.ME

    Auxiliary Quantile Forecasting with Linear Networks

    Authors: Shayan Jawed, Lars Schmidt-Thieme

    Abstract: We propose a novel multi-task method for quantile forecasting with shared Linear layers. Our method is based on the Implicit quantile learning approach, where samples from the Uniform distribution $\mathcal{U}(0, 1)$ are reparameterized to quantile values of the target distribution. We combine the implicit quantile and input time series representations to directly forecast multiple quantile estima… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

    Comments: Under submission

  21. arXiv:2210.10664  [pdf, other

    cs.IR cs.AI cs.LG

    Deep Multi-Representation Model for Click-Through Rate Prediction

    Authors: Shereen Elsayed, Lars Schmidt-Thieme

    Abstract: Click-Through Rate prediction (CTR) is a crucial task in recommender systems, and it gained considerable attention in the past few years. The primary purpose of recent research emphasizes obtaining meaningful and powerful representations through mining low and high feature interactions using various components such as Deep Neural Networks (DNN), CrossNets, or transformer blocks. In this work, we p… ▽ More

    Submitted 25 October, 2022; v1 submitted 18 October, 2022; originally announced October 2022.

  22. arXiv:2210.02091  [pdf, ps, other

    cs.LG

    Tripletformer for Probabilistic Interpolation of Irregularly sampled Time Series

    Authors: Vijaya Krishna Yalavarthi, Johannes Burchert, Lars Schmidt-thieme

    Abstract: Irregularly sampled time series data with missing values is observed in many fields like healthcare, astronomy, and climate science. Interpolation of these types of time series is crucial for tasks such as root cause analysis and medical diagnosis, as well as for smoothing out irregular or noisy data. To address this challenge, we present a novel encoder-decoder architecture called "Tripletformer"… ▽ More

    Submitted 12 January, 2024; v1 submitted 5 October, 2022; originally announced October 2022.

    Journal ref: IEEE International Conference on BigData, 2023

  23. arXiv:2210.00275  [pdf, other

    cs.CV

    Offline Handwritten Amharic Character Recognition Using Few-shot Learning

    Authors: Mesay Samuel, Lars Schmidt-Thieme, DP Sharma, Abiot Sinamo, Abey Bruck

    Abstract: Few-shot learning is an important, but challenging problem of machine learning aimed at learning from only fewer labeled training examples. It has become an active area of research due to deep learning requiring huge amounts of labeled dataset, which is not feasible in the real world. Learning from a few examples is also an important attempt towards learning like humans. Few-shot learning has prov… ▽ More

    Submitted 1 October, 2022; originally announced October 2022.

    Comments: PanAfriCon AI 2022 virtual conference paper

  24. arXiv:2209.01083  [pdf, other

    cs.LG

    When Bioprocess Engineering Meets Machine Learning: A Survey from the Perspective of Automated Bioprocess Development

    Authors: Nghia Duong-Trung, Stefan Born, Jong Woo Kim, Marie-Therese Schermeyer, Katharina Paulick, Maxim Borisyak, Mariano Nicolas Cruz-Bournazou, Thorben Werner, Randolf Scholz, Lars Schmidt-Thieme, Peter Neubauer, Ernesto Martinez

    Abstract: Machine learning (ML) is becoming increasingly crucial in many fields of engineering but has not yet played out its full potential in bioprocess engineering. While experimentation has been accelerated by increasing levels of lab automation, experimental planning and data modeling are still largerly depend on human intervention. ML can be seen as a set of tools that contribute to the automation of… ▽ More

    Submitted 1 November, 2022; v1 submitted 2 September, 2022; originally announced September 2022.

  25. arXiv:2208.11374  [pdf, ps, other

    cs.LG

    DCSF: Deep Convolutional Set Functions for Classification of Asynchronous Time Series

    Authors: Vijaya Krishna Yalavarthi, Johannes Burchert, Lars Schmidt-Thieme

    Abstract: Asynchronous Time Series is a multivariate time series where all the channels are observed asynchronously-independently, making the time series extremely sparse when aligning them. We often observe this effect in applications with complex observation processes, such as health care, climate science, and astronomy, to name a few. Because of the asynchronous nature, they pose a significant challenge… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

  26. arXiv:2207.07212  [pdf, other

    cs.LG

    Attention, Filling in The Gaps for Generalization in Routing Problems

    Authors: Ahmad Bdeir, Jonas K. Falkner, Lars Schmidt-Thieme

    Abstract: Machine Learning (ML) methods have become a useful tool for tackling vehicle routing problems, either in combination with popular heuristics or as standalone models. However, current methods suffer from poor generalization when tackling problems of different sizes or different distributions. As a result, ML in vehicle routing has witnessed an expansion phase with new methodologies being created fo… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

    Comments: Accepted at ECML-PKDD 2022

  27. Solving the Traveling Salesperson Problem with Precedence Constraints by Deep Reinforcement Learning

    Authors: Christian Löwens, Inaam Ashraf, Alexander Gembus, Genesis Cuizon, Jonas K. Falkner, Lars Schmidt-Thieme

    Abstract: This work presents solutions to the Traveling Salesperson Problem with precedence constraints (TSPPC) using Deep Reinforcement Learning (DRL) by adapting recent approaches that work well for regular TSPs. Common to these approaches is the use of graph models based on multi-head attention (MHA) layers. One idea for solving the pickup and delivery problem (PDP) is using heterogeneous attentions to e… ▽ More

    Submitted 19 September, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this contribution is published in KI 2022: Advances in Artificial Intelligence, and is available online at https://doi.org/10.1007/978-3-031-15791-2_14

    Journal ref: KI 2022: Advances in Artificial Intelligence pp 160-172

  28. Learning to Control Local Search for Combinatorial Optimization

    Authors: Jonas K. Falkner, Daniela Thyssens, Ahmad Bdeir, Lars Schmidt-Thieme

    Abstract: Combinatorial optimization problems are encountered in many practical contexts such as logistics and production, but exact solutions are particularly difficult to find and usually NP-hard for considerable problem sizes. To compute approximate solutions, a zoo of generic as well as problem-specific variants of local search is commonly used. However, which variant to apply to which particular proble… ▽ More

    Submitted 13 July, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

    Comments: Accepted at ECML-PKDD 2022

    Journal ref: In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13717. Springer, Cham

  29. arXiv:2206.08476  [pdf, other

    cs.LG cs.AI cs.CV

    Zero-Shot AutoML with Pretrained Models

    Authors: Ekrem Öztürk, Fabio Ferreira, Hadi S. Jomaa, Lars Schmidt-Thieme, Josif Grabocka, Frank Hutter

    Abstract: Given a new dataset D and a low compute budget, how should we choose a pre-trained model to fine-tune to D, and set the fine-tuning hyperparameters without risking overfitting, particularly if D is small? Here, we extend automated machine learning (AutoML) to best make these choices. Our domain-independent meta-learning approach learns a zero-shot surrogate model which, at test time, allows to sel… ▽ More

    Submitted 25 June, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

    Journal ref: International Conference on Machine Learning 2022

  30. arXiv:2205.02923  [pdf, other

    cs.AI

    End-to-End Image-Based Fashion Recommendation

    Authors: Shereen Elsayed, Lukas Brinkmeyer, Lars Schmidt-Thieme

    Abstract: In fashion-based recommendation settings, incorporating the item image features is considered a crucial factor, and it has shown significant improvements to many traditional models, including but not limited to matrix factorization, auto-encoders, and nearest neighbor models. While there are numerous image-based recommender approaches that utilize dedicated deep neural networks, comparisons to att… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: Accepted in FashionXRecsys 2021 workshop

  31. arXiv:2205.00772  [pdf, ps, other

    cs.LG cs.AI

    Large Neighborhood Search based on Neural Construction Heuristics

    Authors: Jonas K. Falkner, Daniela Thyssens, Lars Schmidt-Thieme

    Abstract: We propose a Large Neighborhood Search (LNS) approach utilizing a learned construction heuristic based on neural networks as repair operator to solve the vehicle routing problem with time windows (VRPTW). Our method uses graph neural networks to encode the problem and auto-regressively decodes a solution and is trained with reinforcement learning on the construction task without requiring any labe… ▽ More

    Submitted 10 May, 2022; v1 submitted 2 May, 2022; originally announced May 2022.

  32. CARCA: Context and Attribute-Aware Next-Item Recommendation via Cross-Attention

    Authors: Ahmed Rashed, Shereen Elsayed, Lars Schmidt-Thieme

    Abstract: In sparse recommender settings, users' context and item attributes play a crucial role in deciding which items to recommend next. Despite that, recent works in sequential and time-aware recommendations usually either ignore both aspects or only consider one of them, limiting their predictive performance. In this paper, we address these limitations by proposing a context and attribute-aware recomme… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Journal ref: RecSys (2022) 71-80

  33. arXiv:2204.03456  [pdf, other

    cs.LG

    Few-Shot Forecasting of Time-Series with Heterogeneous Channels

    Authors: Lukas Brinkmeyer, Rafael Rego Drumond, Johannes Burchert, Lars Schmidt-Thieme

    Abstract: Learning complex time series forecasting models usually requires a large amount of data, as each model is trained from scratch for each task/data set. Leveraging learning experience with similar datasets is a well-established technique for classification problems called few-shot classification. However, existing approaches cannot be applied to time-series forecasting because i) multivariate time-s… ▽ More

    Submitted 18 August, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: Under review. Equal contribution (Brinkmeyer and Rego Drumond)

    MSC Class: 68

  34. arXiv:2202.12687  [pdf, other

    cs.CV

    Improving Amharic Handwritten Word Recognition Using Auxiliary Task

    Authors: Mesay Samuel Gondere, Lars Schmidt-Thieme, Durga Prasad Sharma, Abiot Sinamo Boltena

    Abstract: Amharic is one of the official languages of the Federal Democratic Republic of Ethiopia. It is one of the languages that use an Ethiopic script which is derived from Gee'z, ancient and currently a liturgical language. Amharic is also one of the most widely used literature-rich languages of Ethiopia. There are very limited innovative and customized research works in Amharic optical character recogn… ▽ More

    Submitted 25 February, 2022; originally announced February 2022.

  35. arXiv:2202.05695  [pdf, other

    cs.LG

    Positive-Unlabeled Domain Adaptation

    Authors: Jonas Sonntag, Gunnar Behrens, Lars Schmidt-Thieme

    Abstract: Domain Adaptation methodologies have shown to effectively generalize from a labeled source domain to a label scarce target domain. Previous research has either focused on unlabeled domain adaptation without any target supervision or semi-supervised domain adaptation with few labeled target examples per class. On the other hand Positive-Unlabeled (PU-) Learning has attracted increasing interest in… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

  36. arXiv:2202.04411  [pdf, other

    cs.AI

    A.I. and Data-Driven Mobility at Volkswagen Financial Services AG

    Authors: Shayan Jawed, Mofassir ul Islam Arif, Ahmed Rashed, Kiran Madhusudhanan, Shereen Elsayed, Mohsan Jameel, Alexei Volk, Andre Hintsches, Marlies Kornfeld, Katrin Lange, Lars Schmidt-Thieme

    Abstract: Machine learning is being widely adapted in industrial applications owing to the capabilities of commercially available hardware and rapidly advancing research. Volkswagen Financial Services (VWFS), as a market leader in vehicle leasing services, aims to leverage existing proprietary data and the latest research to enhance existing and derive new business processes. The collaboration between Infor… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

  37. arXiv:2201.01529  [pdf, other

    cs.LG math.OC

    Supervised Permutation Invariant Networks for Solving the CVRP with Bounded Fleet Size

    Authors: Daniela Thyssens, Jonas Falkner, Lars Schmidt-Thieme

    Abstract: Learning to solve combinatorial optimization problems, such as the vehicle routing problem, offers great computational advantages over classical operations research solvers and heuristics. The recently developed deep reinforcement learning approaches either improve an initially given solution iteratively or sequentially construct a set of individual tours. However, most of the existing learning-ba… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

    Comments: 9 pages, 5 figures

  38. arXiv:2110.08255  [pdf, other

    cs.LG

    Yformer: U-Net Inspired Transformer Architecture for Far Horizon Time Series Forecasting

    Authors: Kiran Madhusudhanan, Johannes Burchert, Nghia Duong-Trung, Stefan Born, Lars Schmidt-Thieme

    Abstract: Time series data is ubiquitous in research as well as in a wide variety of industrial applications. Effectively analyzing the available historical data and providing insights into the far future allows us to make effective decisions. Recent research has witnessed the superior performance of transformer-based architectures, especially in the regime of far horizon time series forecasting. However, t… ▽ More

    Submitted 25 August, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: Accepted by the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2022)

    Journal ref: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (2022)

  39. arXiv:2110.08028  [pdf, other

    cs.LG

    Improving Hyperparameter Optimization by Planning Ahead

    Authors: Hadi S. Jomaa, Jonas Falkner, Lars Schmidt-Thieme

    Abstract: Hyperparameter optimization (HPO) is generally treated as a bi-level optimization problem that involves fitting a (probabilistic) surrogate model to a set of observed hyperparameter responses, e.g. validation loss, and consequently maximizing an acquisition function using a surrogate model to identify good hyperparameter candidates for evaluation. The choice of a surrogate and/or acquisition funct… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

  40. arXiv:2109.01569  [pdf, other

    cs.CV cs.RO

    Deep Metric Learning for Ground Images

    Authors: Raaghav Radhakrishnan, Jan Fabian Schmid, Randolf Scholz, Lars Schmidt-Thieme

    Abstract: Ground texture based localization methods are potential prospects for low-cost, high-accuracy self-localization solutions for robots. These methods estimate the pose of a given query image, i.e. the current observation of the ground from a downward-facing camera, in respect to a set of reference images whose poses are known in the application area. In this work, we deal with the initial localizati… ▽ More

    Submitted 3 September, 2021; originally announced September 2021.

  41. arXiv:2108.02842  [pdf, other

    cs.LG

    Multimodal Meta-Learning for Time Series Regression

    Authors: Sebastian Pineda Arango, Felix Heinrich, Kiran Madhusudhanan, Lars Schmidt-Thieme

    Abstract: Recent work has shown the efficiency of deep learning models such as Fully Convolutional Networks (FCN) or Recurrent Neural Networks (RNN) to deal with Time Series Regression (TSR) problems. These models sometimes need a lot of data to be able to generalize, yet the time series are sometimes not long enough to be able to learn patterns. Therefore, it is important to make use of information across… ▽ More

    Submitted 2 November, 2021; v1 submitted 5 August, 2021; originally announced August 2021.

    Comments: 16 pages

  42. Multi-script Handwritten Digit Recognition Using Multi-task Learning

    Authors: Mesay Samuel Gondere, Lars Schmidt-Thieme, Durga Prasad Sharma, Randolf Scholz

    Abstract: Handwritten digit recognition is one of the extensively studied area in machine learning. Apart from the wider research on handwritten digit recognition on MNIST dataset, there are many other research works on various script recognition. However, it is not very common for multi-script digit recognition which encourage the development of robust and multipurpose systems. Additionally working on mult… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  43. arXiv:2104.12226  [pdf, other

    cs.LG

    RP-DQN: An application of Q-Learning to Vehicle Routing Problems

    Authors: Ahmad Bdeir, Simon Boeder, Tim Dernedde, Kirill Tkachuk, Jonas K. Falkner, Lars Schmidt-Thieme

    Abstract: In this paper we present a new approach to tackle complex routing problems with an improved state representation that utilizes the model complexity better than previous methods. We enable this by training from temporal differences. Specifically Q-Learning is employed. We show that our approach achieves state-of-the-art performance for autoregressive policies that sequentially insert nodes to const… ▽ More

    Submitted 25 April, 2021; originally announced April 2021.

    Comments: 14 pages, 4 figures

  44. arXiv:2102.03776  [pdf, other

    cs.LG

    Hyperparameter Optimization with Differentiable Metafeatures

    Authors: Hadi S. Jomaa, Lars Schmidt-Thieme, Josif Grabocka

    Abstract: Metafeatures, or dataset characteristics, have been shown to improve the performance of hyperparameter optimization (HPO). Conventionally, metafeatures are precomputed and used to measure the similarity between datasets, leading to a better initialization of HPO models. In this paper, we propose a cross dataset surrogate model called Differentiable Metafeature-based Surrogate (DMFBS), that predict… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

  45. arXiv:2101.02118  [pdf, other

    cs.LG stat.ML

    Do We Really Need Deep Learning Models for Time Series Forecasting?

    Authors: Shereen Elsayed, Daniela Thyssens, Ahmed Rashed, Hadi Samer Jomaa, Lars Schmidt-Thieme

    Abstract: Time series forecasting is a crucial task in machine learning, as it has a wide range of applications including but not limited to forecasting electricity consumption, traffic, and air quality. Traditional forecasting models rely on rolling averages, vector auto-regression and auto-regressive integrated moving averages. On the other hand, deep learning and matrix factorization models have been rec… ▽ More

    Submitted 20 October, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

    Comments: 14 pages with appendix, 1 figure

  46. arXiv:2006.09100  [pdf, other

    cs.LG stat.ML

    Learning to Solve Vehicle Routing Problems with Time Windows through Joint Attention

    Authors: Jonas K. Falkner, Lars Schmidt-Thieme

    Abstract: Many real-world vehicle routing problems involve rich sets of constraints with respect to the capacities of the vehicles, time windows for customers etc. While in recent years first machine learning models have been developed to solve basic vehicle routing problems faster than optimization heuristics, complex constraints rarely are taken into consideration. Due to their general procedure to constr… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

  47. arXiv:1910.12749  [pdf, other

    cs.LG stat.ML

    HIDRA: Head Initialization across Dynamic targets for Robust Architectures

    Authors: Rafael Rego Drumond, Lukas Brinkmeyer, Josif Grabocka, Lars Schmidt-Thieme

    Abstract: The performance of gradient-based optimization strategies depends heavily on the initial weights of the parametric model. Recent works show that there exist weight initializations from which optimization procedures can find the task-specific parameters faster than from uniformly random initializations and that such a weight initialization can be learned by optimizing a specific model architecture… ▽ More

    Submitted 22 January, 2020; v1 submitted 28 October, 2019; originally announced October 2019.

    MSC Class: 68

  48. arXiv:1909.13576  [pdf, other

    cs.LG cs.AI stat.ML

    Chameleon: Learning Model Initializations Across Tasks With Different Schemas

    Authors: Lukas Brinkmeyer, Rafael Rego Drumond, Randolf Scholz, Josif Grabocka, Lars Schmidt-Thieme

    Abstract: Parametric models, and particularly neural networks, require weight initialization as a starting point for gradient-based optimization. Recent work shows that a specific initial parameter set can be learned from a population of supervised learning tasks. Using this initial parameter set enables a fast convergence for unseen classes even when only a handful of instances is available (model-agnostic… ▽ More

    Submitted 11 June, 2020; v1 submitted 30 September, 2019; originally announced September 2019.

    Comments: 18 pages, 7 figures

    MSC Class: 68

  49. arXiv:1909.12943  [pdf, other

    cs.CV cs.LG stat.ML

    Handwritten Amharic Character Recognition Using a Convolutional Neural Network

    Authors: Mesay Samuel Gondere, Lars Schmidt-Thieme, Abiot Sinamo Boltena, Hadi Samer Jomaa

    Abstract: Amharic is the official language of the Federal Democratic Republic of Ethiopia. There are lots of historic Amharic and Ethiopic handwritten documents addressing various relevant issues including governance, science, religious, social rules, cultures and art works which are very reach indigenous knowledge. The Amharic language has its own alphabet derived from Ge'ez which is currently the liturgic… ▽ More

    Submitted 23 September, 2019; originally announced September 2019.

    Comments: ECDA2019 Conference Oral Presentation

  50. arXiv:1906.11527  [pdf, other

    cs.LG stat.ML

    Hyp-RL : Hyperparameter Optimization by Reinforcement Learning

    Authors: Hadi S. Jomaa, Josif Grabocka, Lars Schmidt-Thieme

    Abstract: Hyperparameter tuning is an omnipresent problem in machine learning as it is an integral aspect of obtaining the state-of-the-art performance for any model. Most often, hyperparameters are optimized just by training a model on a grid of possible hyperparameter values and taking the one that performs best on a validation sample (grid search). More recently, methods have been introduced that build a… ▽ More

    Submitted 27 June, 2019; originally announced June 2019.