Skip to main content

Showing 1–12 of 12 results for author: Yıldız, Ç

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.03337  [pdf, other

    cs.LG stat.ML

    Identifying latent state transition in non-linear dynamical systems

    Authors: Çağlar Hızlı, Çağatay Yıldız, Matthias Bethge, ST John, Pekka Marttinen

    Abstract: This work aims to improve generalization and interpretability of dynamical systems by recovering the underlying lower-dimensional latent states and their time evolutions. Previous work on disentangled representation learning within the realm of dynamical systems focused on the latent states, possibly with linear transition approximations. As such, they cannot identify nonlinear transition dynamics… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  2. arXiv:2402.17400  [pdf, other

    cs.CL

    Investigating Continual Pretraining in Large Language Models: Insights and Implications

    Authors: Çağatay Yıldız, Nishaanth Kanna Ravichandran, Prishruit Punia, Matthias Bethge, Beyza Ermis

    Abstract: This paper studies the evolving domain of Continual Learning (CL) in large language models (LLMs), with a focus on develo** strategies for efficient and sustainable training. Our primary emphasis is on continual domain-adaptive pretraining, a process designed to equip LLMs with the ability to integrate new information from various domains while retaining previously learned knowledge and enhancin… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  3. arXiv:2312.16731  [pdf, other

    cs.LG cs.CV

    Infinite dSprites for Disentangled Continual Learning: Separating Memory Edits from Generalization

    Authors: Sebastian Dziadzio, Çağatay Yıldız, Gido M. van de Ven, Tomasz Trzciński, Tinne Tuytelaars, Matthias Bethge

    Abstract: The ability of machine learning systems to learn continually is hindered by catastrophic forgetting, the tendency of neural networks to overwrite existing knowledge when learning a new task. Continual learning methods alleviate this problem through regularization, parameter isolation, or rehearsal, but they are typically evaluated on benchmarks comprising only a handful of tasks. In contrast, huma… ▽ More

    Submitted 29 February, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

    Comments: 10 pages, 10 figures

  4. arXiv:2302.13262  [pdf, other

    cs.LG stat.ML

    Modulated Neural ODEs

    Authors: Ilze Amanda Auzina, Çağatay Yıldız, Sara Magliacane, Matthias Bethge, Efstratios Gavves

    Abstract: Neural ordinary differential equations (NODEs) have been proven useful for learning non-linear dynamics of arbitrary trajectories. However, current NODE methods capture variations across trajectories only via the initial state value or by auto-regressive encoder updates. In this work, we introduce Modulated Neural ODEs (MoNODEs), a novel framework that sets apart dynamics states from underlying st… ▽ More

    Submitted 13 November, 2023; v1 submitted 26 February, 2023; originally announced February 2023.

  5. arXiv:2210.03466  [pdf, other

    cs.LG

    Latent Neural ODEs with Sparse Bayesian Multiple Shooting

    Authors: Valerii Iakovlev, Cagatay Yildiz, Markus Heinonen, Harri Lähdesmäki

    Abstract: Training dynamic models, such as neural ODEs, on long trajectories is a hard problem that requires using various tricks, such as trajectory splitting, to make model training work in practice. These methods are often heuristics with poor theoretical justifications, and require iterative manual tuning. We propose a principled multiple shooting technique for neural ODEs that splits the trajectories i… ▽ More

    Submitted 8 February, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

  6. VIDI: A Video Dataset of Incidents

    Authors: Duygu Sesver, Alp Eren Gençoğlu, Çağrı Emre Yıldız, Zehra Günindi, Faeze Habibi, Ziya Ata Yazıcı, Hazım Kemal Ekenel

    Abstract: Automatic detection of natural disasters and incidents has become more important as a tool for fast response. There have been many studies to detect incidents using still images and text. However, the number of approaches that exploit temporal information is rather limited. One of the main reasons for this is that a diverse video dataset with various incident types does not exist. To address this… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Journal ref: 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP)

  7. arXiv:2205.11894  [pdf, other

    cs.LG stat.ML

    Learning Interacting Dynamical Systems with Latent Gaussian Process ODEs

    Authors: Çağatay Yıldız, Melih Kandemir, Barbara Rakitsch

    Abstract: We study time uncertainty-aware modeling of continuous-time dynamics of interacting objects. We introduce a new model that decomposes independent dynamics of single objects accurately from their interactions. By employing latent Gaussian process ordinary differential equations, our model infers both independent dynamics and their interactions with reliable uncertainty estimates. In our formulation… ▽ More

    Submitted 12 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

  8. arXiv:2106.10905  [pdf, other

    cs.LG stat.ML

    Variational multiple shooting for Bayesian ODEs with Gaussian processes

    Authors: Pashupati Hegde, Çağatay Yıldız, Harri Lähdesmäki, Samuel Kaski, Markus Heinonen

    Abstract: Recent machine learning advances have proposed black-box estimation of unknown continuous-time system dynamics directly from data. However, earlier works are based on approximative ODE solutions or point estimates. We propose a novel Bayesian nonparametric model that uses Gaussian processes to infer posteriors of unknown ODE systems directly from data. We derive sparse variational inference with d… ▽ More

    Submitted 17 July, 2022; v1 submitted 21 June, 2021; originally announced June 2021.

    Comments: Camera-ready version at UAI 2022

  9. arXiv:2102.04764  [pdf, other

    cs.LG stat.ML

    Continuous-Time Model-Based Reinforcement Learning

    Authors: Çağatay Yıldız, Markus Heinonen, Harri Lähdesmäki

    Abstract: Model-based reinforcement learning (MBRL) approaches rely on discrete-time state transition models whereas physical systems and the vast majority of control tasks operate in continuous-time. To avoid time-discretization approximation of the underlying process, we propose a continuous-time MBRL framework based on a novel actor-critic method. Our approach also infers the unknown state evolution diff… ▽ More

    Submitted 11 June, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

  10. arXiv:1905.10994  [pdf, other

    stat.ML cs.LG

    ODE$^2$VAE: Deep generative second order ODEs with Bayesian neural networks

    Authors: Çağatay Yıldız, Markus Heinonen, Harri Lähdesmäki

    Abstract: We present Ordinary Differential Equation Variational Auto-Encoder (ODE$^2$VAE), a latent second order ODE model for high-dimensional sequential data. Leveraging the advances in deep generative models, ODE$^2$VAE can simultaneously learn the embedding of high dimensional trajectories and infer arbitrarily complex continuous-time latent dynamics. Our model explicitly decomposes the latent space int… ▽ More

    Submitted 24 October, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

  11. arXiv:1807.05748  [pdf, other

    stat.ML cs.LG

    Learning Stochastic Differential Equations With Gaussian Processes Without Gradient Matching

    Authors: Cagatay Yildiz, Markus Heinonen, Jukka Intosalmi, Henrik Mannerström, Harri Lähdesmäki

    Abstract: We introduce a novel paradigm for learning non-parametric drift and diffusion functions for stochastic differential equation (SDE). The proposed model learns to simulate path distributions that match observations with non-uniform time increments and arbitrary sparseness, which is in contrast with gradient matching that does not optimize simulated responses. We formulate sensitivity equations for l… ▽ More

    Submitted 31 July, 2018; v1 submitted 16 July, 2018; originally announced July 2018.

    Comments: The accepted version of the paper to be presented in 2018 IEEE International Workshop on Machine Learning for Signal Processing

  12. arXiv:1806.02617  [pdf, other

    stat.ML cs.LG

    Asynchronous Stochastic Quasi-Newton MCMC for Non-Convex Optimization

    Authors: Umut Şimşekli, Çağatay Yıldız, Thanh Huy Nguyen, Gaël Richard, A. Taylan Cemgil

    Abstract: Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a strong potential in non-convex optimization, where local and global convergence guarantees can be shown under certain conditions. By building up on this recent theory, in this study, we develop an asynchronous-parallel stochastic L-BFGS algorithm for non-convex optimization. The proposed algorithm i… ▽ More

    Submitted 7 June, 2018; originally announced June 2018.

    Comments: Published in the International Conference on Machine Learning (ICML 2018)