Skip to main content

Showing 1–11 of 11 results for author: Stanić, A

.
  1. arXiv:2405.17283  [pdf, other

    cs.LG cs.NE

    Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery

    Authors: Anand Gopalakrishnan, Aleksandar Stanić, Jürgen Schmidhuber, Michael Curtis Mozer

    Abstract: Current state-of-the-art synchrony-based models encode object bindings with complex-valued activations and compute with real-valued weights in feedforward architectures. We argue for the computational advantages of a recurrent architecture with complex-valued weights. We propose a fully convolutional autoencoder, SynCx, that performs iterative constraint satisfaction: at each iteration, a hidden l… ▽ More

    Submitted 28 May, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: minor typo fixed

  2. arXiv:2401.01974  [pdf, other

    cs.CV cs.AI cs.LG

    Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers

    Authors: Aleksandar Stanić, Sergi Caelles, Michael Tschannen

    Abstract: Visual reasoning is dominated by end-to-end neural networks scaled to billions of model parameters and training examples. However, even the largest models struggle with compositional reasoning, generalization, fine-grained spatial and temporal reasoning, and counting. Visual reasoning with large language models (LLMs) as controllers can, in principle, address these limitations by decomposing the t… ▽ More

    Submitted 14 May, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

  3. arXiv:2309.11197  [pdf, other

    cs.LG cs.CL

    The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute

    Authors: Aleksandar Stanić, Dylan Ashley, Oleg Serikov, Louis Kirsch, Francesco Faccio, Jürgen Schmidhuber, Thomas Hofmann, Imanol Schlag

    Abstract: The Languini Kitchen serves as both a research collective and codebase designed to empower researchers with limited computational resources to contribute meaningfully to the field of language modelling. We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours. The number of tokens on which a model is trained is defined by the m… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  4. arXiv:2305.17066  [pdf, other

    cs.AI cs.CL cs.CV cs.LG cs.MA

    Mindstorms in Natural Language-Based Societies of Mind

    Authors: Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, **jie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-** Fan, Bernard Ghanem , et al. (1 additional authors not shown)

    Abstract: Both Minsky's "society of mind" and Schmidhuber's "learning to think" inspire diverse societies of large multimodal neural networks (NNs) that solve problems by interviewing each other in a "mindstorm." Recent implementations of NN-based societies of minds consist of large language models (LLMs) and other NN-based experts communicating through a natural language interface. In doing so, they overco… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 9 pages in main text + 7 pages of references + 38 pages of appendices, 14 figures in main text + 13 in appendices, 7 tables in appendices

    MSC Class: 68T07 ACM Class: I.2.6; I.2.11

  5. arXiv:2305.15001  [pdf, other

    cs.LG cs.AI cs.CV

    Contrastive Training of Complex-Valued Autoencoders for Object Discovery

    Authors: Aleksandar Stanić, Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber

    Abstract: Current state-of-the-art object-centric models use slots and attention-based routing for binding. However, this class of models has several conceptual limitations: the number of slots is hardwired; all slots have equal capacity; training has high computational cost; there are no object-level relational factors within slots. Synchrony-based models in principle can address these limitations by using… ▽ More

    Submitted 9 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: accepted to NeurIPS 2023

  6. arXiv:2208.03374  [pdf, other

    cs.LG cs.AI

    Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter

    Authors: Aleksandar Stanić, Yu** Tang, David Ha, Jürgen Schmidhuber

    Abstract: Reinforcement learning agents must generalize beyond their training experience. Prior work has focused mostly on identical training and evaluation environments. Starting from the recently introduced Crafter benchmark, a 2D open world survival game, we introduce a new set of environments suitable for evaluating some agent's ability to generalize on previously unseen (numbers of) objects and to adap… ▽ More

    Submitted 5 August, 2022; originally announced August 2022.

    ACM Class: I.2.6

  7. arXiv:2103.08877  [pdf, other

    cs.CV cs.AI cs.LG

    Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling

    Authors: Đorđe Miladinović, Aleksandar Stanić, Stefan Bauer, Jürgen Schmidhuber, Joachim M. Buhmann

    Abstract: How to improve generative modeling by better exploiting spatial regularities and coherence in images? We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs). In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way, using a sequential gating-based mechani… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

    Journal ref: International Conference on Learning Representations (2021);

  8. Crack propagation simulation without crack tracking algorithm: embedded discontinuity formulation with incompatible modes

    Authors: A. Stanic, B. Brank, A. Ibrahimbegovic, H. G. Matthies

    Abstract: We show that for the simulation of crack propagation in quasi-brittle, two-dimensional solids, very good results can be obtained with an embedded strong discontinuity quadrilateral finite element that has incompatible modes. Even more importantly, we demonstrate that these results can be obtained without using a crack tracking algorithm. Therefore, the simulation of crack patterns with several cra… ▽ More

    Submitted 6 August, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

    Comments: 53 pages, 43 figures, research paper

  9. arXiv:2010.03635  [pdf, other

    cs.LG cs.AI stat.ML

    Hierarchical Relational Inference

    Authors: Aleksandar Stanić, Sjoerd van Steenkiste, Jürgen Schmidhuber

    Abstract: Common-sense physical reasoning in the real world requires learning about the interactions of objects and their dynamics. The notion of an abstract object, however, encompasses a wide variety of physical objects that differ greatly in terms of the complex behaviors they support. To address this, we propose a novel approach to physical reasoning that models objects as hierarchies of parts that may… ▽ More

    Submitted 14 December, 2020; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: Accepted to AAAI 2021

    ACM Class: I.2.6

  10. arXiv:1910.05231  [pdf, other

    cs.LG stat.ML

    R-SQAIR: Relational Sequential Attend, Infer, Repeat

    Authors: Aleksandar Stanić, Jürgen Schmidhuber

    Abstract: Traditional sequential multi-object attention models rely on a recurrent mechanism to infer object relations. We propose a relational extension (R-SQAIR) of one such attention model (SQAIR) by endowing it with a module with strong relational inductive bias that computes in parallel pairwise interactions between inferred objects. Two recently proposed relational modules are studied on tasks of unsu… ▽ More

    Submitted 11 October, 2019; originally announced October 2019.

    Comments: 4 page workshop paper accepted at the NeurIPS 2019 Workshop on Perception as Generative Reasoning: Structure, Causality, Probability

    ACM Class: I.2.6

  11. arXiv:1605.08283  [pdf, other

    cs.LG cs.CV cs.IT cs.NE stat.ML

    Discrete Deep Feature Extraction: A Theory and New Architectures

    Authors: Thomas Wiatowski, Michael Tschannen, Aleksandar Stanić, Philipp Grohs, Helmut Bölcskei

    Abstract: First steps towards a mathematical theory of deep convolutional neural networks for feature extraction were made---for the continuous-time case---in Mallat, 2012, and Wiatowski and Bölcskei, 2015. This paper considers the discrete case, introduces new convolutional neural network architectures, and proposes a mathematical framework for their analysis. Specifically, we establish deformation and tra… ▽ More

    Submitted 26 May, 2016; originally announced May 2016.

    Comments: Proc. of International Conference on Machine Learning (ICML), New York, USA, June 2016, to appear

    Journal ref: Proc. of International Conference on Machine Learning (ICML), New York, USA, pp. 2149-2158, June 2016