Skip to main content

Showing 1–3 of 3 results for author: Höppe, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2206.07696  [pdf, other

    cs.CV cs.LG stat.ML

    Diffusion Models for Video Prediction and Infilling

    Authors: Tobias Höppe, Arash Mehrjou, Stefan Bauer, Didrik Nielsen, Andrea Dittadi

    Abstract: Predicting and anticipating future outcomes or reasoning about missing information in a sequence are critical skills for agents to be able to make intelligent decisions. This requires strong, temporally coherent generative capabilities. Diffusion models have shown remarkable success in several generative tasks, but have not been extensively explored in the video domain. We present Random-Mask Vide… ▽ More

    Submitted 14 November, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: Published in TMLR (11/2022)

  2. arXiv:2101.00027  [pdf, other

    cs.CL

    The Pile: An 800GB Dataset of Diverse Text for Language Modeling

    Authors: Leo Gao, Stella Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, Horace He, Anish Thite, Noa Nabeshima, Shawn Presser, Connor Leahy

    Abstract: Recent work has demonstrated that increased training dataset diversity improves general cross-domain knowledge and downstream generalization capability for large-scale language models. With this in mind, we present \textit{the Pile}: an 825 GiB English text corpus targeted at training large-scale language models. The Pile is constructed from 22 diverse high-quality subsets -- both existing and new… ▽ More

    Submitted 31 December, 2020; originally announced January 2021.

  3. arXiv:2004.12195  [pdf, other

    cs.DL cs.CL cs.HC

    QURATOR: Innovative Technologies for Content and Data Curation

    Authors: Georg Rehm, Peter Bourgonje, Stefanie Hegele, Florian Kintzel, Julián Moreno Schneider, Malte Ostendorff, Karolina Zaczynska, Armin Berger, Stefan Grill, Sören Räuchle, Jens Rauenbusch, Lisa Rutenburg, André Schmidt, Mikka Wild, Henry Hoffmann, Julian Fink, Sarah Schulz, Jurica Seva, Joachim Quantz, Joachim Böttger, Josefine Matthey, Rolf Fricke, Jan Thomsen, Adrian Paschke, Jamal Al Qundus , et al. (15 additional authors not shown)

    Abstract: In all domains and sectors, the demand for intelligent systems to support the processing and generation of digital content is rapidly increasing. The availability of vast amounts of content and the pressure to publish new content quickly and in rapid succession requires faster, more efficient and smarter processing and generation methods. With a consortium of ten partners from research and industr… ▽ More

    Submitted 25 April, 2020; originally announced April 2020.

    Comments: Proceedings of QURATOR 2020: The conference for intelligent content solutions, Berlin, Germany, February 2020