Skip to main content

Showing 1–4 of 4 results for author: Bayazit, D

.
  1. arXiv:2311.16079  [pdf, other

    cs.CL cs.AI cs.LG

    MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

    Authors: Zeming Chen, Alejandro Hernández Cano, Angelika Romanou, Antoine Bonnet, Kyle Matoba, Francesco Salvi, Matteo Pagliardini, Simin Fan, Andreas Köpf, Amirkeivan Mohtashami, Alexandre Sallinen, Alireza Sakhaeirad, Vinitra Swamy, Igor Krawczuk, Deniz Bayazit, Axel Marmet, Syrielle Montariol, Mary-Anne Hartley, Martin Jaggi, Antoine Bosselut

    Abstract: Large language models (LLMs) can potentially democratize access to medical knowledge. While many efforts have been made to harness and improve LLMs' medical knowledge and reasoning capacities, the resulting models are either closed-source (e.g., PaLM, GPT-4) or limited in scale (<= 13B parameters), which restricts their abilities. In this work, we improve access to large-scale medical LLMs by rele… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  2. arXiv:2310.03084  [pdf, other

    cs.CL cs.AI cs.LG

    Discovering Knowledge-Critical Subnetworks in Pretrained Language Models

    Authors: Deniz Bayazit, Negar Foroutan, Zeming Chen, Gail Weiss, Antoine Bosselut

    Abstract: Pretrained language models (LMs) encode implicit representations of knowledge in their parameters. However, localizing these representations and disentangling them from each other remains an open problem. In this work, we investigate whether pretrained language models contain various knowledge-critical subnetworks: particular sparse computational subgraphs responsible for encoding specific knowled… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  3. arXiv:2305.02364  [pdf, other

    cs.CL

    PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives

    Authors: Silin Gao, Beatriz Borges, Soyoung Oh, Deniz Bayazit, Saya Kanno, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut

    Abstract: Sustaining coherent and engaging narratives requires dialogue or storytelling agents to understand how the personas of speakers or listeners ground the narrative. Specifically, these agents must infer personas of their listeners to produce statements that cater to their interests. They must also learn to maintain consistent speaker personas for themselves throughout the narrative, so that their co… ▽ More

    Submitted 26 May, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: ACL 2023, long paper

  4. arXiv:2012.02705  [pdf, other

    cs.RO cs.CL

    Spatial Language Understanding for Object Search in Partially Observed City-scale Environments

    Authors: Kaiyu Zheng, Deniz Bayazit, Rebecca Mathew, Ellie Pavlick, Stefanie Tellex

    Abstract: Humans use spatial language to naturally describe object locations and their relations. Interpreting spatial language not only adds a perceptual modality for robots, but also reduces the barrier of interfacing with humans. Previous work primarily considers spatial language as goal specification for instruction following tasks in fully observable domains, often paired with reference paths for rewar… ▽ More

    Submitted 31 July, 2021; v1 submitted 4 December, 2020; originally announced December 2020.

    Comments: 11 pages, 12 figures, 3 table; Added acknowledgements. 30th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2021