Skip to main content

Showing 1–11 of 11 results for author: Coenen, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2211.05030  [pdf, other

    cs.HC cs.CL

    Creative Writing with an AI-Powered Writing Assistant: Perspectives from Professional Writers

    Authors: Daphne Ippolito, Ann Yuan, Andy Coenen, Sehmon Burnam

    Abstract: Recent developments in natural language generation (NLG) using neural language models have brought us closer than ever to the goal of building AI-powered creative writing tools. However, most prior work on human-AI collaboration in the creative writing domain has evaluated new systems with amateur writers, typically in contrived user studies of limited scope. In this work, we commissioned 13 profe… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  3. LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia

    Authors: Steven M. Goodman, Erin Buehler, Patrick Clary, Andy Coenen, Aaron Donsbach, Tiffanie N. Horne, Michal Lahav, Robert Macdonald, Rain Breaw Michaels, Ajit Narayanan, Mahima Pushkarna, Joel Riley, Alex Santana, Lei Shi, Rachel Sweeney, Phil Weaver, Ann Yuan, Meredith Ringel Morris

    Abstract: Prior work has explored the writing challenges experienced by people with dyslexia, and the potential for new spelling, grammar, and word retrieval technologies to address these challenges. However, the capabilities for natural language generation demonstrated by the latest class of large language models (LLMs) highlight an opportunity to explore new forms of human-AI writing support tools. In thi… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: To appear at The 24th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '22), October 23-26, 2022, Athens, Greece. 26 pages

  4. arXiv:2206.04812  [pdf, other

    cs.CL

    The Case for a Single Model that can Both Generate Continuations and Fill in the Blank

    Authors: Daphne Ippolito, Liam Dugan, Emily Reif, Ann Yuan, Andy Coenen, Chris Callison-Burch

    Abstract: The task of inserting text into a specified position in a passage, known as fill in the blank (FitB), is useful for a variety of applications where writers interact with a natural language generation (NLG) system to craft text. While previous work has tackled this problem with models trained specifically to do the fill-in-the-blank task, a more useful model is one that can effectively perform _bot… ▽ More

    Submitted 30 June, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: This version: fixed bug in the headers of Table 2

    Journal ref: NAACL 2022 Findings

  5. arXiv:2111.06467  [pdf, other

    cs.CL cs.AI cs.LG

    SynthBio: A Case Study in Human-AI Collaborative Curation of Text Datasets

    Authors: Ann Yuan, Daphne Ippolito, Vitaly Nikolaev, Chris Callison-Burch, Andy Coenen, Sebastian Gehrmann

    Abstract: NLP researchers need more, higher-quality text datasets. Human-labeled datasets are expensive to collect, while datasets collected via automatic retrieval from the web such as WikiBio are noisy and can include undesired biases. Moreover, data sourced from the web is often included in datasets used to pretrain models, leading to inadvertent cross-contamination of training and test sets. In this wor… ▽ More

    Submitted 12 January, 2022; v1 submitted 11 November, 2021; originally announced November 2021.

    Comments: 10 pages, 2 figures, accepted to NeurIPS 2021 Datasets and Benchmarks Track

  6. arXiv:2109.03910  [pdf, other

    cs.CL

    A Recipe For Arbitrary Text Style Transfer with Large Language Models

    Authors: Emily Reif, Daphne Ippolito, Ann Yuan, Andy Coenen, Chris Callison-Burch, Jason Wei

    Abstract: In this paper, we leverage large language models (LMs) to perform zero-shot text style transfer. We present a prompting method that we call augmented zero-shot learning, which frames style transfer as a sentence rewriting task and requires only a natural language instruction, without model fine-tuning or exemplars in the target style. Augmented zero-shot learning is simple and demonstrates promisi… ▽ More

    Submitted 31 March, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

  7. arXiv:2107.07430  [pdf, other

    cs.CL

    Wordcraft: a Human-AI Collaborative Editor for Story Writing

    Authors: Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan

    Abstract: As neural language models grow in effectiveness, they are increasingly being applied in real-world settings. However these applications tend to be limited in the modes of interaction they support. In this extended abstract, we propose Wordcraft, an AI-assisted editor for story writing in which a writer and a dialog system collaborate to write a story. Our novel interface uses few-shot learning and… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

    Journal ref: First Workshop on Bridging Human-Computer Interaction and Natural Language Processing at EACL 2021

  8. arXiv:2104.07143  [pdf, other

    cs.CL cs.LG

    An Interpretability Illusion for BERT

    Authors: Tolga Bolukbasi, Adam Pearce, Ann Yuan, Andy Coenen, Emily Reif, Fernanda Viégas, Martin Wattenberg

    Abstract: We describe an "interpretability illusion" that arises when analyzing the BERT model. Activations of individual neurons in the network may spuriously appear to encode a single, simple concept, when in fact they are encoding something far more complex. The same effect holds for linear combinations of activations. We trace the source of this illusion to geometric properties of BERT's embedding space… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

  9. arXiv:2008.05122  [pdf, other

    cs.CL

    The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models

    Authors: Ian Tenney, James Wexler, Jasmijn Bastings, Tolga Bolukbasi, Andy Coenen, Sebastian Gehrmann, Ellen Jiang, Mahima Pushkarna, Carey Radebaugh, Emily Reif, Ann Yuan

    Abstract: We present the Language Interpretability Tool (LIT), an open-source platform for visualization and understanding of NLP models. We focus on core questions about model behavior: Why did my model make this prediction? When does it perform poorly? What happens under a controlled change in the input? LIT integrates local explanations, aggregate analysis, and counterfactual generation into a streamline… ▽ More

    Submitted 12 August, 2020; originally announced August 2020.

  10. arXiv:1906.02715  [pdf, other

    cs.LG cs.CL stat.ML

    Visualizing and Measuring the Geometry of BERT

    Authors: Andy Coenen, Emily Reif, Ann Yuan, Been Kim, Adam Pearce, Fernanda Viégas, Martin Wattenberg

    Abstract: Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of on… ▽ More

    Submitted 28 October, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: 8 pages, 5 figures

  11. arXiv:1807.01068  [pdf, other

    cs.CV cs.AI

    HAMLET: Hierarchical Harmonic Filters for Learning Tracts from Diffusion MRI

    Authors: Marco Reisert, Volker A. Coenen, Christoph Kaller, Karl Egger, Henrik Skibbe

    Abstract: In this work we propose HAMLET, a novel tract learning algorithm, which, after training, maps raw diffusion weighted MRI directly onto an image which simultaneously indicates tract direction and tract presence. The automatic learning of fiber tracts based on diffusion MRI data is a rather new idea, which tries to overcome limitations of atlas-based techniques. HAMLET takes a such an approach. Unli… ▽ More

    Submitted 3 July, 2018; originally announced July 2018.