Skip to main content

Showing 1–18 of 18 results for author: Contractor, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.05979  [pdf, other

    cs.SE cs.AI

    On the Standardization of Behavioral Use Clauses and Their Adoption for Responsible Licensing of AI

    Authors: Daniel McDuff, Tim Korjakow, Scott Cambo, Jesse Josua Benjamin, Jenny Lee, Yacine Jernite, Carlos Muñoz Ferrandis, Aaron Gokaslan, Alek Tarkowski, Joseph Lindley, A. Feder Cooper, Danish Contractor

    Abstract: Growing concerns over negligent or malicious uses of AI have increased the appetite for tools that help manage the risks of the technology. In 2018, licenses with behaviorial-use clauses (commonly referred to as Responsible AI Licenses) were proposed to give developers a framework for releasing AI assets while specifying their users to mitigate negative applications. As of the end of 2023, on the… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  2. arXiv:2305.11790  [pdf, other

    cs.CL

    Prompting with Pseudo-Code Instructions

    Authors: Mayank Mishra, Prince Kumar, Riyaz Bhat, Rudra Murthy V, Danish Contractor, Srikanth Tamilselvam

    Abstract: Prompting with natural language instructions has recently emerged as a popular method of harnessing the capabilities of large language models. Given the inherent ambiguity present in natural language, it is intuitive to consider the possible advantages of prompting with less ambiguous prompt styles, such as the use of pseudo-code. In this paper we explore if prompting via pseudo-code instruction… ▽ More

    Submitted 19 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Published in EMNLP 2023 main track

  3. arXiv:2305.06161  [pdf, other

    cs.CL cs.AI cs.PL cs.SE

    StarCoder: may the source be with you!

    Authors: Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu , et al. (42 additional authors not shown)

    Abstract: The BigCode community, an open-scientific collaboration working on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder and StarCoderBase: 15.5B parameter models with 8K context length, infilling capabilities and fast large-batch inference enabled by multi-query attention. StarCoderBase is trained on 1 trillion tokens sourced from The Stack, a large colle… ▽ More

    Submitted 13 December, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

  4. arXiv:2301.03988  [pdf, other

    cs.SE cs.AI cs.LG

    SantaCoder: don't reach for the stars!

    Authors: Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo , et al. (16 additional authors not shown)

    Abstract: The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This tech report describes the progress of the collaboration until December 2022, outlining the current state of the Personally Identifiable Information (PII) redaction pipeline, the experiments conducted to de-risk the model architecture, and the experiments investigat… ▽ More

    Submitted 24 February, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

  5. arXiv:2301.01015  [pdf, other

    cs.CV cs.AI cs.CL

    Semi-Structured Object Sequence Encoders

    Authors: Rudra Murthy V, Riyaz Bhat, Chulaka Gunasekara, Siva Sankalp Patel, Hui Wan, Tejas Indulal Dhamecha, Danish Contractor, Marina Danilevsky

    Abstract: In this paper we explore the task of modeling semi-structured object sequences; in particular, we focus our attention on the problem of develo** a structure-aware input representation for such sequences. Examples of such data include user activity on websites, machine logs, and many others. This type of data is often represented as a sequence of sets of key-value pairs over time and can present… ▽ More

    Submitted 22 May, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

  6. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  7. arXiv:2210.07295  [pdf, other

    cs.CL cs.LG

    Joint Reasoning on Hybrid-knowledge sources for Task-Oriented Dialog

    Authors: Mayank Mishra, Danish Contractor, Dinesh Raghu

    Abstract: Traditional systems designed for task oriented dialog utilize knowledge present only in structured knowledge sources to generate responses. However, relevant information required to generate responses may also reside in unstructured sources, such as documents. Recent state of the art models such as HyKnow and SeKnow aimed at overcoming these challenges make limiting assumptions about the knowledge… ▽ More

    Submitted 7 February, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

  8. arXiv:2204.02710  [pdf, other

    cs.CL cs.AI

    Mix-and-Match: Scalable Dialog Response Retrieval using Gaussian Mixture Embeddings

    Authors: Gaurav Pandey, Danish Contractor, Sachindra Joshi

    Abstract: Embedding-based approaches for dialog response retrieval embed the context-response pairs as points in the embedding space. These approaches are scalable, but fail to account for the complex, many-to-many relationships that exist between context-response pairs. On the other end of the spectrum, there are approaches that feed the context-response pairs jointly through multiple layers of neural netw… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: 10 pages, 2 figures

  9. arXiv:2112.09544  [pdf

    cs.CY

    It's Time to Do Something: Mitigating the Negative Impacts of Computing Through a Change to the Peer Review Process

    Authors: Brent Hecht, Lauren Wilcox, Jeffrey P. Bigham, Johannes Schöning, Ehsan Hoque, Jason Ernst, Yonatan Bisk, Luigi De Russis, Lana Yarosh, Bushra Anjum, Danish Contractor, Cathy Wu

    Abstract: The computing research community needs to work much harder to address the downsides of our innovations. Between the erosion of privacy, threats to democracy, and automation's effect on employment (among many other issues), we can no longer simply assume that our research will have a net positive impact on the world. While bending the arc of computing innovation towards societal benefit may at firs… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

    Comments: First published on the ACM Future of Computing Academy blog on March 29, 2018. This is the archival version

  10. Variational Learning for Unsupervised Knowledge Grounded Dialogs

    Authors: Mayank Mishra, Dhiraj Madan, Gaurav Pandey, Danish Contractor

    Abstract: Recent methods for knowledge grounded dialogs generate responses by incorporating information from an external textual document. These methods do not require the exact document to be known during training and rely on the use of a retrieval system to fetch relevant documents from a large index. The documents used to generate the responses are modeled as latent variables whose prior probabilities ne… ▽ More

    Submitted 28 April, 2022; v1 submitted 23 November, 2021; originally announced December 2021.

  11. Behavioral Use Licensing for Responsible AI

    Authors: Danish Contractor, Daniel McDuff, Julia Haines, Jenny Lee, Christopher Hines, Brent Hecht, Nicholas Vincent, Hanlin Li

    Abstract: With the growing reliance on artificial intelligence (AI) for many different applications, the sharing of code, data, and models is important to ensure the replicability and democratization of scientific knowledge. Many high-profile academic publishing venues expect code and models to be submitted and released with papers. Furthermore, developers often want to release these assets to encourage dev… ▽ More

    Submitted 20 October, 2022; v1 submitted 4 November, 2020; originally announced November 2020.

    Comments: Paper published at ACM FAccT 2022

  12. arXiv:2010.10216  [pdf, other

    cs.CL cs.AI

    Simulated Chats for Building Dialog Systems: Learning to Generate Conversations from Instructions

    Authors: Biswesh Mohapatra, Gaurav Pandey, Danish Contractor, Sachindra Joshi

    Abstract: Popular dialog datasets such as MultiWOZ are created by providing crowd workers an instruction, expressed in natural language, that describes the task to be accomplished. Crowd workers play the role of a user and an agent to generate dialogs to accomplish tasks involving booking restaurant tables, calling a taxi etc. In this paper, we present a data creation strategy that uses the pre-trained lang… ▽ More

    Submitted 20 October, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted in the Findings of EMNLP 2021

  13. Joint Spatio-Textual Reasoning for Answering Tourism Questions

    Authors: Danish Contractor, Shashank Goel, Mausam, Parag Singla

    Abstract: Our goal is to answer real-world tourism questions that seek Points-of-Interest (POI) recommendations. Such questions express various kinds of spatial and non-spatial constraints, necessitating a combination of textual and spatial reasoning. In response, we develop the first joint spatio-textual reasoning model, which combines geo-spatial knowledge with information in textual corpora to answer que… ▽ More

    Submitted 19 October, 2020; v1 submitted 28 September, 2020; originally announced September 2020.

    Comments: Updated version

  14. arXiv:1909.03759  [pdf, other

    cs.CL cs.AI

    Neural Conversational QA: Learning to Reason v.s. Exploiting Patterns

    Authors: Nikhil Verma, Abhishek Sharma, Dhiraj Madan, Danish Contractor, Harshit Kumar, Sachindra Joshi

    Abstract: Neural Conversational QA tasks like ShARC require systems to answer questions based on the contents of a given passage. On studying recent state-of-the-art models on the ShARCQA task, we found indications that the models learn spurious clues/patterns in the dataset. Furthermore, we show that a heuristic-based program designed to exploit these patterns can have performance comparable to that of the… ▽ More

    Submitted 9 October, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: Accepted at EMNLP 2020. NOTE: An older version of this paper presented a model called 'UrcaNet'. Please view the v1 version of this paper on arxiv for details on that model. This version does not contain UrcaNet

  15. arXiv:1909.03527  [pdf, other

    cs.CL cs.AI cs.IR

    Large Scale Question Answering using Tourism Data

    Authors: Danish Contractor, Krunal Shah, Aditi Partap, Mausam, Parag Singla

    Abstract: We introduce the novel task of answering entity-seeking recommendation questions using a collection of reviews that describe candidate answer entities. We harvest a QA dataset that contains 47,124 paragraph-sized real user questions from travelers seeking recommendations for hotels, attractions and restaurants. Each question can have thousands of candidate answers to choose from and each candidate… ▽ More

    Submitted 27 April, 2020; v1 submitted 8 September, 2019; originally announced September 2019.

    Comments: 20 pages with supplementary notes

  16. Multi-level Memory for Task Oriented Dialogs

    Authors: Revanth Reddy, Danish Contractor, Dinesh Raghu, Sachindra Joshi

    Abstract: Recent end-to-end task oriented dialog systems use memory architectures to incorporate external knowledge in their dialogs. Current work makes simplifying assumptions about the structure of the knowledge base, such as the use of triples to represent knowledge, and combines dialog utterances (context) as well as knowledge base (KB) results as part of the same memory. This causes an explosion in the… ▽ More

    Submitted 11 May, 2019; v1 submitted 24 October, 2018; originally announced October 2018.

    Comments: Accepted as full paper at NAACL 2019

  17. arXiv:1806.01351  [pdf, other

    cs.CL cs.CY cs.IR

    Document Chunking and Learning Objective Generation for Instruction Design

    Authors: Khoi-Nguyen Tran, Jey Han Lau, Danish Contractor, Utkarsh Gupta, Bikram Sengupta, Christopher J. Butler, Mukesh Mohania

    Abstract: Instructional Systems Design is the practice of creating of instructional experiences that make the acquisition of knowledge and skill more efficient, effective, and appealing. Specifically in designing courses, an hour of training material can require between 30 to 500 hours of effort in sourcing and organizing reference data for use in just the preparation of course material. In this paper, we p… ▽ More

    Submitted 5 August, 2018; v1 submitted 1 June, 2018; originally announced June 2018.

    Comments: Proceedings of the 11th International Conference on Education Data Mining (EDM 2018)

  18. Towards Understanding and Answering Multi-Sentence Recommendation Questions on Tourism

    Authors: Danish Contractor, Barun Patra, Mausam Singla, Parag Singla

    Abstract: We introduce the first system towards the novel task of answering complex multisentence recommendation questions in the tourism domain. Our solution uses a pipeline of two modules: question understanding and answering. For question understanding, we define an SQL-like query language that captures the semantic intent of a question; it supports operators like subset, negation, preference and similar… ▽ More

    Submitted 5 January, 2018; originally announced January 2018.