Skip to main content

Showing 1–12 of 12 results for author: Miculicich, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.05365  [pdf, other

    cs.CL cs.AI cs.LG

    CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation

    Authors: I-Hung Hsu, Zifeng Wang, Long T. Le, Lesly Miculicich, Nanyun Peng, Chen-Yu Lee, Tomas Pfister

    Abstract: Grounded generation aims to equip language models (LMs) with the ability to produce more credible and accountable responses by accurately citing verifiable sources. However, existing methods, by either feeding LMs with raw or preprocessed materials, remain prone to errors. To address this, we introduce CaLM, a novel verification framework. CaLM leverages the insight that a robust grounded response… ▽ More

    Submitted 24 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Camera Ready Version

  2. arXiv:2401.04398  [pdf, other

    cs.CL

    Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding

    Authors: Zilong Wang, Hao Zhang, Chun-Liang Li, Julian Martin Eisenschlos, Vincent Perot, Zifeng Wang, Lesly Miculicich, Yasuhisa Fujii, **gbo Shang, Chen-Yu Lee, Tomas Pfister

    Abstract: Table-based reasoning with large language models (LLMs) is a promising direction to tackle many table understanding tasks, such as table-based question answering and fact verification. Compared with generic reasoning, table-based reasoning requires the extraction of underlying semantics from both free-form questions and semi-structured tabular data. Chain-of-Thought and its similar approaches inco… ▽ More

    Submitted 18 January, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted to ICLR 2024

  3. arXiv:2310.17936  [pdf, other

    cs.CL cs.AI cs.LG

    Transformers as Graph-to-Graph Models

    Authors: James Henderson, Alireza Mohammadshahi, Andrei C. Coman, Lesly Miculicich

    Abstract: We argue that Transformers are essentially graph-to-graph models, with sequences just being a special case. Attention weights are functionally equivalent to graph edges. Our Graph-to-Graph Transformer architecture makes this ability explicit, by inputting graph edges into the attention weight computations and predicting graph edges with attention-like functions, thereby integrating explicit graphs… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted to Big Picture workshop at EMNLP 2023

  4. arXiv:2306.00739  [pdf, other

    cs.CL cs.AI cs.DB

    SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended)

    Authors: Ruoxi Sun, Sercan Ö. Arik, Alex Muzio, Lesly Miculicich, Satya Gundabathula, Pengcheng Yin, Hanjun Dai, Hootan Nakhost, Rajarishi Sinha, Zifeng Wang, Tomas Pfister

    Abstract: Text-to-SQL, the process of translating natural language into Structured Query Language (SQL), represents a transformative application of large language models (LLMs), potentially revolutionizing how humans interact with data. This paper introduces the SQL-PaLM framework, a comprehensive solution for understanding and enhancing Text-to-SQL using LLMs, using in the learning regimes of few-shot prom… ▽ More

    Submitted 30 March, 2024; v1 submitted 26 May, 2023; originally announced June 2023.

  5. arXiv:2305.05171  [pdf, other

    cs.CL

    Summarization with Precise Length Control

    Authors: Lesly Miculicich, Yujia Xie, Song Wang, Pengcheng He

    Abstract: Many applications of text generation such as summarization benefit from accurately controlling the text length. Existing approaches on length-controlled summarization either result in degraded performance or can only control the length approximately. In this work, we present a framework to generate summaries with precisely the specified number of tokens or sentences, while maintaining or even impr… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  6. arXiv:2301.08817  [pdf, other

    cs.CL

    Document Summarization with Text Segmentation

    Authors: Lesly Miculicich, Benjamin Han

    Abstract: In this paper, we exploit the innate document segment structure for improving the extractive summarization task. We build two text segmentation models and find the most optimal strategy to introduce their output predictions in an extractive summarization model. Experimental results on a corpus of scientific articles show that extractive summarization benefits from using a highly accurate segmentat… ▽ More

    Submitted 20 January, 2023; originally announced January 2023.

  7. arXiv:2212.10819  [pdf, other

    cs.CL

    Attend to the Right Context: A Plug-and-Play Module for Content-Controllable Summarization

    Authors: Wen Xiao, Lesly Miculicich, Yang Liu, Pengcheng He, Giuseppe Carenini

    Abstract: Content-Controllable Summarization generates summaries focused on the given controlling signals. Due to the lack of large-scale training corpora for the task, we propose a plug-and-play module RelAttn to adapt any general summarizers to the content-controllable summarization task. RelAttn first identifies the relevant content in the source documents, and then makes the model attend to the right co… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  8. arXiv:2203.16574  [pdf, other

    cs.CL cs.LG

    Graph Refinement for Coreference Resolution

    Authors: Lesly Miculicich, James Henderson

    Abstract: The state-of-the-art models for coreference resolution are based on independent mention pair-wise decisions. We propose a modelling approach that learns coreference at the document-level and takes global decisions. For this purpose, we model coreference links in a graph structure where the nodes are tokens in the text, and the edges represent the relationship between them. Our model predicts the g… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

  9. arXiv:1911.08332  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    Neural Network based End-to-End Query by Example Spoken Term Detection

    Authors: Dhananjay Ram, Lesly Miculicich, Hervé Bourlard

    Abstract: This paper focuses on the problem of query by example spoken term detection (QbE-STD) in zero-resource scenario. State-of-the-art approaches primarily rely on dynamic time war** (DTW) based template matching techniques using phone posterior or bottleneck features extracted from a deep neural network (DNN). We use both monolingual and multilingual bottleneck features, and show that multilingual f… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: Submitted to IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING

  10. arXiv:1908.09507  [pdf, other

    cs.CL cs.LG

    Partially-supervised Mention Detection

    Authors: Lesly Miculicich, James Henderson

    Abstract: Learning to detect entity mentions without using syntactic information can be useful for integration and joint optimization with other tasks. However, it is common to have partially annotated data for this problem. Here, we investigate two approaches to deal with partial annotation of mentions: weighted loss and soft-target classification. We also propose two neural mention detection approaches: a… ▽ More

    Submitted 26 August, 2019; originally announced August 2019.

  11. arXiv:1907.00443  [pdf, other

    cs.CL cs.HC cs.LG cs.SD eess.AS

    Multilingual Bottleneck Features for Query by Example Spoken Term Detection

    Authors: Dhananjay Ram, Lesly Miculicich, Hervé Bourlard

    Abstract: State of the art solutions to query by example spoken term detection (QbE-STD) usually rely on bottleneck feature representation of the query and audio document to perform dynamic time war** (DTW) based template matching. Here, we present a study on QbE-STD performance using several monolingual as well as multilingual bottleneck features extracted from feed forward networks. Then, we propose to… ▽ More

    Submitted 30 June, 2019; originally announced July 2019.

  12. arXiv:1809.01576  [pdf, other

    cs.CL

    Document-Level Neural Machine Translation with Hierarchical Attention Networks

    Authors: Lesly Miculicich, Dhananjay Ram, Nikolaos Pappas, James Henderson

    Abstract: Neural Machine Translation (NMT) can be improved by including document-level contextual information. For this purpose, we propose a hierarchical attention model to capture the context in a structured and dynamic manner. The model is integrated in the original NMT architecture as another level of abstraction, conditioning on the NMT model's own previous hidden states. Experiments show that hierarch… ▽ More

    Submitted 1 October, 2018; v1 submitted 5 September, 2018; originally announced September 2018.

    Comments: EMNLP 2018