Skip to main content

Showing 1–50 of 63 results for author: Lin, V

.
  1. arXiv:2405.19325  [pdf, other

    cs.CL

    Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

    Authors: Minghan Li, Xilun Chen, Ari Holtzman, Beidi Chen, Jimmy Lin, Wen-tau Yih, Xi Victoria Lin

    Abstract: Large language models (LLMs) often hallucinate and lack the ability to provide attribution for their generations. Semi-parametric LMs, such as kNN-LM, approach these limitations by refining the output of an LM for a given prompt using its nearest neighbor matches in a non-parametric data store. However, these models often exhibit slow inference speeds and produce non-fluent texts. In this paper, w… ▽ More

    Submitted 30 May, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  2. arXiv:2403.07816  [pdf, other

    cs.CL cs.AI

    Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

    Authors: Sainbayar Sukhbaatar, Olga Golovneva, Vasu Sharma, Hu Xu, Xi Victoria Lin, Baptiste Rozière, Jacob Kahn, Daniel Li, Wen-tau Yih, Jason Weston, Xian Li

    Abstract: We investigate efficient methods for training Large Language Models (LLMs) to possess capabilities in multiple specialized domains, such as coding, math reasoning and world knowledge. Our method, named Branch-Train-MiX (BTX), starts from a seed model, which is branched to train experts in embarrassingly parallel fashion with high throughput and reduced communication cost. After individual experts… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  3. arXiv:2402.14979  [pdf, other

    cs.LG cs.CL stat.ME

    Optimizing Language Models for Human Preferences is a Causal Inference Problem

    Authors: Victoria Lin, Eli Ben-Michael, Louis-Philippe Morency

    Abstract: As large language models (LLMs) see greater use in academic and commercial settings, there is increasing interest in methods that allow language models to generate texts aligned with human preferences. In this paper, we present an initial exploration of language model optimization for human preferences from direct outcome datasets, where each sample consists of a text and an associated numerical o… ▽ More

    Submitted 5 June, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: UAI 2024

  4. arXiv:2402.12847  [pdf, other

    cs.CL cs.AI cs.LG

    Instruction-tuned Language Models are Better Knowledge Learners

    Authors: Zhengbao Jiang, Zhiqing Sun, Weijia Shi, Pedro Rodriguez, Chunting Zhou, Graham Neubig, Xi Victoria Lin, Wen-tau Yih, Srinivasan Iyer

    Abstract: In order for large language model (LLM)-based assistants to effectively adapt to evolving information needs, it must be possible to update their factual knowledge through continued training on new data. The standard recipe for doing so involves continued pre-training on new documents followed by instruction-tuning on question-answer (QA) pairs. However, we find that LLMs trained with this recipe s… ▽ More

    Submitted 25 May, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: ACL 2024. The reproduced data for this paper is available at https://github.com/Edward-Sun/PIT

  5. arXiv:2311.16067  [pdf, other

    math.GT

    Mosaic number and Tile number of Corner Connection Tiles

    Authors: Vincent Lin

    Abstract: Lomonaco and Kauffman introduced knot mosaics in 2008 to model physical quantum states. These mosaics use a set of tiles to represent knots on $n x n$ grids. In 2023 Heap introduced a new set of tiles that can represent knots on a smaller board for small knots. Completing an exhaustive search of all knots or links, $K$, on different board sizes and types is the most common way to determine invaria… ▽ More

    Submitted 27 November, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: 24 pages, 14 figures

    MSC Class: 57K10

  6. arXiv:2310.20697  [pdf, other

    cs.CL stat.ME

    Text-Transport: Toward Learning Causal Effects of Natural Language

    Authors: Victoria Lin, Louis-Philippe Morency, Eli Ben-Michael

    Abstract: As language technologies gain prominence in real-world settings, it is important to understand how changes to language affect reader perceptions. This can be formalized as the causal effect of varying a linguistic attribute (e.g., sentiment) on a reader's response to the text. In this paper, we introduce Text-Transport, a method for estimation of causal effects from natural language under any text… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023

  7. arXiv:2310.10638  [pdf, other

    cs.CL cs.AI cs.LG

    In-context Pretraining: Language Modeling Beyond Document Boundaries

    Authors: Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Gergely Szilvasy, Rich James, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis

    Abstract: Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion. Existing pretraining pipelines train LMs by concatenating random sets of short documents to create input contexts but the prior documents provide no signal for predicting the next d… ▽ More

    Submitted 24 June, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  8. arXiv:2310.01352  [pdf, other

    cs.CL cs.AI

    RA-DIT: Retrieval-Augmented Dual Instruction Tuning

    Authors: Xi Victoria Lin, Xilun Chen, Mingda Chen, Weijia Shi, Maria Lomeli, Rich James, Pedro Rodriguez, Jacob Kahn, Gergely Szilvasy, Mike Lewis, Luke Zettlemoyer, Scott Yih

    Abstract: Retrieval-augmented language models (RALMs) improve performance by accessing long-tail and up-to-date knowledge from external data stores, but are challenging to build. Existing approaches require either expensive retrieval-specific modifications to LM pre-training or use post-hoc integration of the data store that leads to suboptimal performance. We introduce Retrieval-Augmented Dual Instruction… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: v4: ICLR 2024 camera-ready version

  9. arXiv:2308.14815  [pdf, other

    cs.AI cs.LG cs.RO

    Distributionally Robust Statistical Verification with Imprecise Neural Networks

    Authors: Souradeep Dutta, Michele Caprio, Vivian Lin, Matthew Cleaveland, Kuk ** Jang, Ivan Ruchkin, Oleg Sokolsky, Insup Lee

    Abstract: A particularly challenging problem in AI safety is providing guarantees on the behavior of high-dimensional autonomous systems. Verification approaches centered around reachability analysis fail to scale, and purely statistical approaches are constrained by the distributional assumptions about the sampling process. Instead, we pose a distributionally robust version of the statistical verification… ▽ More

    Submitted 11 December, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

  10. arXiv:2306.01061  [pdf, other

    cs.CL cs.DB

    Reimagining Retrieval Augmented Language Models for Answering Queries

    Authors: Wang-Chiew Tan, Yuliang Li, Pedro Rodriguez, Richard James, Xi Victoria Lin, Alon Halevy, Scott Yih

    Abstract: We present a reality check on large language models and inspect the promise of retrieval augmented language models in comparison. Such language models are semi-parametric, where models integrate model parameters and knowledge from external data sources to make their predictions, as opposed to the parametric nature of vanilla large language models. We give initial experimental findings that semi-pa… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  11. arXiv:2305.14728  [pdf, other

    cs.CL

    SenteCon: Leveraging Lexicons to Learn Human-Interpretable Language Representations

    Authors: Victoria Lin, Louis-Philippe Morency

    Abstract: Although deep language representations have become the dominant form of language featurization in recent years, in many settings it is important to understand a model's decision-making process. This necessitates not only an interpretable model but also interpretable features. In particular, language must be featurized in a way that is interpretable while still characterizing the original text well… ▽ More

    Submitted 1 June, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL 2023

  12. arXiv:2305.14083  [pdf, other

    cs.LG cs.CL

    Counterfactual Augmentation for Multimodal Learning Under Presentation Bias

    Authors: Victoria Lin, Louis-Philippe Morency, Dimitrios Dimitriadis, Srinagesh Sharma

    Abstract: In real-world machine learning systems, labels are often derived from user behaviors that the system wishes to encourage. Over time, new models must be trained as new training examples and features become available. However, feedback loops between users and models can bias future user behavior, inducing a presentation bias in the labels that compromises the ability to train new models. In this pap… ▽ More

    Submitted 30 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of EMNLP 2023

  13. arXiv:2305.13999  [pdf, other

    cs.CL cs.LG

    Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model

    Authors: Zeyu Leo Liu, Tim Dettmers, Xi Victoria Lin, Veselin Stoyanov, Xian Li

    Abstract: Large and sparse feed-forward layers (S-FFN) such as Mixture-of-Experts (MoE) have proven effective in scaling up Transformers model size for \textit{pretraining} large language models. By only activating part of the FFN parameters conditioning on input, S-FFN improves generalization performance while kee** training and inference costs (in FLOPs) fixed. In this work, we analyzed two major design… ▽ More

    Submitted 23 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023

  14. arXiv:2304.12470  [pdf, other

    cs.CV

    Vision-based Estimation of Fatigue and Engagement in Cognitive Training Sessions

    Authors: Yanchen Wang, Adam Turnbull, Yunlong Xu, Kathi Heffner, Feng Vankee Lin, Ehsan Adeli

    Abstract: Computerized cognitive training (CCT) is a scalable, well-tolerated intervention that has promise for slowing cognitive decline. Outcomes from CCT are limited by a lack of effective engagement, which is decreased by factors such as mental fatigue, particularly in older adults at risk for dementia. There is a need for scalable, automated measures that can monitor mental fatigue during CCT. Here, we… ▽ More

    Submitted 15 November, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: 23 pages, 6 figures

  15. arXiv:2302.10341  [pdf, other

    cs.LG cs.CV

    DC4L: Distribution Shift Recovery via Data-Driven Control for Deep Learning Models

    Authors: Vivian Lin, Kuk ** Jang, Souradeep Dutta, Michele Caprio, Oleg Sokolsky, Insup Lee

    Abstract: Deep neural networks have repeatedly been shown to be non-robust to the uncertainties of the real world, even to naturally occurring ones. A vast majority of current approaches have focused on data-augmentation methods to expand the range of perturbations that the classifier is exposed to while training. A relatively unexplored avenue that is equally promising involves sanitizing an image as a pre… ▽ More

    Submitted 15 May, 2024; v1 submitted 20 February, 2023; originally announced February 2023.

  16. arXiv:2302.09656  [pdf, other

    cs.LG stat.ML

    Credal Bayesian Deep Learning

    Authors: Michele Caprio, Souradeep Dutta, Kuk ** Jang, Vivian Lin, Radoslav Ivanov, Oleg Sokolsky, Insup Lee

    Abstract: Uncertainty quantification and robustness to distribution shifts are important goals in machine learning and artificial intelligence. Although Bayesian Neural Networks (BNNs) allow for uncertainty in the predictions to be assessed, different sources of uncertainty are indistinguishable. We present Credal Bayesian Deep Learning (CBDL). Heuristically, CBDL allows to train an (uncountably) infinite e… ▽ More

    Submitted 22 February, 2024; v1 submitted 19 February, 2023; originally announced February 2023.

    MSC Class: Primary: 68T37; Secondary: 68T05; 68W25

  17. arXiv:2302.08468  [pdf, other

    cs.LG cs.CL cs.PL cs.SE

    LEVER: Learning to Verify Language-to-Code Generation with Execution

    Authors: Ansong Ni, Srini Iyer, Dragomir Radev, Ves Stoyanov, Wen-tau Yih, Sida I. Wang, Xi Victoria Lin

    Abstract: The advent of large language models trained on code (code LLMs) has led to significant progress in language-to-code generation. State-of-the-art approaches in this area combine LLM decoding with sample pruning and reranking using test cases or heuristics based on the execution results. However, it is challenging to obtain test cases for many real-world language-to-code applications, and heuristics… ▽ More

    Submitted 1 September, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: ICML'23; code available at https://github.com/niansong1996/lever

  18. arXiv:2212.12017  [pdf, other

    cs.CL

    OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

    Authors: Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, ** Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov

    Abstract: Recent work has shown that fine-tuning large pre-trained language models on a collection of tasks described via instructions, a.k.a. instruction-tuning, improves their zero and few-shot generalization to unseen tasks. However, there is a limited understanding of the performance trade-offs of different decisions made during the instruction-tuning process. These decisions include the scale and diver… ▽ More

    Submitted 30 January, 2023; v1 submitted 22 December, 2022; originally announced December 2022.

    Comments: 56 pages. v2->v3: fix OPT-30B evaluation results across benchmarks (previously we reported lower performance of this model due to an evaluation pipeline bug)

  19. arXiv:2212.09803  [pdf, other

    cs.CL cs.AI cs.LG

    Training Trajectories of Language Models Across Scales

    Authors: Mengzhou Xia, Mikel Artetxe, Chunting Zhou, Xi Victoria Lin, Ramakanth Pasunuru, Danqi Chen, Luke Zettlemoyer, Ves Stoyanov

    Abstract: Scaling up language models has led to unprecedented performance gains, but little is understood about how the training dynamics change as models get larger. How do language models of different sizes learn during pre-training? Why do larger language models demonstrate more desirable behaviors? In this paper, we analyze the intermediate training checkpoints of differently sized OPT models (Zhang et… ▽ More

    Submitted 29 May, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: Accepted to ACL 2023; The code and analysis results are available at https://github.com/xiamengzhou/training_trajectory_analysis

  20. arXiv:2211.13196  [pdf, other

    cs.LG cs.CL

    SeedBERT: Recovering Annotator Rating Distributions from an Aggregated Label

    Authors: Aneesha Sampath, Victoria Lin, Louis-Philippe Morency

    Abstract: Many machine learning tasks -- particularly those in affective computing -- are inherently subjective. When asked to classify facial expressions or to rate an individual's attractiveness, humans may disagree with one another, and no single answer may be objectively correct. However, machine learning datasets commonly have just one "ground truth" label for each sample, so models trained on these la… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: To be published in AAAI-23 Workshop on Uncertainty Reasoning and Quantification in Decision Making

  21. arXiv:2209.00840  [pdf, other

    cs.CL

    FOLIO: Natural Language Reasoning with First-Order Logic

    Authors: Simeng Han, Hailey Schoelkopf, Yilun Zhao, Zhenting Qi, Martin Riddell, Wenfei Zhou, James Coady, David Peng, Yujie Qiao, Luke Benson, Lucy Sun, Alex Wardle-Solano, Hannah Szabo, Ekaterina Zubova, Matthew Burtell, Jonathan Fan, Yixin Liu, Brian Wong, Malcolm Sailor, Ansong Ni, Linyong Nan, Jungo Kasai, Tao Yu, Rui Zhang, Alexander R. Fabbri , et al. (10 additional authors not shown)

    Abstract: Large language models (LLMs) have achieved remarkable performance on a variety of natural language understanding tasks. However, existing benchmarks are inadequate in measuring the complex logical reasoning capabilities of a model. We present FOLIO, a human-annotated, logically complex and diverse dataset for reasoning in natural language (NL), equipped with first-order logic (FOL) annotations. FO… ▽ More

    Submitted 17 May, 2024; v1 submitted 2 September, 2022; originally announced September 2022.

  22. arXiv:2205.06266  [pdf, other

    cs.CL

    Lifting the Curse of Multilinguality by Pre-training Modular Transformers

    Authors: Jonas Pfeiffer, Naman Goyal, Xi Victoria Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe

    Abstract: Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages. We address this issue by introducing language-specific modules, which allows us to grow the total capacity of the model, while kee** the total number of trainable parameters per language constant. In contrast with prior work that learn… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: NAACL 2022

  23. arXiv:2205.02014  [pdf, other

    cs.CL cs.AI cs.LG

    On Continual Model Refinement in Out-of-Distribution Data Streams

    Authors: Bill Yuchen Lin, Sida Wang, Xi Victoria Lin, Robin Jia, Lin Xiao, Xiang Ren, Wen-tau Yih

    Abstract: Real-world natural language processing (NLP) models need to be continually updated to fix the prediction errors in out-of-distribution (OOD) data streams while overcoming catastrophic forgetting. However, existing continual learning (CL) problem setups cannot cover such a realistic and complex scenario. In response to this, we propose a new CL problem formulation dubbed continual model refinement… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: Accepted to ACL 2022; Project website: https://cmr-nlp.github.io/

  24. arXiv:2205.01068  [pdf, other

    cs.CL cs.LG

    OPT: Open Pre-trained Transformer Language Models

    Authors: Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, Luke Zettlemoyer

    Abstract: Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning. Given their computational cost, these models are difficult to replicate without significant capital. For the few that are available through APIs, no access is granted to the full model weights, making them difficult to study. We present Open… ▽ More

    Submitted 21 June, 2022; v1 submitted 2 May, 2022; originally announced May 2022.

  25. arXiv:2112.13980  [pdf, other

    cs.HC

    Pretty Princess vs. Successful Leader: Gender Roles in Greeting Card Messages

    Authors: Jiao Sun, Tongshuang Wu, Yue Jiang, Ronil Awalegaonkar, Xi Victoria Lin, Diyi Yang

    Abstract: People write personalized greeting cards on various occasions. While prior work has studied gender roles in greeting card messages, systematic analysis at scale and tools for raising the awareness of gender stereoty** remain under-investigated. To this end, we collect a large greeting card message corpus covering three different occasions (birthday, Valentine's Day and wedding) from three source… ▽ More

    Submitted 27 December, 2021; originally announced December 2021.

    Comments: CHI 2022

  26. arXiv:2112.10684  [pdf, other

    cs.CL cs.AI cs.LG

    Efficient Large Scale Language Modeling with Mixtures of Experts

    Authors: Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, **gfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov

    Abstract: Mixture of Experts layers (MoEs) enable efficient scaling of language models through conditional computation. This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning. With the exception of fine-tuning, we… ▽ More

    Submitted 26 October, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

    Comments: EMNLP 2022

  27. arXiv:2112.10668  [pdf, other

    cs.CL cs.AI

    Few-shot Learning with Multilingual Language Models

    Authors: Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, **gfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona Diab, Veselin Stoyanov, Xian Li

    Abstract: Large-scale generative language models such as GPT-3 are competitive few-shot learners. While these models are known to be able to jointly represent many different languages, their training data is dominated by English, potentially limiting their cross-lingual generalization. In this work, we train multilingual generative language models on a corpus covering a diverse set of languages, and study t… ▽ More

    Submitted 10 November, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

    Comments: Accepted to EMNLP 2022; 34 pages

  28. arXiv:2111.09856  [pdf, other

    math.DS

    Long and Short Periodic Billiard Trajectories in the Regular Pentagon

    Authors: Samuel Everett, Vanessa Lin, Aidan Mager

    Abstract: In any periodic direction on the regular pentagon billiard table, there exists two combinatorially different billiard paths, with one longer than the other. For each periodic direction, McMullen asked if one could determine whether the periodic trajectory through a given point is long, short, or a saddle connection. In this paper we present an algorithm resolving this question for trajectories ema… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

    Comments: 11 pages, 12 figures

    MSC Class: 37C83

  29. Salt-based autopeering for DLT-networks

    Authors: Sebastian Müller, Angelo Capossele, Bartosz Kuśmierz, Vivian Lin, Hans Moog, Andreas Penzkofer, Olivia Saa, William Sanders, Wolfgang Welz

    Abstract: The security of any Distributed Ledger Technology (DLT) depends on the safety of the network layer. Much effort has been put into understanding the consensus layer of DLTs. However, many network layer designs seem ad-hoc and lack a careful analysis of the influence of the design decisions on the whole DLT system. We propose a salt-based automated neighbor selection protocol that shows the inherent… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

    Comments: 4 pages

    Journal ref: 2021 3rd Conference on Blockchain Research Applications for Innovative Networks and Services (BRAINS)

  30. arXiv:2110.11473  [pdf, ps, other

    cond-mat.dis-nn cond-mat.stat-mech cond-mat.str-el quant-ph

    Screening the Coulomb interaction leads to a prethermal regime in two-dimensional bad conductors

    Authors: L. J. Stanley, ** V. Lin, J. Jaroszyński, Dragana Popović

    Abstract: The absence of thermalization in certain isolated many-body systems is of great fundamental interest. Many-body localization (MBL) is a widely studied mechanism for thermalization to fail in strongly disordered quantum systems, but it is still not understood precisely how the range of interactions affects the dynamical behavior and the existence of MBL, especially in dimensions $D>1$. By investiga… ▽ More

    Submitted 3 November, 2023; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: 27 pages, incl. 4 figs + Supplementary (9 figs, 3 Suppl. Notes, Suppl. References)

    Journal ref: Nature Communications 14, 7004 (2023)

  31. arXiv:2107.01713  [pdf, other

    cs.SI math.DS nlin.AO physics.soc-ph q-bio.PE

    A Multilayer Network Model of the Coevolution of the Spread of a Disease and Competing Opinions

    Authors: Kaiyan Peng, Zheng Lu, Vanessa Lin, Michael R. Lindstrom, Christian Parkinson, Chuntian Wang, Andrea L. Bertozzi, Mason A. Porter

    Abstract: During the COVID-19 pandemic, conflicting opinions on physical distancing swept across social media, affecting both human behavior and the spread of COVID-19. Inspired by such phenomena, we construct a two-layer multiplex network for the coupled spread of a disease and conflicting opinions. We model each process as a contagion. On one layer, we consider the concurrent evolution of two opinions --… ▽ More

    Submitted 4 July, 2021; originally announced July 2021.

    MSC Class: 91D30; 92D30; 37N25

  32. arXiv:2105.08021  [pdf, other

    cs.CL cs.AI

    Stage-wise Fine-tuning for Graph-to-Text Generation

    Authors: Qingyun Wang, Semih Yavuz, Victoria Lin, Heng Ji, Nazneen Rajani

    Abstract: Graph-to-text generation has benefited from pre-trained language models (PLMs) in achieving better performance than structured graph encoders. However, they fail to fully utilize the structure information of the input graph. In this paper, we aim to further improve the performance of the pre-trained language model by proposing a structured graph-to-text model with a two-step fine-tuning mechanism… ▽ More

    Submitted 30 May, 2021; v1 submitted 17 May, 2021; originally announced May 2021.

    Comments: 10 pages, Accepted by Proceedings of ACL-IJCNLP 2021 Student Research Workshop, Code and Resources at https://github.com/EagleW/Stage-wise-Fine-tuning

  33. arXiv:2104.05827  [pdf, other

    cs.CL

    Learning to Synthesize Data for Semantic Parsing

    Authors: Bailin Wang, Wenpeng Yin, Xi Victoria Lin, Caiming Xiong

    Abstract: Synthesizing data for semantic parsing has gained increasing attention recently. However, most methods require handcrafted (high-precision) rules in their generative process, hindering the exploration of diverse unseen data. In this work, we propose a generative model which features a (non-neural) PCFG that models the composition of programs (e.g., SQL), and a BART-based translation model that map… ▽ More

    Submitted 27 April, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: NAACL 2021 short paper, fixed citation issue

  34. arXiv:2104.00369  [pdf, other

    cs.CL

    FeTaQA: Free-form Table Question Answering

    Authors: Linyong Nan, Chiachun Hsieh, Ziming Mao, Xi Victoria Lin, Neha Verma, Rui Zhang, Wojciech Kryściński, Nick Schoelkopf, Riley Kong, Xiangru Tang, Murori Mutuma, Ben Rosand, Isabel Trindade, Renusree Bandaru, Jacob Cunningham, Caiming Xiong, Dragomir Radev

    Abstract: Existing table question answering datasets contain abundant factual questions that primarily evaluate the query and schema comprehension capability of a system, but they fail to include questions that require complex reasoning and integration of information due to the constraint of the associated short-form answers. To address these issues and to demonstrate the full challenge of table question an… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

  35. arXiv:2103.02523  [pdf, other

    cs.CL

    NeurIPS 2020 NLC2CMD Competition: Translating Natural Language to Bash Commands

    Authors: Mayank Agarwal, Tathagata Chakraborti, Quchen Fu, David Gros, Xi Victoria Lin, Jaron Maene, Kartik Talamadupula, Zhongwei Teng, Jules White

    Abstract: The NLC2CMD Competition hosted at NeurIPS 2020 aimed to bring the power of natural language processing to the command line. Participants were tasked with building models that can transform descriptions of command line tasks in English to their Bash syntax. This is a report on the competition with details of the task, metrics, data, attempted solutions, and lessons learned.

    Submitted 8 August, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: Appears in PMLR Volume 133: NeurIPS 2020 Competition and Demonstration Track. Competition URL: http://ibm.biz/nlc2cmd

  36. arXiv:2012.12627  [pdf, other

    cs.CL cs.AI cs.DB cs.LG

    Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing

    Authors: Xi Victoria Lin, Richard Socher, Caiming Xiong

    Abstract: We present BRIDGE, a powerful sequential architecture for modeling dependencies between natural language questions and relational databases in cross-DB semantic parsing. BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question. The hybrid sequence is encoded by BERT with minimal subsequent layers and the… ▽ More

    Submitted 30 December, 2020; v1 submitted 23 December, 2020; originally announced December 2020.

    Comments: EMNLP Findings 2020 long paper extended; 23 pages

  37. arXiv:2010.09927  [pdf, other

    cs.CL cs.AI cs.DB cs.IR

    ColloQL: Robust Cross-Domain Text-to-SQL Over Search Queries

    Authors: Karthik Radhakrishnan, Arvind Srikantan, Xi Victoria Lin

    Abstract: Translating natural language utterances to executable queries is a helpful technique in making the vast amount of data stored in relational databases accessible to a wider range of non-tech-savvy end users. Prior work in this area has largely focused on textual input that is linguistically correct and semantically unambiguous. However, real-world user queries are often succinct, colloquial, and no… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: IntEx-SemPar Workshop at EMNLP 2020, 12 pages, 3 figures

  38. arXiv:2009.13845  [pdf, other

    cs.CL cs.AI

    GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

    Authors: Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, Richard Socher, Caiming Xiong

    Abstract: We present GraPPa, an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data. We construct synthetic question-SQL pairs over high-quality tables via a synchronous context-free grammar (SCFG) induced from existing text-to-SQL datasets. We pre-train our model on the synthetic data using a novel te… ▽ More

    Submitted 28 May, 2021; v1 submitted 29 September, 2020; originally announced September 2020.

    Comments: 16 pages; Accepted to ICLR 2021

  39. arXiv:2009.00001  [pdf, other

    cs.HC stat.AP

    Toward Multimodal Modeling of Emotional Expressiveness

    Authors: Victoria Lin, Jeffrey M. Girard, Michael A. Sayette, Louis-Philippe Morency

    Abstract: Emotional expressiveness captures the extent to which a person tends to outwardly display their emotions through behavior. Due to the close relationship between emotional expressiveness and behavioral health, as well as the crucial role that it plays in social interaction, the ability to automatically predict emotional expressiveness stands to spur advances in science, medicine, and industry. In t… ▽ More

    Submitted 31 August, 2020; originally announced September 2020.

    Comments: V. Lin and J.M. Girard contributed equally to this research. This paper was accepted to ICMI 2020

  40. arXiv:2007.15280  [pdf, other

    cs.CL cs.AI cs.DB

    Photon: A Robust Cross-Domain Text-to-SQL System

    Authors: Jichuan Zeng, Xi Victoria Lin, Caiming Xiong, Richard Socher, Michael R. Lyu, Irwin King, Steven C. H. Hoi

    Abstract: Natural language interfaces to databases (NLIDB) democratize end user access to relational data. Due to fundamental differences between natural language communication and programming, it is common for end users to issue questions that are ambiguous to the system or fall outside the semantic scope of its underlying query language. We present Photon, a robust, modular, cross-domain NLIDB that can fl… ▽ More

    Submitted 3 August, 2020; v1 submitted 30 July, 2020; originally announced July 2020.

    Comments: ACL 2020 system demonstration paper extended . The first two authors contributed equally to this work

  41. arXiv:2007.02871  [pdf, other

    cs.CL

    DART: Open-Domain Structured Data Record to Text Generation

    Authors: Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, Yangxiaokang Liu, Nadia Irwanto, Jessica Pan, Faiaz Rahman, Ahmad Zaidi, Mutethia Mutuma, Yasin Tarabar, Ankit Gupta, Tao Yu, Yi Chern Tan, Xi Victoria Lin, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani

    Abstract: We present DART, an open domain structured DAta Record to Text generation dataset with over 82k instances (DARTs). Data-to-Text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and contain nontrivial structures. To this end, we propose a procedure of extracting semantic triples from tables that encodes their structures by exploi… ▽ More

    Submitted 12 April, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: NAACL 2021

  42. arXiv:2005.00965  [pdf, other

    cs.CL cs.LG

    Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation

    Authors: Tianlu Wang, Xi Victoria Lin, Nazneen Fatema Rajani, Bryan McCann, Vicente Ordonez, Caiming Xiong

    Abstract: Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by downstream models. Some commonly adopted debiasing approaches, including the seminal Hard Debias algorithm, apply post-processing procedures that project pre-trained word embeddings into a subspace orthogonal to an inferred gender subspace. We discover that semantic-agnostic corpus reg… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

    Comments: Accepted to ACL 2020

  43. arXiv:2003.02236  [pdf, other

    physics.comp-ph cs.LG math.DS

    Forecasting Sequential Data using Consistent Koopman Autoencoders

    Authors: Omri Azencot, N. Benjamin Erichson, Vanessa Lin, Michael W. Mahoney

    Abstract: Recurrent neural networks are widely used on time series data, yet such models often ignore the underlying physical structures in such sequences. A new class of physics-based methods related to Koopman theory has been introduced, offering an alternative for processing nonlinear dynamical systems. In this work, we propose a novel Consistent Koopman Autoencoder model which, unlike the majority of ex… ▽ More

    Submitted 30 June, 2020; v1 submitted 4 March, 2020; originally announced March 2020.

  44. arXiv:1912.04523  [pdf, other

    cs.CV stat.AP

    Context-Dependent Models for Predicting and Characterizing Facial Expressiveness

    Authors: Victoria Lin, Jeffrey M. Girard, Louis-Philippe Morency

    Abstract: In recent years, extensive research has emerged in affective computing on topics like automatic emotion recognition and determining the signals that characterize individual emotions. Much less studied, however, is expressiveness, or the extent to which someone shows any feeling or emotion. Expressiveness is related to personality and mental health and plays a crucial role in social interaction. As… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

  45. arXiv:1909.05378  [pdf, other

    cs.CL cs.AI

    CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

    Authors: Tao Yu, Rui Zhang, He Yang Er, Suyi Li, Eric Xue, Bo Pang, Xi Victoria Lin, Yi Chern Tan, Tianze Shi, Zihan Li, Youxuan Jiang, Michihiro Yasunaga, Sungrok Shim, Tao Chen, Alexander Fabbri, Zifan Li, Luyao Chen, Yuwen Zhang, Shreya Dixit, Vincent Zhang, Caiming Xiong, Richard Socher, Walter S Lasecki, Dragomir Radev

    Abstract: We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems. It consists of 30k+ turns plus 10k+ annotated SQL queries, obtained from a Wizard-of-Oz (WOZ) collection of 3k dialogues querying 200 complex DBs spanning 138 domains. Each dialogue simulates a real-world DB query scenario with a crowd worker as a user exploring the DB and a SQL expert re… ▽ More

    Submitted 11 September, 2019; originally announced September 2019.

    Comments: Accepted to EMNLP 2019, long paper

  46. arXiv:1909.00786  [pdf, other

    cs.CL

    Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions

    Authors: Rui Zhang, Tao Yu, He Yang Er, Sungrok Shim, Eric Xue, Xi Victoria Lin, Tianze Shi, Caiming Xiong, Richard Socher, Dragomir Radev

    Abstract: We focus on the cross-domain context-dependent text-to-SQL generation task. Based on the observation that adjacent natural language questions are often linguistically dependent and their corresponding SQL queries tend to overlap, we utilize the interaction history by editing the previous predicted query to improve the generation quality. Our editing mechanism views SQL as sequences and reuses gene… ▽ More

    Submitted 9 September, 2019; v1 submitted 2 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  47. gfoRmula: An R package for estimating effects of general time-varying treatment interventions via the parametric g-formula

    Authors: Victoria Lin, Sean McGrath, Zilu Zhang, Lucia C. Petito, Roger W. Logan, Miguel A. Hernán, Jessica G. Young

    Abstract: Researchers are often interested in using longitudinal data to estimate the causal effects of hypothetical time-varying treatment interventions on the mean or risk of a future outcome. Standard regression/conditioning methods for confounding control generally fail to recover causal effects when time-varying confounders are themselves affected by past treatment. In such settings, estimators derived… ▽ More

    Submitted 29 October, 2019; v1 submitted 19 August, 2019; originally announced August 2019.

    Comments: V. Lin and S. McGrath made equal contributions. M.A. Hernan and J.G. Young made equal contributions

    Journal ref: Patterns 1 (2020) 100008

  48. arXiv:1906.02285  [pdf, other

    cs.CL cs.AI

    SParC: Cross-Domain Semantic Parsing in Context

    Authors: Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Heyang Er, Irene Li, Bo Pang, Tao Chen, Emily Ji, Shreya Dixit, David Proctor, Sungrok Shim, Jonathan Kraft, Vincent Zhang, Caiming Xiong, Richard Socher, Dragomir Radev

    Abstract: We present SParC, a dataset for cross-domainSemanticParsing inContext that consists of 4,298 coherent question sequences (12k+ individual questions annotated with SQL queries). It is obtained from controlled user interactions with 200 complex databases over 138 domains. We provide an in-depth analysis of SParC and show that it introduces new challenges compared to existing datasets. SParC demonstr… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: Accepted to ACL 2019, long paper

  49. arXiv:1808.10568  [pdf, other

    cs.AI cs.CL cs.LG

    Multi-Hop Knowledge Graph Reasoning with Reward Sha**

    Authors: Xi Victoria Lin, Richard Socher, Caiming Xiong

    Abstract: Multi-hop reasoning is an effective approach for query answering (QA) over incomplete knowledge graphs (KGs). The problem can be formulated in a reinforcement learning (RL) setup, where a policy-based agent sequentially extends its inference path until it reaches a target. However, in an incomplete KG environment, the agent receives low-quality rewards corrupted by false negatives in the training… ▽ More

    Submitted 11 September, 2018; v1 submitted 30 August, 2018; originally announced August 2018.

    Comments: Accepted to EMNLP 2018, 12 pages

  50. arXiv:1808.03366  [pdf, ps, other

    math.AP math.AG math.SP

    Polynomial-like elements in vector spaces with group actions

    Authors: Minh Kha, Vladimir Lin

    Abstract: In this paper, we study polynomial-like elements in vector spaces equipped with group actions. We first define these elements via iterated difference operators. In the case of a full rank lattice acting on an Euclidean space, these polynomial-like elements are exactly polynomials with periodic coefficients, which are closely related to solutions of periodic differential equations. Our main theorem… ▽ More

    Submitted 9 August, 2018; originally announced August 2018.

    Comments: To appear in Contemporary Mathematics