Skip to main content

Showing 1–8 of 8 results for author: Adams, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.09528  [pdf, other

    cs.CL cs.AI cs.LG

    HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM

    Authors: Zhilin Wang, Yi Dong, Jiaqi Zeng, Virginia Adams, Makesh Narsimhan Sreedhar, Daniel Egert, Olivier Delalleau, Jane Polak Scowcroft, Neel Kant, Aidan Swope, Oleksii Kuchaiev

    Abstract: Existing open-source helpfulness preference datasets do not specify what makes some responses more helpful and others less so. Models trained on these datasets can incidentally learn to model dataset artifacts (e.g. preferring longer but unhelpful responses only due to their length). To alleviate this problem, we collect HelpSteer, a multi-attribute helpfulness dataset annotated for the various as… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  2. arXiv:2210.13673  [pdf, other

    cs.CL

    Evaluating Parameter Efficient Learning for Generation

    Authors: Peng Xu, Mostofa Patwary, Shrimai Prabhumoye, Virginia Adams, Ryan J. Prenger, Wei **, Nayeon Lee, Mohammad Shoeybi, Bryan Catanzaro

    Abstract: Parameter efficient learning methods (PERMs) have recently gained significant attention as they provide an efficient way for pre-trained language models (PLMs) to adapt to a downstream task. However, these conclusions are mostly drawn from in-domain evaluations over the full training set. In this paper, we present comparisons between PERMs and finetuning from three new perspectives: (1) the effect… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022 main conference

  3. arXiv:2206.01137  [pdf, other

    cs.CL cs.LG

    Finding the Right Recipe for Low Resource Domain Adaptation in Neural Machine Translation

    Authors: Virginia Adams, Sandeep Subramanian, Mike Chrzanowski, Oleksii Hrinchuk, Oleksii Kuchaiev

    Abstract: General translation models often still struggle to generate accurate translations in specialized domains. To guide machine translation practitioners and characterize the effectiveness of domain adaptation methods under different data availability scenarios, we conduct an in-depth empirical exploration of monolingual and parallel data approaches to domain adaptation of pre-trained, third-party, NMT… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

  4. arXiv:2111.15641  [pdf, ps, other

    cs.CL

    Automatic Extraction of Medication Names in Tweets as Named Entity Recognition

    Authors: Carol Anderson, Bo Liu, Anas Abidin, Hoo-Chang Shin, Virginia Adams

    Abstract: Social media posts contain potentially valuable information about medical conditions and health-related behavior. Biocreative VII Task 3 focuses on mining this information by recognizing mentions of medications and dietary supplements in tweets. We approach this task by fine tuning multiple BERT-style language models to perform token-level classification, and combining them into ensembles to gener… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

    Comments: Submission to the BioCreative VII challenge - Track-3

  5. arXiv:2111.15622  [pdf, other

    cs.CL

    Chemical Identification and Indexing in PubMed Articles via BERT and Text-to-Text Approaches

    Authors: Virginia Adams, Hoo-Chang Shin, Carol Anderson, Bo Liu, Anas Abidin

    Abstract: The Biocreative VII Track-2 challenge consists of named entity recognition, entity-linking (or entity-normalization), and topic indexing tasks -- with entities and topics limited to chemicals for this challenge. Named entity recognition is a well-established problem and we achieve our best performance with BERT-based BioMegatron models. We extend our BERT-based approach to the entity linking task.… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

    Comments: Submission to the BioCreative VII challenge - Track-2

  6. arXiv:2111.15617  [pdf, other

    cs.CL

    Text Mining Drug/Chemical-Protein Interactions using an Ensemble of BERT and T5 Based Models

    Authors: Virginia Adams, Hoo-Chang Shin, Carol Anderson, Bo Liu, Anas Abidin

    Abstract: In Track-1 of the BioCreative VII Challenge participants are asked to identify interactions between drugs/chemicals and proteins. In-context named entity annotations for each drug/chemical and protein are provided and one of fourteen different interactions must be automatically predicted. For this relation extraction task, we attempt both a BERT-based sentence classification approach, and a more n… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

    Comments: Submission to the BioCreative VII challenge, Track-1

  7. arXiv:2111.08634  [pdf, other

    cs.CL cs.LG

    NVIDIA NeMo Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21

    Authors: Sandeep Subramanian, Oleksii Hrinchuk, Virginia Adams, Oleksii Kuchaiev

    Abstract: This paper provides an overview of NVIDIA NeMo's neural machine translation systems for the constrained data track of the WMT21 News and Biomedical Shared Translation Tasks. Our news task submissions for English-German (En-De) and English-Russian (En-Ru) are built on top of a baseline transformer-based sequence-to-sequence model. Specifically, we use a combination of 1) checkpoint averaging 2) mod… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

    Comments: WMT'21 news and biomedical shared task submission

  8. LFRic: Meeting the challenges of scalability and performance portability in Weather and Climate models

    Authors: S. V. Adams, R. W. Ford, M. Hambley, J. M. Hobson, I. Kavcic, C. M. Maynard, T. Melvin, E. H Mueller, S. Mullerworth, A. R. Porter, M. Rezny, B. J. Shipway, R. Wong

    Abstract: This paper describes LFRic: the new weather and climate modelling system being developed by the UK Met Office to replace the existing Unified Model in preparation for exascale computing in the 2020s. LFRic uses the GungHo dynamical core and runs on a semi-structured cubed-sphere mesh. The design of the supporting infrastructure follows object orientated principles to facilitate modularity and the… ▽ More

    Submitted 12 July, 2019; v1 submitted 19 September, 2018; originally announced September 2018.

    Comments: 41 pages, 10 figures. EASC2018

    Journal ref: Journal of Parallel and Distributed Computing, 132 (2019), 383 -- 396