Skip to main content

Showing 1–7 of 7 results for author: Malvar, S

.
  1. arXiv:2404.00213  [pdf, other

    cs.CL

    Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning

    Authors: Nick Mecklenburg, Yiyou Lin, Xiaoxiao Li, Daniel Holstein, Leonardo Nunes, Sara Malvar, Bruno Silva, Ranveer Chandra, Vijay Aski, Pavan Kumar Reddy Yannam, Tolga Aktas, Todd Hendry

    Abstract: In recent years, Large Language Models (LLMs) have shown remarkable performance in generating human-like text, proving to be a valuable asset across various applications. However, adapting these models to incorporate new, out-of-domain knowledge remains a challenge, particularly for facts and events that occur after the model's knowledge cutoff date. This paper investigates the effectiveness of Su… ▽ More

    Submitted 2 April, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

    Comments: 16 pages; 7 figures. updated authors list

  2. arXiv:2401.08406  [pdf, other

    cs.CL cs.LG

    RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture

    Authors: Angels Balaguer, Vinamra Benara, Renato Luiz de Freitas Cunha, Roberto de M. Estevão Filho, Todd Hendry, Daniel Holstein, Jennifer Marsman, Nick Mecklenburg, Sara Malvar, Leonardo O. Nunes, Rafael Padilha, Morris Sharp, Bruno Silva, Swati Sharma, Vijay Aski, Ranveer Chandra

    Abstract: There are two common ways in which developers are incorporating proprietary and domain-specific data when building applications of Large Language Models (LLMs): Retrieval-Augmented Generation (RAG) and Fine-Tuning. RAG augments the prompt with the external data, while fine-Tuning incorporates the additional knowledge into the model itself. However, the pros and cons of both approaches are not well… ▽ More

    Submitted 30 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

  3. arXiv:2211.05986  [pdf, other

    cs.LG cs.CY

    DeepG2P: Fusing Multi-Modal Data to Improve Crop Production

    Authors: Swati Sharma, Aditi Partap, Maria Angels de Luis Balaguer, Sara Malvar, Ranveer Chandra

    Abstract: Agriculture is at the heart of the solution to achieve sustainability in feeding the world population, but advancing our understanding on how agricultural output responds to climatic variability is still needed. Precision Agriculture (PA), which is a management strategy that uses technology such as remote sensing, Geographical Information System (GIS), and machine learning for decision making in t… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: Under review in AISTATS2023

  4. arXiv:2211.05675  [pdf, other

    cs.LG cs.CY

    Causal Modeling of Soil Processes for Improved Generalization

    Authors: Somya Sharma, Swati Sharma, Andy Neal, Sara Malvar, Eduardo Rodrigues, John Crawford, Emre Kiciman, Ranveer Chandra

    Abstract: Measuring and monitoring soil organic carbon is critical for agricultural productivity and for addressing critical environmental problems. Soil organic carbon not only enriches nutrition in soil, but also has a gamut of co-benefits such as improving water storage and limiting physical erosion. Despite a litany of work in soil organic carbon estimation, current approaches do not generalize well acr… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022 Workshop Tackling Climate Change with Machine Learning

  5. arXiv:2211.00625  [pdf, other

    q-bio.QM cs.LG

    Machine learning can guide experimental approaches for protein digestibility estimations

    Authors: Sara Malvar, Anvita Bhagavathula, Maria Angels de Luis Balaguer, Swati Sharma, Ranveer Chandra

    Abstract: Food protein digestibility and bioavailability are critical aspects in addressing human nutritional demands, particularly when seeking sustainable alternatives to animal-based proteins. In this study, we propose a machine learning approach to predict the true ileal digestibility coefficient of food items. The model makes use of a unique curated dataset that combines nutritional information from di… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: 50 pages, submitted to Nature Food

  6. arXiv:2201.00715  [pdf, other

    cs.LG cs.CY

    Machine learning approaches for localized lockdown during COVID-19: a case study analysis

    Authors: Sara Malvar, Julio Romano Meneghini

    Abstract: At the end of 2019, the latest novel coronavirus Sars-CoV-2 emerged as a significant acute respiratory disease that has become a global pandemic. Countries like Brazil have had difficulty in dealing with the virus due to the high socioeconomic difference of states and municipalities. Therefore, this study presents a new approach using different machine learning and deep learning algorithms applied… ▽ More

    Submitted 3 January, 2022; originally announced January 2022.

    Comments: The code is available at https://github.com/smalvar/COVID-clustering

  7. arXiv:2001.07886  [pdf, other

    cs.MM cs.CR eess.SY

    AMP: Authentication of Media via Provenance

    Authors: Paul England, Henrique S. Malvar, Eric Horvitz, Jack W. Stokes, Cédric Fournet, Rebecca Burke-Aguero, Amaury Chamayou, Sylvan Clebsch, Manuel Costa, John Deutscher, Shabnam Erfani, Matt Gaylor, Andrew Jenks, Kevin Kane, Elissa Redmiles, Alex Shamis, Isha Sharma, Sam Wenker, Anika Zaman

    Abstract: Advances in graphics and machine learning have led to the general availability of easy-to-use tools for modifying and synthesizing media. The proliferation of these tools threatens to cast doubt on the veracity of all media. One approach to thwarting the flow of fake media is to detect modified or synthesized media through machine learning methods. While detection may help in the short term, we be… ▽ More

    Submitted 20 June, 2020; v1 submitted 22 January, 2020; originally announced January 2020.

    Comments: Add detailed manifest description, Add provenance, Improve text