Skip to main content

Showing 1–2 of 2 results for author: Ngu, N

.
  1. arXiv:2308.11189  [pdf, other

    cs.CL cs.AI cs.LG

    Diversity Measures: Domain-Independent Proxies for Failure in Language Model Queries

    Authors: Noel Ngu, Nathaniel Lee, Paulo Shakarian

    Abstract: Error prediction in large language models often relies on domain-specific information. In this paper, we present measures for quantification of error in the response of a large language model based on the diversity of responses to a given prompt - hence independent of the underlying application. We describe how three such measures - based on entropy, Gini impurity, and centroid distance - can be e… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Report number: Accepted to IEEE ICSC '24

  2. arXiv:2302.13814  [pdf, other

    cs.CL cs.AI cs.LG

    An Independent Evaluation of ChatGPT on Mathematical Word Problems (MWP)

    Authors: Paulo Shakarian, Abhinav Koyyalamudi, Noel Ngu, Lakshmivihari Mareedu

    Abstract: We study the performance of a commercially available large language model (LLM) known as ChatGPT on math word problems (MWPs) from the dataset DRAW-1K. To our knowledge, this is the first independent evaluation of ChatGPT. We found that ChatGPT's performance changes dramatically based on the requirement to show its work, failing 20% of the time when it provides work compared with 84% when it does… ▽ More

    Submitted 27 February, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Journal ref: AAAI Spring Symposium 2023 (MAKE)