Search | arXiv e-print repository

Grading Massive Open Online Courses Using Large Language Models

Authors: Shahriar Golchin, Nikhil Garuda, Christopher Impey, Matthew Wenger

Abstract: Massive open online courses (MOOCs) offer free education globally to anyone with a computer and internet access. Despite this democratization of learning, the massive enrollment in these courses makes it impractical for one instructor to assess every student's writing assignment. As a result, peer grading, often guided by a straightforward rubric, is the method of choice. While convenient, peer gr… ▽ More Massive open online courses (MOOCs) offer free education globally to anyone with a computer and internet access. Despite this democratization of learning, the massive enrollment in these courses makes it impractical for one instructor to assess every student's writing assignment. As a result, peer grading, often guided by a straightforward rubric, is the method of choice. While convenient, peer grading often falls short in terms of reliability and validity. In this study, we explore the feasibility of using large language models (LLMs) to replace peer grading in MOOCs. Specifically, we use two LLMs, GPT-4 and GPT-3.5, across three MOOCs: Introductory Astronomy, Astrobiology, and the History and Philosophy of Astronomy. To instruct LLMs, we use three different prompts based on the zero-shot chain-of-thought (ZCoT) prompting technique: (1) ZCoT with instructor-provided correct answers, (2) ZCoT with both instructor-provided correct answers and rubrics, and (3) ZCoT with instructor-provided correct answers and LLM-generated rubrics. Tested on 18 settings, our results show that ZCoT, when augmented with instructor-provided correct answers and rubrics, produces grades that are more aligned with those assigned by instructors compared to peer grading. Finally, our findings indicate a promising potential for automated grading systems in MOOCs, especially in subjects with well-defined rubrics, to improve the learning experience for millions of online learners worldwide. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: v1. arXiv admin note: substantial text overlap with arXiv:2402.03776

arXiv:2402.03776 [pdf, ps, other]

Large Language Models As MOOCs Graders

Authors: Shahriar Golchin, Nikhil Garuda, Christopher Impey, Matthew Wenger

Abstract: Massive open online courses (MOOCs) unlock the doors to free education for anyone around the globe with access to a computer and the internet. Despite this democratization of learning, the massive enrollment in these courses means it is almost impossible for one instructor to assess every student's writing assignment. As a result, peer grading, often guided by a straightforward rubric, is the meth… ▽ More Massive open online courses (MOOCs) unlock the doors to free education for anyone around the globe with access to a computer and the internet. Despite this democratization of learning, the massive enrollment in these courses means it is almost impossible for one instructor to assess every student's writing assignment. As a result, peer grading, often guided by a straightforward rubric, is the method of choice. While convenient, peer grading often falls short in terms of reliability and validity. In this study, using 18 distinct settings, we explore the feasibility of leveraging large language models (LLMs) to replace peer grading in MOOCs. Specifically, we focus on two state-of-the-art LLMs: GPT-4 and GPT-3.5, across three distinct courses: Introductory Astronomy, Astrobiology, and the History and Philosophy of Astronomy. To instruct LLMs, we use three different prompts based on a variant of the zero-shot chain-of-thought (Zero-shot-CoT) prompting technique: Zero-shot-CoT combined with instructor-provided correct answers; Zero-shot-CoT in conjunction with both instructor-formulated answers and rubrics; and Zero-shot-CoT with instructor-offered correct answers and LLM-generated rubrics. Our results show that Zero-shot-CoT, when integrated with instructor-provided answers and rubrics, produces grades that are more aligned with those assigned by instructors compared to peer grading. However, the History and Philosophy of Astronomy course proves to be more challenging in terms of grading as opposed to other courses. Finally, our study reveals a promising direction for automating grading systems for MOOCs, especially in subjects with well-defined rubrics. △ Less

Submitted 29 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

Comments: v1.3 preprint

arXiv:2311.18382 [pdf, other]

Catalogue of BRITE-Constellation targets I. Fields 1 to 14 (November 2013 - April 2016)

Authors: K. Zwintz, A. Pigulski, R. Kuschnig, G. A. Wade, G. Doherty, M. Earl, C. Lovekin, M. Muellner, S. Piché-Perrier, T. Steindl, P. G. Beck, K. Bicz, D. M. Bowman, G. Handler, B. Pablo, A. Popowicz, T. Rozanski, P. Mikołajczyk, D. Baade, O. Koudelka, A. F. J. Moffat, C. Neiner, P. Orleanski, R. Smolec, N. St. Louis , et al. (3 additional authors not shown)

Abstract: The BRIght Target Explorer (BRITE) mission collects photometric time series in two passbands aiming to investigate stellar structure and evolution. Since their launches in the years 2013 and 2014, the constellation of five BRITE nano-satellites has observed a total of more than 700 individual bright stars in 64 fields. Some targets have been observed multiple times. Thus, the total time base of th… ▽ More The BRIght Target Explorer (BRITE) mission collects photometric time series in two passbands aiming to investigate stellar structure and evolution. Since their launches in the years 2013 and 2014, the constellation of five BRITE nano-satellites has observed a total of more than 700 individual bright stars in 64 fields. Some targets have been observed multiple times. Thus, the total time base of the data sets acquired for those stars can be as long as nine years. Our aim is to provide a complete description of ready-to-use BRITE data, to show the scientific potential of the BRITE-Constellation data by identifying the most interesting targets, and to demonstrate and encourage how scientists can use these data in their research. We apply a decorrelation process to the automatically reduced BRITE-Constellation data to correct for instrumental effects. We perform a statistical analysis of the light curves obtained for the 300 stars observed in the first 14 fields during the first ~2.5 years of the mission. We also perform cross-identification with the International Variable Star Index. We present the data obtained by the BRITE-Constellation mission in the first 14 fields it observed from November 2013 to April 2016. We also describe the properties of the data for these fields and the 300 stars observed in them. Using these data, we detected variability in 64% of the presented sample of stars. Sixty-four stars or 21.3% of the sample have not yet been identified as variable in the literature and their data have not been analysed in detail. They can therefore provide valuable scientific material for further research. All data are made publicly available through the BRITE Public Data Archive and the Canadian Astronomy Data Centre. △ Less

Submitted 30 November, 2023; originally announced November 2023.

Comments: accepted by Astronomy & Astrophysics, 13 pages main text, 22 pages of appendix

arXiv:2306.14924 [pdf, other]

LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

Authors: Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim

Abstract: Deductive coding is a widely used qualitative research method for determining the prevalence of themes across documents. While useful, deductive coding is often burdensome and time consuming since it requires researchers to read, interpret, and reliably categorize a large body of unstructured text documents. Large language models (LLMs), like ChatGPT, are a class of quickly evolving AI tools that… ▽ More Deductive coding is a widely used qualitative research method for determining the prevalence of themes across documents. While useful, deductive coding is often burdensome and time consuming since it requires researchers to read, interpret, and reliably categorize a large body of unstructured text documents. Large language models (LLMs), like ChatGPT, are a class of quickly evolving AI tools that can perform a range of natural language processing and reasoning tasks. In this study, we explore the use of LLMs to reduce the time it takes for deductive coding while retaining the flexibility of a traditional content analysis. We outline the proposed approach, called LLM-assisted content analysis (LACA), along with an in-depth case study using GPT-3.5 for LACA on a publicly available deductive coding data set. Additionally, we conduct an empirical benchmark using LACA on 4 publicly available data sets to assess the broader question of how well GPT-3.5 performs across a range of deductive coding tasks. Overall, we find that GPT-3.5 can often perform deductive coding at levels of agreement comparable to human coders. Additionally, we demonstrate that LACA can help refine prompts for deductive coding, identify codes for which an LLM is randomly guessing, and help assess when to use LLMs vs. human coders for deductive coding. We conclude with several implications for future practice of deductive coding and related research methods. △ Less

Submitted 23 June, 2023; originally announced June 2023.

arXiv:2211.09862 [pdf, other]

Knowledge distillation for fast and accurate DNA sequence correction

Authors: Anastasiya Belyaeva, Joel Shor, Daniel E. Cook, Kishwar Shafin, Daniel Liu, Armin Töpfer, Aaron M. Wenger, William J. Rowell, Howard Yang, Alexey Kolesnikov, Cory Y. McLean, Maria Nattestad, Andrew Carroll, Pi-Chuan Chang

Abstract: Accurate genome sequencing can improve our understanding of biology and the genetic basis of disease. The standard approach for generating DNA sequences from PacBio instruments relies on HMM-based models. Here, we introduce Distilled DeepConsensus - a distilled transformer-encoder model for sequence correction, which improves upon the HMM-based methods with runtime constraints in mind. Distilled D… ▽ More Accurate genome sequencing can improve our understanding of biology and the genetic basis of disease. The standard approach for generating DNA sequences from PacBio instruments relies on HMM-based models. Here, we introduce Distilled DeepConsensus - a distilled transformer-encoder model for sequence correction, which improves upon the HMM-based methods with runtime constraints in mind. Distilled DeepConsensus is 1.3x faster and 1.5x smaller than its larger counterpart while improving the yield of high quality reads (Q30) over the HMM-based method by 1.69x (vs. 1.73x for larger model). With improved accuracy of genomic sequences, Distilled DeepConsensus improves downstream applications of genomic sequence analysis such as reducing variant calling errors by 39% (34% for larger model) and improving genome assembly quality by 3.8% (4.2% for larger model). We show that the representations learned by Distilled DeepConsensus are similar between faster and slower models. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Journal ref: Learning Meaningful Representations of Life, NeurIPS 2022 workshop oral paper

arXiv:2110.12383 [pdf, other]

Automated Extraction of Sentencing Decisions from Court Cases in the Hebrew Language

Authors: Mohr Wenger, Tom Kalir, Noga Berger, Carmit Chalamish, Renana Keydar, Gabriel Stanovsky

Abstract: We present the task of Automated Punishment Extraction (APE) in sentencing decisions from criminal court cases in Hebrew. Addressing APE will enable the identification of sentencing patterns and constitute an important step** stone for many follow up legal NLP applications in Hebrew, including the prediction of sentencing decisions. We curate a dataset of sexual assault sentencing decisions and… ▽ More We present the task of Automated Punishment Extraction (APE) in sentencing decisions from criminal court cases in Hebrew. Addressing APE will enable the identification of sentencing patterns and constitute an important step** stone for many follow up legal NLP applications in Hebrew, including the prediction of sentencing decisions. We curate a dataset of sexual assault sentencing decisions and a manually-annotated evaluation dataset, and implement rule-based and supervised models. We find that while supervised models can identify the sentence containing the punishment with good accuracy, rule-based approaches outperform them on the full APE task. We conclude by presenting a first analysis of sentencing patterns in our dataset and analyze common models' errors, indicating avenues for future work, such as distinguishing between probation and actual imprisonment punishment. We will make all our resources available upon request, including data, annotation, and first benchmark models. △ Less

Submitted 24 October, 2021; originally announced October 2021.

Comments: Accepted to the Natural Legal Language Processing workshop (NLLP 2021), colocated with EMNLP 2021

arXiv:1812.06591 [pdf, other]

SMART: An Open Source Data Labeling Platform for Supervised Learning

Authors: Rob Chew, Michael Wenger, Caroline Kery, Jason Nance, Keith Richards, Emily Hadley, Peter Baumgartner

Abstract: SMART is an open source web application designed to help data scientists and research teams efficiently build labeled training data sets for supervised machine learning tasks. SMART provides users with an intuitive interface for creating labeled data sets, supports active learning to help reduce the required amount of labeled data, and incorporates inter-rater reliability statistics to provide ins… ▽ More SMART is an open source web application designed to help data scientists and research teams efficiently build labeled training data sets for supervised machine learning tasks. SMART provides users with an intuitive interface for creating labeled data sets, supports active learning to help reduce the required amount of labeled data, and incorporates inter-rater reliability statistics to provide insight into label quality. SMART is designed to be platform agnostic and easily deployable to meet the needs of as many different research teams as possible. The project website contains links to the code repository and extensive user documentation. △ Less

Submitted 11 December, 2018; originally announced December 2018.

Comments: 5 pages, 1 figure

Journal ref: The Journal of Machine Learning Research, 20(1), 2999-3003 (2019)

arXiv:1410.7453 [pdf, other]

GMWB Riders in a Binomial Framework - Pricing, Hedging, and Diversification of Mortality Risk

Authors: Cody B. Hyndman, Menachem Wenger

Abstract: We construct a binomial model for a guaranteed minimum withdrawal benefit (GMWB) rider to a variable annuity (VA) under optimal policyholder behaviour. The binomial model results in explicitly formulated perfect hedging strategies funded using only periodic fee income. We consider the separate perspectives of the insurer and policyholder and introduce a unifying relationship. Decompositions of the… ▽ More We construct a binomial model for a guaranteed minimum withdrawal benefit (GMWB) rider to a variable annuity (VA) under optimal policyholder behaviour. The binomial model results in explicitly formulated perfect hedging strategies funded using only periodic fee income. We consider the separate perspectives of the insurer and policyholder and introduce a unifying relationship. Decompositions of the VA and GMWB contract into term-certain payments and options representing the guarantee and early surrender features are extended to the binomial framework. We incorporate an approximation algorithm for Asian options that significantly improves efficiency of the binomial model while retaining accuracy. Several numerical examples are provided which illustrate both the accuracy and the tractability of the binomial model. We extend the binomial model to include policy holder mortality and death benefits. Pricing, hedging, and the decompositions of the contract are extended to incorporate mortality risk. We prove limiting results for the hedging strategies and demonstrate mortality risk diversification. Numerical examples are provided which illustrate the effectiveness of hedging and the diversification of mortality risk under capacity constraints with finite pools. △ Less

Submitted 6 July, 2016; v1 submitted 27 October, 2014; originally announced October 2014.

Comments: 41 pages, 11 figures; This paper combines a previous version titled "Pricing and Hedging GMWB Riders in a Binomial Framework" (arXiv:1410.7453v1) and the working paper titled "Diversification of mortality risk in GMWB rider pricing and hedging"

MSC Class: 91G20; 91G60; 91B30; 60G40

arXiv:1307.2562 [pdf, ps, other]

doi 10.1016/j.insmatheco.2014.02.004

Valuation Perspectives and Decompositions for Variable Annuities with GMWB riders

Authors: Cody B. Hyndman, Menachem Wenger

Abstract: The guaranteed minimum withdrawal benefit (GMWB) rider, as an add on to a variable annuity (VA), guarantees the return of premiums in the form of peri- odic withdrawals while allowing policyholders to participate fully in any market gains. GMWB riders represent an embedded option on the account value with a fee structure that is different from typical financial derivatives. We consider fair pricin… ▽ More The guaranteed minimum withdrawal benefit (GMWB) rider, as an add on to a variable annuity (VA), guarantees the return of premiums in the form of peri- odic withdrawals while allowing policyholders to participate fully in any market gains. GMWB riders represent an embedded option on the account value with a fee structure that is different from typical financial derivatives. We consider fair pricing of the GMWB rider from a financial economic perspective. Particular focus is placed on the distinct perspectives of the insurer and policyholder and the unifying relationship. We extend a decomposition of the VA contract into components that reflect term-certain payments and embedded derivatives to the case where the policyholder has the option to surrender, or lapse, the contract early. △ Less

Submitted 27 December, 2013; v1 submitted 9 July, 2013; originally announced July 2013.

Comments: 18 pages, proof of Lemma A.1 expanded for clarity

MSC Class: 91G20; 60G40

arXiv:astro-ph/0002110 [pdf, ps, other]

doi 10.1051/aas:2000332

The SIMBAD astronomical database

Authors: Marc Wenger, Francois Ochsenbein, Daniel Egret, Pascal Dubois, Francois Bonnarel, Suzanne Borde, Francoise Genova, Gerard Jasniewicz, Suzanne Laloe, Soizick Lesteven, Richard Monier

Abstract: Simbad is the reference database for identification and bibliography of astronomical objects. It contains identifications, `basic data', bibliography, and selected observational measurements for several million astronomical objects. Simbad is developed and maintained by CDS, Strasbourg. Building the database contents is achieved with the help of several contributing institutes. Scanning the bibl… ▽ More Simbad is the reference database for identification and bibliography of astronomical objects. It contains identifications, `basic data', bibliography, and selected observational measurements for several million astronomical objects. Simbad is developed and maintained by CDS, Strasbourg. Building the database contents is achieved with the help of several contributing institutes. Scanning the bibliography is the result of the collaboration of CDS with bibliographers in Observatoire de Paris (DASGAL), Institut d'Astrophysique de Paris, and Observatoire de Bordeaux. When selecting catalogues and tables for inclusion, priority is given to optimal multi-wavelength coverage of the database, and to support of research developments linked to large projects. In parallel, the systematic scanning of the bibliography reflects the diversity and general trends of astronomical research. A WWW interface to Simbad is available at: http://simbad.u-strasbg.fr/Simbad △ Less

Submitted 4 February, 2000; originally announced February 2000.

Comments: 14 pages, 5 Postscript figures; to be published in A&AS

arXiv:astro-ph/0002109 [pdf, ps, other]

The ALADIN Interactive Sky Atlas

Authors: Francois Bonnarel, Pierre Fernique, Olivier Bienayme, Daniel Egret, Francoise Genova, Mireille Louys, Francois Ochsenbein, Marc Wenger, James G. Bartlett

Abstract: The Aladin interactive sky atlas, developed at CDS, is a service providing simultaneous access to digitized images of the sky, astronomical catalogues, and databases. The driving motivation is to facilitate direct, visual comparison of observational data at any wavelength with images of the optical sky, and with reference catalogues. The set of available sky images consists of the STScI Digi… ▽ More The Aladin interactive sky atlas, developed at CDS, is a service providing simultaneous access to digitized images of the sky, astronomical catalogues, and databases. The driving motivation is to facilitate direct, visual comparison of observational data at any wavelength with images of the optical sky, and with reference catalogues. The set of available sky images consists of the STScI Digitized Sky Surveys, completed with high resolution images of crowded regions scanned at the MAMA facility in Paris. A Java WWW interface to the system is available at: http://aladin.u-strasbg.fr/ △ Less

Submitted 4 February, 2000; originally announced February 2000.

Comments: 8 pages, 3 Postscript figures; to be published in A&A

arXiv:astro-ph/0002095 [pdf, ps, other]

The CDS information hub

Authors: Francoise Genova, Daniel Egret, Olivier Bienayme, Francois Bonnarel, Pascal Dubois, Pierre Fernique, Gerard Jasniewicz, Soizick Lesteven, Richard Monier, Francois Ochsenbein, Marc Wenger

Abstract: The Centre de Donnees astronomiques de Strasbourg (CDS) provides homogeneous access to heterogeneous information of various origins: information about astronomical objects in Simbad; catalogs and observation logs in VizieR and in the catalogue service; reference images and overlays in Aladin; nomenclature in the Dictionary of Nomenclature; Yellow Page services; the AstroGLU resource discovery to… ▽ More The Centre de Donnees astronomiques de Strasbourg (CDS) provides homogeneous access to heterogeneous information of various origins: information about astronomical objects in Simbad; catalogs and observation logs in VizieR and in the catalogue service; reference images and overlays in Aladin; nomenclature in the Dictionary of Nomenclature; Yellow Page services; the AstroGLU resource discovery tool; mirror copies of other reference services; and documentation. With the implementation of links between the CDS services, and with other on--line reference information, CDS has become a major hub in the rapidly evolving world of information retrieval in astronomy, develo** efficient tools to help astronomers to navigate in the world-wide `Virtual Observatory' under construction, from data in the observatory archives to results published in journals. The WWW interface to the CDS services is available at: http://cdsweb.u-strasbg.fr/ △ Less

Submitted 4 February, 2000; originally announced February 2000.

Comments: 7 pages, 2 Postscript figures; to be published in A&AS

Showing 1–12 of 12 results for author: Wenger, M