-
Grading Massive Open Online Courses Using Large Language Models
Authors:
Shahriar Golchin,
Nikhil Garuda,
Christopher Impey,
Matthew Wenger
Abstract:
Massive open online courses (MOOCs) offer free education globally to anyone with a computer and internet access. Despite this democratization of learning, the massive enrollment in these courses makes it impractical for one instructor to assess every student's writing assignment. As a result, peer grading, often guided by a straightforward rubric, is the method of choice. While convenient, peer gr…
▽ More
Massive open online courses (MOOCs) offer free education globally to anyone with a computer and internet access. Despite this democratization of learning, the massive enrollment in these courses makes it impractical for one instructor to assess every student's writing assignment. As a result, peer grading, often guided by a straightforward rubric, is the method of choice. While convenient, peer grading often falls short in terms of reliability and validity. In this study, we explore the feasibility of using large language models (LLMs) to replace peer grading in MOOCs. Specifically, we use two LLMs, GPT-4 and GPT-3.5, across three MOOCs: Introductory Astronomy, Astrobiology, and the History and Philosophy of Astronomy. To instruct LLMs, we use three different prompts based on the zero-shot chain-of-thought (ZCoT) prompting technique: (1) ZCoT with instructor-provided correct answers, (2) ZCoT with both instructor-provided correct answers and rubrics, and (3) ZCoT with instructor-provided correct answers and LLM-generated rubrics. Tested on 18 settings, our results show that ZCoT, when augmented with instructor-provided correct answers and rubrics, produces grades that are more aligned with those assigned by instructors compared to peer grading. Finally, our findings indicate a promising potential for automated grading systems in MOOCs, especially in subjects with well-defined rubrics, to improve the learning experience for millions of online learners worldwide.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Large Language Models As MOOCs Graders
Authors:
Shahriar Golchin,
Nikhil Garuda,
Christopher Impey,
Matthew Wenger
Abstract:
Massive open online courses (MOOCs) unlock the doors to free education for anyone around the globe with access to a computer and the internet. Despite this democratization of learning, the massive enrollment in these courses means it is almost impossible for one instructor to assess every student's writing assignment. As a result, peer grading, often guided by a straightforward rubric, is the meth…
▽ More
Massive open online courses (MOOCs) unlock the doors to free education for anyone around the globe with access to a computer and the internet. Despite this democratization of learning, the massive enrollment in these courses means it is almost impossible for one instructor to assess every student's writing assignment. As a result, peer grading, often guided by a straightforward rubric, is the method of choice. While convenient, peer grading often falls short in terms of reliability and validity. In this study, using 18 distinct settings, we explore the feasibility of leveraging large language models (LLMs) to replace peer grading in MOOCs. Specifically, we focus on two state-of-the-art LLMs: GPT-4 and GPT-3.5, across three distinct courses: Introductory Astronomy, Astrobiology, and the History and Philosophy of Astronomy. To instruct LLMs, we use three different prompts based on a variant of the zero-shot chain-of-thought (Zero-shot-CoT) prompting technique: Zero-shot-CoT combined with instructor-provided correct answers; Zero-shot-CoT in conjunction with both instructor-formulated answers and rubrics; and Zero-shot-CoT with instructor-offered correct answers and LLM-generated rubrics. Our results show that Zero-shot-CoT, when integrated with instructor-provided answers and rubrics, produces grades that are more aligned with those assigned by instructors compared to peer grading. However, the History and Philosophy of Astronomy course proves to be more challenging in terms of grading as opposed to other courses. Finally, our study reveals a promising direction for automating grading systems for MOOCs, especially in subjects with well-defined rubrics.
△ Less
Submitted 29 February, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Catalogue of BRITE-Constellation targets I. Fields 1 to 14 (November 2013 - April 2016)
Authors:
K. Zwintz,
A. Pigulski,
R. Kuschnig,
G. A. Wade,
G. Doherty,
M. Earl,
C. Lovekin,
M. Muellner,
S. Piché-Perrier,
T. Steindl,
P. G. Beck,
K. Bicz,
D. M. Bowman,
G. Handler,
B. Pablo,
A. Popowicz,
T. Rozanski,
P. Mikołajczyk,
D. Baade,
O. Koudelka,
A. F. J. Moffat,
C. Neiner,
P. Orleanski,
R. Smolec,
N. St. Louis
, et al. (3 additional authors not shown)
Abstract:
The BRIght Target Explorer (BRITE) mission collects photometric time series in two passbands aiming to investigate stellar structure and evolution. Since their launches in the years 2013 and 2014, the constellation of five BRITE nano-satellites has observed a total of more than 700 individual bright stars in 64 fields. Some targets have been observed multiple times. Thus, the total time base of th…
▽ More
The BRIght Target Explorer (BRITE) mission collects photometric time series in two passbands aiming to investigate stellar structure and evolution. Since their launches in the years 2013 and 2014, the constellation of five BRITE nano-satellites has observed a total of more than 700 individual bright stars in 64 fields. Some targets have been observed multiple times. Thus, the total time base of the data sets acquired for those stars can be as long as nine years. Our aim is to provide a complete description of ready-to-use BRITE data, to show the scientific potential of the BRITE-Constellation data by identifying the most interesting targets, and to demonstrate and encourage how scientists can use these data in their research. We apply a decorrelation process to the automatically reduced BRITE-Constellation data to correct for instrumental effects. We perform a statistical analysis of the light curves obtained for the 300 stars observed in the first 14 fields during the first ~2.5 years of the mission. We also perform cross-identification with the International Variable Star Index. We present the data obtained by the BRITE-Constellation mission in the first 14 fields it observed from November 2013 to April 2016. We also describe the properties of the data for these fields and the 300 stars observed in them. Using these data, we detected variability in 64% of the presented sample of stars. Sixty-four stars or 21.3% of the sample have not yet been identified as variable in the literature and their data have not been analysed in detail. They can therefore provide valuable scientific material for further research. All data are made publicly available through the BRITE Public Data Archive and the Canadian Astronomy Data Centre.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding
Authors:
Robert Chew,
John Bollenbacher,
Michael Wenger,
Jessica Speer,
Annice Kim
Abstract:
Deductive coding is a widely used qualitative research method for determining the prevalence of themes across documents. While useful, deductive coding is often burdensome and time consuming since it requires researchers to read, interpret, and reliably categorize a large body of unstructured text documents. Large language models (LLMs), like ChatGPT, are a class of quickly evolving AI tools that…
▽ More
Deductive coding is a widely used qualitative research method for determining the prevalence of themes across documents. While useful, deductive coding is often burdensome and time consuming since it requires researchers to read, interpret, and reliably categorize a large body of unstructured text documents. Large language models (LLMs), like ChatGPT, are a class of quickly evolving AI tools that can perform a range of natural language processing and reasoning tasks. In this study, we explore the use of LLMs to reduce the time it takes for deductive coding while retaining the flexibility of a traditional content analysis. We outline the proposed approach, called LLM-assisted content analysis (LACA), along with an in-depth case study using GPT-3.5 for LACA on a publicly available deductive coding data set. Additionally, we conduct an empirical benchmark using LACA on 4 publicly available data sets to assess the broader question of how well GPT-3.5 performs across a range of deductive coding tasks. Overall, we find that GPT-3.5 can often perform deductive coding at levels of agreement comparable to human coders. Additionally, we demonstrate that LACA can help refine prompts for deductive coding, identify codes for which an LLM is randomly guessing, and help assess when to use LLMs vs. human coders for deductive coding. We conclude with several implications for future practice of deductive coding and related research methods.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Knowledge distillation for fast and accurate DNA sequence correction
Authors:
Anastasiya Belyaeva,
Joel Shor,
Daniel E. Cook,
Kishwar Shafin,
Daniel Liu,
Armin Töpfer,
Aaron M. Wenger,
William J. Rowell,
Howard Yang,
Alexey Kolesnikov,
Cory Y. McLean,
Maria Nattestad,
Andrew Carroll,
Pi-Chuan Chang
Abstract:
Accurate genome sequencing can improve our understanding of biology and the genetic basis of disease. The standard approach for generating DNA sequences from PacBio instruments relies on HMM-based models. Here, we introduce Distilled DeepConsensus - a distilled transformer-encoder model for sequence correction, which improves upon the HMM-based methods with runtime constraints in mind. Distilled D…
▽ More
Accurate genome sequencing can improve our understanding of biology and the genetic basis of disease. The standard approach for generating DNA sequences from PacBio instruments relies on HMM-based models. Here, we introduce Distilled DeepConsensus - a distilled transformer-encoder model for sequence correction, which improves upon the HMM-based methods with runtime constraints in mind. Distilled DeepConsensus is 1.3x faster and 1.5x smaller than its larger counterpart while improving the yield of high quality reads (Q30) over the HMM-based method by 1.69x (vs. 1.73x for larger model). With improved accuracy of genomic sequences, Distilled DeepConsensus improves downstream applications of genomic sequence analysis such as reducing variant calling errors by 39% (34% for larger model) and improving genome assembly quality by 3.8% (4.2% for larger model). We show that the representations learned by Distilled DeepConsensus are similar between faster and slower models.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
Automated Extraction of Sentencing Decisions from Court Cases in the Hebrew Language
Authors:
Mohr Wenger,
Tom Kalir,
Noga Berger,
Carmit Chalamish,
Renana Keydar,
Gabriel Stanovsky
Abstract:
We present the task of Automated Punishment Extraction (APE) in sentencing decisions from criminal court cases in Hebrew. Addressing APE will enable the identification of sentencing patterns and constitute an important step** stone for many follow up legal NLP applications in Hebrew, including the prediction of sentencing decisions. We curate a dataset of sexual assault sentencing decisions and…
▽ More
We present the task of Automated Punishment Extraction (APE) in sentencing decisions from criminal court cases in Hebrew. Addressing APE will enable the identification of sentencing patterns and constitute an important step** stone for many follow up legal NLP applications in Hebrew, including the prediction of sentencing decisions. We curate a dataset of sexual assault sentencing decisions and a manually-annotated evaluation dataset, and implement rule-based and supervised models. We find that while supervised models can identify the sentence containing the punishment with good accuracy, rule-based approaches outperform them on the full APE task. We conclude by presenting a first analysis of sentencing patterns in our dataset and analyze common models' errors, indicating avenues for future work, such as distinguishing between probation and actual imprisonment punishment. We will make all our resources available upon request, including data, annotation, and first benchmark models.
△ Less
Submitted 24 October, 2021;
originally announced October 2021.
-
SMART: An Open Source Data Labeling Platform for Supervised Learning
Authors:
Rob Chew,
Michael Wenger,
Caroline Kery,
Jason Nance,
Keith Richards,
Emily Hadley,
Peter Baumgartner
Abstract:
SMART is an open source web application designed to help data scientists and research teams efficiently build labeled training data sets for supervised machine learning tasks. SMART provides users with an intuitive interface for creating labeled data sets, supports active learning to help reduce the required amount of labeled data, and incorporates inter-rater reliability statistics to provide ins…
▽ More
SMART is an open source web application designed to help data scientists and research teams efficiently build labeled training data sets for supervised machine learning tasks. SMART provides users with an intuitive interface for creating labeled data sets, supports active learning to help reduce the required amount of labeled data, and incorporates inter-rater reliability statistics to provide insight into label quality. SMART is designed to be platform agnostic and easily deployable to meet the needs of as many different research teams as possible. The project website contains links to the code repository and extensive user documentation.
△ Less
Submitted 11 December, 2018;
originally announced December 2018.
-
GMWB Riders in a Binomial Framework - Pricing, Hedging, and Diversification of Mortality Risk
Authors:
Cody B. Hyndman,
Menachem Wenger
Abstract:
We construct a binomial model for a guaranteed minimum withdrawal benefit (GMWB) rider to a variable annuity (VA) under optimal policyholder behaviour. The binomial model results in explicitly formulated perfect hedging strategies funded using only periodic fee income. We consider the separate perspectives of the insurer and policyholder and introduce a unifying relationship. Decompositions of the…
▽ More
We construct a binomial model for a guaranteed minimum withdrawal benefit (GMWB) rider to a variable annuity (VA) under optimal policyholder behaviour. The binomial model results in explicitly formulated perfect hedging strategies funded using only periodic fee income. We consider the separate perspectives of the insurer and policyholder and introduce a unifying relationship. Decompositions of the VA and GMWB contract into term-certain payments and options representing the guarantee and early surrender features are extended to the binomial framework. We incorporate an approximation algorithm for Asian options that significantly improves efficiency of the binomial model while retaining accuracy. Several numerical examples are provided which illustrate both the accuracy and the tractability of the binomial model. We extend the binomial model to include policy holder mortality and death benefits. Pricing, hedging, and the decompositions of the contract are extended to incorporate mortality risk. We prove limiting results for the hedging strategies and demonstrate mortality risk diversification. Numerical examples are provided which illustrate the effectiveness of hedging and the diversification of mortality risk under capacity constraints with finite pools.
△ Less
Submitted 6 July, 2016; v1 submitted 27 October, 2014;
originally announced October 2014.
-
Valuation Perspectives and Decompositions for Variable Annuities with GMWB riders
Authors:
Cody B. Hyndman,
Menachem Wenger
Abstract:
The guaranteed minimum withdrawal benefit (GMWB) rider, as an add on to a variable annuity (VA), guarantees the return of premiums in the form of peri- odic withdrawals while allowing policyholders to participate fully in any market gains. GMWB riders represent an embedded option on the account value with a fee structure that is different from typical financial derivatives. We consider fair pricin…
▽ More
The guaranteed minimum withdrawal benefit (GMWB) rider, as an add on to a variable annuity (VA), guarantees the return of premiums in the form of peri- odic withdrawals while allowing policyholders to participate fully in any market gains. GMWB riders represent an embedded option on the account value with a fee structure that is different from typical financial derivatives. We consider fair pricing of the GMWB rider from a financial economic perspective. Particular focus is placed on the distinct perspectives of the insurer and policyholder and the unifying relationship. We extend a decomposition of the VA contract into components that reflect term-certain payments and embedded derivatives to the case where the policyholder has the option to surrender, or lapse, the contract early.
△ Less
Submitted 27 December, 2013; v1 submitted 9 July, 2013;
originally announced July 2013.
-
The SIMBAD astronomical database
Authors:
Marc Wenger,
Francois Ochsenbein,
Daniel Egret,
Pascal Dubois,
Francois Bonnarel,
Suzanne Borde,
Francoise Genova,
Gerard Jasniewicz,
Suzanne Laloe,
Soizick Lesteven,
Richard Monier
Abstract:
Simbad is the reference database for identification and bibliography of astronomical objects. It contains identifications, `basic data', bibliography, and selected observational measurements for several million astronomical objects. Simbad is developed and maintained by CDS, Strasbourg. Building the database contents is achieved with the help of several contributing institutes. Scanning the bibl…
▽ More
Simbad is the reference database for identification and bibliography of astronomical objects. It contains identifications, `basic data', bibliography, and selected observational measurements for several million astronomical objects. Simbad is developed and maintained by CDS, Strasbourg. Building the database contents is achieved with the help of several contributing institutes. Scanning the bibliography is the result of the collaboration of CDS with bibliographers in Observatoire de Paris (DASGAL), Institut d'Astrophysique de Paris, and Observatoire de Bordeaux. When selecting catalogues and tables for inclusion, priority is given to optimal multi-wavelength coverage of the database, and to support of research developments linked to large projects. In parallel, the systematic scanning of the bibliography reflects the diversity and general trends of astronomical research.
A WWW interface to Simbad is available at: http://simbad.u-strasbg.fr/Simbad
△ Less
Submitted 4 February, 2000;
originally announced February 2000.
-
The ALADIN Interactive Sky Atlas
Authors:
Francois Bonnarel,
Pierre Fernique,
Olivier Bienayme,
Daniel Egret,
Francoise Genova,
Mireille Louys,
Francois Ochsenbein,
Marc Wenger,
James G. Bartlett
Abstract:
The Aladin interactive sky atlas, developed at CDS, is a service providing simultaneous access to digitized images of the sky, astronomical catalogues, and databases.
The driving motivation is to facilitate direct, visual comparison of observational data at any wavelength with images of the optical sky, and with reference catalogues.
The set of available sky images consists of the STScI Digi…
▽ More
The Aladin interactive sky atlas, developed at CDS, is a service providing simultaneous access to digitized images of the sky, astronomical catalogues, and databases.
The driving motivation is to facilitate direct, visual comparison of observational data at any wavelength with images of the optical sky, and with reference catalogues.
The set of available sky images consists of the STScI Digitized Sky Surveys, completed with high resolution images of crowded regions scanned at the MAMA facility in Paris.
A Java WWW interface to the system is available at: http://aladin.u-strasbg.fr/
△ Less
Submitted 4 February, 2000;
originally announced February 2000.
-
The CDS information hub
Authors:
Francoise Genova,
Daniel Egret,
Olivier Bienayme,
Francois Bonnarel,
Pascal Dubois,
Pierre Fernique,
Gerard Jasniewicz,
Soizick Lesteven,
Richard Monier,
Francois Ochsenbein,
Marc Wenger
Abstract:
The Centre de Donnees astronomiques de Strasbourg (CDS) provides homogeneous access to heterogeneous information of various origins: information about astronomical objects in Simbad; catalogs and observation logs in VizieR and in the catalogue service; reference images and overlays in Aladin; nomenclature in the Dictionary of Nomenclature; Yellow Page services; the AstroGLU resource discovery to…
▽ More
The Centre de Donnees astronomiques de Strasbourg (CDS) provides homogeneous access to heterogeneous information of various origins: information about astronomical objects in Simbad; catalogs and observation logs in VizieR and in the catalogue service; reference images and overlays in Aladin; nomenclature in the Dictionary of Nomenclature; Yellow Page services; the AstroGLU resource discovery tool; mirror copies of other reference services; and documentation. With the implementation of links between the CDS services, and with other on--line reference information, CDS has become a major hub in the rapidly evolving world of information retrieval in astronomy, develo** efficient tools to help astronomers to navigate in the world-wide `Virtual Observatory' under construction, from data in the observatory archives to results published in journals. The WWW interface to the CDS services is available at: http://cdsweb.u-strasbg.fr/
△ Less
Submitted 4 February, 2000;
originally announced February 2000.