Search | arXiv e-print repository

Towards Modeling Learner Performance with Large Language Models

Authors: Seyed Parsa Neshaei, Richard Lee Davis, Adam Hazimeh, Bojan Lazarevski, Pierre Dillenbourg, Tanja Käser

Abstract: Recent work exploring the capabilities of pre-trained large language models (LLMs) has demonstrated their ability to act as general pattern machines by completing complex token sequences representing a wide array of tasks, including time-series prediction and robot control. This paper investigates whether the pattern recognition and sequence modeling capabilities of LLMs can be extended to the dom… ▽ More Recent work exploring the capabilities of pre-trained large language models (LLMs) has demonstrated their ability to act as general pattern machines by completing complex token sequences representing a wide array of tasks, including time-series prediction and robot control. This paper investigates whether the pattern recognition and sequence modeling capabilities of LLMs can be extended to the domain of knowledge tracing, a critical component in the development of intelligent tutoring systems (ITSs) that tailor educational experiences by predicting learner performance over time. In an empirical evaluation across multiple real-world datasets, we compare two approaches to using LLMs for this task, zero-shot prompting and model fine-tuning, with existing, non-LLM approaches to knowledge tracing. While LLM-based approaches do not achieve state-of-the-art performance, fine-tuned LLMs surpass the performance of naive baseline models and perform on par with standard Bayesian Knowledge Tracing approaches across multiple metrics. These findings suggest that the pattern recognition capabilities of LLMs can be used to model complex learning trajectories, opening a novel avenue for applying LLMs to educational contexts. The paper concludes with a discussion of the implications of these findings for future research, suggesting that further refinements and a deeper understanding of LLMs' predictive mechanisms could lead to enhanced performance in knowledge tracing tasks. △ Less

Submitted 29 February, 2024; originally announced March 2024.

Comments: 12 pages, 4 figures

arXiv:2306.02206 [pdf]

Mitigating Molecular Aggregation in Drug Discovery with Predictive Insights from Explainable AI

Authors: Hunter Sturm, Jonas Teufel, Kaitlin A. Isfeld, Pascal Friederich, Rebecca L. Davis

Abstract: As the importance of high-throughput screening (HTS) continues to grow due to its value in early stage drug discovery and data generation for training machine learning models, there is a growing need for robust methods for pre-screening compounds to identify and prevent false-positive hits. Small, colloidally aggregating molecules are one of the primary sources of false-positive hits in high-throu… ▽ More As the importance of high-throughput screening (HTS) continues to grow due to its value in early stage drug discovery and data generation for training machine learning models, there is a growing need for robust methods for pre-screening compounds to identify and prevent false-positive hits. Small, colloidally aggregating molecules are one of the primary sources of false-positive hits in high-throughput screens, making them an ideal candidate to target for removal from libraries using predictive pre-screening tools. However, a lack of understanding of the causes of molecular aggregation introduces difficulty in the development of predictive tools for detecting aggregating molecules. Herein, we present an examination of the molecular features differentiating datasets of aggregating and non-aggregating molecules, as well as a machine learning approach to predicting molecular aggregation. Our method uses explainable graph neural networks and counterfactuals to reliably predict and explain aggregation, giving additional insights and design rules for future screening. The integration of this method in HTS approaches will help combat false positives, providing better lead molecules more rapidly and thus accelerating drug discovery cycles. △ Less

Submitted 3 June, 2023; originally announced June 2023.

Comments: 17 pages, plus SI

arXiv:2304.11120 [pdf, other]

What is missing in autonomous discovery: Open challenges for the community

Authors: Phillip M. Maffettone, Pascal Friederich, Sterling G. Baird, Ben Blaiszik, Keith A. Brown, Stuart I. Campbell, Orion A. Cohen, Tantum Collins, Rebecca L. Davis, Ian T. Foster, Navid Haghmoradi, Mark Hereld, Nicole Jung, Ha-Kyung Kwon, Gabriella Pizzuto, Jacob Rintamaki, Casper Steinmann, Luca Torresi, Shi**g Sun

Abstract: Self-driving labs (SDLs) leverage combinations of artificial intelligence, automation, and advanced computing to accelerate scientific discovery. The promise of this field has given rise to a rich community of passionate scientists, engineers, and social scientists, as evidenced by the development of the Acceleration Consortium and recent Accelerate Conference. Despite its strengths, this rapidly… ▽ More Self-driving labs (SDLs) leverage combinations of artificial intelligence, automation, and advanced computing to accelerate scientific discovery. The promise of this field has given rise to a rich community of passionate scientists, engineers, and social scientists, as evidenced by the development of the Acceleration Consortium and recent Accelerate Conference. Despite its strengths, this rapidly develo** field presents numerous opportunities for growth, challenges to overcome, and potential risks of which to remain aware. This community perspective builds on a discourse instantiated during the first Accelerate Conference, and looks to the future of self-driving labs with a tempered optimism. Incorporating input from academia, government, and industry, we briefly describe the current status of self-driving labs, then turn our attention to barriers, opportunities, and a vision for what is possible. Our field is delivering solutions in technology and infrastructure, artificial intelligence and knowledge generation, and education and workforce development. In the spirit of community, we intend for this work to foster discussion and drive best practices as our field grows. △ Less

Submitted 2 May, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

arXiv:2303.16759 [pdf]

doi 10.1136/bmjhci-2022-100665

Exploring celebrity influence on public attitude towards the COVID-19 pandemic: social media shared sentiment analysis

Authors: Brianna M White, Chad A Melton, Parya Zareie, Robert L Davis, Robert A Bednarczyk, Arash Shaban-Nejad

Abstract: The COVID-19 pandemic has introduced new opportunities for health communication, including an increase in the public use of online outlets for health-related emotions. People have turned to social media networks to share sentiments related to the impacts of the COVID-19 pandemic. In this paper we examine the role of social messaging shared by Persons in the Public Eye (i.e. athletes, politicians,… ▽ More The COVID-19 pandemic has introduced new opportunities for health communication, including an increase in the public use of online outlets for health-related emotions. People have turned to social media networks to share sentiments related to the impacts of the COVID-19 pandemic. In this paper we examine the role of social messaging shared by Persons in the Public Eye (i.e. athletes, politicians, news personnel) in determining overall public discourse direction. We harvested approximately 13 million tweets ranging from 1 January 2020 to 1 March 2022. The sentiment was calculated for each tweet using a fine-tuned DistilRoBERTa model, which was used to compare COVID-19 vaccine-related Twitter posts (tweets) that co-occurred with mentions of People in the Public Eye. Our findings suggest the presence of consistent patterns of emotional content co-occurring with messaging shared by Persons in the Public Eye for the first two years of the COVID-19 pandemic influenced public opinion and largely stimulated online public discourse. We demonstrate that as the pandemic progressed, public sentiment shared on social networks was shaped by risk perceptions, political ideologies and health-protective behaviours shared by Persons in the Public Eye, often in a negative light. △ Less

Submitted 23 February, 2023; originally announced March 2023.

Comments: 7 Pages, 4 Figures

ACM Class: I.2.7

Journal ref: BMJ Health & Care Informatics 2023;30:e100665

arXiv:2211.15407 [pdf]

doi 10.2196/40408

Fine-tuned Sentiment Analysis of COVID-19 Vaccine-Related Social Media Data: Comparative Study

Authors: Chad A Melton, Brianna M White, Robert L Davis, Robert A Bednarczyk, Arash Shaban-Nejad

Abstract: This study investigated and compared public sentiment related to COVID-19 vaccines expressed on two popular social media platforms, Reddit and Twitter, harvested from January 1, 2020, to March 1, 2022. To accomplish this task, we created a fine-tuned DistilRoBERTa model to predict sentiments of approximately 9.5 million Tweets and 70 thousand Reddit comments. To fine-tune our model, our team manua… ▽ More This study investigated and compared public sentiment related to COVID-19 vaccines expressed on two popular social media platforms, Reddit and Twitter, harvested from January 1, 2020, to March 1, 2022. To accomplish this task, we created a fine-tuned DistilRoBERTa model to predict sentiments of approximately 9.5 million Tweets and 70 thousand Reddit comments. To fine-tune our model, our team manually labeled the sentiment of 3600 Tweets and then augmented our dataset by the method of back-translation. Text sentiment for each social media platform was then classified with our fine-tuned model using Python and the Huggingface sentiment analysis pipeline. Our results determined that the average sentiment expressed on Twitter was more negative (52% positive) than positive and the sentiment expressed on Reddit was more positive than negative (53% positive). Though average sentiment was found to vary between these social media platforms, both displayed similar behavior related to sentiment shared at key vaccine-related developments during the pandemic. Considering this similar trend in shared sentiment demonstrated across social media platforms, Twitter and Reddit continue to be valuable data sources that public health officials can utilize to strengthen vaccine confidence and combat misinformation. As the spread of misinformation poses a range of psychological and psychosocial risks (anxiety, fear, etc.), there is an urgency in understanding the public perspective and attitude toward shared falsities. Comprehensive educational delivery systems tailored to the population's expressed sentiments that facilitate digital literacy, health information-seeking behavior, and precision health promotion could aid in clarifying such misinformation. △ Less

Submitted 17 October, 2022; originally announced November 2022.

Comments: 11 Pages, 5 Figures, and 1 Table

MSC Class: 92-11 ACM Class: I.2.7

Journal ref: Journal of Medical Internet Research (JMIR) 2022;24(10):e40408

arXiv:2108.11579 [pdf, other]

Modeling Item Response Theory with Stochastic Variational Inference

Authors: Mike Wu, Richard L. Davis, Benjamin W. Domingue, Chris Piech, Noah Goodman

Abstract: Item Response Theory (IRT) is a ubiquitous model for understanding human behaviors and attitudes based on their responses to questions. Large modern datasets offer opportunities to capture more nuances in human behavior, potentially improving psychometric modeling leading to improved scientific understanding and public policy. However, while larger datasets allow for more flexible approaches, many… ▽ More Item Response Theory (IRT) is a ubiquitous model for understanding human behaviors and attitudes based on their responses to questions. Large modern datasets offer opportunities to capture more nuances in human behavior, potentially improving psychometric modeling leading to improved scientific understanding and public policy. However, while larger datasets allow for more flexible approaches, many contemporary algorithms for fitting IRT models may also have massive computational demands that forbid real-world application. To address this bottleneck, we introduce a variational Bayesian inference algorithm for IRT, and show that it is fast and scalable without sacrificing accuracy. Applying this method to five large-scale item response datasets from cognitive science and education yields higher log likelihoods and higher accuracy in imputing missing data than alternative inference algorithms. Using this new inference approach we then generalize IRT with expressive Bayesian models of responses, leveraging recent advances in deep learning to capture nonlinear item characteristic curves (ICC) with neural networks. Using an eigth-grade mathematics test from TIMSS, we show our nonlinear IRT models can capture interesting asymmetric ICCs. The algorithm implementation is open-source, and easily usable. △ Less

Submitted 28 July, 2022; v1 submitted 26 August, 2021; originally announced August 2021.

Comments: version two includes added experiments; 33 pages of content; 6 pages appendix; figures at the bottom. arXiv admin note: text overlap with arXiv:2002.00276

arXiv:2103.09311 [pdf]

doi 10.2196/24738

Using a Personal Health Library-Enabled mHealth Recommender System for Self-Management of Diabetes Among Underserved Populations: Use Case for Knowledge Graphs and Linked Data

Authors: Nariman Ammar, James E Bailey, Robert L Davis, Arash Shaban-Nejad

Abstract: Personal health libraries (PHLs) provide a single point of secure access to patients digital health data and enable the integration of knowledge stored in their digital health profiles with other sources of global knowledge. PHLs can help empower caregivers and health care providers to make informed decisions about patients health by understanding medical events in the context of their lives. This… ▽ More Personal health libraries (PHLs) provide a single point of secure access to patients digital health data and enable the integration of knowledge stored in their digital health profiles with other sources of global knowledge. PHLs can help empower caregivers and health care providers to make informed decisions about patients health by understanding medical events in the context of their lives. This paper reports the implementation of a mobile health digital intervention that incorporates both digital health data stored in patients PHLs and other sources of contextual knowledge to deliver tailored recommendations for improving self-care behaviors in diabetic adults. We conducted a thematic assessment of patient functional and nonfunctional requirements that are missing from current EHRs based on evidence from the literature. We used the results to identify the technologies needed to address those requirements. We describe the technological infrastructures used to construct, manage, and integrate the types of knowledge stored in the PHL. We leverage the Social Linked Data (Solid) platform to design a fully decentralized and privacy-aware platform that supports interoperability and care integration. We provided an initial prototype design of a PHL and drafted a use case scenario that involves four actors to demonstrate how the proposed prototype can be used to address user requirements, including the construction and management of the PHL and its utilization for develo** a mobile app that queries the knowledge stored and integrated into the PHL in a private and fully decentralized manner to provide better recommendations. The proposed PHL helps patients and their caregivers take a central role in making decisions regarding their health and equips their health care providers with informatics tools that support the collection and interpretation of the collected knowledge. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Comments: 21 Pages, 13 Figures

ACM Class: I.2.4; J.3

Journal ref: JMIR Form Res. 2021 March 16;5(3):e24738

arXiv:2002.00276 [pdf, other]

Variational Item Response Theory: Fast, Accurate, and Expressive

Authors: Mike Wu, Richard L. Davis, Benjamin W. Domingue, Chris Piech, Noah Goodman

Abstract: Item Response Theory (IRT) is a ubiquitous model for understanding humans based on their responses to questions, used in fields as diverse as education, medicine and psychology. Large modern datasets offer opportunities to capture more nuances in human behavior, potentially improving test scoring and better informing public policy. Yet larger datasets pose a difficult speed / accuracy challenge to… ▽ More Item Response Theory (IRT) is a ubiquitous model for understanding humans based on their responses to questions, used in fields as diverse as education, medicine and psychology. Large modern datasets offer opportunities to capture more nuances in human behavior, potentially improving test scoring and better informing public policy. Yet larger datasets pose a difficult speed / accuracy challenge to contemporary algorithms for fitting IRT models. We introduce a variational Bayesian inference algorithm for IRT, and show that it is fast and scaleable without sacrificing accuracy. Using this inference approach we then extend classic IRT with expressive Bayesian models of responses. Applying this method to five large-scale item response datasets from cognitive science and education yields higher log likelihoods and improvements in imputing missing data. The algorithm implementation is open-source, and easily usable. △ Less

Submitted 16 March, 2020; v1 submitted 1 February, 2020; originally announced February 2020.

Comments: 10 pages of content

arXiv:astro-ph/9402041 [pdf, ps, other]

How Well Can Cosmological Parameters Be Estimated from CMB Observations?

Authors: J. Richard Bond, Richard L. Davis, Paul J. Steinhardt

Abstract: The CMB anisotropy depends sensitively upon the slope and amplitude of primordial density and gravitational wave fluctuations, the baryon density, the Hubble constant, the cosmological constant, the ionization history, {\it etc.} We report on recent work showing how well multi-scale measurements of the anisotropy power spectrum can resolve these factors. We identify a hypersurface in cosmic para… ▽ More The CMB anisotropy depends sensitively upon the slope and amplitude of primordial density and gravitational wave fluctuations, the baryon density, the Hubble constant, the cosmological constant, the ionization history, {\it etc.} We report on recent work showing how well multi-scale measurements of the anisotropy power spectrum can resolve these factors. We identify a hypersurface in cosmic parameter space that can be accurately localized by observations, but along which the likelihood will hardly vary. Other cosmic observations will be needed to break this degeneracy. △ Less

Submitted 16 February, 1994; originally announced February 1994.

Comments: 9 pages, Penn Preprint UPR-0604T

Journal ref: Astrophys.Lett.Commun.32:53-62,1995

arXiv:astro-ph/9309041 [pdf, ps, other]

doi 10.1103/PhysRevLett.72.13

Measuring Cosmological Parameters with Cosmic Microwave Background Experiments

Authors: J. R. Bond, Robert Crittenden, Richard L. Davis, George Efstathiou, Paul J. Steinhardt

Abstract: The cosmic microwave background anisotropy is sensitive to the slope and amplitude of primordial energy density and gravitational wave fluctuations, the baryon density, the Hubble constant, the cosmological constant, the ionization history, {\it etc.} In this Letter, we examine the degree to which these factors can be separately resolved from combined small- and large-angular scale anisotropy ob… ▽ More The cosmic microwave background anisotropy is sensitive to the slope and amplitude of primordial energy density and gravitational wave fluctuations, the baryon density, the Hubble constant, the cosmological constant, the ionization history, {\it etc.} In this Letter, we examine the degree to which these factors can be separately resolved from combined small- and large-angular scale anisotropy observations. We isolate directions of degeneracy in this cosmic parameter space, but note that other cosmic observations can break the degeneracy. (Correction to Eq. 4) △ Less

Submitted 6 October, 1993; v1 submitted 28 September, 1993; originally announced September 1993.

Comments: 10 pages, Penn Preprint

Journal ref: Phys.Rev.Lett.72:13-16,1994

arXiv:astro-ph/9306027 [pdf, ps, other]

doi 10.1086/187082

Polarization of the Microwave Background due to Primordial Gravitational Waves

Authors: Robert Crittenden, Richard L. Davis, Paul J. Steinhardt

Abstract: The contribution of gravitational wave (tensor metric) and energy density (scalar metric) fluctuations to the cosmic microwave background polarization is computed by numerically solving the relativistic radiation transfer equations. We find that the tensor contribution is significant only at large angular scales (multipoles $\ell \lta 40$). For standard recombination, the tensor contribution can… ▽ More The contribution of gravitational wave (tensor metric) and energy density (scalar metric) fluctuations to the cosmic microwave background polarization is computed by numerically solving the relativistic radiation transfer equations. We find that the tensor contribution is significant only at large angular scales (multipoles $\ell \lta 40$). For standard recombination, the tensor contribution can dominate at $\ell \lta 40$; however, the effect would be difficult to detect since the total (scalar plus tensor) polarization is $< 1$\%. For models with late reionization, the total large angular scale polarization is large ($\sim 7-9$\%), but the tensor fraction is negligibly small. Hence, polarization may be useful for discriminating ionization history, but is much less promising as a means for detecting tensor fluctuations. △ Less

Submitted 25 June, 1993; originally announced June 1993.

Comments: 9 pages, Penn Preprint UPR-0575T

Journal ref: Astrophys.J.417:L13-L16,1993

arXiv:astro-ph/9303014 [pdf, ps, other]

doi 10.1103/PhysRevLett.71.324

The Imprint of Gravitational Waves on the Cosmic Microwave Background

Authors: R. Crittenden, J. R. Bond, R. L. Davis, G. Efstathiou, P. J. Steinhardt

Abstract: Long-wavelength gravitational waves can induce significant temperature anisotropy in the cosmic microwave background. Distinguishing this from anisotropy induced by energy density fluctuations is critical for testing inflationary cosmology and theories of large-scale structure formation. We describe full radiative transport calculations of the two contributions and show that they differ dramatic… ▽ More Long-wavelength gravitational waves can induce significant temperature anisotropy in the cosmic microwave background. Distinguishing this from anisotropy induced by energy density fluctuations is critical for testing inflationary cosmology and theories of large-scale structure formation. We describe full radiative transport calculations of the two contributions and show that they differ dramatically at angular scales below a few degrees. We show how anisotropy experiments probing large- and small-angular scales can combine to distinguish the imprint due to gravitational waves. △ Less

Submitted 25 March, 1993; originally announced March 1993.

Comments: 11 pages, Penn Preprint-UPR-T

Journal ref: Phys.Rev.Lett.71:324-327,1993

arXiv:astro-ph/9207001 [pdf, ps, other]

doi 10.1103/PhysRevLett.69.1856

Cosmic Microwave Background Probes Models of Inflation

Authors: R. L. Davis, H. M. Hodges, G. F. Smoot, P. J. Steinhardt, M. S. Turner

Abstract: Inflation creates both scalar (density) and tensor (gravity wave) metric perturbations. We find that the tensor mode contribution to the CMB anisotropy on large-angular scales can only exceed that of the scalar mode in models where the spectrum of perturbations deviates significantly from scale invariance (e.g., extended and power-law inflation models and extreme versions of chaotic inflation).… ▽ More Inflation creates both scalar (density) and tensor (gravity wave) metric perturbations. We find that the tensor mode contribution to the CMB anisotropy on large-angular scales can only exceed that of the scalar mode in models where the spectrum of perturbations deviates significantly from scale invariance (e.g., extended and power-law inflation models and extreme versions of chaotic inflation). If the tensor mode dominates at large-angular scales, then the value of $ΔT/T$ predicted on $1^\circ$ is less than if the scalar mode dominates, and, for cold dark matter models, $b>1$ can be made consistent with the COBE DMR results. △ Less

Submitted 13 July, 1992; originally announced July 1992.

Comments: 12 pages, FERMILAB-Pub-92/168-A

Journal ref: Phys.Rev.Lett.69:1856-1859,1992; ERRATUM-ibid.70:1733,1993

arXiv:hep-th/9205060 [pdf, ps, other]

doi 10.1016/0370-2693(92)91327-6

Cosmological Implications of Domain Walls due to Duality Invariant Moduli Sector of Superstring Vacua

Authors: M. Cvetič, R. L. Davis

Abstract: We study cosmological implications of the duality ($PSL(2,{\bf Z})$) invariant potential for the compactification radius $T$, arising in a class of superstring vacua. We show that in spite of having only one minimum in the fundamental domain of the $T$ field there are two types of non-supersymmetric domain walls: one is associated with the discrete Peccei-Quinn symmetry $T\to T+i$, analogous to… ▽ More We study cosmological implications of the duality ($PSL(2,{\bf Z})$) invariant potential for the compactification radius $T$, arising in a class of superstring vacua. We show that in spite of having only one minimum in the fundamental domain of the $T$ field there are two types of non-supersymmetric domain walls: one is associated with the discrete Peccei-Quinn symmetry $T\to T+i$, analogous to the axionic domain wall, and another one associated with the noncompact symmetry $T\to 1/T$, analogous to the $Z_2$ domain walls. The first one is bound by stringy cosmic strings. The scale of such domain walls is governed by the scale of gaugino condensation (${\cal O} (10^{16}$ GeV) in the case of hidden $E_8$ gauge group), while the separation between minima is of order $M_{pl}$. We discuss the formation of walls and their cosmological implications: the walls must be gotten rid of, either by chop** by stringy cosmic strings and/or inflation. Since there is no usual Kibble mechanism to create strings, either one must assume they exist $ab initio$, or one must conclude that string cosmologies require inflation. The non-perturbative potential dealt with here appears not to give the needed inflationary epoch. △ Less

Submitted 1 June, 1992; v1 submitted 18 May, 1992; originally announced May 1992.

Comments: 10p., 3 figures, not included, minor wording changes

Journal ref: Phys.Lett.B296:316-322,1992

Showing 1–14 of 14 results for author: Davis, R L