-
Lessons Learned from Designing an Open-Source Automated Feedback System for STEM Education
Authors:
Steffen Steinert,
Lars Krupp,
Karina E. Avila,
Anke S. Janssen,
Verena Ruf,
David Dzsotjan,
Christian De Schryver,
Jakob Karolus,
Stefan Ruzika,
Karen Joisten,
Paul Lukowicz,
Jochen Kuhn,
Norbert Wehn,
Stefan Küchemann
Abstract:
As distance learning becomes increasingly important and artificial intelligence tools continue to advance, automated systems for individual learning have attracted significant attention. However, the scarcity of open-source online tools that are capable of providing personalized feedback has restricted the widespread implementation of research-based feedback systems. In this work, we present RATsA…
▽ More
As distance learning becomes increasingly important and artificial intelligence tools continue to advance, automated systems for individual learning have attracted significant attention. However, the scarcity of open-source online tools that are capable of providing personalized feedback has restricted the widespread implementation of research-based feedback systems. In this work, we present RATsApp, an open-source automated feedback system (AFS) that incorporates research-based features such as formative feedback. The system focuses on core STEM competencies such as mathematical competence, representational competence, and data literacy. It also allows lecturers to monitor students' progress. We conducted a survey based on the technology acceptance model (TAM2) among a set of students (N=64). Our findings confirm the applicability of the TAM2 framework, revealing that factors such as the relevance of the studies, output quality, and ease of use significantly influence the perceived usefulness. We also found a linear relation between the perceived usefulness and the intention to use, which in turn is a significant predictor of the frequency of use. Moreover, the formative feedback feature of RATsApp received positive feedback, indicating its potential as an educational tool. Furthermore, as an open-source platform, RATsApp encourages public contributions to its ongoing development, fostering a collaborative approach to improve educational tools.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Harnessing Large Language Models to Enhance Self-Regulated Learning via Formative Feedback
Authors:
Steffen Steinert,
Karina E. Avila,
Stefan Ruzika,
Jochen Kuhn,
Stefan Küchemann
Abstract:
Effectively supporting students in mastering all facets of self-regulated learning is a central aim of teachers and educational researchers. Prior research could demonstrate that formative feedback is an effective way to support students during self-regulated learning (SRL). However, for formative feedback to be effective, it needs to be tailored to the learners, requiring information about their…
▽ More
Effectively supporting students in mastering all facets of self-regulated learning is a central aim of teachers and educational researchers. Prior research could demonstrate that formative feedback is an effective way to support students during self-regulated learning (SRL). However, for formative feedback to be effective, it needs to be tailored to the learners, requiring information about their learning progress. In this work, we introduce LEAP, a novel platform that utilizes advanced large language models (LLMs), such as ChatGPT, to provide formative feedback to students. LEAP empowers teachers with the ability to effectively pre-prompt and assign tasks to the LLM, thereby stimulating students' cognitive and metacognitive processes and promoting self-regulated learning. We demonstrate that a systematic prompt design based on theoretical principles can provide a wide range of types of scaffolds to students, including sense-making, elaboration, self-explanation, partial task-solution scaffolds, as well as metacognitive and motivational scaffolds. In this way, we emphasize the critical importance of synchronizing educational technological advances with empirical research and theoretical frameworks.
△ Less
Submitted 30 November, 2023; v1 submitted 23 November, 2023;
originally announced November 2023.
-
Unreflected Acceptance -- Investigating the Negative Consequences of ChatGPT-Assisted Problem Solving in Physics Education
Authors:
Lars Krupp,
Steffen Steinert,
Maximilian Kiefer-Emmanouilidis,
Karina E. Avila,
Paul Lukowicz,
Jochen Kuhn,
Stefan Küchemann,
Jakob Karolus
Abstract:
Large language models (LLMs) have recently gained popularity. However, the impact of their general availability through ChatGPT on sensitive areas of everyday life, such as education, remains unclear. Nevertheless, the societal impact on established educational methods is already being experienced by both students and educators. Our work focuses on higher physics education and examines problem sol…
▽ More
Large language models (LLMs) have recently gained popularity. However, the impact of their general availability through ChatGPT on sensitive areas of everyday life, such as education, remains unclear. Nevertheless, the societal impact on established educational methods is already being experienced by both students and educators. Our work focuses on higher physics education and examines problem solving strategies. In a study, students with a background in physics were assigned to solve physics exercises, with one group having access to an internet search engine (N=12) and the other group being allowed to use ChatGPT (N=27). We evaluated their performance, strategies, and interaction with the provided tools. Our results showed that nearly half of the solutions provided with the support of ChatGPT were mistakenly assumed to be correct by the students, indicating that they overly trusted ChatGPT even in their field of expertise. Likewise, in 42% of cases, students used copy & paste to query ChatGPT -- an approach only used in 4% of search engine queries -- highlighting the stark differences in interaction behavior between the groups and indicating limited reflection when using ChatGPT. In our work, we demonstrated a need to (1) guide students on how to interact with LLMs and (2) create awareness of potential shortcomings for users.
△ Less
Submitted 21 August, 2023;
originally announced September 2023.
-
Educational data augmentation in physics education research using ChatGPT
Authors:
Fabian Kieser,
Peter Wulff,
Jochen Kuhn,
Stefan Küchemann
Abstract:
Generative AI technologies such as large language models show novel potentials to enhance educational research. For example, generative large language models were shown to be capable to solve quantitative reasoning tasks in physics and concept tests such as the Force Concept Inventory. Given the importance of such concept inventories for physics education research, and the challenges in develo**…
▽ More
Generative AI technologies such as large language models show novel potentials to enhance educational research. For example, generative large language models were shown to be capable to solve quantitative reasoning tasks in physics and concept tests such as the Force Concept Inventory. Given the importance of such concept inventories for physics education research, and the challenges in develo** them such as field testing with representative populations, this study seeks to examine to what extent a generative large language model could be utilized to generate a synthetic data set for the FCI that exhibits content-related variability in responses. We use the recently introduced ChatGPT based on the GPT 4 generative large language model and investigate to what extent ChatGPT could solve the FCI accurately (RQ1) and could be prompted to solve the FCI as-if it were a student belonging to a different cohort (RQ2). Furthermore, we study, to what extent ChatGPT could be prompted to solve the FCI as-if it were a student having a different force- and mechanics-related misconception (RQ3). In alignment with other research, we found the ChatGPT could accurately solve the FCI. We furthermore found that prompting ChatGPT to respond to the inventory as-if it belonged to a different cohort yielded no variance in responses, however, responding as-if it had a certain misconception introduced much variance in responses that approximate real human responses on the FCI in some regards.
△ Less
Submitted 2 October, 2023; v1 submitted 26 July, 2023;
originally announced July 2023.
-
Physics task development of prospective physics teachers using ChatGPT
Authors:
Stefan Küchemann,
Steffen Steinert,
Natalia Revenga,
Matthias Schweinberger,
Yavuz Dinc,
Karina E. Avila,
Jochen Kuhn
Abstract:
The recent advancement of large language models presents numerous opportunities for teaching and learning. Despite widespread public debate regarding the use of large language models, empirical research on their opportunities and risks in education remains limited. In this work, we demonstrate the qualities and shortcomings of using ChatGPT 3.5 for physics task development by prospective teachers.…
▽ More
The recent advancement of large language models presents numerous opportunities for teaching and learning. Despite widespread public debate regarding the use of large language models, empirical research on their opportunities and risks in education remains limited. In this work, we demonstrate the qualities and shortcomings of using ChatGPT 3.5 for physics task development by prospective teachers. In a randomized controlled trial, 26 prospective physics teacher students were divided into two groups: the first group used ChatGPT 3.5 to develop text-based physics tasks for four different concepts in the field of kinematics for 10th grade high school students, while the second group used a classical textbook to create tasks for the same concepts and target group. The results indicate no difference in task correctness, but students using the textbook achieved a higher clarity and more frequently embedded their questions in a meaningful context. Both groups adapted the level of task difficulty easily to the target group but struggled strongly with sufficient task specificity, i.e., relevant information to solve the tasks were missing. Students using ChatGPT for problem posing rated high system usability but experienced difficulties with output quality. These results provide insights into the opportunities and pitfalls of using large language models in education.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
The impact of an interactive visualization and simulation tool on learning quantum physics: Results of an eye-tracking study
Authors:
Stefan Küchemann,
Malte Ubben,
David Dzsotjan,
Sergey Mukhametov,
Carrie A. Weidner,
Linda Qerimi,
Jochen Kuhn,
Stefan Heusler,
Jacob F. Sherson
Abstract:
Employing scientific practices to obtain and use information is one of the central facets of next generation science standards. Especially in quantum technology education, the ability to employ such practices is an essential skill to foster both academic success and technological development. In order to help educators design effective instructions, the comparison between novices' and experts' eye…
▽ More
Employing scientific practices to obtain and use information is one of the central facets of next generation science standards. Especially in quantum technology education, the ability to employ such practices is an essential skill to foster both academic success and technological development. In order to help educators design effective instructions, the comparison between novices' and experts' eye movements allows the identification of efficient information extraction and integration strategies. In this work, we compare the gaze behavior of experts and novices while solving problems in quantum physics using an interactive simulation tool, Quantum Composer, which displays information via multiple external representations (numerical values, equations, graphs). During two reasoning tasks, we found that metarepresentational competences were crucial for successful engagement with the simulation tool. The analysis of the gaze behavior revealed that visual attention on a graph plays a major role and redundant numerical information is ignored. Furthermore, the total dwell time on relevant and irrelevant areas is predictive for the score in the second task. Therefore, the results demonstrate which difficulties novices encounter when using simulation tools and provides insights for how to design effective instructions in quantum technology education guided by experts' gaze behavior.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
Development and validation of the Converging Lenses Concept Inventory for middle school physics education
Authors:
Salome Wörner,
Sebastian Becker,
Stefan Küchemann,
Katharina Scheiter,
Jochen Kuhn
Abstract:
Optics is a core field in the curricula of secondary physics education. In this study, we present the development and validation of a test instrument in the field of optics, the Converging Lenses Concept Inventory (CLCI). It can be used as a formative or a summative assessment of middle school students' conceptual understanding of image formation by converging lenses. The CLCI assesses: (1) the ov…
▽ More
Optics is a core field in the curricula of secondary physics education. In this study, we present the development and validation of a test instrument in the field of optics, the Converging Lenses Concept Inventory (CLCI). It can be used as a formative or a summative assessment of middle school students' conceptual understanding of image formation by converging lenses. The CLCI assesses: (1) the overall understanding of fundamental concepts related to converging lenses, (2) the understanding of specific concepts, and (3) students' propensity for difficulties within this topic. The initial CLCI consists of 16 multiple-choice items; however, one item was removed based on various quality checks. We validated the CLCI thoroughly with distractor analyses, classical test theory, item response theory, structural analyses, and analyses of students' total scores at different measurement points as quantitative approaches, as well as student interviews and an expert survey as qualitative approaches. The quantitative analyses are mostly based on a dataset of N = 318 middle school students who took the CLCI as a posttest. The student interviews were conducted with seven middle school students after they were taught the concepts of converging lenses. The expert survey included five experts who evaluated both individual items and the test as a whole. The analyses showed good to excellent results for the test instrument, corroborating the 15-item CLCI's validity and its compliance with the three foci outlined above.
△ Less
Submitted 19 May, 2022;
originally announced May 2022.
-
Studying physics during the COVID-19 pandemic: Student assessments of learning achievement, perceived effectiveness of online recitations, and online laboratories
Authors:
Pascal Klein,
Lana Ivanjek,
Merten Nikolay Dahlkemper,
Katarina Jeličić,
Marie-Annette Geyer,
Stefan Küchemann,
Ana Susac
Abstract:
The COVID-19 pandemic has significantly affected the education system worldwide that was responding with a sudden shift to distance learning. Various physics courses such as lectures, tutorials, and the laboratories had to be transferred into online formats rapidly, resulting in a variety of simultaneous, asynchronous, and mixed activities. To investigate how physics students perceived the sudden…
▽ More
The COVID-19 pandemic has significantly affected the education system worldwide that was responding with a sudden shift to distance learning. Various physics courses such as lectures, tutorials, and the laboratories had to be transferred into online formats rapidly, resulting in a variety of simultaneous, asynchronous, and mixed activities. To investigate how physics students perceived the sudden shift to online learning, we developed a questionnaire and gathered data from N = 578 physics students from five universities in Germany, Austria, and Croatia. In this article, we report how the problem-solving sessions (recitations) and laboratories were adapted, how students' judge different formats of the courses and how useful and effective they perceive them. The results are correlated to the students' self-efficacy ratings and other behavioral measures (such as self-regulated learning skills) and demographics. In a related article, we focus on the online physics lectures and compare simultaneous vs. asynchronous teaching and learning methods (n.n.). We find that good communication abilities and self-organization skills are positively correlated for perceived learning achievement. Furthermore, the previous duration of studies had a significant impact on the students' perceived overall learning achievement, on the students' acquisition of experimental skills during the physics laboratories, and on their assessment of the recitations' effectiveness. That is, students in their first academic year show consistently lower scores than more progressed students. For the physics laboratories, it was found that gathering real data was crucial to the acquisition of experimental skills and the reinforcement of content. For the physics recitations, handing in own solutions for feedback was correlated with perceived effectiveness. We draw conclusions and implications for future online classes.
△ Less
Submitted 12 October, 2020;
originally announced October 2020.
-
Visual attention while solving the test of understanding graphs in kinematics: An eye-tracking analysis
Authors:
Pascal Klein,
Andreas Lichtenberger,
Stefan Küchemann,
Sebastian Becker,
Martina Kekule,
Jouni Viiri,
Christiane Baadte,
Andreas Vaterlaus,
Jochen Kuhn
Abstract:
This study used eye-tracking to capture the students' visual attention while taking the test of understanding graphs in kinematics (TUG-K). A total of N = 115 upper-secondary-level students from Germany and Switzerland took the 26-item multiple-choice instrument after learning about kinematics graphs in the regular classroom. Besides choosing the correct alternative among research-based distractor…
▽ More
This study used eye-tracking to capture the students' visual attention while taking the test of understanding graphs in kinematics (TUG-K). A total of N = 115 upper-secondary-level students from Germany and Switzerland took the 26-item multiple-choice instrument after learning about kinematics graphs in the regular classroom. Besides choosing the correct alternative among research-based distractors, the students were required to judge their response confidence for each question. The items were presented sequentially on a computer screen equipped with a remote eye tracker, resulting in a set of approx. 3000 paired responses (accuracy and confidence) and about 40 hours of eye movementdata (approx. 500.000 fixations). The analysis of students' visual attention related to the item stems (questions) and the item options reveal that high response confidence is correlated with shorter visit duration on both elements of the items. While the students' response accuracy and their response confidence are highly correlated on the score level, r(115) = 0.63, p < 0.001, the eye-tracking measures do not sufficiently discriminate between correct and incorrect responses. However, a more fine-grained analysis of visual attention based on different answer options reveals a significant discrimination between correct and incorrect answers in terms of an interaction effect: Incorrect responses are associated with longer visit durations on strong distractors and less time spent on correct options while correct responses show the opposite trend. Outcomes of this study provide new insights into the validation of concept inventories based on students' behavioural level.
△ Less
Submitted 31 August, 2019;
originally announced September 2019.
-
Structure and size of the plastic zone formed during nanoindentation of a metallic glass
Authors:
Karina E. Avila,
Stefan Küchemann,
Herbert M. Urbassek
Abstract:
Using molecular dynamics simulation, we study the plastic zone created during nanoindentation of a large CuZr glass system. The plastic zone consists of a core region, in which virtually every atom undergoes plastic rearrangement, and a tail, where the density distribution of the plastically active atoms decays to zero. Compared to crystalline substrates, the plastic zone in metallic glasses is si…
▽ More
Using molecular dynamics simulation, we study the plastic zone created during nanoindentation of a large CuZr glass system. The plastic zone consists of a core region, in which virtually every atom undergoes plastic rearrangement, and a tail, where the density distribution of the plastically active atoms decays to zero. Compared to crystalline substrates, the plastic zone in metallic glasses is significantly smaller than in crystals. The so-called plastic-zone size factor, which relates the radius of the plastic zone to the contact radius of the indenter with the substrate, assumes values around 1, while in crystals -- depending on the crystal structure -- values of 2--3 are common. The small plastic zone in metallic glasses is caused by the essentially homogeneous deformation in the amorphous matrix, while in crystals heterogeneous dislocations prevail, whose growth leads to a marked extension of the plastic zone.
△ Less
Submitted 28 August, 2019;
originally announced August 2019.
-
Nanoscratching of metallic glasses -- An atomistic study
Authors:
Karina E. Avila,
Stefan Küchemann,
Iyad Alabd Alhafez,
Herbert M. Urbassek
Abstract:
Tribological properties of materials play an important role in engineering applications. Up to now, a number of experimental studies have identified correlations between tribological parameters and the mechanical response. Using molecular dynamics simulations, we study abrasive wear behavior via nanoscratching of a Cu$_{64.5}$Zr$_{35.5}$ metallic glass. The evolution of the normal and transverse f…
▽ More
Tribological properties of materials play an important role in engineering applications. Up to now, a number of experimental studies have identified correlations between tribological parameters and the mechanical response. Using molecular dynamics simulations, we study abrasive wear behavior via nanoscratching of a Cu$_{64.5}$Zr$_{35.5}$ metallic glass. The evolution of the normal and transverse forces and hardness values follows the behavior well known for crystalline substrates. In particular, the generation of the frontal pileup weakens the response of the material to the scratching tip and leads to a decrease of the transverse hardness as compared to the normal hardness. However, metallic glasses soften with increasing temperature, particularly above the glass transition temperature thus showing a higher tendency to structurally relax an applied stress. This plastic response is analyzed focusing on local regions of atoms which underwent strong von-Mises strains, since these are the basis of shear-transformation zones and shear bands. The volume occupied by these atoms increases with temperature, but large increases are only observed above the glass transition temperature. We quantify the generation of plasticity by the concept of plastic efficiency, which relates the generation of plastic volume inside the sample with the formation of external damage, viz. the scratch groove. In comparison to nanoindentation, the generation rate of the plastic volume during nanoscratching is significantly temperature dependent making the glass inside more damage-tolerant at lower temperature but more damage-susceptible at elevated temperatures.
△ Less
Submitted 28 August, 2019;
originally announced August 2019.
-
Improving students' understanding of rotating frames of reference using videos from different perspectives
Authors:
Stefan Küchemann,
Pascal Klein,
Henning Fouckhardt,
Sebastian Gröber,
Jochen Kuhn
Abstract:
The concepts of the Coriolis and the centrifugal force are essential in various scientific fields and they are standard components of introductory physics lectures. In this paper we explore how students understand and apply concepts of rotating frames of reference in the context of an exemplary lecture demonstration experiment. We found in a $Predict-Observe-Explain$-setting, that after predicting…
▽ More
The concepts of the Coriolis and the centrifugal force are essential in various scientific fields and they are standard components of introductory physics lectures. In this paper we explore how students understand and apply concepts of rotating frames of reference in the context of an exemplary lecture demonstration experiment. We found in a $Predict-Observe-Explain$-setting, that after predicting the outcome prior to the demonstration, only one out of five physics students correctly reported the observation of the trajectory of a sphere rolling over a rotating disc. Despite this low score, a detailed analysis of distractors revealed significant conceptual learning during the observation of the experiment. In this context, we identified three main misconceptions and learning difficulties. First, the centrifugal force is only required to describe the trajectory if the object is coupled to the rotating system. Second, inertial forces cause a reaction of an object on which they act. And third, students systematically mix-up the trajectories in the stationary and the rotating frame of reference. Furthermore, we captured students' eye movements during the $Predict$ task and found that physics students with low confidence ratings focused longer on relevant task areas than confident students despite having a comparable score. Consequently, this metric is a helpful tool for the identification of misconceptions using eye tracking. Overall, the results help to understand the complexity of concept learning from demonstration experiments and provide important implications for instructional design of introductions to rotating frames of reference.
△ Less
Submitted 26 February, 2019;
originally announced February 2019.