Search | arXiv e-print repository

arXiv:2401.15201 [pdf, other]

Automatically Detecting Confusion and Conflict During Collaborative Learning Using Linguistic, Prosodic, and Facial Cues

Authors: Yingbo Ma, Yukyeong Song, Mehmet Celepkolu, Kristy Elizabeth Boyer, Eric Wiebe, Collin F. Lynch, Maya Israel

Abstract: During collaborative learning, confusion and conflict emerge naturally. However, persistent confusion or conflict have the potential to generate frustration and significantly impede learners' performance. Early automatic detection of confusion and conflict would allow us to support early interventions which can in turn improve students' experience with and outcomes from collaborative learning. Des… ▽ More During collaborative learning, confusion and conflict emerge naturally. However, persistent confusion or conflict have the potential to generate frustration and significantly impede learners' performance. Early automatic detection of confusion and conflict would allow us to support early interventions which can in turn improve students' experience with and outcomes from collaborative learning. Despite the extensive studies modeling confusion during solo learning, there is a need for further work in collaborative learning. This paper presents a multimodal machine-learning framework that automatically detects confusion and conflict during collaborative learning. We used data from 38 elementary school learners who collaborated on a series of programming tasks in classrooms. We trained deep multimodal learning models to detect confusion and conflict using features that were automatically extracted from learners' collaborative dialogues, including (1) language-derived features including TF-IDF, lexical semantics, and sentiment, (2) audio-derived features including acoustic-prosodic features, and (3) video-derived features including eye gaze, head pose, and facial expressions. Our results show that multimodal models that combine semantics, pitch, and facial expressions detected confusion and conflict with the highest accuracy, outperforming all unimodal models. We also found that prosodic cues are more predictive of conflict, and facial cues are more predictive of confusion. This study contributes to the automated modeling of collaborative learning processes and the development of real-time adaptive support to enhance learners' collaborative learning experience in classroom contexts. △ Less

Submitted 26 January, 2024; originally announced January 2024.

Comments: 27 pages, 7 figures, 7 tables

arXiv:2112.04465 [pdf, other]

doi 10.1145/3478431.3499377

Designing a Dashboard for Student Teamwork Analysis

Authors: Niki Gitinabard, Sarah Heckman, Tiffany Barnes, Collin F. Lynch

Abstract: Classroom dashboards are designed to help instructors effectively orchestrate classrooms by providing summary statistics, activity tracking, and other information. Existing dashboards are generally specific to an LMS or platform and they generally summarize individual work, not group behaviors. However, CS courses typically involve constellations of tools and mix on- and offline collaboration. Thu… ▽ More Classroom dashboards are designed to help instructors effectively orchestrate classrooms by providing summary statistics, activity tracking, and other information. Existing dashboards are generally specific to an LMS or platform and they generally summarize individual work, not group behaviors. However, CS courses typically involve constellations of tools and mix on- and offline collaboration. Thus, cross-platform monitoring of individuals and teams is important to develop a full picture of the class. In this work, we describe our work on Concert, a data integration platform that collects data about student activities from several sources such as Piazza, My Digital Hand, and GitHub and uses it to support classroom monitoring through analysis and visualizations. We discuss team visualizations that we have developed to support effective group management and to help instructors identify teams in need of intervention. △ Less

Submitted 6 December, 2021; originally announced December 2021.

Comments: In press: SIGCSE 2022

arXiv:1905.00928 [pdf, other]

What will you do next? A sequence analysis on the student transitions between online platforms in blended courses

Authors: Niki Gitinabard, Sarah Heckman, Tiffany Barnes, Collin F. Lynch

Abstract: Students' interactions with online tools can provide us with insights into their study and work habits. Prior research has shown that these habits, even as simple as the number of actions or the time spent on online platforms can distinguish between the higher performing students and low-performers. These habits are also often used to predict students' performance in classes. One key feature of th… ▽ More Students' interactions with online tools can provide us with insights into their study and work habits. Prior research has shown that these habits, even as simple as the number of actions or the time spent on online platforms can distinguish between the higher performing students and low-performers. These habits are also often used to predict students' performance in classes. One key feature of these actions that is often overlooked is how and when the students transition between different online platforms. In this work, we study sequences of student transitions between online tools in blended courses and identify which habits make the most difference between the higher and lower performing groups. While our results showed that most of the time students focus on a single tool, we were able to find patterns in their transitions to differentiate high and low performing groups. These findings can help instructors to provide procedural guidance to the students, as well as to identify harmful habits and make timely interventions. △ Less

Submitted 2 May, 2019; originally announced May 2019.

Journal ref: In International Conference on Educational Data Mining 2019

arXiv:1904.07331 [pdf, other]

Predicting Student Performance Based on Online Study Habits: A Study of Blended Courses

Authors: Adithya Sheshadri, Niki Gitinabard, Collin F. Lynch, Tiffany Barnes, Sarah Heckman

Abstract: Online tools provide unique access to research students' study habits and problem-solving behavior. In MOOCs, this online data can be used to inform instructors and to provide automatic guidance to students. However, these techniques may not apply in blended courses with face to face and online components. We report on a study of integrated user-system interaction logs from 3 computer science cour… ▽ More Online tools provide unique access to research students' study habits and problem-solving behavior. In MOOCs, this online data can be used to inform instructors and to provide automatic guidance to students. However, these techniques may not apply in blended courses with face to face and online components. We report on a study of integrated user-system interaction logs from 3 computer science courses using four online systems: LMS, forum, version control, and homework system. Our results show that students rarely work across platforms in a single session, and that final class performance can be predicted from students' system use. △ Less

Submitted 15 April, 2019; originally announced April 2019.

Comments: Published in the International Conference on Educational Data Mining (EDM 2018)

arXiv:1904.07328 [pdf, other]

doi 10.1109/TLT.2019.2911832

How Widely Can Prediction Models be Generalized? Performance Prediction in Blended Courses

Authors: Niki Gitinabard, Yiqiao Xu, Sarah Heckman, Tiffany Barnes, Collin F. Lynch

Abstract: Blended courses that mix in-person instruction with online platforms are increasingly popular in secondary education. These tools record a rich amount of data on students' study habits and social interactions. Prior research has shown that these metrics are correlated with students' performance in face to face classes. However, predictive models for blended courses are still limited and have not y… ▽ More Blended courses that mix in-person instruction with online platforms are increasingly popular in secondary education. These tools record a rich amount of data on students' study habits and social interactions. Prior research has shown that these metrics are correlated with students' performance in face to face classes. However, predictive models for blended courses are still limited and have not yet succeeded at early prediction or cross-class predictions even for repeated offerings of the same course. In this work, we use data from two offerings of two different undergraduate courses to train and evaluate predictive models on student performance based upon persistent student characteristics including study habits and social interactions. We analyze the performance of these models on the same offering, on different offerings of the same course, and across courses to see how well they generalize. We also evaluate the models on different segments of the courses to determine how early reliable predictions can be made. This work tells us in part how much data is required to make robust predictions and how cross-class data may be used, or not, to boost model performance. The results of this study will help us better understand how similar the study habits, social activities, and the teamwork styles are across semesters for students in each performance category. These trained models also provide an avenue to improve our existing support platforms to better support struggling students early in the semester with the goal of providing timely intervention. △ Less

Submitted 21 June, 2019; v1 submitted 15 April, 2019; originally announced April 2019.

Journal ref: IEEE TLT, Special Issue on Early Prediction 2019

arXiv:1812.00843 [pdf]

Early Prediction of Course Grades: Models and Feature Selection

Authors: Hengxuan Li, Collin F. Lynch, Tiffany Barnes

Abstract: In this paper, we compare predictive models for students' final performance in a blended course using a set of generic features collected from the first six weeks of class. These features were extracted from students' online homework submission logs as well as other online actions. We compare the effectiveness of 5 different ML algorithms (SVMs, Support Vector Regression, Decision Tree, Naive Baye… ▽ More In this paper, we compare predictive models for students' final performance in a blended course using a set of generic features collected from the first six weeks of class. These features were extracted from students' online homework submission logs as well as other online actions. We compare the effectiveness of 5 different ML algorithms (SVMs, Support Vector Regression, Decision Tree, Naive Bayes and K-Nearest Neighbor). We found that SVMs outperform other models and improve when compared to the baseline. This study demonstrates feasible implementations for predictive models that rely on common data from blended courses that can be used to monitor students' progress and to tailor instruction. △ Less

Submitted 3 December, 2018; originally announced December 2018.

Journal ref: The Proceedings of the 11th International Conference on Educational Data Mining (EDM 2018). 492-495

arXiv:1809.03323 [pdf, other]

Deriving Enhanced Geographical Representations via Similarity-based Spectral Analysis: Predicting Colorectal Cancer Survival Curves in Iowa

Authors: Michael T. Lash, Min Zhang, Xun Zhou, W. Nick Street, Charles F. Lynch

Abstract: Neural networks are capable of learning rich, nonlinear feature representations shown to be beneficial in many predictive tasks. In this work, we use such models to explore different geographical feature representations in the context of predicting colorectal cancer survival curves for patients in the state of Iowa, spanning the years 1989 to 2013. Specifically, we compare model performance using… ▽ More Neural networks are capable of learning rich, nonlinear feature representations shown to be beneficial in many predictive tasks. In this work, we use such models to explore different geographical feature representations in the context of predicting colorectal cancer survival curves for patients in the state of Iowa, spanning the years 1989 to 2013. Specifically, we compare model performance using "area between the curves" (ABC) to assess (a) whether survival curves can be reasonably predicted for colorectal cancer patients in the state of Iowa, (b) whether geographical features improve predictive performance, (c) whether a simple binary representation, or a richer, spectral analysis-elicited representation perform better, and (d) whether spectral analysis-based representations can be improved upon by leveraging geographically-descriptive features. In exploring (d), we devise a similarity-based spectral analysis procedure, which allows for the combination of geographically relational and geographically descriptive features. Our findings suggest that survival curves can be reasonably estimated on average, with predictive performance deviating at the five-year survival mark among all models. We also find that geographical features improve predictive performance, and that better performance is obtained using richer, spectral analysis-elicited features. Furthermore, we find that similarity-based spectral analysis-elicited representations improve upon the original spectral analysis results by approximately 40%. △ Less

Submitted 6 September, 2018; originally announced September 2018.

Comments: arXiv admin note: substantial text overlap with arXiv:1708.04714

arXiv:1809.00052 [pdf, other]

Your Actions or Your Associates? Predicting Certification and Dropout in MOOCs with Behavioral and Social Features

Authors: Niki Gitinabard, Farzaneh Khoshnevisan, Collin F. Lynch, Elle Yuan Wang

Abstract: The high level of attrition and low rate of certification in Massive Open Online Courses (MOOCs) has prompted a great deal of research. Prior researchers have focused on predicting dropout based upon behavioral features such as student confusion, click-stream patterns, and social interactions. However, few studies have focused on combining student logs with forum data. In this work, we use data… ▽ More The high level of attrition and low rate of certification in Massive Open Online Courses (MOOCs) has prompted a great deal of research. Prior researchers have focused on predicting dropout based upon behavioral features such as student confusion, click-stream patterns, and social interactions. However, few studies have focused on combining student logs with forum data. In this work, we use data from two different offerings of the same MOOC. We conduct a survival analysis to identify likely dropouts. We then examine two classes of features, social and behavioral, and apply a combination of modeling and feature-selection methods to identify the most relevant features to predict both dropout and certification. We examine the utility of three different model types and we consider the impact of different definitions of dropout on the predictors. Finally, we assess the reliability of the models over time by evaluating whether or not models from week 1 can predict dropout in week 2, and so on. The outcomes of this study will help instructors identify students likely to fail or dropout as soon as the first two weeks and provide them with more support. △ Less

Submitted 31 August, 2018; originally announced September 2018.

Comments: Published at the 11th International Conference on Educational Data Mining (EDM 2018)

arXiv:1710.04129 [pdf, other]

Identifying Student Communities in Blended Courses

Authors: Niki Gitinabard, Collin F. Lynch, Sarah Heckman, Tiffany Barnes

Abstract: Blended courses have become the norm in post-secondary education. Universities use large-scale learning management systems to manage class content. Instructors deliver readings, lectures, and office hours online; students use intelligent tutors, web forums, and online submission systems; and classes communicate via web forums. These online tools allow students to form new social networks or bring… ▽ More Blended courses have become the norm in post-secondary education. Universities use large-scale learning management systems to manage class content. Instructors deliver readings, lectures, and office hours online; students use intelligent tutors, web forums, and online submission systems; and classes communicate via web forums. These online tools allow students to form new social networks or bring social relationships online. They also allow us to collect data on students' social relationships. In this paper we report on our research on community formation in blended courses based on online forum interactions. We found that it was possible to group students into communities using standard community detection algorithms via their posts and reply structure and that the students' grades are significantly correlated with their closest peers. △ Less

Submitted 28 September, 2017; originally announced October 2017.

Journal ref: Proceedings of the 10th International Conference on Educational Data Mining (p. 378-379). 2017. Wuhan, China

arXiv:1709.10215 [pdf, other]

A Social Network Analysis on Blended Courses

Authors: Niki Gitinabard, Linting Xue, Collin F. Lynch, Sarah Heckman, Tiffany Barnes

Abstract: The large-scale online management systems (e.g. Moodle), online web forums (e.g. Piazza), and online homework systems (e.g. WebAssign) have been widely used in the blended courses recently. Instructors can use these systems to deliver class content and materials. Students can communicate with the classmates, share the course materials, and discuss the course questions via the online forums. With t… ▽ More The large-scale online management systems (e.g. Moodle), online web forums (e.g. Piazza), and online homework systems (e.g. WebAssign) have been widely used in the blended courses recently. Instructors can use these systems to deliver class content and materials. Students can communicate with the classmates, share the course materials, and discuss the course questions via the online forums. With the increased use of the online systems, a large amount of students' interaction data has been collected. This data can be used to analyze students' learning behaviors and predict students' learning outcomes. In this work, we collected students' interaction data in three different blended courses. We represented the data as directed graphs and investigated the correlation between the social graph properties and students' final grades. Our results showed that in all these classes, students who asked more answers and received more feedbacks on the forum tend to obtain higher grades. The significance of this work is that we can use the results to encourage students to participate more in forums to learn the class materials better; we can also build a predictive model based on the social metrics to show us low performing students early in the semester. △ Less

Submitted 28 September, 2017; originally announced September 2017.

Comments: In: EDM 2017 Extended Proceedings: Workshop Proceedings of the 10th International Conference on Educational Data Mining. Wuhan (China)

Journal ref: GEDM 2017 proceedings(p. 22-26)

arXiv:1708.04714 [pdf, other]

Learning Rich Geographical Representations: Predicting Colorectal Cancer Survival in the State of Iowa

Authors: Michael T. Lash, Yuqi Sun, Xun Zhou, Charles F. Lynch, W. Nick Street

Abstract: Neural networks are capable of learning rich, nonlinear feature representations shown to be beneficial in many predictive tasks. In this work, we use these models to explore the use of geographical features in predicting colorectal cancer survival curves for patients in the state of Iowa, spanning the years 1989 to 2012. Specifically, we compare model performance using a newly defined metric -- ar… ▽ More Neural networks are capable of learning rich, nonlinear feature representations shown to be beneficial in many predictive tasks. In this work, we use these models to explore the use of geographical features in predicting colorectal cancer survival curves for patients in the state of Iowa, spanning the years 1989 to 2012. Specifically, we compare model performance using a newly defined metric -- area between the curves (ABC) -- to assess (a) whether survival curves can be reasonably predicted for colorectal cancer patients in the state of Iowa, (b) whether geographical features improve predictive performance, and (c) whether a simple binary representation or richer, spectral clustering-based representation perform better. Our findings suggest that survival curves can be reasonably estimated on average, with predictive performance deviating at the five-year survival mark. We also find that geographical features improve predictive performance, and that the best performance is obtained using richer, spectral analysis-elicited features. △ Less

Submitted 15 August, 2017; originally announced August 2017.

Comments: 8 pages

Showing 1–11 of 11 results for author: Lynch, C F