-
Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling
Authors:
Hang Jiang,
Xiajie Zhang,
Robert Mahari,
Daniel Kessler,
Eric Ma,
Tal August,
Irene Li,
Alex 'Sandy' Pentland,
Yoon Kim,
Deb Roy,
Jad Kabbara
Abstract:
Making legal knowledge accessible to non-experts is crucial for enhancing general legal literacy and encouraging civic participation in democracy. However, legal documents are often challenging to understand for people without legal backgrounds. In this paper, we present a novel application of large language models (LLMs) in legal education to help non-experts learn intricate legal concepts throug…
▽ More
Making legal knowledge accessible to non-experts is crucial for enhancing general legal literacy and encouraging civic participation in democracy. However, legal documents are often challenging to understand for people without legal backgrounds. In this paper, we present a novel application of large language models (LLMs) in legal education to help non-experts learn intricate legal concepts through storytelling, an effective pedagogical tool in conveying complex and abstract concepts. We also introduce a new dataset LegalStories, which consists of 294 complex legal doctrines, each accompanied by a story and a set of multiple-choice questions generated by LLMs. To construct the dataset, we experiment with various LLMs to generate legal stories explaining these concepts. Furthermore, we use an expert-in-the-loop approach to iteratively design multiple-choice questions. Then, we evaluate the effectiveness of storytelling with LLMs through randomized controlled trials (RCTs) with legal novices on 10 samples from the dataset. We find that LLM-generated stories enhance comprehension of legal concepts and interest in law among non-native speakers compared to only definitions. Moreover, stories consistently help participants relate legal concepts to their lives. Finally, we find that learning with stories shows a higher retention rate for non-native speakers in the follow-up assessment. Our work has strong implications for using LLMs in promoting teaching and learning in the legal field and beyond.
△ Less
Submitted 2 July, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
Higher-dimensional subdiagram matching
Authors:
Amar Hadzihasanovic,
Diana Kessler
Abstract:
Higher-dimensional rewriting is founded on a duality of rewrite systems and cell complexes, connecting computational mathematics to higher categories and homotopy theory: the two sides of a rewrite rule are two halves of the boundary of an (n+1)-cell, which are diagrams of n-cells. We study higher-dimensional diagram rewriting as a mechanism of computation, focussing on the matching problem for re…
▽ More
Higher-dimensional rewriting is founded on a duality of rewrite systems and cell complexes, connecting computational mathematics to higher categories and homotopy theory: the two sides of a rewrite rule are two halves of the boundary of an (n+1)-cell, which are diagrams of n-cells. We study higher-dimensional diagram rewriting as a mechanism of computation, focussing on the matching problem for rewritable subdiagrams within the combinatorial framework of diagrammatic sets. We provide an algorithm for subdiagram matching in arbitrary dimensions, based on new results on layerings of diagrams, and derive upper bounds on its time complexity. We show that these superpolynomial bounds can be improved to polynomial bounds under certain acyclicity conditions, and that these conditions hold in general for diagrams up to dimension 3. We discuss the challenges that arise in dimension 4.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
Data Structures for Topologically Sound Higher-Dimensional Diagram Rewriting
Authors:
Amar Hadzihasanovic,
Diana Kessler
Abstract:
We present a computational implementation of diagrammatic sets, a model of higher-dimensional diagram rewriting that is "topologically sound": diagrams admit a functorial interpretation as homotopies in cell complexes. This has potential applications both in the formalisation of higher algebra and category theory and in computational algebraic topology. We describe data structures for well-formed…
▽ More
We present a computational implementation of diagrammatic sets, a model of higher-dimensional diagram rewriting that is "topologically sound": diagrams admit a functorial interpretation as homotopies in cell complexes. This has potential applications both in the formalisation of higher algebra and category theory and in computational algebraic topology. We describe data structures for well-formed shapes of diagrams of arbitrary dimensions and provide a solution to their isomorphism problem in time O(n^3 log n). On top of this, we define a type theory for rewriting in diagrammatic sets and provide a semantic characterisation of its syntactic category. All data structures and algorithms are implemented in the Python library rewalt, which also supports various visualisations of diagrams.
△ Less
Submitted 31 July, 2023; v1 submitted 20 September, 2022;
originally announced September 2022.
-
Ten Quick Tips for Deep Learning in Biology
Authors:
Benjamin D. Lee,
Anthony Gitter,
Casey S. Greene,
Sebastian Raschka,
Finlay Maguire,
Alexander J. Titus,
Michael D. Kessler,
Alexandra J. Lee,
Marc G. Chevrette,
Paul Allen Stewart,
Thiago Britto-Borges,
Evan M. Cofer,
Kun-Hsing Yu,
Juan Jose Carmona,
Elana J. Fertig,
Alexandr A. Kalinin,
Beth Signal,
Benjamin J. Lengerich,
Timothy J. Triche Jr,
Simina M. Boca
Abstract:
Machine learning is a modern approach to problem-solving and task automation. In particular, machine learning is concerned with the development and applications of algorithms that can recognize patterns in data and use them for predictive modeling. Artificial neural networks are a particular class of machine learning algorithms and models that evolved into what is now described as deep learning. G…
▽ More
Machine learning is a modern approach to problem-solving and task automation. In particular, machine learning is concerned with the development and applications of algorithms that can recognize patterns in data and use them for predictive modeling. Artificial neural networks are a particular class of machine learning algorithms and models that evolved into what is now described as deep learning. Given the computational advances made in the last decade, deep learning can now be applied to massive data sets and in innumerable contexts. Therefore, deep learning has become its own subfield of machine learning. In the context of biological research, it has been increasingly used to derive novel insights from high-dimensional biological data. To make the biological applications of deep learning more accessible to scientists who have some experience with machine learning, we solicited input from a community of researchers with varied biological and deep learning interests. These individuals collaboratively contributed to this manuscript's writing using the GitHub version control platform and the Manubot manuscript generation toolset. The goal was to articulate a practical, accessible, and concise set of guidelines and suggestions to follow when using deep learning. In the course of our discussions, several themes became clear: the importance of understanding and applying machine learning fundamentals as a baseline for utilizing deep learning, the necessity for extensive model comparisons with careful evaluation, and the need for critical thought in interpreting results generated by deep learning, among others.
△ Less
Submitted 29 May, 2021;
originally announced May 2021.
-
Supervised PCA: A Multiobjective Approach
Authors:
Alexander Ritchie,
Laura Balzano,
Daniel Kessler,
Chandra S. Sripada,
Clayton Scott
Abstract:
Methods for supervised principal component analysis (SPCA) aim to incorporate label information into principal component analysis (PCA), so that the extracted features are more useful for a prediction task of interest. Prior work on SPCA has focused primarily on optimizing prediction error, and has neglected the value of maximizing variance explained by the extracted features. We propose a new met…
▽ More
Methods for supervised principal component analysis (SPCA) aim to incorporate label information into principal component analysis (PCA), so that the extracted features are more useful for a prediction task of interest. Prior work on SPCA has focused primarily on optimizing prediction error, and has neglected the value of maximizing variance explained by the extracted features. We propose a new method for SPCA that addresses both of these objectives jointly, and demonstrate empirically that our approach dominates existing approaches, i.e., outperforms them with respect to both prediction error and variation explained. Our approach accommodates arbitrary supervised learning losses and, through a statistical reformulation, provides a novel low-rank extension of generalized linear models.
△ Less
Submitted 16 August, 2022; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Learning Densities Conditional on Many Interacting Features
Authors:
David C. Kessler,
Jack Taylor,
David B. Dunson
Abstract:
Learning a distribution conditional on a set of discrete-valued features is a commonly encountered task. This becomes more challenging with a high-dimensional feature set when there is the possibility of interaction between the features. In addition, many frequently applied techniques consider only prediction of the mean, but the complete conditional density is needed to answer more complex questi…
▽ More
Learning a distribution conditional on a set of discrete-valued features is a commonly encountered task. This becomes more challenging with a high-dimensional feature set when there is the possibility of interaction between the features. In addition, many frequently applied techniques consider only prediction of the mean, but the complete conditional density is needed to answer more complex questions. We demonstrate a novel nonparametric Bayes method based upon a tensor factorization of feature-dependent weights for Gaussian kernels. The method makes use of multistage feature selection for dimension reduction. The resulting conditional density morphs flexibly with the selected features.
△ Less
Submitted 29 April, 2013; v1 submitted 26 April, 2013;
originally announced April 2013.