Search | arXiv e-print repository

arXiv:2308.00107 [pdf, other]

Validation of a Zero-Shot Learning Natural Language Processing Tool for Data Abstraction from Unstructured Healthcare Data

Authors: Basil Kaufmann, Dallin Busby, Chandan Krushna Das, Neeraja Tillu, Mani Menon, Ashutosh K. Tewari, Michael A. Gorin

Abstract: Objectives: To describe the development and validation of a zero-shot learning natural language processing (NLP) tool for abstracting data from unstructured text contained within PDF documents, such as those found within electronic health records. Materials and Methods: A data abstraction tool based on the GPT-3.5 model from OpenAI was developed and compared to three physician human abstractors in… ▽ More Objectives: To describe the development and validation of a zero-shot learning natural language processing (NLP) tool for abstracting data from unstructured text contained within PDF documents, such as those found within electronic health records. Materials and Methods: A data abstraction tool based on the GPT-3.5 model from OpenAI was developed and compared to three physician human abstractors in terms of time to task completion and accuracy for abstracting data on 14 unique variables from a set of 199 de-identified radical prostatectomy pathology reports. The reports were processed by the software tool in vectorized and scanned formats to establish the impact of optical character recognition on data abstraction. The tool was assessed for superiority for data abstraction speed and non-inferiority for accuracy. Results: The human abstractors required a mean of 101s per report for data abstraction, with times varying from 15 to 284 s. In comparison, the software tool required a mean of 12.8 s to process the vectorized reports and a mean of 15.8 to process the scanned reports (P < 0.001). The overall accuracies of the three human abstractors were 94.7%, 97.8%, and 96.4% for the combined set of 2786 datapoints. The software tool had an overall accuracy of 94.2% for the vectorized reports, proving to be non-inferior to the human abstractors at a margin of -10% ($α$=0.025). The tool had a slightly lower accuracy of 88.7% using the scanned reports, proving to be non-inferiority to 2 out of 3 human abstractors. Conclusion: The developed zero-shot learning NLP tool affords researchers comparable levels of accuracy to that of human abstractors, with significant time savings benefits. Because of the lack of need for task-specific model training, the developed tool is highly generalizable and can be used for a wide variety of data abstraction tasks, even outside the field of medicine. △ Less

Submitted 23 July, 2023; originally announced August 2023.

Comments: 10 pages, 3 figures, 1 table, 3 supplementary figures

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2303.02523 [pdf, ps, other]

Requirements for Mass Adoption of Assistive Listening Technology by the General Public

Authors: Thomas B. Kaufmann, Mehdi Foroogozar, Julie Liss, Visar Berisha

Abstract: Assistive listening systems (ALSs) dramatically increase speech intelligibility and reduce listening effort. It is very likely that essentially everyone, not only individuals with hearing loss, would benefit from the increased signal-to-noise ratio an ALS provides in almost any listening scenario. However, ALSs are rarely used by anyone other than people with severe to profound hearing losses. To… ▽ More Assistive listening systems (ALSs) dramatically increase speech intelligibility and reduce listening effort. It is very likely that essentially everyone, not only individuals with hearing loss, would benefit from the increased signal-to-noise ratio an ALS provides in almost any listening scenario. However, ALSs are rarely used by anyone other than people with severe to profound hearing losses. To date, the reasons for this poor adoption have not been systematically investigated. The authors hypothesize that the reasons for poor adoption of assistive listening technology include (1) an inability to use personally owned receiving devices, (2) a lack of high-fidelity stereo sound, (3) receiving devices not providing an unoccluded listening experience, (4) distortion from alignment delay and (5) a lack of automatic connectivity to an available assistive listening audio signal. We propose solutions to each of these problems in an effort to pave the way for mass adoption of assistive listening technology by the general public. △ Less

Submitted 3 May, 2023; v1 submitted 4 March, 2023; originally announced March 2023.

Comments: Accepted to ICASSP 2023

arXiv:2206.08353 [pdf, other]

Towards Understanding How Machines Can Learn Causal Overhypotheses

Authors: Eliza Kosoy, David M. Chan, Adrian Liu, Jasmine Collins, Bryanna Kaufmann, Sandy Han Huang, Jessica B. Hamrick, John Canny, Nan Rosemary Ke, Alison Gopnik

Abstract: Recent work in machine learning and cognitive science has suggested that understanding causal information is essential to the development of intelligence. The extensive literature in cognitive science using the ``blicket detector'' environment shows that children are adept at many kinds of causal inference and learning. We propose to adapt that environment for machine learning agents. One of the k… ▽ More Recent work in machine learning and cognitive science has suggested that understanding causal information is essential to the development of intelligence. The extensive literature in cognitive science using the ``blicket detector'' environment shows that children are adept at many kinds of causal inference and learning. We propose to adapt that environment for machine learning agents. One of the key challenges for current machine learning algorithms is modeling and understanding causal overhypotheses: transferable abstract hypotheses about sets of causal relationships. In contrast, even young children spontaneously learn and use causal overhypotheses. In this work, we present a new benchmark -- a flexible environment which allows for the evaluation of existing techniques under variable causal overhypotheses -- and demonstrate that many existing state-of-the-art methods have trouble generalizing in this environment. The code and resources for this benchmark are available at https://github.com/CannyLab/casual_overhypotheses. △ Less

Submitted 16 June, 2022; originally announced June 2022.

arXiv:2202.10430 [pdf, other]

Learning Causal Overhypotheses through Exploration in Children and Computational Models

Authors: Eliza Kosoy, Adrian Liu, Jasmine Collins, David M Chan, Jessica B Hamrick, Nan Rosemary Ke, Sandy H Huang, Bryanna Kaufmann, John Canny, Alison Gopnik

Abstract: Despite recent progress in reinforcement learning (RL), RL algorithms for exploration still remain an active area of research. Existing methods often focus on state-based metrics, which do not consider the underlying causal structures of the environment, and while recent research has begun to explore RL environments for causal learning, these environments primarily leverage causal information thro… ▽ More Despite recent progress in reinforcement learning (RL), RL algorithms for exploration still remain an active area of research. Existing methods often focus on state-based metrics, which do not consider the underlying causal structures of the environment, and while recent research has begun to explore RL environments for causal learning, these environments primarily leverage causal information through causal inference or induction rather than exploration. In contrast, human children - some of the most proficient explorers - have been shown to use causal information to great benefit. In this work, we introduce a novel RL environment designed with a controllable causal structure, which allows us to evaluate exploration strategies used by both agents and children in a unified environment. In addition, through experimentation on both computation models and children, we demonstrate that there are significant differences between information-gain optimal RL exploration in causal environments and the exploration of children in the same environments. We conclude with a discussion of how these findings may inspire new directions of research into efficient exploration and disambiguation of causal structures for RL algorithms. △ Less

Submitted 21 February, 2022; originally announced February 2022.

arXiv:1705.09811 [pdf, other]

Multi-shot ASP solving with clingo

Authors: Martin Gebser, Roland Kaminski, Benjamin Kaufmann, Torsten Schaub

Abstract: We introduce a new flexible paradigm of grounding and solving in Answer Set Programming (ASP), which we refer to as multi-shot ASP solving, and present its implementation in the ASP system clingo. Multi-shot ASP solving features grounding and solving processes that deal with continuously changing logic programs. In doing so, they remain operative and accommodate changes in a seamless way. For in… ▽ More We introduce a new flexible paradigm of grounding and solving in Answer Set Programming (ASP), which we refer to as multi-shot ASP solving, and present its implementation in the ASP system clingo. Multi-shot ASP solving features grounding and solving processes that deal with continuously changing logic programs. In doing so, they remain operative and accommodate changes in a seamless way. For instance, such processes allow for advanced forms of search, as in optimization or theory solving, or interaction with an environment, as in robotics or query-answering. Common to them is that the problem specification evolves during the reasoning process, either because data or constraints are added, deleted, or replaced. This evolutionary aspect adds another dimension to ASP since it brings about state changing operations. We address this issue by providing an operational semantics that characterizes grounding and solving processes in multi-shot ASP solving. This characterization provides a semantic account of grounder and solver states along with the operations manipulating them. The operative nature of multi-shot solving avoids redundancies in relaunching grounder and solver programs and benefits from the solver's learning capacities. clingo accomplishes this by complementing ASP's declarative input language with control capacities. On the declarative side, a new directive allows for structuring logic programs into named and parameterizable subprograms. The grounding and integration of these subprograms into the solving process is completely modular and fully controllable from the procedural side. To this end, clingo offers a new application programming interface that is conveniently accessible via scripting languages. △ Less

Submitted 20 March, 2018; v1 submitted 27 May, 2017; originally announced May 2017.

Comments: Under consideration for publication in Theory and Practice of Logic Programming (TPLP)

ACM Class: D.1.6

arXiv:1705.04569 [pdf, ps, other]

Clingcon: The Next Generation

Authors: Mutsunori Banbara, Benjamin Kaufmann, Max Ostrowski, Torsten Schaub

Abstract: We present the third generation of the constraint answer set system clingcon, combining Answer Set Programming (ASP) with finite domain constraint processing (CP). While its predecessors rely on a black-box approach to hybrid solving by integrating the CP solver gecode, the new clingcon system pursues a lazy approach using dedicated constraint propagators to extend propagation in the underlying AS… ▽ More We present the third generation of the constraint answer set system clingcon, combining Answer Set Programming (ASP) with finite domain constraint processing (CP). While its predecessors rely on a black-box approach to hybrid solving by integrating the CP solver gecode, the new clingcon system pursues a lazy approach using dedicated constraint propagators to extend propagation in the underlying ASP solver clasp. No extension is needed for parsing and grounding clingcon's hybrid modeling language since both can be accommodated by the new generic theory handling capabilities of the ASP grounder gringo. As a whole, clingcon 3 is thus an extension of the ASP system clingo 5, which itself relies on the grounder gringo and the solver clasp. The new approach of clingcon offers a seamless integration of CP propagation into ASP solving that benefits from the whole spectrum of clasp's reasoning modes, including for instance multi-shot solving and advanced optimization techniques. This is accomplished by a lazy approach that unfolds the representation of constraints and adds it to that of the logic program only when needed. Although the unfolding is usually dictated by the constraint propagators during solving, it can already be partially (or even totally) done during preprocessing. Moreover, clingcon's constraint preprocessing and propagation incorporate several well established CP techniques that greatly improve its performance. We demonstrate this via an extensive empirical evaluation contrasting, first, the various techniques in the context of CSP solving and, second, the new clingcon system with other hybrid ASP systems. Under consideration in Theory and Practice of Logic Programming (TPLP) △ Less

Submitted 12 May, 2017; originally announced May 2017.

Comments: Under consideration in Theory and Practice of Logic Programming (TPLP)

arXiv:1405.3694 [pdf, ps, other]

Clingo = ASP + Control: Preliminary Report

Authors: Martin Gebser, Roland Kaminski, Benjamin Kaufmann, Torsten Schaub

Abstract: We present the new ASP system clingo 4. Unlike its predecessors, being mere monolithic combinations of the grounder gringo with the solver clasp, the new clingo 4 series offers high-level constructs for realizing complex reasoning processes. Among others, such processes feature advanced forms of search, as in optimization or theory solving, or even interact with an environment, as in robotics or q… ▽ More We present the new ASP system clingo 4. Unlike its predecessors, being mere monolithic combinations of the grounder gringo with the solver clasp, the new clingo 4 series offers high-level constructs for realizing complex reasoning processes. Among others, such processes feature advanced forms of search, as in optimization or theory solving, or even interact with an environment, as in robotics or query-answering. Common to them is that the problem specification evolves during the reasoning process, either because data or constraints are added, deleted, or replaced. In fact, clingo 4 carries out such complex reasoning within a single integrated ASP grounding and solving process. This avoids redundancies in relaunching grounder and solver programs and benefits from the solver's learning capacities. clingo 4 accomplishes this by complementing ASP's declarative input language by control capacities expressed via the embedded scripting languages lua and python. On the declarative side, clingo 4 offers a new directive that allows for structuring logic programs into named and parameterizable subprograms. The grounding and integration of these subprograms into the solving process is completely modular and fully controllable from the procedural side, viz. the scripting languages. By strictly separating logic and control programs, clingo 4 also abolishes the need for dedicated systems for incremental and reactive reasoning, like iclingo and oclingo, respectively, and its flexibility goes well beyond the advanced yet still rigid solving processes of the latter. △ Less

Submitted 14 May, 2014; originally announced May 2014.

arXiv:1210.3265 [pdf, other]

Multi-threaded ASP Solving with clasp

Authors: Martin Gebser, Benjamin Kaufmann, Torsten Schaub

Abstract: We present the new multi-threaded version of the state-of-the-art answer set solver clasp. We detail its component and communication architecture and illustrate how they support the principal functionalities of clasp. Also, we provide some insights into the data representation used for different constraint types handled by clasp. All this is accompanied by an extensive experimental analysis of the… ▽ More We present the new multi-threaded version of the state-of-the-art answer set solver clasp. We detail its component and communication architecture and illustrate how they support the principal functionalities of clasp. Also, we provide some insights into the data representation used for different constraint types handled by clasp. All this is accompanied by an extensive experimental analysis of the major features related to multi-threading in clasp. △ Less

Submitted 11 October, 2012; originally announced October 2012.

Comments: 19 pages, 5 figures, to appear in Theory and Practice of Logic Programming

arXiv:1005.1716 [pdf, other]

Heuristics in Conflict Resolution

Authors: Christian Drescher, Martin Gebser, Benjamin Kaufmann, Torsten Schaub

Abstract: Modern solvers for Boolean Satisfiability (SAT) and Answer Set Programming (ASP) are based on sophisticated Boolean constraint solving techniques. In both areas, conflict-driven learning and related techniques constitute key features whose application is enabled by conflict analysis. Although various conflict analysis schemes have been proposed, implemented, and studied both theoretically and prac… ▽ More Modern solvers for Boolean Satisfiability (SAT) and Answer Set Programming (ASP) are based on sophisticated Boolean constraint solving techniques. In both areas, conflict-driven learning and related techniques constitute key features whose application is enabled by conflict analysis. Although various conflict analysis schemes have been proposed, implemented, and studied both theoretically and practically in the SAT area, the heuristic aspects involved in conflict analysis have not yet received much attention. Assuming a fixed conflict analysis scheme, we address the open question of how to identify "good'' reasons for conflicts, and we investigate several heuristics for conflict analysis in ASP solving. To our knowledge, a systematic study like ours has not yet been performed in the SAT area, thus, it might be beneficial for both the field of ASP as well as the one of SAT solving. △ Less

Submitted 11 May, 2010; originally announced May 2010.

Journal ref: Proceedings of the Twelfth International Workshop on Nonmonotonic Reasoning (2008) 141-149

Showing 1–9 of 9 results for author: Kaufmann, B