-
Standing on FURM ground -- A framework for evaluating Fair, Useful, and Reliable AI Models in healthcare systems
Authors:
Alison Callahan,
Duncan McElfresh,
Juan M. Banda,
Gabrielle Bunney,
Danton Char,
Jonathan Chen,
Conor K. Corbin,
Debadutta Dash,
Norman L. Downing,
Sneha S. Jain,
Nikesh Kotecha,
Jonathan Masterson,
Michelle M. Mello,
Keith Morse,
Srikar Nallan,
Abby Pandya,
Anurang Revri,
Aditya Sharma,
Christopher Sharp,
Rahul Thapa,
Michael Wornow,
Alaa Youssef,
Michael A. Pfeffer,
Nigam H. Shah
Abstract:
The impact of using artificial intelligence (AI) to guide patient care or operational processes is an interplay of the AI model's output, the decision-making protocol based on that output, and the capacity of the stakeholders involved to take the necessary subsequent action. Estimating the effects of this interplay before deployment, and studying it in real time afterwards, are essential to bridge…
▽ More
The impact of using artificial intelligence (AI) to guide patient care or operational processes is an interplay of the AI model's output, the decision-making protocol based on that output, and the capacity of the stakeholders involved to take the necessary subsequent action. Estimating the effects of this interplay before deployment, and studying it in real time afterwards, are essential to bridge the chasm between AI model development and achievable benefit. To accomplish this, the Data Science team at Stanford Health Care has developed a Testing and Evaluation (T&E) mechanism to identify fair, useful and reliable AI models (FURM) by conducting an ethical review to identify potential value mismatches, simulations to estimate usefulness, financial projections to assess sustainability, as well as analyses to determine IT feasibility, design a deployment strategy, and recommend a prospective monitoring and evaluation plan. We report on FURM assessments done to evaluate six AI guided solutions for potential adoption, spanning clinical and operational settings, each with the potential to impact from several dozen to tens of thousands of patients each year. We describe the assessment process, summarize the six assessments, and share our framework to enable others to conduct similar assessments. Of the six solutions we assessed, two have moved into a planning and implementation phase. Our novel contributions - usefulness estimates by simulation, financial projections to quantify sustainability, and a process to do ethical assessments - as well as their underlying methods and open source tools, are available for other healthcare systems to conduct actionable evaluations of candidate AI solutions.
△ Less
Submitted 14 March, 2024; v1 submitted 26 February, 2024;
originally announced March 2024.
-
Opinion Mining Using Population-tuned Generative Language Models
Authors:
Allmin Susaiyah,
Abhinay Pandya,
Aki Härmä
Abstract:
We present a novel method for mining opinions from text collections using generative language models trained on data collected from different populations. We describe the basic definitions, methodology and a generic algorithm for opinion insight mining. We demonstrate the performance of our method in an experiment where a pre-trained generative model is fine-tuned using specifically tailored conte…
▽ More
We present a novel method for mining opinions from text collections using generative language models trained on data collected from different populations. We describe the basic definitions, methodology and a generic algorithm for opinion insight mining. We demonstrate the performance of our method in an experiment where a pre-trained generative model is fine-tuned using specifically tailored content with unnatural and fully annotated opinions. We show that our approach can learn and transfer the opinions to the semantic classes while maintaining the proportion of polarisation. Finally, we demonstrate the usage of an insight mining system to scale up the discovery of opinion insights from a real text corpus.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
nuReality: A VR environment for research of pedestrian and autonomous vehicle interactions
Authors:
Paul Schmitt,
Nicholas Britten,
JiHyun Jeong,
Amelia Coffey,
Kevin Clark,
Shweta Sunil Kothawade,
Elena Corina Grigore,
Adam Khaw,
Christopher Konopka,
Linh Pham,
Kim Ryan,
Christopher Schmitt,
Aryaman Pandya,
Emilio Frazzoli
Abstract:
We present nuReality, a virtual reality 'VR' environment designed to test the efficacy of vehicular behaviors to communicate intent during interactions between autonomous vehicles 'AVs' and pedestrians at urban intersections. In this project we focus on expressive behaviors as a means for pedestrians to readily recognize the underlying intent of the AV's movements. VR is an ideal tool to use to te…
▽ More
We present nuReality, a virtual reality 'VR' environment designed to test the efficacy of vehicular behaviors to communicate intent during interactions between autonomous vehicles 'AVs' and pedestrians at urban intersections. In this project we focus on expressive behaviors as a means for pedestrians to readily recognize the underlying intent of the AV's movements. VR is an ideal tool to use to test these situations as it can be immersive and place subjects into these potentially dangerous scenarios without risk. nuReality provides a novel and immersive virtual reality environment that includes numerous visual details (road and building texturing, parked cars, swaying tree limbs) as well as auditory details (birds chir**, cars honking in the distance, people talking). In these files we present the nuReality environment, its 10 unique vehicle behavior scenarios, and the Unreal Engine and Autodesk Maya source files for each scenario. The files are publicly released as open source at www.nuReality.org, to support the academic community studying the critical AV-pedestrian interaction.
△ Less
Submitted 12 January, 2022;
originally announced January 2022.
-
Cascading Adaptors to Leverage English Data to Improve Performance of Question Answering for Low-Resource Languages
Authors:
Hariom A. Pandya,
Bhavik Ardeshna,
Dr. Brijesh S. Bhatt
Abstract:
Transformer based architectures have shown notable results on many down streaming tasks including question answering. The availability of data, on the other hand, impedes obtaining legitimate performance for low-resource languages. In this paper, we investigate the applicability of pre-trained multilingual models to improve the performance of question answering in low-resource languages. We tested…
▽ More
Transformer based architectures have shown notable results on many down streaming tasks including question answering. The availability of data, on the other hand, impedes obtaining legitimate performance for low-resource languages. In this paper, we investigate the applicability of pre-trained multilingual models to improve the performance of question answering in low-resource languages. We tested four combinations of language and task adapters using multilingual transformer architectures on seven languages similar to MLQA dataset. Additionally, we have also proposed zero-shot transfer learning of low-resource question answering using language and task adapters. We observed that stacking the language and the task adapters improves the multilingual transformer models' performance significantly for low-resource languages.
△ Less
Submitted 18 December, 2021;
originally announced December 2021.
-
Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices
Authors:
Hariom A. Pandya,
Brijesh S. Bhatt
Abstract:
The usage and amount of information available on the internet increase over the past decade. This digitization leads to the need for automated answering system to extract fruitful information from redundant and transitional knowledge sources. Such systems are designed to cater the most prominent answer from this giant knowledge source to the user query using natural language understanding (NLU) an…
▽ More
The usage and amount of information available on the internet increase over the past decade. This digitization leads to the need for automated answering system to extract fruitful information from redundant and transitional knowledge sources. Such systems are designed to cater the most prominent answer from this giant knowledge source to the user query using natural language understanding (NLU) and thus eminently depends on the Question-answering(QA) field.
Question answering involves but not limited to the steps like map** of user question to pertinent query, retrieval of relevant information, finding the best suitable answer from the retrieved information etc. The current improvement of deep learning models evince compelling performance improvement in all these tasks.
In this review work, the research directions of QA field are analyzed based on the type of question, answer type, source of evidence-answer, and modeling approach. This detailing followed by open challenges of the field like automatic question generation, similarity detection and, low resource availability for a language. In the end, a survey of available datasets and evaluation measures is presented.
△ Less
Submitted 7 December, 2021;
originally announced December 2021.
-
Multi-channel MRI Embedding: An EffectiveStrategy for Enhancement of Human Brain WholeTumor Segmentation
Authors:
Apurva Pandya,
Catherine Samuel,
Nisargkumar Patel,
Vaibhavkumar Patel,
Thangarajah Akilan
Abstract:
One of the most important tasks in medical image processing is the brain's whole tumor segmentation. It assists in quicker clinical assessment and early detection of brain tumors, which is crucial for lifesaving treatment procedures of patients. Because, brain tumors often can be malignant or benign, if they are detected at an early stage. A brain tumor is a collection or a mass of abnormal cells…
▽ More
One of the most important tasks in medical image processing is the brain's whole tumor segmentation. It assists in quicker clinical assessment and early detection of brain tumors, which is crucial for lifesaving treatment procedures of patients. Because, brain tumors often can be malignant or benign, if they are detected at an early stage. A brain tumor is a collection or a mass of abnormal cells in the brain. The human skull encloses the brain very rigidly and any growth inside this restricted place can cause severe health issues. The detection of brain tumors requires careful and intricate analysis for surgical planning and treatment. Most physicians employ Magnetic Resonance Imaging (MRI) to diagnose such tumors. A manual diagnosis of the tumors using MRI is known to be time-consuming; approximately, it takes up to eighteen hours per sample. Thus, the automatic segmentation of tumors has become an optimal solution for this problem. Studies have shown that this technique provides better accuracy and it is faster than manual analysis resulting in patients receiving the treatment at the right time. Our research introduces an efficient strategy called Multi-channel MRI embedding to improve the result of deep learning-based tumor segmentation. The experimental analysis on the Brats-2019 dataset wrt the U-Net encoder-decoder (EnDec) model shows significant improvement. The embedding strategy surmounts the state-of-the-art approaches with an improvement of 2% without any timing overheads.
△ Less
Submitted 13 September, 2020;
originally announced September 2020.
-
Machine Learning Approach for Skill Evaluation in Robotic-Assisted Surgery
Authors:
Mahtab J. Fard,
Sattar Ameri,
Ratna B. Chinnam,
Abhilash K. Pandya,
Michael D. Klein,
R. Darin Ellis
Abstract:
Evaluating surgeon skill has predominantly been a subjective task. Development of objective methods for surgical skill assessment are of increased interest. Recently, with technological advances such as robotic-assisted minimally invasive surgery (RMIS), new opportunities for objective and automated assessment frameworks have arisen. In this paper, we applied machine learning methods to automatica…
▽ More
Evaluating surgeon skill has predominantly been a subjective task. Development of objective methods for surgical skill assessment are of increased interest. Recently, with technological advances such as robotic-assisted minimally invasive surgery (RMIS), new opportunities for objective and automated assessment frameworks have arisen. In this paper, we applied machine learning methods to automatically evaluate performance of the surgeon in RMIS. Six important movement features were used in the evaluation including completion time, path length, depth perception, speed, smoothness and curvature. Different classification methods applied to discriminate expert and novice surgeons. We test our method on real surgical data for suturing task and compare the classification result with the ground truth data (obtained by manual labeling). The experimental results show that the proposed framework can classify surgical skill level with relatively high accuracy of 85.7%. This study demonstrates the ability of machine learning methods to automatically classify expert and novice surgeons using movement features for different RMIS tasks. Due to the simplicity and generalizability of the introduced classification method, it is easy to implement in existing trainers.
△ Less
Submitted 15 November, 2016;
originally announced November 2016.