-
Hidden Flaws Behind Expert-Level Accuracy of GPT-4 Vision in Medicine
Authors:
Qiao **,
Fangyuan Chen,
Yiliang Zhou,
Ziyang Xu,
Justin M. Cheung,
Robert Chen,
Ronald M. Summers,
Justin F. Rousseau,
Peiyun Ni,
Marc J Landsman,
Sally L. Baxter,
Subhi J. Al'Aref,
Yijia Li,
Alex Chen,
Josef A. Brejt,
Michael F. Chiang,
Yifan Peng,
Zhiyong Lu
Abstract:
Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4V's rationales of image comprehension, recall of medical knowledge, and step-by…
▽ More
Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4V's rationales of image comprehension, recall of medical knowledge, and step-by-step multimodal reasoning when solving New England Journal of Medicine (NEJM) Image Challenges - an imaging quiz designed to test the knowledge and diagnostic capabilities of medical professionals. Evaluation results confirmed that GPT-4V performs comparatively to human physicians regarding multi-choice accuracy (81.6% vs. 77.8%). GPT-4V also performs well in cases where physicians incorrectly answer, with over 78% accuracy. However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (35.5%), most prominent in image comprehension (27.2%). Regardless of GPT-4V's high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such multimodal AI models into clinical workflows.
△ Less
Submitted 22 April, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
Artificial Intelligence for Interstellar Travel
Authors:
Andreas M. Hein,
Stephen Baxter
Abstract:
The large distances involved in interstellar travel require a high degree of spacecraft autonomy, realized by artificial intelligence. The breadth of tasks artificial intelligence could perform on such spacecraft involves maintenance, data collection, designing and constructing an infrastructure using in-situ resources. Despite its importance, existing publications on artificial intelligence and i…
▽ More
The large distances involved in interstellar travel require a high degree of spacecraft autonomy, realized by artificial intelligence. The breadth of tasks artificial intelligence could perform on such spacecraft involves maintenance, data collection, designing and constructing an infrastructure using in-situ resources. Despite its importance, existing publications on artificial intelligence and interstellar travel are limited to cursory descriptions where little detail is given about the nature of the artificial intelligence. This article explores the role of artificial intelligence for interstellar travel by compiling use cases, exploring capabilities, and proposing typologies, system and mission architectures. Estimations for the required intelligence level for specific types of interstellar probes are given, along with potential system and mission architectures, covering those proposed in the literature but also presenting novel ones. Finally, a generic design for interstellar probes with an AI payload is proposed. Given current levels of increase in computational power, a spacecraft with a similar computational power as the human brain would have a mass from dozens to hundreds of tons in a 2050-2060 timeframe. Given that the advent of the first interstellar missions and artificial general intelligence are estimated to be by the mid-21st century, a more in-depth exploration of the relationship between the two should be attempted, focusing on neglected areas such as protecting the artificial intelligence payload from radiation in interstellar space and the role of artificial intelligence in self-replication.
△ Less
Submitted 19 November, 2018; v1 submitted 15 November, 2018;
originally announced November 2018.
-
Putting in All the Stops: Execution Control for JavaScript
Authors:
Samuel Baxter,
Rachit Nigam,
Joe Gibbs Politz,
Shriram Krishnamurthi,
Arjun Guha
Abstract:
Scores of compilers produce JavaScript, enabling programmers to use many languages on the Web, reuse existing code, and even use Web IDEs. Unfortunately, most compilers inherit the browser's compromised execution model, so long-running programs freeze the browser tab, infinite loops crash IDEs, and so on. The few compilers that avoid these problems suffer poor performance and are difficult to engi…
▽ More
Scores of compilers produce JavaScript, enabling programmers to use many languages on the Web, reuse existing code, and even use Web IDEs. Unfortunately, most compilers inherit the browser's compromised execution model, so long-running programs freeze the browser tab, infinite loops crash IDEs, and so on. The few compilers that avoid these problems suffer poor performance and are difficult to engineer.
This paper presents Stopify, a source-to-source compiler that extends JavaScript with debugging abstractions and blocking operations, and easily integrates with existing compilers. We apply Stopify to 10 programming languages and develop a Web IDE that supports stop**, single-step**, breakpointing, and long-running computations. For nine languages, Stopify requires no or trivial compiler changes. For eight, our IDE is the first that provides these features. Two of our subject languages have compilers with similar features. Stopify's performance is competitive with these compilers and it makes them dramatically simpler.
Stopify's abstractions rely on first-class continuations, which it provides by compiling JavaScript to JavaScript. We also identify sub-languages of JavaScript that compilers implicitly use, and exploit these to improve performance. Finally, Stopify needs to repeatedly interrupt and resume program execution. We use a sampling-based technique to estimate program speed that outperforms other systems.
△ Less
Submitted 15 April, 2018; v1 submitted 8 February, 2018;
originally announced February 2018.
-
A Continuous Max-Flow Approach to Cyclic Field Reconstruction
Authors:
John S. H. Baxter,
Jonathan McLeod,
Terry M. Peters
Abstract:
Reconstruction of an image from noisy data using Markov Random Field theory has been explored by both the graph-cuts and continuous max-flow community in the form of the Potts and Ishikawa models. However, neither model takes into account the particular cyclic topology of specific intensity types such as the hue in natural colour images, or the phase in complex valued MRI. This paper presents \tex…
▽ More
Reconstruction of an image from noisy data using Markov Random Field theory has been explored by both the graph-cuts and continuous max-flow community in the form of the Potts and Ishikawa models. However, neither model takes into account the particular cyclic topology of specific intensity types such as the hue in natural colour images, or the phase in complex valued MRI. This paper presents \textit{cyclic continuous max-flow} image reconstruction which models the intensity being reconstructed as having a fundamentally cyclic topology. This model complements the Ishikawa model in that it is designed with image reconstruction in mind, having the topology of the intensity space inherent in the model while being readily extendable to an arbitrary intensity resolution.
△ Less
Submitted 11 November, 2015;
originally announced November 2015.
-
Shape Complexes in Continuous Max-Flow Hierarchical Multi-Labeling Problems
Authors:
John S. H. Baxter,
**g Yuan,
Terry M. Peters
Abstract:
Although topological considerations amongst multiple labels have been previously investigated in the context of continuous max-flow image segmentation, similar investigations have yet to be made about shape considerations in a general and extendable manner. This paper presents shape complexes for segmentation, which capture more complex shapes by combining multiple labels and super-labels constrai…
▽ More
Although topological considerations amongst multiple labels have been previously investigated in the context of continuous max-flow image segmentation, similar investigations have yet to be made about shape considerations in a general and extendable manner. This paper presents shape complexes for segmentation, which capture more complex shapes by combining multiple labels and super-labels constrained by geodesic star convexity. Shape complexes combine geodesic star convexity constraints with hierarchical label organization, which together allow for more complex shapes to be represented. This framework avoids the use of co-ordinate system war** techniques to convert shape constraints into topological constraints, which may be ambiguous or ill-defined for certain segmentation problems.
△ Less
Submitted 15 October, 2015;
originally announced October 2015.
-
A Proximal Bregman Projection Approach to Continuous Max-Flow Problems Using Entropic Distances
Authors:
John S. H. Baxter,
Martin Rajchl,
**g Yuan,
Terry M. Peters
Abstract:
One issue limiting the adaption of large-scale multi-region segmentation is the sometimes prohibitive memory requirements. This is especially troubling considering advances in massively parallel computing and commercial graphics processing units because of their already limited memory compared to the current random access memory used in more traditional computation. To address this issue in the fi…
▽ More
One issue limiting the adaption of large-scale multi-region segmentation is the sometimes prohibitive memory requirements. This is especially troubling considering advances in massively parallel computing and commercial graphics processing units because of their already limited memory compared to the current random access memory used in more traditional computation. To address this issue in the field of continuous max-flow segmentation, we have developed a \textit{pseudo-flow} framework using the theory of Bregman proximal projections and entropic distances which implicitly represents flow variables between labels and designated source and sink nodes. This reduces the memory requirements for max-flow segmentation by approximately 20\% for Potts models and approximately 30\% for hierarchical max-flow (HMF) and directed acyclic graph max-flow (DAGMF) models. This represents a great improvement in the state-of-the-art in max-flow segmentation, allowing for much larger problems to be addressed and accelerated using commercially available graphics processing hardware.
△ Less
Submitted 30 January, 2015;
originally announced January 2015.
-
A Continuous Max-Flow Approach to Multi-Labeling Problems under Arbitrary Region Regularization
Authors:
John S. H. Baxter,
Martin Rajchl,
**g Yuan,
Terry M. Peters
Abstract:
The incorporation of region regularization into max-flow segmentation has traditionally focused on ordering and part-whole relationships. A side effect of the development of such models is that it constrained regularization only to those cases, rather than allowing for arbitrary region regularization. Directed Acyclic Graphical Max-Flow (DAGMF) segmentation overcomes these limitations by allowing…
▽ More
The incorporation of region regularization into max-flow segmentation has traditionally focused on ordering and part-whole relationships. A side effect of the development of such models is that it constrained regularization only to those cases, rather than allowing for arbitrary region regularization. Directed Acyclic Graphical Max-Flow (DAGMF) segmentation overcomes these limitations by allowing for the algorithm designer to specify an arbitrary directed acyclic graph to structure a max-flow segmentation. This allows for individual 'parts' to be a member of multiple distinct 'wholes.'
△ Less
Submitted 5 June, 2014; v1 submitted 5 May, 2014;
originally announced May 2014.
-
RANCOR: Non-Linear Image Registration with Total Variation Regularization
Authors:
Martin Rajchl,
John S. H. Baxter,
Wu Qiu,
Ali R. Khan,
Aaron Fenster,
Terry M. Peters,
**g Yuan
Abstract:
Optimization techniques have been widely used in deformable registration, allowing for the incorporation of similarity metrics with regularization mechanisms. These regularization mechanisms are designed to mitigate the effects of trivial solutions to ill-posed registration problems and to otherwise ensure the resulting deformation fields are well-behaved. This paper introduces a novel deformable…
▽ More
Optimization techniques have been widely used in deformable registration, allowing for the incorporation of similarity metrics with regularization mechanisms. These regularization mechanisms are designed to mitigate the effects of trivial solutions to ill-posed registration problems and to otherwise ensure the resulting deformation fields are well-behaved. This paper introduces a novel deformable registration algorithm, RANCOR, which uses iterative convexification to address deformable registration problems under total-variation regularization. Initial comparative results against four state-of-the-art registration algorithms are presented using the Internet Brain Segmentation Repository (IBSR) database.
△ Less
Submitted 9 April, 2014;
originally announced April 2014.
-
A Continuous Max-Flow Approach to General Hierarchical Multi-Labeling Problems
Authors:
John S. H. Baxter,
Martin Rajchl,
**g Yuan,
Terry M. Peters
Abstract:
Multi-region segmentation algorithms often have the onus of incorporating complex anatomical knowledge representing spatial or geometric relationships between objects, and general-purpose methods of addressing this knowledge in an optimization-based manner have thus been lacking. This paper presents Generalized Hierarchical Max-Flow (GHMF) segmentation, which captures simple anatomical part-whole…
▽ More
Multi-region segmentation algorithms often have the onus of incorporating complex anatomical knowledge representing spatial or geometric relationships between objects, and general-purpose methods of addressing this knowledge in an optimization-based manner have thus been lacking. This paper presents Generalized Hierarchical Max-Flow (GHMF) segmentation, which captures simple anatomical part-whole relationships in the form of an unconstrained hierarchy. Regularization can then be applied to both parts and wholes independently, allowing for spatial grou** and clustering of labels in a globally optimal convex optimization framework. For the purposes of ready integration into a variety of segmentation tasks, the hierarchies can be presented in run-time, allowing for the segmentation problem to be readily specified and alternatives explored without undue programming effort or recompilation.
△ Less
Submitted 5 June, 2014; v1 submitted 1 April, 2014;
originally announced April 2014.
-
Crossing the Dripline to 11N Using Elastic Resonance Scattering
Authors:
K. Markenroth,
L. Axelsson,
S. Baxter,
M. J. G. Borge,
C. Donzaud,
S. Fayans,
H. O. U. Fynbo,
V. Z. Goldberg,
S. Grevy,
D. Guillemaud-Mueller,
B. Jonson,
K. -M. Kallman,
S. Leenhardt,
M. Lewitowicz,
T. Lonnroth,
P. Manngard,
I. Martel,
A. C. Mueller,
I. Mukha,
T. Nilsson,
G. Nyman,
N. A. Orr,
K. Riisager,
G. V. Rogachev,
M. -G. Saint-Laurent
, et al. (11 additional authors not shown)
Abstract:
The level structure of the unbound nucleus 11N has been studied by 10C+p elastic resonance scattering in inverse geometry with the LISE3 spectrometer at GANIL, using a 10C beam with an energy of 9.0 MeV/u. An additional measurement was done at the A1200 spectrometer at MSU. The excitation function above the 10C+p threshold has been determined up to 5 MeV. A potential-model analysis revealed thre…
▽ More
The level structure of the unbound nucleus 11N has been studied by 10C+p elastic resonance scattering in inverse geometry with the LISE3 spectrometer at GANIL, using a 10C beam with an energy of 9.0 MeV/u. An additional measurement was done at the A1200 spectrometer at MSU. The excitation function above the 10C+p threshold has been determined up to 5 MeV. A potential-model analysis revealed three resonance states at energies 1.27 (+0.18-0.05) MeV (Gamma=1.44 +-0.2 MeV), 2.01(+0.15-0.05) MeV, (Gamma=0.84 +-$0.2 MeV) and 3.75(+-0.05) MeV, (Gamma=0.60 +-0.05 MeV) with the spin-parity assignments I(pi) =1/2+, 1/2- and 5/2+, respectively. Hence, 11N is shown to have a ground state parity inversion completely analogous to its mirror partner, 11Be. A narrow resonance in the excitation function at 4.33 (+-0.05) MeV was also observed and assigned spin-parity 3/2-.
△ Less
Submitted 21 June, 2000;
originally announced June 2000.