Search | arXiv e-print repository

Performance of large language models in numerical vs. semantic medical knowledge: Benchmarking on evidence-based Q&As

Authors: Eden Avnat, Michal Levy, Daniel Herstain, Elia Yanko, Daniel Ben Joya, Michal Tzuchman Katz, Dafna Eshel, Sahar Laros, Yael Dagan, Shahar Barami, Joseph Mermelstein, Shahar Ovadia, Noam Shomron, Varda Shalev, Raja-Elie E. Abdulnour

Abstract: Clinical problem-solving requires processing of semantic medical knowledge such as illness scripts and numerical medical knowledge of diagnostic tests for evidence-based decision-making. As large language models (LLMs) show promising results in many aspects of language-based clinical practice, their ability to generate non-language evidence-based answers to clinical questions is inherently limited… ▽ More Clinical problem-solving requires processing of semantic medical knowledge such as illness scripts and numerical medical knowledge of diagnostic tests for evidence-based decision-making. As large language models (LLMs) show promising results in many aspects of language-based clinical practice, their ability to generate non-language evidence-based answers to clinical questions is inherently limited by tokenization. Therefore, we evaluated LLMs' performance on two question types: numeric (correlating findings) and semantic (differentiating entities) while examining differences within and between LLMs in medical aspects and comparing their performance to humans. To generate straightforward multi-choice questions and answers (QAs) based on evidence-based medicine (EBM), we used a comprehensive medical knowledge graph (encompassed data from more than 50,00 peer-reviewed articles) and created the "EBMQA". EBMQA contains 105,000 QAs labeled with medical and non-medical topics and classified into numerical or semantic questions. We benchmarked this dataset using more than 24,500 QAs on two state-of-the-art LLMs: Chat-GPT4 and Claude3-Opus. We evaluated the LLMs accuracy on semantic and numerical question types and according to sub-labeled topics. For validation, six medical experts were tested on 100 numerical EBMQA questions. We found that both LLMs excelled more in semantic than numerical QAs, with Claude3 surpassing GPT4 in numerical QAs. However, both LLMs showed inter and intra gaps in different medical aspects and remained inferior to humans. Thus, their medical advice should be addressed carefully. △ Less

Submitted 1 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

arXiv:2204.04995 [pdf, other]

doi 10.1126/sciadv.ade03

A simple catch: thermal fluctuations enable hydrodynamic trap** of microrollers by obstacles

Authors: Ernest B. van der Wee, Brendan C. Blackwell, Florencio Balboa Usabiaga, Andrey V. Sokolov, Isaiah T. Katz, Blaise Delmotte, Michelle M. Driscoll

Abstract: It is known that obstacles can hydrodynamically trap bacteria and synthetic microswimmers in orbits, where the trap** time heavily depends on the swimmer flow field and noise is needed to escape the trap. Here, we use experiments and simulations to investigate the trap** of microrollers by obstacles. Microrollers are rotating particles close to a bottom surface, which have a prescribed propuls… ▽ More It is known that obstacles can hydrodynamically trap bacteria and synthetic microswimmers in orbits, where the trap** time heavily depends on the swimmer flow field and noise is needed to escape the trap. Here, we use experiments and simulations to investigate the trap** of microrollers by obstacles. Microrollers are rotating particles close to a bottom surface, which have a prescribed propulsion direction imposed by an external rotating magnetic field. The flow field that drives their motion is quite different from previously studied swimmers. We found that the trap** time can be controlled by modifying the obstacle size or the colloid-obstacle repulsive potential. We detail the mechanisms of the trap** and find two remarkable features: The microroller is confined in the wake of the obstacle, and it can only enter the trap with Brownian motion. While noise is usually needed to escape traps in dynamical systems, here, we show that it is the only means to reach the hydrodynamic attractor. △ Less

Submitted 9 March, 2023; v1 submitted 11 April, 2022; originally announced April 2022.

Comments: Supplementary Files can be found at: https://doi.org/10.6084/m9.figshare.19772950

Journal ref: Sci. Adv. 9, eade0320(2023)

arXiv:1905.06015 [pdf]

doi 10.1103/PhysRevLett.84.79

Quasi-Phase-Matching in Chiral Materials

Authors: Bertrand Busson, Martti Kauranen, Colin Nuckolls, Thomas Katz, André Persoons

Abstract: The second-order nonlinear optical coefficients associated with chirality differ in sign for the two mirror-image forms (enantiomers) of a chiral material. Structures comprised of alternating stacks of the enantiomers can therefore be used for quasi-phase-matched frequency conversion, as we demonstrate here by second-harmonic generation from Langmuir-Blodgett films of a helicenebisquinone. Such st… ▽ More The second-order nonlinear optical coefficients associated with chirality differ in sign for the two mirror-image forms (enantiomers) of a chiral material. Structures comprised of alternating stacks of the enantiomers can therefore be used for quasi-phase-matched frequency conversion, as we demonstrate here by second-harmonic generation from Langmuir-Blodgett films of a helicenebisquinone. Such structures could lead to new types of frequency converters in which both the second-order nonlinear response and quasi-phase-matching arise from the chirality of a material rather than its polar order. PACS numbers: 42.65.Ky, 42.70.Nq, 78.66.Qn △ Less

Submitted 15 May, 2019; originally announced May 2019.

Journal ref: Physical Review Letters, American Physical Society, 2000, 84 (1), pp.79-82

arXiv:1508.06232 [pdf, other]

Large-scale EM Analysis of the Drosophila Antennal Lobe with Automatically Computed Synapse Point Clouds

Authors: Ting Zhao, Shin-ya Takemura, Gary B. Huang, Jane Anne Horne, William T. Katz, Kazunori Shinomiya, Louis K. Scheffer, Ian A. Meinertzhagen, Patricia K. Rivlin, Stephen M. Plaza

Abstract: The promise of extracting connectomes and performing useful analysis on large electron microscopy (EM) datasets has been an elusive dream for many years. Tracing in even the smallest portions of neuropil requires copious human annotation, the rate-limiting step for generating a connectome. While a combination of improved imaging and automatic segmentation will lead to the analysis of increasingly… ▽ More The promise of extracting connectomes and performing useful analysis on large electron microscopy (EM) datasets has been an elusive dream for many years. Tracing in even the smallest portions of neuropil requires copious human annotation, the rate-limiting step for generating a connectome. While a combination of improved imaging and automatic segmentation will lead to the analysis of increasingly large volumes, machines still fail to reach the quality of human tracers. Unfortunately, small errors in image segmentation can lead to catastrophic distortions of the connectome. In this paper, to analyze very large datasets, we explore different mechanisms that are less sensitive to errors in automation. Namely, we advocate and deploy extensive synapse detection on the entire antennal lobe (AL) neuropil in the brain of the fruit fly Drosophila, a region much larger than any densely annotated to date. The resulting synapse point cloud produced is invaluable for determining compartment boundaries in the AL and choosing specific regions for subsequent analysis. We introduce our methodology in this paper for region selection and show both manual and automatic synapse annotation results. Finally, we note the correspondence between image datasets obtained using the synaptic marker, antibody nc82, and our datasets enabling registration between light and EM image modalities. △ Less

Submitted 25 August, 2015; originally announced August 2015.

Showing 1–4 of 4 results for author: Katz, T