-
Human-Algorithm Collaborative Bayesian Optimization for Engineering Systems
Authors:
Tom Savage,
Ehecatl Antonio del Rio Chanona
Abstract:
Bayesian optimization has been successfully applied throughout Chemical Engineering for the optimization of functions that are expensive-to-evaluate, or where gradients are not easily obtainable. However, domain experts often possess valuable physical insights that are overlooked in fully automated decision-making approaches, necessitating the inclusion of human input. In this article we re-introd…
▽ More
Bayesian optimization has been successfully applied throughout Chemical Engineering for the optimization of functions that are expensive-to-evaluate, or where gradients are not easily obtainable. However, domain experts often possess valuable physical insights that are overlooked in fully automated decision-making approaches, necessitating the inclusion of human input. In this article we re-introduce the human back into the data-driven decision making loop by outlining an approach for collaborative Bayesian optimization. Our methodology exploits the hypothesis that humans are more efficient at making discrete choices rather than continuous ones and enables experts to influence critical early decisions. We apply high-throughput (batch) Bayesian optimization alongside discrete decision theory to enable domain experts to influence the selection of experiments. At every iteration we apply a multi-objective approach that results in a set of alternate solutions that have both high utility and are reasonably distinct. The expert then selects the desired solution for evaluation from this set, allowing for the inclusion of expert knowledge and improving accountability, whilst maintaining the advantages of Bayesian optimization. We demonstrate our approach across a number of applied and numerical case studies including bioprocess optimization and reactor geometry design, demonstrating that even in the case of an uninformed practitioner our algorithm recovers the regret of standard Bayesian optimization. Through the inclusion of continuous expert opinion, our approach enables faster convergence, and improved accountability for Bayesian optimization in engineering systems.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Methods to Estimate Large Language Model Confidence
Authors:
Maia Kotelanski,
Robert Gallo,
Ashwin Nayak,
Thomas Savage
Abstract:
Large Language Models have difficulty communicating uncertainty, which is a significant obstacle to applying LLMs to complex medical tasks. This study evaluates methods to measure LLM confidence when suggesting a diagnosis for challenging clinical vignettes. GPT4 was asked a series of challenging case questions using Chain of Thought and Self Consistency prompting. Multiple methods were investigat…
▽ More
Large Language Models have difficulty communicating uncertainty, which is a significant obstacle to applying LLMs to complex medical tasks. This study evaluates methods to measure LLM confidence when suggesting a diagnosis for challenging clinical vignettes. GPT4 was asked a series of challenging case questions using Chain of Thought and Self Consistency prompting. Multiple methods were investigated to assess model confidence and evaluated on their ability to predict the models observed accuracy. The methods evaluated were Intrinsic Confidence, SC Agreement Frequency and CoT Response Length. SC Agreement Frequency correlated with observed accuracy, yielding a higher Area under the Receiver Operating Characteristic Curve compared to Intrinsic Confidence and CoT Length analysis. SC agreement is the most useful proxy for model confidence, especially for medical diagnosis. Model Intrinsic Confidence and CoT Response Length exhibit a weaker ability to differentiate between correct and incorrect answers, preventing them from being reliable and interpretable markers for model confidence. We conclude GPT4 has a limited ability to assess its own diagnostic accuracy. SC Agreement Frequency is the most useful method to measure GPT4 confidence.
△ Less
Submitted 8 December, 2023; v1 submitted 28 November, 2023;
originally announced December 2023.
-
Expert-guided Bayesian Optimisation for Human-in-the-loop Experimental Design of Known Systems
Authors:
Tom Savage,
Ehecatl Antonio del Rio Chanona
Abstract:
Domain experts often possess valuable physical insights that are overlooked in fully automated decision-making processes such as Bayesian optimisation. In this article we apply high-throughput (batch) Bayesian optimisation alongside anthropological decision theory to enable domain experts to influence the selection of optimal experiments. Our methodology exploits the hypothesis that humans are bet…
▽ More
Domain experts often possess valuable physical insights that are overlooked in fully automated decision-making processes such as Bayesian optimisation. In this article we apply high-throughput (batch) Bayesian optimisation alongside anthropological decision theory to enable domain experts to influence the selection of optimal experiments. Our methodology exploits the hypothesis that humans are better at making discrete choices than continuous ones and enables experts to influence critical early decisions. At each iteration we solve an augmented multi-objective optimisation problem across a number of alternate solutions, maximising both the sum of their utility function values and the determinant of their covariance matrix, equivalent to their total variability. By taking the solution at the knee point of the Pareto front, we return a set of alternate solutions at each iteration that have both high utility values and are reasonably distinct, from which the expert selects one for evaluation. We demonstrate that even in the case of an uninformed practitioner, our algorithm recovers the regret of standard Bayesian optimisation.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records
Authors:
Scott L. Fleming,
Alejandro Lozano,
William J. Haberkorn,
Jenelle A. **dal,
Eduardo P. Reis,
Rahul Thapa,
Louis Blankemeier,
Julian Z. Genkins,
Ethan Steinberg,
Ashwin Nayak,
Birju S. Patel,
Chia-Chun Chiang,
Alison Callahan,
Zepeng Huo,
Sergios Gatidis,
Scott J. Adams,
Oluseyi Fayanju,
Shreya J. Shah,
Thomas Savage,
Ethan Goh,
Akshay S. Chaudhari,
Nima Aghaeepour,
Christopher Sharp,
Michael A. Pfeffer,
Percy Liang
, et al. (5 additional authors not shown)
Abstract:
The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture…
▽ More
The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture the complexity of information needs and documentation burdens experienced by clinicians. To address these challenges, we introduce MedAlign, a benchmark dataset of 983 natural language instructions for EHR data. MedAlign is curated by 15 clinicians (7 specialities), includes clinician-written reference responses for 303 instructions, and provides 276 longitudinal EHRs for grounding instruction-response pairs. We used MedAlign to evaluate 6 general domain LLMs, having clinicians rank the accuracy and quality of each LLM response. We found high error rates, ranging from 35% (GPT-4) to 68% (MPT-7B-Instruct), and an 8.3% drop in accuracy moving from 32k to 2k context lengths for GPT-4. Finally, we report correlations between clinician rankings and automated natural language generation metrics as a way to rank LLMs without human review. We make MedAlign available under a research data use agreement to enable LLM evaluations on tasks aligned with clinician needs and preferences.
△ Less
Submitted 24 December, 2023; v1 submitted 27 August, 2023;
originally announced August 2023.
-
Machine Learning-Assisted Discovery of Flow Reactor Designs
Authors:
Tom Savage,
Nausheen Basha,
Jonathan McDonough,
James Krassowski,
Omar K Matar,
Ehecatl Antonio del Rio Chanona
Abstract:
Additive manufacturing has enabled the fabrication of advanced reactor geometries, permitting larger, more complex design spaces. Identifying promising configurations within such spaces presents a significant challenge for current approaches. Furthermore, existing parameterisations of reactor geometries are low-dimensional with expensive optimisation limiting more complex solutions. To address thi…
▽ More
Additive manufacturing has enabled the fabrication of advanced reactor geometries, permitting larger, more complex design spaces. Identifying promising configurations within such spaces presents a significant challenge for current approaches. Furthermore, existing parameterisations of reactor geometries are low-dimensional with expensive optimisation limiting more complex solutions. To address this challenge, we establish a machine learning-assisted approach for the design of the next-generation of chemical reactors, combining the application of high-dimensional parameterisations, computational fluid dynamics, and multi-fidelity Bayesian optimisation. We associate the development of mixing-enhancing vortical flow structures in novel coiled reactors with performance, and use our approach to identify key characteristics of optimal designs. By appealing to the principles of flow dynamics, we rationalise the selection of novel design features that lead to experimental plug flow performance improvements of 60% over conventional designs. Our results demonstrate that coupling advanced manufacturing techniques with `augmented-intelligence' approaches can lead to superior design performance and, consequently, emissions-reduction and sustainability.
△ Less
Submitted 6 June, 2024; v1 submitted 17 August, 2023;
originally announced August 2023.
-
Diagnostic Reasoning Prompts Reveal the Potential for Large Language Model Interpretability in Medicine
Authors:
Thomas Savage,
Ashwin Nayak,
Robert Gallo,
Ekanath Rangan,
Jonathan H Chen
Abstract:
One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop novel diagnostic reasoning prompts to study whether LLMs can perform clinical reasoning to accurately form a diagnosis. We find that GPT4 can…
▽ More
One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop novel diagnostic reasoning prompts to study whether LLMs can perform clinical reasoning to accurately form a diagnosis. We find that GPT4 can be prompted to mimic the common clinical reasoning processes of clinicians without sacrificing diagnostic accuracy. This is significant because an LLM that can use clinical reasoning to provide an interpretable rationale offers physicians a means to evaluate whether LLMs can be trusted for patient care. Novel prompting methods have the potential to expose the black box of LLMs, bringing them one step closer to safe and effective use in medicine.
△ Less
Submitted 13 August, 2023;
originally announced August 2023.
-
Discovery of mixing characteristics for enhancing coiled reactor performance through a Bayesian Optimisation-CFD approach
Authors:
Nausheen Basha,
Thomas Savage,
Jonathan McDonough,
Ehecatl Antonio Del-Rio Chanona,
Omar K. Matar
Abstract:
Processes involving the manufacture of fine/bulk chemicals, pharmaceuticals, biofuels, and waste treatment require plug flow characteristics to minimise their energy consumption and costs, and maximise product quality. One such versatile flow chemistry platform is the coiled tube reactor subjected to oscillatory motion, producing excellent plug flow qualities equivalent to well-mixed tanks-in-seri…
▽ More
Processes involving the manufacture of fine/bulk chemicals, pharmaceuticals, biofuels, and waste treatment require plug flow characteristics to minimise their energy consumption and costs, and maximise product quality. One such versatile flow chemistry platform is the coiled tube reactor subjected to oscillatory motion, producing excellent plug flow qualities equivalent to well-mixed tanks-in-series 'N'. In this study, we discover the critical features of these flows that result in high plug flow performance using a data-driven approach. This is done by integrating Bayesian optimisation, a surrogate model approach, with Computational fluid dynamics that we treat as a black-box function to explore the parameter space of the operating conditions, oscillation amplitude and frequency, and net flow rate. Here, we correlate the flow characteristics as a function of the dimensionless Strouhal, oscillatory Dean, and Reynolds numbers to the reactor plug flow performance value 'N'. Under conditions of optimal performance (specific examples are provided herein), the oscillatory flow is just sufficient to limit axial dispersion through flow reversal and redirection, and to promote Dean vortices. This automated, open-source, integrated method can be easily adapted to identify the flow characteristics that produce an optimised performance for other chemical reactors and processes.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
Multi-Fidelity Data-Driven Design and Analysis of Reactor and Tube Simulations
Authors:
Tom Savage,
Nausheen Basha,
Jonathan McDonough,
Omar K Matar,
Ehecatl Antonio del Rio Chanona
Abstract:
The development of new manufacturing techniques such as 3D printing have enabled the creation of previously infeasible chemical reactor designs. Systematically optimizing the highly parameterized geometries involved in these new classes of reactor is vital to ensure enhanced mixing characteristics and feasible manufacturability. Here we present a framework to rapidly solve this nonlinear, computat…
▽ More
The development of new manufacturing techniques such as 3D printing have enabled the creation of previously infeasible chemical reactor designs. Systematically optimizing the highly parameterized geometries involved in these new classes of reactor is vital to ensure enhanced mixing characteristics and feasible manufacturability. Here we present a framework to rapidly solve this nonlinear, computationally expensive, and derivative-free problem, enabling the fast prototype of novel reactor parameterizations. We take advantage of Gaussian processes to adaptively learn a multi-fidelity model of reactor simulations across a number of different continuous mesh fidelities. The search space of reactor geometries is explored through an amalgam of different, potentially lower, fidelity simulations which are chosen for evaluation based on weighted acquisition function, trading off information gain with cost of simulation. Within our framework we derive a novel criteria for monitoring the progress and dictating the termination of multi-fidelity Bayesian optimization, ensuring a high fidelity solution is returned before experimental budget is exhausted. The class of reactor we investigate are helical-tube reactors under pulsed-flow conditions, which have demonstrated outstanding mixing characteristics, have the potential to be highly parameterized, and are easily manufactured using 3D printing. To validate our results, we 3D print and experimentally validate the optimal reactor geometry, confirming its mixing performance. In doing so we demonstrate our design framework to be extensible to a broad variety of expensive simulation-based optimization problems, supporting the design of the next generation of highly parameterized chemical reactors.
△ Less
Submitted 7 July, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Robust Market Potential Assessment: Designing optimal policies for low-carbon technology adoption in an increasingly uncertain world
Authors:
Tom Savage,
Antonio del Rio Chanona,
Gbemi Oluleye
Abstract:
Increasing the adoption of alternative technologies is vital to ensure a successful transition to net-zero emissions in the manufacturing sector. Yet there is no model to analyse technology adoption and the impact of policy interventions in generating sufficient demand to reduce cost. Such a model is vital for assessing policy-instruments for the implementation of future energy scenarios. The desi…
▽ More
Increasing the adoption of alternative technologies is vital to ensure a successful transition to net-zero emissions in the manufacturing sector. Yet there is no model to analyse technology adoption and the impact of policy interventions in generating sufficient demand to reduce cost. Such a model is vital for assessing policy-instruments for the implementation of future energy scenarios. The design of successful policies for technology uptake becomes increasingly difficult when associated market forces/factors are uncertain, such as energy prices or technology efficiencies. In this paper we formulate a novel robust market potential assessment problem under uncertainty, resulting in policies that are immune to uncertain factors. We demonstrate two case studies: the potential use of carbon capture and storage for iron and steel production across the EU, and the transition to hydrogen from natural gas in steam boilers across the chemicals industry in the UK. Each robust optimisation problem is solved using an iterative cutting planes algorithm which enables existing models to be solved under uncertainty. By taking advantage of parallelisation we are able to solve the nonlinear robust market assessment problem for technology adoption in times within the same order of magnitude as the nominal problem. Policy makers often wish to trade-off certainty with effectiveness of a solution. Therefore, we apply an approximation to chance constraints, varying the amount of uncertainty to locate less certain but more effective solutions. Our results demonstrate the possibility of locating robust policies for the implementation of low-carbon technologies, as well as providing direct insights for policy-makers into the decrease in policy effectiveness resulting from increasing robustness. The approach we present is extensible to a large number of policy design and alternative technology adoption problems.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Deep Gaussian Process-based Multi-fidelity Bayesian Optimization for Simulated Chemical Reactors
Authors:
Tom Savage,
Nausheen Basha,
Omar Matar,
Ehecatl Antonio Del-Rio Chanona
Abstract:
New manufacturing techniques such as 3D printing have recently enabled the creation of previously infeasible chemical reactor designs. Optimizing the geometry of the next generation of chemical reactors is important to understand the underlying physics and to ensure reactor feasibility in the real world. This optimization problem is computationally expensive, nonlinear, and derivative-free making…
▽ More
New manufacturing techniques such as 3D printing have recently enabled the creation of previously infeasible chemical reactor designs. Optimizing the geometry of the next generation of chemical reactors is important to understand the underlying physics and to ensure reactor feasibility in the real world. This optimization problem is computationally expensive, nonlinear, and derivative-free making it challenging to solve. In this work, we apply deep Gaussian processes (DGPs) to model multi-fidelity coiled-tube reactor simulations in a Bayesian optimization setting. By applying a multi-fidelity Bayesian optimization method, the search space of reactor geometries is explored through an amalgam of different fidelity simulations which are chosen based on prediction uncertainty and simulation cost, maximizing the use of computational budget. The use of DGPs provides an end-to-end model for five discrete mesh fidelities, enabling less computational effort to gain good solutions during optimization. The accuracy of simulations for these five fidelities is determined against experimental data obtained from a 3D printed reactor configuration, providing insights into appropriate hyper-parameters. We hope this work provides interesting insight into the practical use of DGP-based multi-fidelity Bayesian optimization for engineering discovery.
△ Less
Submitted 31 October, 2022;
originally announced October 2022.
-
The "Hot Spots" Conjecture on the Vicsek Set
Authors:
Marius Ionescu,
Thomas L. Savage
Abstract:
We prove the Hot Spot conjecture on the Vicsek set. Specifically, we show that every eigenfunction of the second smallest eigenvalue of the Neumann Laplacian on the Vicsek set attains its maximum and minimum on the boundary.
We prove the Hot Spot conjecture on the Vicsek set. Specifically, we show that every eigenfunction of the second smallest eigenvalue of the Neumann Laplacian on the Vicsek set attains its maximum and minimum on the boundary.
△ Less
Submitted 3 January, 2019; v1 submitted 13 June, 2018;
originally announced June 2018.
-
General monotonicity, interpolation of operators, and applications
Authors:
S. M. Grigoriev,
Y. Sagher,
T. R. Savage
Abstract:
We continue the work of S. Tikhonov, E. Liflyand, B. Booton, and others, proving the equivalence of L(p,q)-norms of general monotone functions and of their Fourier transforms. The main tool in this work is the interpolation properties of cones of general monotone functions in L(p,q)-norms.
We continue the work of S. Tikhonov, E. Liflyand, B. Booton, and others, proving the equivalence of L(p,q)-norms of general monotone functions and of their Fourier transforms. The main tool in this work is the interpolation properties of cones of general monotone functions in L(p,q)-norms.
△ Less
Submitted 24 October, 2014;
originally announced October 2014.