-
Health Text Simplification: An Annotated Corpus for Digestive Cancer Education and Novel Strategies for Reinforcement Learning
Authors:
Md Mushfiqur Rahman,
Mohammad Sabik Irbaz,
Kai North,
Michelle S. Williams,
Marcos Zampieri,
Kevin Lybarger
Abstract:
Objective: The reading level of health educational materials significantly influences the understandability and accessibility of the information, particularly for minoritized populations. Many patient educational resources surpass the reading level and complexity of widely accepted standards. There is a critical need for high-performing text simplification models in health information to enhance d…
▽ More
Objective: The reading level of health educational materials significantly influences the understandability and accessibility of the information, particularly for minoritized populations. Many patient educational resources surpass the reading level and complexity of widely accepted standards. There is a critical need for high-performing text simplification models in health information to enhance dissemination and literacy. This need is particularly acute in cancer education, where effective prevention and screening education can substantially reduce morbidity and mortality.
Methods: We introduce Simplified Digestive Cancer (SimpleDC), a parallel corpus of cancer education materials tailored for health text simplification research, comprising educational content from the American Cancer Society, Centers for Disease Control and Prevention, and National Cancer Institute. Utilizing SimpleDC alongside the existing Med-EASi corpus, we explore Large Language Model (LLM)-based simplification methods, including fine-tuning, reinforcement learning (RL), reinforcement learning with human feedback (RLHF), domain adaptation, and prompt-based approaches. Our experimentation encompasses Llama 2 and GPT-4. A novel RLHF reward function is introduced, featuring a lightweight model adept at distinguishing between original and simplified texts, thereby enhancing the model's effectiveness with unlabeled data.
Results: Fine-tuned Llama 2 models demonstrated high performance across various metrics. Our innovative RLHF reward function surpassed existing RL text simplification reward functions in effectiveness. The results underscore that RL/RLHF can augment fine-tuning, facilitating model training on unlabeled text and improving performance.
△ Less
Submitted 29 March, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Growth, catalysis and faceting of $α$-Ga$_2$O$_3$ and $α$-(In$_x$Ga$_{1-x}$)$_2$O$_3$ on $m$-plane $α$-Al$_2$O$_3$ by molecular beam epitaxy
Authors:
Martin S. Williams,
Manuel Alonso-Orts,
Marco Schowalter,
Alexander Karg,
Sushma Raghuvansy,
Jon P. McCandless,
Debdeep Jena,
Andreas Rosenauer,
Martin Eickhoff,
Patrick Vogt
Abstract:
The growth of $α$-Ga$_2$O$_3$ and $α$-(In$_x$Ga$_{1-x}$)$_2$O$_3$ on $m$-plane $α$-Al$_2$O$_3$(10$\bar{1}$0) by molecular beam epitaxy (MBE) and metal-oxide-catalyzed epitaxy (MOCATAXY) is investigated. By systematically exploring the parameter space accessed by MBE and MOCATAXY, phase-pure $α$-Ga$_2$O$_3$(10$\bar{1}$0) and $α$-(In$_x$Ga$_{1-x}$)$_2$O$_3$(10$\bar{1}$0) thin films are realized. The…
▽ More
The growth of $α$-Ga$_2$O$_3$ and $α$-(In$_x$Ga$_{1-x}$)$_2$O$_3$ on $m$-plane $α$-Al$_2$O$_3$(10$\bar{1}$0) by molecular beam epitaxy (MBE) and metal-oxide-catalyzed epitaxy (MOCATAXY) is investigated. By systematically exploring the parameter space accessed by MBE and MOCATAXY, phase-pure $α$-Ga$_2$O$_3$(10$\bar{1}$0) and $α$-(In$_x$Ga$_{1-x}$)$_2$O$_3$(10$\bar{1}$0) thin films are realized. The presence of In on the $α$-Ga$_2$O$_3$ growth surface remarkably expands its growth window far into the metal-rich flux regime and to higher growth temperatures. With increasing O-to-Ga flux ratio ($R_{\text{O}}$), In incorporates into $α$-(In$_x$Ga$_{1-x}$)$_2$O$_3$ up to $x \leq 0.08$. Upon a critical thickness, $β$-(In$_x$Ga$_{1-x}$)$_2$O$_3$ nucleates and subsequently heteroepitaxially grows on top of $α$-(In$_x$Ga$_{1-x}$)$_2$O$_3$ facets. Metal-rich MOCATAXY growth conditions, where $α$-Ga$_2$O$_3$ would not conventionally stabilize, lead to single-crystalline $α$-Ga$_2$O$_3$ with negligible In incorporation and improved surface morphology. Higher $T_{\text{G}}$ further results in single-crystalline $α$-Ga$_2$O$_3$ with well-defined terraces and step edges at their surfaces. For $R_{\text{O}} \leq 0.53$, In acts as a surfactant on the $α$-Ga$_2$O$_3$ growth surface by favoring step edges, while for $R_{\text{O}} \geq 0.8$, In incorporates and leads to a-plane $α$-(In$_x$Ga$_{1-x}$)$_2$O$_3$ faceting and the subsequent ($\bar{2}$01) $β$-(In$_x$Ga$_{1-x}$)$_2$O$_3$ growth on top. Thin film analysis by STEM reveals highly crystalline $α$-Ga$_2$O$_3$ layers and interfaces. We provide a phase diagram to guide the MBE and MOCATAXY growth of single-crystalline $α$-Ga$_2$O$_3$ on $α$-Al$_2$O$_3$(10$\bar{1}$0).
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Growth of $α-Ga_2O_3$ on $Al_2O_3$ by conventional molecular-beam epitaxy and metal-oxide-catalyzed epitaxy
Authors:
J. P. McCandless,
D. Rowe,
N. Pieczulewski,
V. Protasenko,
M. Alonso-Orts,
M. S. Williams,
M. Eickhoff,
H. G. Xing,
D. A. Muller,
D. Jena,
P. Vogt
Abstract:
We report the growth of $α-Ga_2O_3$ on $m$-plane $Al_2O_3$ by conventional plasma-assisted molecular-beam epitaxy (MBE) and In-mediated metal-oxide-catalyzed epitaxy (MOCATAXY). We report a growth-rate-diagram for $α-Ga_2O_3$ (10-10), and observe (i) a growth rate increase, (ii) an expanded growth window, and (iii) reduced out-of-lane mosaic spread when MOCATAXY is employed for the growth of…
▽ More
We report the growth of $α-Ga_2O_3$ on $m$-plane $Al_2O_3$ by conventional plasma-assisted molecular-beam epitaxy (MBE) and In-mediated metal-oxide-catalyzed epitaxy (MOCATAXY). We report a growth-rate-diagram for $α-Ga_2O_3$ (10-10), and observe (i) a growth rate increase, (ii) an expanded growth window, and (iii) reduced out-of-lane mosaic spread when MOCATAXY is employed for the growth of $α-Ga_2O_3$. Through the use of In-mediated catalysis, growth rates over $0.2\,μ\text{m}\,\text{hr}^{-1}$ and rocking curves with full width at half maxima of $Δω\approx 0.45^{\circ}$ are achieved. Faceting is observed along the $α-Ga_2O_3$ film surface and is explored through scanning transmission electron microscopy.
△ Less
Submitted 30 January, 2023;
originally announced January 2023.