-
Automated Annotation of Scientific Texts for ML-based Keyphrase Extraction and Validation
Authors:
Oluwamayowa O. Amusat,
Harshad Hegde,
Christopher J. Mungall,
Anna Giannakou,
Neil P. Byers,
Dan Gunter,
Kjiersten Fagnan,
Lavanya Ramakrishnan
Abstract:
Advanced omics technologies and facilities generate a wealth of valuable data daily; however, the data often lacks the essential metadata required for researchers to find and search them effectively. The lack of metadata poses a significant challenge in the utilization of these datasets. Machine learning-based metadata extraction techniques have emerged as a potentially viable approach to automati…
▽ More
Advanced omics technologies and facilities generate a wealth of valuable data daily; however, the data often lacks the essential metadata required for researchers to find and search them effectively. The lack of metadata poses a significant challenge in the utilization of these datasets. Machine learning-based metadata extraction techniques have emerged as a potentially viable approach to automatically annotating scientific datasets with the metadata necessary for enabling effective search. Text labeling, usually performed manually, plays a crucial role in validating machine-extracted metadata. However, manual labeling is time-consuming; thus, there is an need to develop automated text labeling techniques in order to accelerate the process of scientific innovation. This need is particularly urgent in fields such as environmental genomics and microbiome science, which have historically received less attention in terms of metadata curation and creation of gold-standard text mining datasets.
In this paper, we present two novel automated text labeling approaches for the validation of ML-generated metadata for unlabeled texts, with specific applications in environmental genomics. Our techniques show the potential of two new ways to leverage existing information about the unlabeled texts and the scientific domain. The first technique exploits relationships between different types of data sources related to the same research study, such as publications and proposals. The second technique takes advantage of domain-specific controlled vocabularies or ontologies. In this paper, we detail applying these approaches for ML-generated metadata validation. Our results show that the proposed label assignment approaches can generate both generic and highly-specific text labels for the unlabeled texts, with up to 44% of the labels matching with those suggested by a ML keyword extraction algorithm.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
In-situ quantification of gamma-ray and beta-only emitting radionuclides
Authors:
Kai Vetter,
Donald Gunter,
Paul Luke,
Victor Negut,
Ryan Pavlovsky,
Brian Plimley,
Joanna Szornel
Abstract:
The prompt and in-situ assessment of non-gamma ray emitting radionuclides such as Sr-90 remains an outstanding challenge, particularly in radiological emergency response and consequence management situations. We have developed a new concept to quantitatively assess a wide range of radionuclides, including beta-only emitters, using coplanar-grid CdZnTe detectors that provide depth-of-interaction se…
▽ More
The prompt and in-situ assessment of non-gamma ray emitting radionuclides such as Sr-90 remains an outstanding challenge, particularly in radiological emergency response and consequence management situations. We have developed a new concept to quantitatively assess a wide range of radionuclides, including beta-only emitters, using coplanar-grid CdZnTe detectors that provide depth-of-interaction sensing. By combining measurements with and without an electron absorber, we demonstrate the feasibility of detecting and identifying Sr-90 and other radionuclides with a sensitivity of about 1 $μ$Ci/m$^2$ or 3.7x10$^4$ Bq/m$^2$, which is 10\% of the Derived Response Level, in less than 60 minutes. The new compact instrument can be used in the field or mobile laboratories to quickly assess a wide range of samples with sufficient sensitivity and specificity to provide critical guidance in the response after radiological incidents.
△ Less
Submitted 9 April, 2023;
originally announced April 2023.
-
Coded Aperture and Compton Imaging for the Development of $^{225}$Ac-based Radiopharmaceuticals
Authors:
Emily A. Frame,
Kondapa N. Bobba,
Donald L. Gunter,
Lucian Mihailescu,
Anil P. Bidkar,
Robert R. Flavell,
Kai Vetter
Abstract:
Targeted alpha-particle therapy (TAT) has great promise as a cancer treatment. Arguably the most promising TAT radionuclide that has been proposed is $^{225}$Ac. The development of $^{225}$Ac-based radiopharmaceuticals has been hampered due to the lack of effective means to study the daughter redistribution of these agents in small animals at the preclinical stage. The ability to directly image th…
▽ More
Targeted alpha-particle therapy (TAT) has great promise as a cancer treatment. Arguably the most promising TAT radionuclide that has been proposed is $^{225}$Ac. The development of $^{225}$Ac-based radiopharmaceuticals has been hampered due to the lack of effective means to study the daughter redistribution of these agents in small animals at the preclinical stage. The ability to directly image the daughters, namely $^{221}$Fr and $^{213}$Bi, via their gamma-ray emissions would be a boon for preclinical studies. That said, conventional medical imaging modalities, including single photon emission computed tomography (SPECT) based on pinhole collimation, cannot be employed due to sensitivity limitations. As an alternative, we propose the use of both coded aperture and Compton imaging with the former modality suited to the 218-keV gamma-ray emission of $^{221}$Fr and the latter suited to the 440-keV gamma-ray emission of $^{213}$Bi. This work includes coded aperture images of $^{221}$Fr and Compton images of $^{213}$Bi in tumor-bearing mice injected with $^{225}$Ac-based radiopharmaceuticals. These results are the first demonstration of visualizing and quantifying the $^{225}$Ac daughters in small animals via coded aperture and Compton imaging and serve as a step** stone for future radiopharmaceutical studies.
△ Less
Submitted 5 April, 2023; v1 submitted 15 December, 2022;
originally announced December 2022.
-
Free-moving Quantitative Gamma-ray Imaging
Authors:
Daniel Hellfeld,
Mark S. Bandstra,
Jayson R. Vavrek,
Donald L. Gunter,
Joseph C. Curtis,
Marco Salathe,
Ryan Pavlovsky,
Victor Negut,
Paul J. Barton,
Joshua W. Cates,
Brian J. Quiter,
Reynold J. Cooper,
Kai Vetter,
Tenzing H. Y. Joshi
Abstract:
The ability to map and estimate the activity of radiological source distributions in unknown three-dimensional environments has applications in the prevention and response to radiological accidents or threats as well as the enforcement and verification of international nuclear non-proliferation agreements. Such a capability requires well-characterized detector response functions, accurate time-dep…
▽ More
The ability to map and estimate the activity of radiological source distributions in unknown three-dimensional environments has applications in the prevention and response to radiological accidents or threats as well as the enforcement and verification of international nuclear non-proliferation agreements. Such a capability requires well-characterized detector response functions, accurate time-dependent detector position and orientation data, a digitized representation of the surrounding 3D environment, and appropriate image reconstruction and uncertainty quantification methods. We have previously demonstrated 3D map** of gamma-ray emitters with free-moving detector systems on a relative intensity scale using a technique called Scene Data Fusion (SDF). Here we characterize the detector response of a multi-element gamma-ray imaging system using experimentally benchmarked Monte Carlo simulations and perform 3D map** on an absolute intensity scale. We present experimental reconstruction results from hand-carried and airborne measurements with point-like and distributed sources in known configurations, demonstrating quantitative SDF in complex 3D environments.
△ Less
Submitted 19 October, 2021; v1 submitted 8 July, 2021;
originally announced July 2021.
-
Towards An Implementation of the Subset-sum Problem on the IBM Quantum Experience
Authors:
David Gunter,
Toks Adedoyin
Abstract:
In seeking out an algorithm to test out the capability of the IBM Quantum Experience quantum computer, we were given a review paper covering various algorithms for solving the subset-sum problem, including both classical and quantum algorithms. The paper went on to present a novel algorithm that beat the previous best algorithm known at the time. The complex nature of the algorithm made it difficu…
▽ More
In seeking out an algorithm to test out the capability of the IBM Quantum Experience quantum computer, we were given a review paper covering various algorithms for solving the subset-sum problem, including both classical and quantum algorithms. The paper went on to present a novel algorithm that beat the previous best algorithm known at the time. The complex nature of the algorithm made it difficult to see a path for implementation on the Quantum Experience machine and the exponential cost - only slightly better than the best classical algorithm - left us looking for a different approach for solving this problem. We present here a new quantum algorithm for solving the subset-sum problem that for many cases should lead to O(poly(n))-time to solution. The work is reminiscent of the verification procedure used in a polynomial-time algorithm for the quantum Arthur-Merlin games presented elsewhere, where the use of a quantum binary search to find a maximum eigenvalue in the final output stage has been adapted to the subset-sum problem as in another paper.
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
Quantum Algorithm Implementations for Beginners
Authors:
Abhijith J.,
Adetokunbo Adedoyin,
John Ambrosiano,
Petr Anisimov,
William Casper,
Gopinath Chennupati,
Carleton Coffrin,
Hristo Djidjev,
David Gunter,
Satish Karra,
Nathan Lemons,
Shizeng Lin,
Alexander Malyzhenkov,
David Mascarenas,
Susan Mniszewski,
Balu Nadiga,
Daniel O'Malley,
Diane Oyen,
Scott Pakin,
Lakshman Prasad,
Randy Roberts,
Phillip Romero,
Nandakishore Santhi,
Nikolai Sinitsyn,
Pieter J. Swart
, et al. (9 additional authors not shown)
Abstract:
As quantum computers become available to the general public, the need has arisen to train a cohort of quantum programmers, many of whom have been develo** classical computer programs for most of their careers. While currently available quantum computers have less than 100 qubits, quantum computing hardware is widely expected to grow in terms of qubit count, quality, and connectivity. This review…
▽ More
As quantum computers become available to the general public, the need has arisen to train a cohort of quantum programmers, many of whom have been develo** classical computer programs for most of their careers. While currently available quantum computers have less than 100 qubits, quantum computing hardware is widely expected to grow in terms of qubit count, quality, and connectivity. This review aims to explain the principles of quantum programming, which are quite different from classical programming, with straightforward algebra that makes understanding of the underlying fascinating quantum mechanical principles optional. We give an introduction to quantum computing algorithms and their implementation on real quantum hardware. We survey 20 different quantum algorithms, attempting to describe each in a succinct and self-contained fashion. We show how these algorithms can be implemented on IBM's quantum computer, and in each case, we discuss the results of the implementation with respect to differences between the simulator and the actual hardware runs. This article introduces computer scientists, physicists, and engineers to quantum algorithms and provides a blueprint for their implementations.
△ Less
Submitted 26 June, 2022; v1 submitted 10 April, 2018;
originally announced April 2018.
-
Report on the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3)
Authors:
Daniel S. Katz,
Sou-Cheng T. Choi,
Kyle E. Niemeyer,
James Hetherington,
Frank Löffler,
Dan Gunter,
Ray Idaszak,
Steven R. Brandt,
Mark A. Miller,
Sandra Gesing,
Nick D. Jones,
Nic Weber,
Suresh Marru,
Gabrielle Allen,
Birgit Penzenstadler,
Colin C. Venters,
Ethan Davis,
Lorraine Hwang,
Ilian Todorov,
Abani Patra,
Miguel de Val-Borro
Abstract:
This report records and discusses the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3). The report includes a description of the keynote presentation of the workshop, which served as an overview of sustainable scientific software. It also summarizes a set of lightning talks in which speakers highlighted to-the-point lessons and challenges pertaining to sustain…
▽ More
This report records and discusses the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3). The report includes a description of the keynote presentation of the workshop, which served as an overview of sustainable scientific software. It also summarizes a set of lightning talks in which speakers highlighted to-the-point lessons and challenges pertaining to sustaining scientific software. The final and main contribution of the report is a summary of the discussions, future steps, and future organization for a set of self-organized working groups on topics including develo** pathways to funding scientific software; constructing useful common metrics for crediting software stakeholders; identifying principles for sustainable software engineering design; reaching out to research software organizations around the world; and building communities for software sustainability. For each group, we include a point of contact and a landing page that can be used by those who want to join that group's future activities. The main challenge left by the workshop is to see if the groups will execute these activities that they have scheduled, and how the WSSSPE community can encourage this to happen.
△ Less
Submitted 6 February, 2016;
originally announced February 2016.
-
User Applications Driven by the Community Contribution Framework MPContribs in the Materials Project
Authors:
Patrick Huck,
Dan Gunter,
Shreyas Cholia,
Donald Winston,
Alpha N'Diaye,
Kristin Persson
Abstract:
This work discusses how the MPContribs framework in the Materials Project (MP) allows user-contributed data to be shown and analyzed alongside the core MP database. The Materials Project is a searchable database of electronic structure properties of over 65,000 bulk solid materials that is accessible through a web-based science-gateway. We describe the motivation for enabling user contributions to…
▽ More
This work discusses how the MPContribs framework in the Materials Project (MP) allows user-contributed data to be shown and analyzed alongside the core MP database. The Materials Project is a searchable database of electronic structure properties of over 65,000 bulk solid materials that is accessible through a web-based science-gateway. We describe the motivation for enabling user contributions to the materials data and present the framework's features and challenges in the context of two real applications. These use-cases illustrate how scientific collaborations can build applications with their own "user-contributed" data using MPContribs. The Nanoporous Materials Explorer application provides a unique search interface to a novel dataset of hundreds of thousands of materials, each with tables of user-contributed values related to material adsorption and density at varying temperature and pressure. The Unified Theoretical and Experimental x-ray Spectroscopy application discusses a full workflow for the association, dissemination and combined analyses of experimental data from the Advanced Light Source with MP's theoretical core data, using MPContribs tools for data formatting, management and exploration. The capabilities being developed for these collaborations are serving as the model for how new materials data can be incorporated into the Materials Project website with minimal staff overhead while giving powerful tools for data search and display to the user community.
△ Less
Submitted 19 October, 2015;
originally announced October 2015.
-
A Community Contribution Framework for Sharing Materials Data with Materials Project
Authors:
Patrick Huck,
Anubhav Jain,
Dan Gunter,
Donald Winston,
Kristin Persson
Abstract:
As scientific discovery becomes increasingly data-driven, software platforms are needed to efficiently organize and disseminate data from disparate sources. This is certainly the case in the field of materials science. For example, Materials Project has generated computational data on over 60,000 chemical compounds and has made that data available through a web portal and REST interface. However,…
▽ More
As scientific discovery becomes increasingly data-driven, software platforms are needed to efficiently organize and disseminate data from disparate sources. This is certainly the case in the field of materials science. For example, Materials Project has generated computational data on over 60,000 chemical compounds and has made that data available through a web portal and REST interface. However, such portals must seek to incorporate community submissions to expand the scope of scientific data sharing. In this paper, we describe MPContribs, a computing/software infrastructure to integrate and organize contributions of simulated or measured materials data from users. Our solution supports complex submissions and provides interfaces that allow contributors to share analyses and graphs. A RESTful API exposes mechanisms for book-kee**, retrieval and aggregation of submitted entries, as well as persistent URIs or DOIs that can be used to reference the data in publications. Our approach isolates contributed data from a host project's quality-controlled core data and yet enables analyses across the entire dataset, programmatically or through customized web apps. We expect the developed framework to enhance collaborative determination of material properties and to maximize the impact of each contributor's dataset. In the long-term, MPContribs seeks to make Materials Project an institutional, and thus community-wide, memory for computational and experimental materials science.
△ Less
Submitted 16 October, 2015;
originally announced October 2015.
-
GMA Instrumentation of the Athena Framework using NetLogger
Authors:
Craig E. Tull,
Dan Gunter,
Wim Lavrijsen,
David Quarrie,
Brian Tierney
Abstract:
Grid applications are, by their nature, wide-area distributed applications. This WAN aspect of Grid applications makes the use of conventional monitoring and instrumentation tools (such as top, gprof, LSF Monitor, etc) impractical for verification that the application is running correctly and efficiently. To be effective, monitoring data must be "end-to-end", meaning that all components between…
▽ More
Grid applications are, by their nature, wide-area distributed applications. This WAN aspect of Grid applications makes the use of conventional monitoring and instrumentation tools (such as top, gprof, LSF Monitor, etc) impractical for verification that the application is running correctly and efficiently. To be effective, monitoring data must be "end-to-end", meaning that all components between the Grid application endpoints must be monitored. Instrumented applications can generate a large amount of monitoring data, so typically the instrumentation is off by default. For jobs running on a Grid, there needs to be a general mechanism to remotely activate the instrumentation in running jobs. The NetLogger Toolkit Activation Service provides this mechanism.
To demonstrate this, we have instrumented the ATLAS Athena Framework with NetLogger to generate monitoring events. We then use a GMA-based activation service to control NetLogger's trigger mechanism. The NetLogger trigger mechanism allows one to easily start, stop, or change the logging level of a running program by modifying a trigger file. We present here details of the design of the NetLogger implementation of the GMA-based activation service and the instrumentation service for Athena. We also describe how this activation service allows us to non-intrusively collect and visualize the ATLAS Athena Framework monitoring data.
△ Less
Submitted 14 June, 2003;
originally announced June 2003.
-
Hadronic Production of Doubly Charmed Baryons via Charm Exitation in Proton
Authors:
D. A. Gunter,
V. A. Saleev
Abstract:
The production of baryons containing two charmed quarks Xi_cc in hadronic interactions at high energies and large transverse momenta is considered. It is supposed, that Xi_cc-baryon is formed during a non-perturbative fragmentation of the (cc)-diquark, which was produced in the hard process of $c$-quark scattering from the colliding protons: c+c -> (cc) +g. It is shown that such mechanism enhanc…
▽ More
The production of baryons containing two charmed quarks Xi_cc in hadronic interactions at high energies and large transverse momenta is considered. It is supposed, that Xi_cc-baryon is formed during a non-perturbative fragmentation of the (cc)-diquark, which was produced in the hard process of $c$-quark scattering from the colliding protons: c+c -> (cc) +g. It is shown that such mechanism enhances the expected doubly charmed baryon production cross section on Tevatron and LHC colliders approximately 2 times in contrast to predictions, obtained in the model of gluon - gluon production of (cc)-diquarks in the leading order of perturbative QCD.
△ Less
Submitted 18 April, 2001;
originally announced April 2001.
-
Implicit integration of the TDGL equations of superconductivity
Authors:
D. O. Gunter,
H. G. Kaper,
G. K. Leaf
Abstract:
This article is concerned with the integration of the time-dependent Ginzburg--Landau (TDGL) equations of superconductivity. Four algorithms, ranging from fully explicit to fully implicit, are presented and evaluated for stability, accuracy, and compute time. The benchmark problem for the evaluation is the equilibration of a vortex configuration in a superconductor that is embedded in a thin ins…
▽ More
This article is concerned with the integration of the time-dependent Ginzburg--Landau (TDGL) equations of superconductivity. Four algorithms, ranging from fully explicit to fully implicit, are presented and evaluated for stability, accuracy, and compute time. The benchmark problem for the evaluation is the equilibration of a vortex configuration in a superconductor that is embedded in a thin insulator and subject to an applied magnetic field.
△ Less
Submitted 13 July, 2000;
originally announced July 2000.
-
Implicit Integration of the Time-Dependent Ginzburg-Landau Equations of Superconductivity
Authors:
D. O. Gunter,
H. G. Kaper,
G. K. Leaf
Abstract:
This article is concerned with the integration of the time-dependent Ginzburg-Landau (TDGL) equations of superconductivity. Four algorithms, ranging from fully explicit to fully implicit, are presented and evaluated for stability, accuracy, and compute time. The benchmark problem for the evaluation is the equilibration of a vortex configuration in a superconductor that is embedded in a thin insu…
▽ More
This article is concerned with the integration of the time-dependent Ginzburg-Landau (TDGL) equations of superconductivity. Four algorithms, ranging from fully explicit to fully implicit, are presented and evaluated for stability, accuracy, and compute time. The benchmark problem for the evaluation is the equilibration of a vortex configuration in a superconductor that is embedded in a thin insulator and subject to an applied magnetic field.
△ Less
Submitted 25 June, 1999;
originally announced June 1999.