-
CORD-19: The COVID-19 Open Research Dataset
Authors:
Lucy Lu Wang,
Kyle Lo,
Yoganand Chandrasekhar,
Russell Reas,
Jiangjiang Yang,
Doug Burdick,
Darrin Eide,
Kathryn Funk,
Yannis Katsis,
Rodney Kinney,
Yunyao Li,
Ziyang Liu,
William Merrill,
Paul Mooney,
Dewey Murdick,
Devvret Rishi,
Jerry Sheehan,
Zhihong Shen,
Brandon Stilson,
Alex Wade,
Kuansan Wang,
Nancy Xin Ru Wang,
Chris Wilhelm,
Boya Xie,
Douglas Raymond
, et al. (3 additional authors not shown)
Abstract:
The COVID-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on COVID-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development of text mining and information retrieval systems over its rich collection of metadata and structured full text papers. Since its release, CORD-19 has been downloaded over 200K times and has served as the b…
▽ More
The COVID-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on COVID-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development of text mining and information retrieval systems over its rich collection of metadata and structured full text papers. Since its release, CORD-19 has been downloaded over 200K times and has served as the basis of many COVID-19 text mining and discovery systems. In this article, we describe the mechanics of dataset construction, highlighting challenges and key design decisions, provide an overview of how CORD-19 has been used, and describe several shared tasks built around the dataset. We hope this resource will continue to bring together the computing community, biomedical experts, and policy makers in the search for effective treatments and management policies for COVID-19.
△ Less
Submitted 10 July, 2020; v1 submitted 22 April, 2020;
originally announced April 2020.
-
Modified Legendre-Gauss-Radau Collocation Method for Solving Optimal Control Problems with Nonsmooth Solutions
Authors:
Joseph D. Eide,
William W. Hager,
Anil V. Rao
Abstract:
A new method is developed for solving optimal control problems whose solutions are nonsmooth. The method developed in this paper employs a modified form of the Legendre-Gauss-Radau orthogonal direct collocation method. This modified Legendre-Gauss-Radau method adds two variables and two constraints at the end of a mesh interval when compared with a previously developed standard Legendre-Gauss-Rada…
▽ More
A new method is developed for solving optimal control problems whose solutions are nonsmooth. The method developed in this paper employs a modified form of the Legendre-Gauss-Radau orthogonal direct collocation method. This modified Legendre-Gauss-Radau method adds two variables and two constraints at the end of a mesh interval when compared with a previously developed standard Legendre-Gauss-Radau collocation method. The two additional variables are the time at the interface between two mesh intervals and the control at the end of each mesh interval. The two additional constraints are a collocation condition for those differential equations that depend upon the control and an inequality constraint on the control at the endpoint of each mesh interval. The additional constraints modify the search space of the nonlinear programming problem such that an accurate approximation to the location of the nonsmoothness is obtained. The transformed adjoint system of the modified Legendre-Gauss-Radau method is then developed. Using this transformed adjoint system, a method is developed to transform the Lagrange multipliers of the nonlinear programming problem to the costate of the optimal control problem. Furthermore, it is shown that the costate estimate satisfies one of the Weierstrass-Erdmann optimality conditions. Finally, the method developed in this paper is demonstrated on an example whose solution is nonsmooth.
△ Less
Submitted 8 November, 2020; v1 submitted 7 September, 2019;
originally announced September 2019.
-
A Scalable Hybrid Research Paper Recommender System for Microsoft Academic
Authors:
Anshul Kanakia,
Zhihong Shen,
Darrin Eide,
Kuansan Wang
Abstract:
We present the design and methodology for the large scale hybrid paper recommender system used by Microsoft Academic. The system provides recommendations for approximately 160 million English research papers and patents. Our approach handles incomplete citation information while also alleviating the cold-start problem that often affects other recommender systems. We use the Microsoft Academic Grap…
▽ More
We present the design and methodology for the large scale hybrid paper recommender system used by Microsoft Academic. The system provides recommendations for approximately 160 million English research papers and patents. Our approach handles incomplete citation information while also alleviating the cold-start problem that often affects other recommender systems. We use the Microsoft Academic Graph (MAG), titles, and available abstracts of research papers to build a recommendation list for all documents, thereby combining co-citation and content based approaches. Tuning system parameters also allows for blending and prioritization of each approach which, in turn, allows us to balance paper novelty versus authority in recommendation results. We evaluate the generated recommendations via a user study of 40 participants, with over 2400 recommendation pairs graded and discuss the quality of the results using P@10 and nDCG scores. We see that there is a strong correlation between participant scores and the similarity rankings produced by our system but that additional focus needs to be put towards improving recommender precision, particularly for content based recommendations. The results of the user survey and associated analysis scripts are made available via GitHub and the recommendations produced by our system are available as part of the MAG on Azure to facilitate further research and light up novel research paper recommendation applications.
△ Less
Submitted 21 May, 2019;
originally announced May 2019.