-
Lung and Colon Cancer Histopathological Image Dataset (LC25000)
Authors:
Andrew A. Borkowski,
Marilyn M. Bui,
L. Brannon Thomas,
Catherine P. Wilson,
Lauren A. DeLand,
Stephen M. Mastorides
Abstract:
The field of Machine Learning, a subset of Artificial Intelligence, has led to remarkable advancements in many areas, including medicine. Machine Learning algorithms require large datasets to train computer models successfully. Although there are medical image datasets available, more image datasets are needed from a variety of medical entities, especially cancer pathology. Even more scarce are ML…
▽ More
The field of Machine Learning, a subset of Artificial Intelligence, has led to remarkable advancements in many areas, including medicine. Machine Learning algorithms require large datasets to train computer models successfully. Although there are medical image datasets available, more image datasets are needed from a variety of medical entities, especially cancer pathology. Even more scarce are ML-ready image datasets. To address this need, we created an image dataset (LC25000) with 25,000 color images in 5 classes. Each class contains 5,000 images of the following histologic entities: colon adenocarcinoma, benign colonic tissue, lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue. All images are de-identified, HIPAA compliant, validated, and freely available for download to AI researchers.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.
-
Google Auto ML versus Apple Create ML for Histopathologic Cancer Diagnosis; Which Algorithms Are Better?
Authors:
Andrew A. Borkowski,
Catherine P. Wilson,
Steven A. Borkowski,
L. Brannon Thomas,
Lauren A. Deland,
Stefanie J. Grewe,
Stephen M. Mastorides
Abstract:
Artificial Intelligence is set to revolutionize multiple fields in the coming years. One subset of AI, machine learning, shows immense potential for application in a diverse set of medical specialties, including diagnostic pathology. In this study, we investigate the utility of the Apple Create ML and Google Cloud Auto ML, two machine learning platforms, in a variety of pathological scenarios invo…
▽ More
Artificial Intelligence is set to revolutionize multiple fields in the coming years. One subset of AI, machine learning, shows immense potential for application in a diverse set of medical specialties, including diagnostic pathology. In this study, we investigate the utility of the Apple Create ML and Google Cloud Auto ML, two machine learning platforms, in a variety of pathological scenarios involving lung and colon pathology. First, we evaluate the ability of the platforms to differentiate normal lung tissue from cancerous lung tissue. Also, the ability to accurately distinguish two subtypes of lung cancer (adenocarcinoma and squamous cell carcinoma) is examined and compared. Similarly, the ability of the two programs to differentiate colon adenocarcinoma from normal colon is assessed as is done with lung tissue. Also, cases of colon adenocarcinoma are evaluated for the presence or absence of a specific gene mutation known as KRAS. Finally, our last experiment examines the ability of the Apple and Google platforms to differentiate between adenocarcinomas of lung origin versus colon origin. In our trained models for lung and colon cancer diagnosis, both Apple and Google machine learning algorithms performed very well individually and with no statistically significant differences found between the two platforms. However, some critical factors set them apart. Apple Create ML can be used on local computers but is limited to an Apple ecosystem. Google Auto ML is not platform specific but runs only in Google Cloud with associated computational fees. In the end, both are excellent machine learning tools that have great potential in the field of diagnostic pathology, and which one to choose would depend on personal preference, programming experience, and available storage space.
△ Less
Submitted 19 March, 2019;
originally announced March 2019.
-
Apple Machine Learning Algorithms Successfully Detect Colon Cancer but Fail to Predict KRAS Mutation Status
Authors:
Andrew A. Borkowski,
Catherine P. Wilson,
Steven A. Borkowski,
L. Brannon Thomas,
Lauren A. Deland,
Stephen M. Mastorides
Abstract:
Colon cancer is the second leading cause of cancer-related death in the United States of America. Its prognosis has significantly improved with the advancement of targeted therapies based on underlying molecular changes. The KRAS mutation is one of the most frequent molecular alterations seen in colon cancer and its presence can affect treatment selection. We attempted to use Apple machine learnin…
▽ More
Colon cancer is the second leading cause of cancer-related death in the United States of America. Its prognosis has significantly improved with the advancement of targeted therapies based on underlying molecular changes. The KRAS mutation is one of the most frequent molecular alterations seen in colon cancer and its presence can affect treatment selection. We attempted to use Apple machine learning algorithms to diagnose colon cancer and predict the KRAS mutation status from histopathological images. We captured 250 colon cancer images and 250 benign colon tissue images. Half of colon cancer images were captured from KRAS mutation-positive tumors and another half from KRAS mutation-negative tumors. Next, we created Image Classifier Model using Apple CreateML machine learning module. The trained and validated model was able to successfully differentiate between colon cancer and benign colon tissue images with 98 % recall and 98 % precision. However, our model failed to reliably identify KRAS mutations, with the highest realized accuracy of 66 %. Although not yet perfected, in the near future Apple CreateML modules can be used in diagnostic smartphone-based applications and potentially alleviate shortages of medical professionals in understaffed parts of the world.
△ Less
Submitted 15 January, 2019; v1 submitted 11 December, 2018;
originally announced December 2018.