-
One Thousand and One Hours: Self-driving Motion Prediction Dataset
Authors:
John Houston,
Guido Zuidhof,
Luca Bergamini,
Yawei Ye,
Long Chen,
Ashesh Jain,
Sammy Omari,
Vladimir Iglovikov,
Peter Ondruska
Abstract:
Motivated by the impact of large-scale datasets on ML systems we present the largest self-driving dataset for motion prediction to date, containing over 1,000 hours of data. This was collected by a fleet of 20 autonomous vehicles along a fixed route in Palo Alto, California, over a four-month period. It consists of 170,000 scenes, where each scene is 25 seconds long and captures the perception out…
▽ More
Motivated by the impact of large-scale datasets on ML systems we present the largest self-driving dataset for motion prediction to date, containing over 1,000 hours of data. This was collected by a fleet of 20 autonomous vehicles along a fixed route in Palo Alto, California, over a four-month period. It consists of 170,000 scenes, where each scene is 25 seconds long and captures the perception output of the self-driving system, which encodes the precise positions and motions of nearby vehicles, cyclists, and pedestrians over time. On top of this, the dataset contains a high-definition semantic map with 15,242 labelled elements and a high-definition aerial view over the area. We show that using a dataset of this size dramatically improves performance for key self-driving problems. Combined with the provided software kit, this collection forms the largest and most detailed dataset to date for the development of self-driving machine learning tasks, such as motion forecasting, motion planning and simulation. The full dataset is available at http://level5.lyft.com/.
△ Less
Submitted 16 November, 2020; v1 submitted 25 June, 2020;
originally announced June 2020.
-
Context-aware stacked convolutional neural networks for classification of breast carcinomas in whole-slide histopathology images
Authors:
Babak Ehteshami Bejnordi,
Guido Zuidhof,
Maschenka Balkenhol,
Meyke Hermsen,
Peter Bult,
Bram van Ginneken,
Nico Karssemeijer,
Geert Litjens,
Jeroen van der Laak
Abstract:
Automated classification of histopathological whole-slide images (WSI) of breast tissue requires analysis at very high resolutions with a large contextual area. In this paper, we present context-aware stacked convolutional neural networks (CNN) for classification of breast WSIs into normal/benign, ductal carcinoma in situ (DCIS), and invasive ductal carcinoma (IDC). We first train a CNN using high…
▽ More
Automated classification of histopathological whole-slide images (WSI) of breast tissue requires analysis at very high resolutions with a large contextual area. In this paper, we present context-aware stacked convolutional neural networks (CNN) for classification of breast WSIs into normal/benign, ductal carcinoma in situ (DCIS), and invasive ductal carcinoma (IDC). We first train a CNN using high pixel resolution patches to capture cellular level information. The feature responses generated by this model are then fed as input to a second CNN, stacked on top of the first. Training of this stacked architecture with large input patches enables learning of fine-grained (cellular) details and global interdependence of tissue structures. Our system is trained and evaluated on a dataset containing 221 WSIs of H&E stained breast tissue specimens. The system achieves an AUC of 0.962 for the binary classification of non-malignant and malignant slides and obtains a three class accuracy of 81.3% for classification of WSIs into normal/benign, DCIS, and IDC, demonstrating its potentials for routine diagnostics.
△ Less
Submitted 10 May, 2017;
originally announced May 2017.
-
Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge
Authors:
Arnaud Arindra Adiyoso Setio,
Alberto Traverso,
Thomas de Bel,
Moira S. N. Berens,
Cas van den Bogaard,
Piergiorgio Cerello,
Hao Chen,
Qi Dou,
Maria Evelina Fantacci,
Bram Geurts,
Robbert van der Gugten,
Pheng Ann Heng,
Bart Jansen,
Michael M. J. de Kaste,
Valentin Kotov,
Jack Yu-Hung Lin,
Jeroen T. M. C. Manders,
Alexander Sónora-Mengana,
Juan Carlos García-Naranjo,
Evgenia Papavasileiou,
Mathias Prokop,
Marco Saletta,
Cornelia M Schaefer-Prokop,
Ernst T. Scholten,
Luuk Scholten
, et al. (7 additional authors not shown)
Abstract:
Automatic detection of pulmonary nodules in thoracic computed tomography (CT) scans has been an active area of research for the last two decades. However, there have only been few studies that provide a comparative performance evaluation of different systems on a common database. We have therefore set up the LUNA16 challenge, an objective evaluation framework for automatic nodule detection algorit…
▽ More
Automatic detection of pulmonary nodules in thoracic computed tomography (CT) scans has been an active area of research for the last two decades. However, there have only been few studies that provide a comparative performance evaluation of different systems on a common database. We have therefore set up the LUNA16 challenge, an objective evaluation framework for automatic nodule detection algorithms using the largest publicly available reference database of chest CT scans, the LIDC-IDRI data set. In LUNA16, participants develop their algorithm and upload their predictions on 888 CT scans in one of the two tracks: 1) the complete nodule detection track where a complete CAD system should be developed, or 2) the false positive reduction track where a provided set of nodule candidates should be classified. This paper describes the setup of LUNA16 and presents the results of the challenge so far. Moreover, the impact of combining individual systems on the detection performance was also investigated. It was observed that the leading solutions employed convolutional networks and used the provided set of nodule candidates. The combination of these solutions achieved an excellent sensitivity of over 95% at fewer than 1.0 false positives per scan. This highlights the potential of combining algorithms to improve the detection performance. Our observer study with four expert readers has shown that the best system detects nodules that were missed by expert readers who originally annotated the LIDC-IDRI data. We released this set of additional nodules for further development of CAD systems.
△ Less
Submitted 15 July, 2017; v1 submitted 23 December, 2016;
originally announced December 2016.