Search | arXiv e-print repository

ContrastDiagnosis: Enhancing Interpretability in Lung Nodule Diagnosis Using Contrastive Learning

Authors: Chenglong Wang, Yinqiao Yi, Yida Wang, Chengxiu Zhang, Yun Liu, Kensaku Mori, Mei Yuan, Guang Yang

Abstract: With the ongoing development of deep learning, an increasing number of AI models have surpassed the performance levels of human clinical practitioners. However, the prevalence of AI diagnostic products in actual clinical practice remains significantly lower than desired. One crucial reason for this gap is the so-called `black box' nature of AI models. Clinicians' distrust of black box models has d… ▽ More With the ongoing development of deep learning, an increasing number of AI models have surpassed the performance levels of human clinical practitioners. However, the prevalence of AI diagnostic products in actual clinical practice remains significantly lower than desired. One crucial reason for this gap is the so-called `black box' nature of AI models. Clinicians' distrust of black box models has directly hindered the clinical deployment of AI products. To address this challenge, we propose ContrastDiagnosis, a straightforward yet effective interpretable diagnosis framework. This framework is designed to introduce inherent transparency and provide extensive post-hoc explainability for deep learning model, making them more suitable for clinical medical diagnosis. ContrastDiagnosis incorporates a contrastive learning mechanism to provide a case-based reasoning diagnostic rationale, enhancing the model's transparency and also offers post-hoc interpretability by highlighting similar areas. High diagnostic accuracy was achieved with AUC of 0.977 while maintain a high transparency and explainability. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2309.12325 [pdf, other]

FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare

Authors: Karim Lekadir, Aasa Feragen, Abdul Joseph Fofanah, Alejandro F Frangi, Alena Buyx, Anais Emelie, Andrea Lara, Antonio R Porras, An-Wen Chan, Arcadi Navarro, Ben Glocker, Benard O Botwe, Bishesh Khanal, Brigit Beger, Carol C Wu, Celia Cintas, Curtis P Langlotz, Daniel Rueckert, Deogratias Mzurikwao, Dimitrios I Fotiadis, Doszhan Zhussupov, Enzo Ferrante, Erik Meijering, Eva Weicken, Fabio A González , et al. (93 additional authors not shown)

Abstract: Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted… ▽ More Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted by patients, clinicians, health organisations and authorities. This work describes the FUTURE-AI guideline as the first international consensus framework for guiding the development and deployment of trustworthy AI tools in healthcare. The FUTURE-AI consortium was founded in 2021 and currently comprises 118 inter-disciplinary experts from 51 countries representing all continents, including AI scientists, clinicians, ethicists, and social scientists. Over a two-year period, the consortium defined guiding principles and best practices for trustworthy AI through an iterative process comprising an in-depth literature review, a modified Delphi survey, and online consensus meetings. The FUTURE-AI framework was established based on 6 guiding principles for trustworthy AI in healthcare, i.e. Fairness, Universality, Traceability, Usability, Robustness and Explainability. Through consensus, a set of 28 best practices were defined, addressing technical, clinical, legal and socio-ethical dimensions. The recommendations cover the entire lifecycle of medical AI, from design, development and validation to regulation, deployment, and monitoring. FUTURE-AI is a risk-informed, assumption-free guideline which provides a structured approach for constructing medical AI tools that will be trusted, deployed and adopted in real-world practice. Researchers are encouraged to take the recommendations into account in proof-of-concept stages to facilitate future translation towards clinical practice of medical AI. △ Less

Submitted 11 August, 2023; originally announced September 2023.

ACM Class: I.2.0; I.4.0; I.5.0

arXiv:2308.04070 [pdf, other]

ConDistFL: Conditional Distillation for Federated Learning from Partially Annotated Data

Authors: Pochuan Wang, Chen Shen, Weichung Wang, Masahiro Oda, Chiou-Shann Fuh, Kensaku Mori, Holger R. Roth

Abstract: Develo** a generalized segmentation model capable of simultaneously delineating multiple organs and diseases is highly desirable. Federated learning (FL) is a key technology enabling the collaborative development of a model without exchanging training data. However, the limited access to fully annotated training data poses a major challenge to training generalizable models. We propose "ConDistFL… ▽ More Develo** a generalized segmentation model capable of simultaneously delineating multiple organs and diseases is highly desirable. Federated learning (FL) is a key technology enabling the collaborative development of a model without exchanging training data. However, the limited access to fully annotated training data poses a major challenge to training generalizable models. We propose "ConDistFL", a framework to solve this problem by combining FL with knowledge distillation. Local models can extract the knowledge of unlabeled organs and tumors from partially annotated data from the global model with an adequately designed conditional probability representation. We validate our framework on four distinct partially annotated abdominal CT datasets from the MSD and KiTS19 challenges. The experimental results show that the proposed framework significantly outperforms FedAvg and FedOpt baselines. Moreover, the performance on an external test dataset demonstrates superior generalizability compared to models trained on each dataset separately. Our ablation study suggests that ConDistFL can perform well without frequent aggregation, reducing the communication cost of FL. Our implementation will be available at https://github.com/NVIDIA/NVFlare/tree/dev/research/condist-fl. △ Less

Submitted 8 August, 2023; originally announced August 2023.

arXiv:2308.02266 [pdf, other]

SURE-Val: Safe Urban Relevance Extension and Validation

Authors: Kai Storms, Ken Mori, Steven Peters

Abstract: To evaluate perception components of an automated driving system, it is necessary to define the relevant objects. While the urban domain is popular among perception datasets, relevance is insufficiently specified for this domain. Therefore, this work adopts an existing method to define relevance in the highway domain and expands it to the urban domain. While different conceptualizations and defini… ▽ More To evaluate perception components of an automated driving system, it is necessary to define the relevant objects. While the urban domain is popular among perception datasets, relevance is insufficiently specified for this domain. Therefore, this work adopts an existing method to define relevance in the highway domain and expands it to the urban domain. While different conceptualizations and definitions of relevance are present in literature, there is a lack of methods to validate these definitions. Therefore, this work presents a novel relevance validation method leveraging a motion prediction component. The validation leverages the idea that removing irrelevant objects should not influence a prediction component which reflects human driving behavior. The influence on the prediction is quantified by considering the statistical distribution of prediction performance across a large-scale dataset. The validation procedure is verified using criteria specifically designed to exclude relevant objects. The validation method is successfully applied to the relevance criteria from this work, thus supporting their validity. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: Accepted at Uni-DAS e.V. Workshop Fahrerassistenz und automatisiertes Fahren

arXiv:2307.14058 [pdf, other]

Towards Establishing Systematic Classification Requirements for Automated Driving

Authors: Ken T. Mori, Trent Brown, Steven Peters

Abstract: Despite the presence of the classification task in many different benchmark datasets for perception in the automotive domain, few efforts have been undertaken to define consistent classification requirements. This work addresses the topic by proposing a structured method to generate a classification structure. First, legal categories are identified based on behavioral requirements for the vehicle.… ▽ More Despite the presence of the classification task in many different benchmark datasets for perception in the automotive domain, few efforts have been undertaken to define consistent classification requirements. This work addresses the topic by proposing a structured method to generate a classification structure. First, legal categories are identified based on behavioral requirements for the vehicle. This structure is further substantiated by considering the two aspects of collision safety for objects as well as perceptual categories. A classification hierarchy is obtained by applying the method to an exemplary legal text. A comparison of the results with benchmark dataset categories shows limited agreement. This indicates the necessity for explicit consideration of legal requirements regarding perception. △ Less

Submitted 26 July, 2023; originally announced July 2023.

Comments: Accepted to IEEE IV 2023

arXiv:2307.10873 [pdf, other]

Conservative Estimation of Perception Relevance of Dynamic Objects for Safe Trajectories in Automotive Scenarios

Authors: Ken Mori, Kai Storms, Steven Peters

Abstract: Having efficient testing strategies is a core challenge that needs to be overcome for the release of automated driving. This necessitates clear requirements as well as suitable methods for testing. In this work, the requirements for perception modules are considered with respect to relevance. The concept of relevance currently remains insufficiently defined and specified. In this paper, we propose… ▽ More Having efficient testing strategies is a core challenge that needs to be overcome for the release of automated driving. This necessitates clear requirements as well as suitable methods for testing. In this work, the requirements for perception modules are considered with respect to relevance. The concept of relevance currently remains insufficiently defined and specified. In this paper, we propose a novel methodology to overcome this challenge by exemplary application to collision safety in the highway domain. Using this general system and use case specification, a corresponding concept for relevance is derived. Irrelevant objects are thus defined as objects which do not limit the set of safe actions available to the ego vehicle under consideration of all uncertainties. As an initial step, the use case is decomposed into functional scenarios with respect to collision relevance. For each functional scenario, possible actions of both the ego vehicle and any other dynamic object are formalized as equations. This set of possible actions is constrained by traffic rules, yielding relevance criteria. As a result, we present a conservative estimation which dynamic objects are relevant for perception and need to be considered for a complete evaluation. The estimation provides requirements which are applicable for offline testing and validation of perception components. A visualization is presented for examples from the highD dataset, showing the plausibility of the results. Finally, a possibility for a future validation of the presented relevance concept is outlined. △ Less

Submitted 20 July, 2023; originally announced July 2023.

arXiv:2303.14901 [pdf, other]

Identifying Suspicious Regions of Covid-19 by Abnormality-Sensitive Activation Map**

Authors: Ryo Toda, Hayato Itoh, Masahiro Oda, Yuichiro Hayashi, Yoshito Otake, Masahiro Hashimoto, Toshiaki Akashi, Shigeki Aoki, Kensaku Mori

Abstract: This paper presents a fully-automated method for the identification of suspicious regions of a coronavirus disease (COVID-19) on chest CT volumes. One major role of chest CT scanning in COVID-19 diagnoses is identification of an inflammation particular to the disease. This task is generally performed by radiologists through an interpretation of the CT volumes, however, because of the heavy workloa… ▽ More This paper presents a fully-automated method for the identification of suspicious regions of a coronavirus disease (COVID-19) on chest CT volumes. One major role of chest CT scanning in COVID-19 diagnoses is identification of an inflammation particular to the disease. This task is generally performed by radiologists through an interpretation of the CT volumes, however, because of the heavy workload, an automatic analysis method using a computer is desired. Most computer-aided diagnosis studies have addressed only a portion of the elements necessary for the identification. In this work, we realize the identification method through a classification task by using a 2.5-dimensional CNN with three-dimensional attention mechanisms. We visualize the suspicious regions by applying a backpropagation based on positive gradients to attention-weighted features. We perform experiments on an in-house dataset and two public datasets to reveal the generalization ability of the proposed method. The proposed architecture achieved AUCs of over 0.900 for all the datasets, and mean sensitivity $0.853 \pm 0.036$ and specificity $0.870 \pm 0.040$. The method can also identify notable lesions pointed out in the radiology report as suspicious regions. △ Less

Submitted 26 March, 2023; originally announced March 2023.

Comments: 10 pages, 3 figures

arXiv:2302.00899 [pdf, other]

doi 10.1080/21681163.2022.2151938

KST-Mixer: Kinematic Spatio-Temporal Data Mixer For Colon Shape Estimation

Authors: Masahiro Oda, Kazuhiro Furukawa, Nassir Navab, Kensaku Mori

Abstract: We propose a spatio-temporal mixing kinematic data estimation method to estimate the shape of the colon with deformations caused by colonoscope insertion. Endoscope tracking or a navigation system that navigates physicians to target positions is needed to reduce such complications as organ perforations. Although many previous methods focused to track bronchoscopes and surgical endoscopes, few numb… ▽ More We propose a spatio-temporal mixing kinematic data estimation method to estimate the shape of the colon with deformations caused by colonoscope insertion. Endoscope tracking or a navigation system that navigates physicians to target positions is needed to reduce such complications as organ perforations. Although many previous methods focused to track bronchoscopes and surgical endoscopes, few number of colonoscope tracking methods were proposed. This is because the colon largely deforms during colonoscope insertion. The deformation causes significant tracking errors. Colon deformation should be taken into account in the tracking process. We propose a colon shape estimation method using a Kinematic Spatio-Temporal data Mixer (KST-Mixer) that can be used during colonoscope insertions to the colon. Kinematic data of a colonoscope and the colon, including positions and directions of their centerlines, are obtained using electromagnetic and depth sensors. The proposed method separates the data into sub-groups along the spatial and temporal axes. The KST-Mixer extracts kinematic features and mix them along the spatial and temporal axes multiple times. We evaluated colon shape estimation accuracies in phantom studies. The proposed method achieved 11.92 mm mean Euclidean distance error, the smallest of the previous methods. Statistical analysis indicated that the proposed method significantly reduced the error compared to the previous methods. △ Less

Submitted 2 February, 2023; originally announced February 2023.

Comments: Accepted paper as an oral presentation at Joint MICCAI workshop 2022, AE-CAI/CARE/OR2.0. Received the Outstanding Paper Award

Journal ref: Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 2023

arXiv:2206.05275 [pdf, other]

Spatial-temporal Concept based Explanation of 3D ConvNets

Authors: Ying Ji, Yu Wang, Kensaku Mori, Jien Kato

Abstract: Recent studies have achieved outstanding success in explaining 2D image recognition ConvNets. On the other hand, due to the computation cost and complexity of video data, the explanation of 3D video recognition ConvNets is relatively less studied. In this paper, we present a 3D ACE (Automatic Concept-based Explanation) framework for interpreting 3D ConvNets. In our approach: (1) videos are represe… ▽ More Recent studies have achieved outstanding success in explaining 2D image recognition ConvNets. On the other hand, due to the computation cost and complexity of video data, the explanation of 3D video recognition ConvNets is relatively less studied. In this paper, we present a 3D ACE (Automatic Concept-based Explanation) framework for interpreting 3D ConvNets. In our approach: (1) videos are represented using high-level supervoxels, which is straightforward for human to understand; and (2) the interpreting framework estimates a score for each voxel, which reflects its importance in the decision procedure. Experiments show that our method can discover spatial-temporal concepts of different importance-levels, and thus can explore the influence of the concepts on a target task, such as action classification, in-depth. The codes are publicly available. △ Less

Submitted 9 June, 2022; originally announced June 2022.

arXiv:2201.05331 [pdf, ps, other]

doi 10.1007/978-3-642-40811-3_42

Semi-automated Virtual Unfolded View Generation Method of Stomach from CT Volumes

Authors: Masahiro Oda, Tomoaki Suito, Yuichiro Hayashi, Takayuki Kitasaka, Kazuhiro Furukawa, Ryoji Miyahara, Yoshiki Hirooka, Hidemi Goto, Gen Iinuma, Kazunari Misawa, Shigeru Nawano, Kensaku Mori

Abstract: CT image-based diagnosis of the stomach is developed as a new way of diagnostic method. A virtual unfolded (VU) view is suitable for displaying its wall. In this paper, we propose a semi-automated method for generating VU views of the stomach. Our method requires minimum manual operations. The determination of the unfolding forces and the termination of the unfolding process are automated. The unf… ▽ More CT image-based diagnosis of the stomach is developed as a new way of diagnostic method. A virtual unfolded (VU) view is suitable for displaying its wall. In this paper, we propose a semi-automated method for generating VU views of the stomach. Our method requires minimum manual operations. The determination of the unfolding forces and the termination of the unfolding process are automated. The unfolded shape of the stomach is estimated based on its radius. The unfolding forces are determined so that the stomach wall is deformed to the expected shape. The iterative deformation process is terminated if the difference of the shapes between the deformed shape and expected shape is small. Our experiments using 67 CT volumes showed that our proposed method can generate good VU views for 76.1% cases. △ Less

Submitted 14 January, 2022; originally announced January 2022.

Comments: Accepted paper as a poster presentation at MICCAI 2013 (International Conference on Medical Image Computing and Computer-Assisted Intervention), Nagoya, Japan

Journal ref: Published in Proceedings of MICCAI 2013, LNCS 8149, pp.332-339, 2013

arXiv:2201.04918 [pdf, ps, other]

doi 10.1049/htl.2019.0071

Realistic Endoscopic Image Generation Method Using Virtual-to-real Image-domain Translation

Authors: Masahiro Oda, Kiyohito Tanaka, Hirotsugu Takabatake, Masaki Mori, Hiroshi Natori, Kensaku Mori

Abstract: This paper proposes a realistic image generation method for visualization in endoscopic simulation systems. Endoscopic diagnosis and treatment are performed in many hospitals. To reduce complications related to endoscope insertions, endoscopic simulation systems are used for training or rehearsal of endoscope insertions. However, current simulation systems generate non-realistic virtual endoscopic… ▽ More This paper proposes a realistic image generation method for visualization in endoscopic simulation systems. Endoscopic diagnosis and treatment are performed in many hospitals. To reduce complications related to endoscope insertions, endoscopic simulation systems are used for training or rehearsal of endoscope insertions. However, current simulation systems generate non-realistic virtual endoscopic images. To improve the value of the simulation systems, improvement of reality of their generated images is necessary. We propose a realistic image generation method for endoscopic simulation systems. Virtual endoscopic images are generated by using a volume rendering method from a CT volume of a patient. We improve the reality of the virtual endoscopic images using a virtual-to-real image-domain translation technique. The image-domain translator is implemented as a fully convolutional network (FCN). We train the FCN by minimizing a cycle consistency loss function. The FCN is trained using unpaired virtual and real endoscopic images. To obtain high quality image-domain translation results, we perform an image cleansing to the real endoscopic image set. We tested to use the shallow U-Net, U-Net, deep U-Net, and U-Net having residual units as the image-domain translator. The deep U-Net and U-Net having residual units generated quite realistic images. △ Less

Submitted 13 January, 2022; originally announced January 2022.

Comments: Accepted paper as an oral presentation at the Joint MICCAI workshop MIAR | AE-CAI | CARE 2019

Journal ref: Healthcare Technology Letters, Vol.6, No.6, pp.214-219, 2019

arXiv:2201.04485 [pdf, other]

doi 10.1080/21681163.2021.2012835

Depth Estimation from Single-shot Monocular Endoscope Image Using Image Domain Adaptation And Edge-Aware Depth Estimation

Authors: Masahiro Oda, Hayato Itoh, Kiyohito Tanaka, Hirotsugu Takabatake, Masaki Mori, Hiroshi Natori, Kensaku Mori

Abstract: We propose a depth estimation method from a single-shot monocular endoscopic image using Lambertian surface translation by domain adaptation and depth estimation using multi-scale edge loss. We employ a two-step estimation process including Lambertian surface translation from unpaired data and depth estimation. The texture and specular reflection on the surface of an organ reduce the accuracy of d… ▽ More We propose a depth estimation method from a single-shot monocular endoscopic image using Lambertian surface translation by domain adaptation and depth estimation using multi-scale edge loss. We employ a two-step estimation process including Lambertian surface translation from unpaired data and depth estimation. The texture and specular reflection on the surface of an organ reduce the accuracy of depth estimations. We apply Lambertian surface translation to an endoscopic image to remove these texture and reflections. Then, we estimate the depth by using a fully convolutional network (FCN). During the training of the FCN, improvement of the object edge similarity between an estimated image and a ground truth depth image is important for getting better results. We introduced a muti-scale edge loss function to improve the accuracy of depth estimation. We quantitatively evaluated the proposed method using real colonoscopic images. The estimated depth values were proportional to the real depth values. Furthermore, we applied the estimated depth images to automated anatomical location identification of colonoscopic images using a convolutional neural network. The identification accuracy of the network improved from 69.2% to 74.1% by using the estimated depth images. △ Less

Submitted 12 January, 2022; originally announced January 2022.

Comments: Accepted paper as an oral presentation at Joint MICCAI workshop 2021, AE-CAI/CARE/OR2.0

Journal ref: Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 2021

arXiv:2201.03053 [pdf, other]

doi 10.1007/978-3-030-90874-4_9

COVID-19 Infection Segmentation from Chest CT Images Based on Scale Uncertainty

Authors: Masahiro Oda, Tong Zheng, Yuichiro Hayashi, Yoshito Otake, Masahiro Hashimoto, Toshiaki Akashi, Shigeki Aoki, Kensaku Mori

Abstract: This paper proposes a segmentation method of infection regions in the lung from CT volumes of COVID-19 patients. COVID-19 spread worldwide, causing many infected patients and deaths. CT image-based diagnosis of COVID-19 can provide quick and accurate diagnosis results. An automated segmentation method of infection regions in the lung provides a quantitative criterion for diagnosis. Previous method… ▽ More This paper proposes a segmentation method of infection regions in the lung from CT volumes of COVID-19 patients. COVID-19 spread worldwide, causing many infected patients and deaths. CT image-based diagnosis of COVID-19 can provide quick and accurate diagnosis results. An automated segmentation method of infection regions in the lung provides a quantitative criterion for diagnosis. Previous methods employ whole 2D image or 3D volume-based processes. Infection regions have a considerable variation in their sizes. Such processes easily miss small infection regions. Patch-based process is effective for segmenting small targets. However, selecting the appropriate patch size is difficult in infection region segmentation. We utilize the scale uncertainty among various receptive field sizes of a segmentation FCN to obtain infection regions. The receptive field sizes can be defined as the patch size and the resolution of volumes where patches are clipped from. This paper proposes an infection segmentation network (ISNet) that performs patch-based segmentation and a scale uncertainty-aware prediction aggregation method that refines the segmentation result. We design ISNet to segment infection regions that have various intensity values. ISNet has multiple encoding paths to process patch volumes normalized by multiple intensity ranges. We collect prediction results generated by ISNets having various receptive field sizes. Scale uncertainty among the prediction results is extracted by the prediction aggregation method. We use an aggregation FCN to generate a refined segmentation result considering scale uncertainty among the predictions. In our experiments using 199 chest CT volumes of COVID-19 cases, the prediction aggregation method improved the dice similarity score from 47.6% to 62.1%. △ Less

Submitted 9 January, 2022; originally announced January 2022.

Comments: Accepted paper as a oral presentation at CILP2021, 10th MICCAI CLIP Workshop

Journal ref: DCL 2021, PPML 2021, LL-COVID19 2021, CLIP 2021, Lecture Notes in Computer Science (LNCS) 12969, pp.88-97

arXiv:2201.03050 [pdf, ps, other]

doi 10.1117/12.2582066

Lung infection and normal region segmentation from CT volumes of COVID-19 cases

Authors: Masahiro Oda, Yuichiro Hayashi, Yoshito Otake, Masahiro Hashimoto, Toshiaki Akashi, Kensaku Mori

Abstract: This paper proposes an automated segmentation method of infection and normal regions in the lung from CT volumes of COVID-19 patients. From December 2019, novel coronavirus disease 2019 (COVID-19) spreads over the world and giving significant impacts to our economic activities and daily lives. To diagnose the large number of infected patients, diagnosis assistance by computers is needed. Chest CT… ▽ More This paper proposes an automated segmentation method of infection and normal regions in the lung from CT volumes of COVID-19 patients. From December 2019, novel coronavirus disease 2019 (COVID-19) spreads over the world and giving significant impacts to our economic activities and daily lives. To diagnose the large number of infected patients, diagnosis assistance by computers is needed. Chest CT is effective for diagnosis of viral pneumonia including COVID-19. A quantitative analysis method of condition of the lung from CT volumes by computers is required for diagnosis assistance of COVID-19. This paper proposes an automated segmentation method of infection and normal regions in the lung from CT volumes using a COVID-19 segmentation fully convolutional network (FCN). In diagnosis of lung diseases including COVID-19, analysis of conditions of normal and infection regions in the lung is important. Our method recognizes and segments lung normal and infection regions in CT volumes. To segment infection regions that have various shapes and sizes, we introduced dense pooling connections and dilated convolutions in our FCN. We applied the proposed method to CT volumes of COVID-19 cases. From mild to severe cases of COVID-19, the proposed method correctly segmented normal and infection regions in the lung. Dice scores of normal and infection regions were 0.911 and 0.753, respectively. △ Less

Submitted 9 January, 2022; originally announced January 2022.

Comments: Accepted paper as a poster presentation at SPIE Medical Imaging 2021

Journal ref: Proceedings of SPIE Medical Imaging 2021: Computer-Aided Diagnosis, Vol.11597, 115972X-1-6

arXiv:2112.02796 [pdf, other]

Conditional Deep Hierarchical Variational Autoencoder for Voice Conversion

Authors: Kei Akuzawa, Kotaro Onishi, Keisuke Takiguchi, Kohki Mametani, Koichiro Mori

Abstract: Variational autoencoder-based voice conversion (VAE-VC) has the advantage of requiring only pairs of speeches and speaker labels for training. Unlike the majority of the research in VAE-VC which focuses on utilizing auxiliary losses or discretizing latent variables, this paper investigates how an increasing model expressiveness has benefits and impacts on the VAE-VC. Specifically, we first analyze… ▽ More Variational autoencoder-based voice conversion (VAE-VC) has the advantage of requiring only pairs of speeches and speaker labels for training. Unlike the majority of the research in VAE-VC which focuses on utilizing auxiliary losses or discretizing latent variables, this paper investigates how an increasing model expressiveness has benefits and impacts on the VAE-VC. Specifically, we first analyze VAE-VC from a rate-distortion perspective, and point out that model expressiveness is significant for VAE-VC because rate and distortion reflect similarity and naturalness of converted speeches. Based on the analysis, we propose a novel VC method using a deep hierarchical VAE, which has high model expressiveness as well as having fast conversion speed thanks to its non-autoregressive decoder. Also, our analysis reveals another problem that similarity can be degraded when the latent variable of VAEs has redundant information. We address the problem by controlling the information contained in the latent variable using $β$-VAE objective. In the experiment using VCTK corpus, the proposed method achieved mean opinion scores higher than 3.5 on both naturalness and similarity in inter-gender settings, which are higher than the scores of existing autoencoder-based VC methods. △ Less

Submitted 6 December, 2021; originally announced December 2021.

arXiv:2108.08537 [pdf, other]

Multi-task Federated Learning for Heterogeneous Pancreas Segmentation

Authors: Chen Shen, Pochuan Wang, Holger R. Roth, Dong Yang, Daguang Xu, Masahiro Oda, Weichung Wang, Chiou-Shann Fuh, Po-Ting Chen, Kao-Lang Liu, Wei-Chih Liao, Kensaku Mori

Abstract: Federated learning (FL) for medical image segmentation becomes more challenging in multi-task settings where clients might have different categories of labels represented in their data. For example, one client might have patient data with "healthy'' pancreases only while datasets from other clients may contain cases with pancreatic tumors. The vanilla federated averaging algorithm makes it possibl… ▽ More Federated learning (FL) for medical image segmentation becomes more challenging in multi-task settings where clients might have different categories of labels represented in their data. For example, one client might have patient data with "healthy'' pancreases only while datasets from other clients may contain cases with pancreatic tumors. The vanilla federated averaging algorithm makes it possible to obtain more generalizable deep learning-based segmentation models representing the training data from multiple institutions without centralizing datasets. However, it might be sub-optimal for the aforementioned multi-task scenarios. In this paper, we investigate heterogeneous optimization methods that show improvements for the automated segmentation of pancreas and pancreatic tumors in abdominal CT images with FL settings. △ Less

Submitted 19 August, 2021; originally announced August 2021.

Comments: Accepted by MICCAI DCL Workshop 2021

ACM Class: I.4.6

arXiv:2010.10207 [pdf, other]

Micro CT Image-Assisted Cross Modality Super-Resolution of Clinical CT Images Utilizing Synthesized Training Dataset

Authors: Tong Zheng, Hirohisa Oda, Masahiro Oda, Shota Nakamura, Masaki Mori, Hirotsugu Takabatake, Hiroshi Natori, Kensaku Mori

Abstract: This paper proposes a novel, unsupervised super-resolution (SR) approach for performing the SR of a clinical CT into the resolution level of a micro CT ($μ$CT). The precise non-invasive diagnosis of lung cancer typically utilizes clinical CT data. Due to the resolution limitations of clinical CT (about $0.5 \times 0.5 \times 0.5$ mm$^3$), it is difficult to obtain enough pathological information s… ▽ More This paper proposes a novel, unsupervised super-resolution (SR) approach for performing the SR of a clinical CT into the resolution level of a micro CT ($μ$CT). The precise non-invasive diagnosis of lung cancer typically utilizes clinical CT data. Due to the resolution limitations of clinical CT (about $0.5 \times 0.5 \times 0.5$ mm$^3$), it is difficult to obtain enough pathological information such as the invasion area at alveoli level. On the other hand, $μ$CT scanning allows the acquisition of volumes of lung specimens with much higher resolution ($50 \times 50 \times 50 μ{\rm m}^3$ or higher). Thus, super-resolution of clinical CT volume may be helpful for diagnosis of lung cancer. Typical SR methods require aligned pairs of low-resolution (LR) and high-resolution (HR) images for training. Unfortunately, obtaining paired clinical CT and $μ$CT volumes of human lung tissues is infeasible. Unsupervised SR methods are required that do not need paired LR and HR images. In this paper, we create corresponding clinical CT-$μ$CT pairs by simulating clinical CT images from $μ$CT images by modified CycleGAN. After this, we use simulated clinical CT-$μ$CT image pairs to train an SR network based on SRGAN. Finally, we use the trained SR network to perform SR of the clinical CT images. We compare our proposed method with another unsupervised SR method for clinical CT images named SR-CycleGAN. Experimental results demonstrate that the proposed method can successfully perform SR of clinical CT images of lung cancer patients with $μ$CT level resolution, and quantitatively and qualitatively outperformed conventional method (SR-CycleGAN), improving the SSIM (structure similarity) form 0.40 to 0.51. △ Less

Submitted 20 October, 2020; originally announced October 2020.

arXiv:2009.13148 [pdf, other]

Automated Pancreas Segmentation Using Multi-institutional Collaborative Deep Learning

Authors: Pochuan Wang, Chen Shen, Holger R. Roth, Dong Yang, Daguang Xu, Masahiro Oda, Kazunari Misawa, Po-Ting Chen, Kao-Lang Liu, Wei-Chih Liao, Weichung Wang, Kensaku Mori

Abstract: The performance of deep learning-based methods strongly relies on the number of datasets used for training. Many efforts have been made to increase the data in the medical image analysis field. However, unlike photography images, it is hard to generate centralized databases to collect medical images because of numerous technical, legal, and privacy issues. In this work, we study the use of federat… ▽ More The performance of deep learning-based methods strongly relies on the number of datasets used for training. Many efforts have been made to increase the data in the medical image analysis field. However, unlike photography images, it is hard to generate centralized databases to collect medical images because of numerous technical, legal, and privacy issues. In this work, we study the use of federated learning between two institutions in a real-world setting to collaboratively train a model without sharing the raw data across national boundaries. We quantitatively compare the segmentation models obtained with federated learning and local training alone. Our experimental results show that federated learning models have higher generalizability than standalone training. △ Less

Submitted 28 September, 2020; originally announced September 2020.

Comments: Accepted by MICCAI DCL Workshop 2020

arXiv:2005.03345 [pdf, ps, other]

doi 10.1007/978-3-319-46723-8_64

Regression Forest-Based Atlas Localization and Direction Specific Atlas Generation for Pancreas Segmentation

Authors: Masahiro Oda, Natsuki Shimizu, Ken'ichi Karasawa, Yukitaka Nimura, Takayuki Kitasaka, Kazunari Misawa, Michitaka Fujiwara, Daniel Rueckert, Kensaku Mori

Abstract: This paper proposes a fully automated atlas-based pancreas segmentation method from CT volumes utilizing atlas localization by regression forest and atlas generation using blood vessel information. Previous probabilistic atlas-based pancreas segmentation methods cannot deal with spatial variations that are commonly found in the pancreas well. Also, shape variations are not represented by an averag… ▽ More This paper proposes a fully automated atlas-based pancreas segmentation method from CT volumes utilizing atlas localization by regression forest and atlas generation using blood vessel information. Previous probabilistic atlas-based pancreas segmentation methods cannot deal with spatial variations that are commonly found in the pancreas well. Also, shape variations are not represented by an averaged atlas. We propose a fully automated pancreas segmentation method that deals with two types of variations mentioned above. The position and size of the pancreas is estimated using a regression forest technique. After localization, a patient-specific probabilistic atlas is generated based on a new image similarity that reflects the blood vessel position and direction information around the pancreas. We segment it using the EM algorithm with the atlas as prior followed by the graph-cut. In evaluation results using 147 CT volumes, the Jaccard index and the Dice overlap of the proposed method were 62.1% and 75.1%, respectively. Although we automated all of the segmentation processes, segmentation results were superior to the other state-of-the-art methods in the Dice overlap. △ Less

Submitted 7 May, 2020; originally announced May 2020.

Comments: Accepted paper as a poster presentation at MICCAI 2016 (International Conference on Medical Image Computing and Computer-Assisted Intervention), Athens, Greece

Journal ref: Published in Proceedings of MICCAI 2016, LNCS 9901, pp 556-563

arXiv:2005.01433 [pdf, ps, other]

doi 10.1117/12.2549951

Automated eye disease classification method from anterior eye image using anatomical structure focused image classification technique

Authors: Masahiro Oda, Takefumi Yamaguchi, Hideki Fukuoka, Yuta Ueno, Kensaku Mori

Abstract: This paper presents an automated classification method of infective and non-infective diseases from anterior eye images. Treatments for cases of infective and non-infective diseases are different. Distinguishing them from anterior eye images is important to decide a treatment plan. Ophthalmologists distinguish them empirically. Quantitative classification of them based on computer assistance is ne… ▽ More This paper presents an automated classification method of infective and non-infective diseases from anterior eye images. Treatments for cases of infective and non-infective diseases are different. Distinguishing them from anterior eye images is important to decide a treatment plan. Ophthalmologists distinguish them empirically. Quantitative classification of them based on computer assistance is necessary. We propose an automated classification method of anterior eye images into cases of infective or non-infective disease. Anterior eye images have large variations of the eye position and brightness of illumination. This makes the classification difficult. If we focus on the cornea, positions of opacified areas in the corneas are different between cases of the infective and non-infective diseases. Therefore, we solve the anterior eye image classification task by using an object detection approach targeting the cornea. This approach can be said as "anatomical structure focused image classification". We use the YOLOv3 object detection method to detect corneas of infective disease and corneas of non-infective disease. The detection result is used to define a classification result of a image. In our experiments using anterior eye images, 88.3% of images were correctly classified by the proposed method. △ Less

Submitted 4 May, 2020; originally announced May 2020.

Comments: Accepted paper as a poster presentation at SPIE Medical Imaging 2020, Houston, TX, USA

Journal ref: Proceedings of SPIE Medical Imaging 2020: Computer-Aided Diagnosis, Vol.11314, 1131446

arXiv:2004.13629 [pdf, ps, other]

doi 10.1007/978-3-030-00937-3_21

Colon Shape Estimation Method for Colonoscope Tracking using Recurrent Neural Networks

Authors: Masahiro Oda, Holger R. Roth, Takayuki Kitasaka, Kazuhiro Furukawa, Ryoji Miyahara, Yoshiki Hirooka, Hidemi Goto, Nassir Navab, Kensaku Mori

Abstract: We propose an estimation method using a recurrent neural network (RNN) of the colon's shape where deformation was occurred by a colonoscope insertion. Colonoscope tracking or a navigation system that navigates physician to polyp positions is needed to reduce such complications as colon perforation. Previous tracking methods caused large tracking errors at the transverse and sigmoid colons because… ▽ More We propose an estimation method using a recurrent neural network (RNN) of the colon's shape where deformation was occurred by a colonoscope insertion. Colonoscope tracking or a navigation system that navigates physician to polyp positions is needed to reduce such complications as colon perforation. Previous tracking methods caused large tracking errors at the transverse and sigmoid colons because these areas largely deform during colonoscope insertion. Colon deformation should be taken into account in tracking processes. We propose a colon deformation estimation method using RNN and obtain the colonoscope shape from electromagnetic sensors during its insertion into the colon. This method obtains positional, directional, and an insertion length from the colonoscope shape. From its shape, we also calculate the relative features that represent the positional and directional relationships between two points on a colonoscope. Long short-term memory is used to estimate the current colon shape from the past transition of the features of the colonoscope shape. We performed colon shape estimation in a phantom study and correctly estimated the colon shapes during colonoscope insertion with 12.39 (mm) estimation error. △ Less

Submitted 20 April, 2020; originally announced April 2020.

Comments: Accepted paper as a poster presentation at MICCAI 2018 (International Conference on Medical Image Computing and Computer-Assisted Intervention), Granada, Spain

Journal ref: Published in Proceedings of MICCAI 2018, LNCS 11073, pp 176-184

arXiv:2004.09056 [pdf, ps, other]

doi 10.1117/12.2512729

Colonoscope tracking method based on shape estimation network

Authors: Masahiro Oda, Holger R. Roth, Takayuki Kitasaka, Kazuhiro Furukawa, Ryoji Miyahara, Yoshiki Hirooka, Nassir Navab, Kensaku Mori

Abstract: This paper presents a colonoscope tracking method utilizing a colon shape estimation method. CT colonography is used as a less-invasive colon diagnosis method. If colonic polyps or early-stage cancers are found, they are removed in a colonoscopic examination. In the colonoscopic examination, understanding where the colonoscope running in the colon is difficult. A colonoscope navigation system is n… ▽ More This paper presents a colonoscope tracking method utilizing a colon shape estimation method. CT colonography is used as a less-invasive colon diagnosis method. If colonic polyps or early-stage cancers are found, they are removed in a colonoscopic examination. In the colonoscopic examination, understanding where the colonoscope running in the colon is difficult. A colonoscope navigation system is necessary to reduce overlooking of polyps. We propose a colonoscope tracking method for navigation systems. Previous colonoscope tracking methods caused large tracking errors because they do not consider deformations of the colon during colonoscope insertions. We utilize the shape estimation network (SEN), which estimates deformed colon shape during colonoscope insertions. The SEN is a neural network containing long short-term memory (LSTM) layer. To perform colon shape estimation suitable to the real clinical situation, we trained the SEN using data obtained during colonoscope operations of physicians. The proposed tracking method performs map** of the colonoscope tip position to a position in the colon using estimation results of the SEN. We evaluated the proposed method in a phantom study. We confirmed that tracking errors of the proposed method was enough small to perform navigation in the ascending, transverse, and descending colons. △ Less

Submitted 20 April, 2020; originally announced April 2020.

Comments: Accepted paper as an oral presentation at SPIE Medical Imaging 2019, San Diego, CA, USA

Journal ref: Proceedings of SPIE Medical Imaging 2019: Image-Guided Procedures, Robotic Interventions, and Modeling, Vol.10951, 109510Q

arXiv:2004.03272 [pdf]

Super-resolution of clinical CT volumes with modified CycleGAN using micro CT volumes

Authors: Tong ZHENG, Hirohisa ODA, Takayasu MORIYA, Takaaki SUGINO, Shota NAKAMURA, Masahiro ODA, Masaki MORI, Hirotsugu TAKABATAKE, Hiroshi NATORI, Kensaku MORI

Abstract: This paper presents a super-resolution (SR) method with unpaired training dataset of clinical CT and micro CT volumes. For obtaining very detailed information such as cancer invasion from pre-operative clinical CT volumes of lung cancer patients, SR of clinical CT volumes to $\m$}CT level is desired. While most SR methods require paired low- and high- resolution images for training, it is infeasib… ▽ More This paper presents a super-resolution (SR) method with unpaired training dataset of clinical CT and micro CT volumes. For obtaining very detailed information such as cancer invasion from pre-operative clinical CT volumes of lung cancer patients, SR of clinical CT volumes to $\m$}CT level is desired. While most SR methods require paired low- and high- resolution images for training, it is infeasible to obtain paired clinical CT and μCT volumes. We propose a SR approach based on CycleGAN, which could perform SR on clinical CT into $μ$CT level. We proposed new loss functions to keep cycle consistency, while training without paired volumes. Experimental results demonstrated that our proposed method successfully performed SR of clinical CT volume of lung cancer patients into $μ$CT level. △ Less

Submitted 7 April, 2020; originally announced April 2020.

Comments: 6 pages, 2 figures

arXiv:2003.10690 [pdf, other]

doi 10.1117/12.2551024

Organ Segmentation From Full-size CT Images Using Memory-Efficient FCN

Authors: Chenglong Wang, Masahiro Oda, Kensaku Mori

Abstract: In this work, we present a memory-efficient fully convolutional network (FCN) incorporated with several memory-optimized techniques to reduce the run-time GPU memory demand during training phase. In medical image segmentation tasks, subvolume crop** has become a common preprocessing. Subvolumes (or small patch volumes) were cropped to reduce GPU memory demand. However, small patch volumes captur… ▽ More In this work, we present a memory-efficient fully convolutional network (FCN) incorporated with several memory-optimized techniques to reduce the run-time GPU memory demand during training phase. In medical image segmentation tasks, subvolume crop** has become a common preprocessing. Subvolumes (or small patch volumes) were cropped to reduce GPU memory demand. However, small patch volumes capture less spatial context that leads to lower accuracy. As a pilot study, the purpose of this work is to propose a memory-efficient FCN which enables us to train the model on full size CT image directly without subvolume crop**, while maintaining the segmentation accuracy. We optimize our network from both architecture and implementation. With the development of computing hardware, such as graphics processing unit (GPU) and tensor processing unit (TPU), now deep learning applications is able to train networks with large datasets within acceptable time. Among these applications, semantic segmentation using fully convolutional network (FCN) also has gained a significant improvement against traditional image processing approaches in both computer vision and medical image processing fields. However, unlike general color images used in computer vision tasks, medical images have larger scales than color images such as 3D computed tomography (CT) images, micro CT images, and histopathological images. For training these medical images, the large demand of computing resource become a severe problem. In this paper, we present a memory-efficient FCN to tackle the high GPU memory demand challenge in organ segmentation problem from clinical CT images. The experimental results demonstrated that our GPU memory demand is about 40% of baseline architecture, parameter amount is about 30% of the baseline. △ Less

Submitted 24 March, 2020; originally announced March 2020.

Journal ref: Proc. SPIE 11314, Medical Imaging 2020: Computer-Aided Diagnosis, 113140I

arXiv:2003.01290 [pdf, other]

Visualizing intestines for diagnostic assistance of ileus based on intestinal region segmentation from 3D CT images

Authors: Hirohisa Oda, Kohei Nishio, Takayuki Kitasaka, Hizuru Amano, Aitaro Takimoto, Hiroo Uchida, Kojiro Suzuki, Hayato Itoh, Masahiro Oda, Kensaku Mori

Abstract: This paper presents a visualization method of intestine (the small and large intestines) regions and their stenosed parts caused by ileus from CT volumes. Since it is difficult for non-expert clinicians to find stenosed parts, the intestine and its stenosed parts should be visualized intuitively. Furthermore, the intestine regions of ileus cases are quite hard to be segmented. The proposed method… ▽ More This paper presents a visualization method of intestine (the small and large intestines) regions and their stenosed parts caused by ileus from CT volumes. Since it is difficult for non-expert clinicians to find stenosed parts, the intestine and its stenosed parts should be visualized intuitively. Furthermore, the intestine regions of ileus cases are quite hard to be segmented. The proposed method segments intestine regions by 3D FCN (3D U-Net). Intestine regions are quite difficult to be segmented in ileus cases since the inside the intestine is filled with fluids. These fluids have similar intensities with intestinal wall on 3D CT volumes. We segment the intestine regions by using 3D U-Net trained by a weak annotation approach. Weak-annotation makes possible to train the 3D U-Net with small manually-traced label images of the intestine. This avoids us to prepare many annotation labels of the intestine that has long and winding shape. Each intestine segment is volume-rendered and colored based on the distance from its endpoint in volume rendering. Stenosed parts (disjoint points of an intestine segment) can be easily identified on such visualization. In the experiments, we showed that stenosed parts were intuitively visualized as endpoints of segmented regions, which are colored by red or blue. △ Less

Submitted 2 March, 2020; originally announced March 2020.

Journal ref: SPIE Medical Imaging 2020, 11314-109

arXiv:1912.12838 [pdf, other]

Multi-modality super-resolution loss for GAN-based super-resolution of clinical CT images using micro CT image database

Authors: Tong Zheng, Hirohisa Oda, Takayasu Moriya, Shota Nakamura, Masahiro Oda, Masaki Mori, Horitsugu Takabatake, Hiroshi Natori, Kensaku Mori

Abstract: This paper newly introduces multi-modality loss function for GAN-based super-resolution that can maintain image structure and intensity on unpaired training dataset of clinical CT and micro CT volumes. Precise non-invasive diagnosis of lung cancer mainly utilizes 3D multidetector computed-tomography (CT) data. On the other hand, we can take micro CT images of resected lung specimen in 50 micro met… ▽ More This paper newly introduces multi-modality loss function for GAN-based super-resolution that can maintain image structure and intensity on unpaired training dataset of clinical CT and micro CT volumes. Precise non-invasive diagnosis of lung cancer mainly utilizes 3D multidetector computed-tomography (CT) data. On the other hand, we can take micro CT images of resected lung specimen in 50 micro meter or higher resolution. However, micro CT scanning cannot be applied to living human imaging. For obtaining highly detailed information such as cancer invasion area from pre-operative clinical CT volumes of lung cancer patients, super-resolution (SR) of clinical CT volumes to $μ$CT level might be one of substitutive solutions. While most SR methods require paired low- and high-resolution images for training, it is infeasible to obtain precisely paired clinical CT and micro CT volumes. We aim to propose unpaired SR approaches for clincial CT using micro CT images based on unpaired image translation methods such as CycleGAN or UNIT. Since clinical CT and micro CT are very different in structure and intensity, direct application of GAN-based unpaired image translation methods in super-resolution tends to generate arbitrary images. Aiming to solve this problem, we propose new loss function called multi-modality loss function to maintain the similarity of input images and corresponding output images in super-resolution task. Experimental results demonstrated that the newly proposed loss function made CycleGAN and UNIT to successfully perform SR of clinical CT images of lung cancer patients into micro CT level resolution, while original CycleGAN and UNIT failed in super-resolution. △ Less

Submitted 7 April, 2020; v1 submitted 30 December, 2019; originally announced December 2019.

Comments: 6 pages, 2 figures

arXiv:1909.11167 [pdf, other]

Intelligent image synthesis to attack a segmentation CNN using adversarial learning

Authors: Liang Chen, Paul Bentley, Kensaku Mori, Kazunari Misawa, Michitaka Fujiwara, Daniel Rueckert

Abstract: Deep learning approaches based on convolutional neural networks (CNNs) have been successful in solving a number of problems in medical imaging, including image segmentation. In recent years, it has been shown that CNNs are vulnerable to attacks in which the input image is perturbed by relatively small amounts of noise so that the CNN is no longer able to perform a segmentation of the perturbed ima… ▽ More Deep learning approaches based on convolutional neural networks (CNNs) have been successful in solving a number of problems in medical imaging, including image segmentation. In recent years, it has been shown that CNNs are vulnerable to attacks in which the input image is perturbed by relatively small amounts of noise so that the CNN is no longer able to perform a segmentation of the perturbed image with sufficient accuracy. Therefore, exploring methods on how to attack CNN-based models as well as how to defend models against attacks have become a popular topic as this also provides insights into the performance and generalization abilities of CNNs. However, most of the existing work assumes unrealistic attack models, i.e. the resulting attacks were specified in advance. In this paper, we propose a novel approach for generating adversarial examples to attack CNN-based segmentation models for medical images. Our approach has three key features: 1) The generated adversarial examples exhibit anatomical variations (in form of deformations) as well as appearance perturbations; 2) The adversarial examples attack segmentation models so that the Dice scores decrease by a pre-specified amount; 3) The attack is not required to be specified beforehand. We have evaluated our approach on CNN-based approaches for the multi-organ segmentation problem in 2D CT images. We show that the proposed approach can be used to attack different CNN-based segmentation models. △ Less

Submitted 24 September, 2019; originally announced September 2019.

arXiv:1908.01543 [pdf, other]

doi 10.1016/j.compmedimag.2019.101642

Precise Estimation of Renal Vascular Dominant Regions Using Spatially Aware Fully Convolutional Networks, Tensor-Cut and Voronoi Diagrams

Authors: Chenglong Wang, Holger R. Roth, Takayuki Kitasaka, Masahiro Oda, Yuichiro Hayashi, Yasushi Yoshino, Tokunori Yamamoto, Naoto Sassa, Momokazu Goto, Kensaku Mori

Abstract: This paper presents a new approach for precisely estimating the renal vascular dominant region using a Voronoi diagram. To provide computer-assisted diagnostics for the pre-surgical simulation of partial nephrectomy surgery, we must obtain information on the renal arteries and the renal vascular dominant regions. We propose a fully automatic segmentation method that combines a neural network and t… ▽ More This paper presents a new approach for precisely estimating the renal vascular dominant region using a Voronoi diagram. To provide computer-assisted diagnostics for the pre-surgical simulation of partial nephrectomy surgery, we must obtain information on the renal arteries and the renal vascular dominant regions. We propose a fully automatic segmentation method that combines a neural network and tensor-based graph-cut methods to precisely extract the kidney and renal arteries. First, we use a convolutional neural network to localize the kidney regions and extract tiny renal arteries with a tensor-based graph-cut method. Then we generate a Voronoi diagram to estimate the renal vascular dominant regions based on the segmented kidney and renal arteries. The accuracy of kidney segmentation in 27 cases with 8-fold cross validation reached a Dice score of 95%. The accuracy of renal artery segmentation in 8 cases obtained a centerline overlap ratio of 80%. Each partition region corresponds to a renal vascular dominant region. The final dominant-region estimation accuracy achieved a Dice coefficient of 80%. A clinical application showed the potential of our proposed estimation approach in a real clinical surgical environment. Further validation using large-scale database is our future work. △ Less

Submitted 5 August, 2019; originally announced August 2019.

Journal ref: Computerized Medical Imaging and Graphics 77 (2019): 101642

arXiv:1812.09567 [pdf, other]

doi 10.1109/ISGT.2019.8791624

Learning Dynamical Demand Response Model in Real-Time Pricing Program

Authors: Hanchen Xu, Hongbo Sun, Daniel Nikovski, Kitamura Shoichi, Kazuyuki Mori

Abstract: Price responsiveness is a major feature of end use customers (EUCs) that participate in demand response (DR) programs, and has been conventionally modeled with static demand functions, which take the electricity price as the input and the aggregate energy consumption as the output. This, however, neglects the inherent temporal correlation of the EUC behaviors, and may result in large errors when p… ▽ More Price responsiveness is a major feature of end use customers (EUCs) that participate in demand response (DR) programs, and has been conventionally modeled with static demand functions, which take the electricity price as the input and the aggregate energy consumption as the output. This, however, neglects the inherent temporal correlation of the EUC behaviors, and may result in large errors when predicting the actual responses of EUCs in real-time pricing (RTP) programs. In this paper, we propose a dynamical DR model so as to capture the temporal behavior of the EUCs. The states in the proposed dynamical DR model can be explicitly chosen, in which case the model can be represented by a linear function or a multi-layer feedforward neural network, or implicitly chosen, in which case the model can be represented by a recurrent neural network or a long short-term memory unit network. In both cases, the dynamical DR model can be learned from historical price and energy consumption data. Numerical simulation illustrated how the states are chosen and also showed the proposed dynamical DR model significantly outperforms the static ones. △ Less

Submitted 22 December, 2018; originally announced December 2018.

Comments: Accepted to IEEE ISGT NA 2019

arXiv:1806.03019 [pdf, ps, other]

doi 10.1007/978-3-319-67558-9_26

3D FCN Feature Driven Regression Forest-Based Pancreas Localization and Segmentation

Authors: Masahiro Oda, Natsuki Shimizu, Holger R. Roth, Ken'ichi Karasawa, Takayuki Kitasaka, Kazunari Misawa, Michitaka Fujiwara, Daniel Rueckert, Kensaku Mori

Abstract: This paper presents a fully automated atlas-based pancreas segmentation method from CT volumes utilizing 3D fully convolutional network (FCN) feature-based pancreas localization. Segmentation of the pancreas is difficult because it has larger inter-patient spatial variations than other organs. Previous pancreas segmentation methods failed to deal with such variations. We propose a fully automated… ▽ More This paper presents a fully automated atlas-based pancreas segmentation method from CT volumes utilizing 3D fully convolutional network (FCN) feature-based pancreas localization. Segmentation of the pancreas is difficult because it has larger inter-patient spatial variations than other organs. Previous pancreas segmentation methods failed to deal with such variations. We propose a fully automated pancreas segmentation method that contains novel localization and segmentation. Since the pancreas neighbors many other organs, its position and size are strongly related to the positions of the surrounding organs. We estimate the position and the size of the pancreas (localized) from global features by regression forests. As global features, we use intensity differences and 3D FCN deep learned features, which include automatically extracted essential features for segmentation. We chose 3D FCN features from a trained 3D U-Net, which is trained to perform multi-organ segmentation. The global features include both the pancreas and surrounding organ information. After localization, a patient-specific probabilistic atlas-based pancreas segmentation is performed. In evaluation results with 146 CT volumes, we achieved 60.6% of the Jaccard index and 73.9% of the Dice overlap. △ Less

Submitted 8 June, 2018; originally announced June 2018.

Comments: Presented in MICCAI 2017 workshop, DLMIA 2017 (Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support)

Report number: Published in LNCS Vol.10553

Journal ref: DLMIA 2017, ML-CDS 2017: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, pp.222-230

arXiv:1806.03014 [pdf, ps, other]

doi 10.1117/12.2293936

Machine learning-based colon deformation estimation method for colonoscope tracking

Authors: Masahiro Oda, Takayuki Kitasaka, Kazuhiro Furukawa, Ryoji Miyahara, Yoshiki Hirooka, Hidemi Goto, Nassir Navab, Kensaku Mori

Abstract: This paper presents a colon deformation estimation method, which can be used to estimate colon deformations during colonoscope insertions. Colonoscope tracking or navigation system that navigates a physician to polyp positions during a colonoscope insertion is required to reduce complications such as colon perforation. A previous colonoscope tracking method obtains a colonoscope position in the co… ▽ More This paper presents a colon deformation estimation method, which can be used to estimate colon deformations during colonoscope insertions. Colonoscope tracking or navigation system that navigates a physician to polyp positions during a colonoscope insertion is required to reduce complications such as colon perforation. A previous colonoscope tracking method obtains a colonoscope position in the colon by registering a colonoscope shape and a colon shape. The colonoscope shape is obtained using an electromagnetic sensor, and the colon shape is obtained from a CT volume. However, large tracking errors were observed due to colon deformations occurred during colonoscope insertions. Such deformations make the registration difficult. Because the colon deformation is caused by a colonoscope, there is a strong relationship between the colon deformation and the colonoscope shape. An estimation method of colon deformations occur during colonoscope insertions is necessary to reduce tracking errors. We propose a colon deformation estimation method. This method is used to estimate a deformed colon shape from a colonoscope shape. We use the regression forests algorithm to estimate a deformed colon shape. The regression forests algorithm is trained using pairs of colon and colonoscope shapes, which contains deformations occur during colonoscope insertions. As a preliminary study, we utilized the method to estimate deformations of a colon phantom. In our experiments, the proposed method correctly estimated deformed colon phantom shapes. △ Less

Submitted 8 June, 2018; originally announced June 2018.

Comments: Accepted paper for oral presentation at SPIE Medical Imaging 2018, Houston, TX, USA

Report number: Published in Proceedings of SPIE 10576, Medical Imaging 2018: Image-Guided Procedures, Robotic Interventions, and Modeling, 1057619

Journal ref: SPIE Medical Imaging 2018: Image-Guided Procedures, Robotic Interventions, and Modeling

arXiv:1806.02237 [pdf, other]

A multi-scale pyramid of 3D fully convolutional networks for abdominal multi-organ segmentation

Authors: Holger R. Roth, Chen Shen, Hirohisa Oda, Takaaki Sugino, Masahiro Oda, Yuichiro Hayashi, Kazunari Misawa, Kensaku Mori

Abstract: Recent advances in deep learning, like 3D fully convolutional networks (FCNs), have improved the state-of-the-art in dense semantic segmentation of medical images. However, most network architectures require severely downsampling or crop** the images to meet the memory limitations of today's GPU cards while still considering enough context in the images for accurate segmentation. In this work, w… ▽ More Recent advances in deep learning, like 3D fully convolutional networks (FCNs), have improved the state-of-the-art in dense semantic segmentation of medical images. However, most network architectures require severely downsampling or crop** the images to meet the memory limitations of today's GPU cards while still considering enough context in the images for accurate segmentation. In this work, we propose a novel approach that utilizes auto-context to perform semantic segmentation at higher resolutions in a multi-scale pyramid of stacked 3D FCNs. We train and validate our models on a dataset of manually annotated abdominal organs and vessels from 377 clinical CT images used in gastric surgery, and achieve promising results with close to 90% Dice score on average. For additional evaluation, we perform separate testing on datasets from different sources and achieve competitive results, illustrating the robustness of the model and approach. △ Less

Submitted 6 June, 2018; originally announced June 2018.

Comments: Accepted for presentation at the 21st International Conference on Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, September 16-20, Granada, Spain

arXiv:1804.03999 [pdf, other]

Attention U-Net: Learning Where to Look for the Pancreas

Authors: Ozan Oktay, Jo Schlemper, Loic Le Folgoc, Matthew Lee, Mattias Heinrich, Kazunari Misawa, Kensaku Mori, Steven McDonagh, Nils Y Hammerla, Bernhard Kainz, Ben Glocker, Daniel Rueckert

Abstract: We propose a novel attention gate (AG) model for medical imaging that automatically learns to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ localisation… ▽ More We propose a novel attention gate (AG) model for medical imaging that automatically learns to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ localisation modules of cascaded convolutional neural networks (CNNs). AGs can be easily integrated into standard CNN architectures such as the U-Net model with minimal computational overhead while increasing the model sensitivity and prediction accuracy. The proposed Attention U-Net architecture is evaluated on two large CT abdominal datasets for multi-class image segmentation. Experimental results show that AGs consistently improve the prediction performance of U-Net across different datasets and training sizes while preserving computational efficiency. The code for the proposed architecture is publicly available. △ Less

Submitted 20 May, 2018; v1 submitted 11 April, 2018; originally announced April 2018.

Comments: Accepted to published in MIDL'18 (Revised Version) / OpenReview link: https://openreview.net/forum?id=Skft7cijM

arXiv:1804.03830 [pdf, other]

doi 10.1117/12.2293414

Unsupervised Segmentation of 3D Medical Images Based on Clustering and Deep Representation Learning

Authors: Takayasu Moriya, Holger R. Roth, Shota Nakamura, Hirohisa Oda, Kai Nagara, Masahiro Oda, Kensaku Mori

Abstract: This paper presents a novel unsupervised segmentation method for 3D medical images. Convolutional neural networks (CNNs) have brought significant advances in image segmentation. However, most of the recent methods rely on supervised learning, which requires large amounts of manually annotated data. Thus, it is challenging for these methods to cope with the growing amount of medical images. This pa… ▽ More This paper presents a novel unsupervised segmentation method for 3D medical images. Convolutional neural networks (CNNs) have brought significant advances in image segmentation. However, most of the recent methods rely on supervised learning, which requires large amounts of manually annotated data. Thus, it is challenging for these methods to cope with the growing amount of medical images. This paper proposes a unified approach to unsupervised deep representation learning and clustering for segmentation. Our proposed method consists of two phases. In the first phase, we learn deep feature representations of training patches from a target image using joint unsupervised learning (JULE) that alternately clusters representations generated by a CNN and updates the CNN parameters using cluster labels as supervisory signals. We extend JULE to 3D medical images by utilizing 3D convolutions throughout the CNN architecture. In the second phase, we apply k-means to the deep representations from the trained CNN and then project cluster labels to the target image in order to obtain the fully segmented image. We evaluated our methods on three images of lung cancer specimens scanned with micro-computed tomography (micro-CT). The automatic segmentation of pathological regions in micro-CT could further contribute to the pathological examination process. Hence, we aim to automatically divide each image into the regions of invasive carcinoma, noninvasive carcinoma, and normal tissue. Our experiments show the potential abilities of unsupervised deep representation learning for medical image segmentation. △ Less

Submitted 11 April, 2018; originally announced April 2018.

Comments: This paper was presented at SPIE Medical Imaging 2018, Houston, TX, USA

Journal ref: Proc. SPIE 10578, Medical Imaging 2018: Biomedical Applications in Molecular, Structural, and Functional Imaging, 1057820 (12 March 2018)

arXiv:1804.03828 [pdf, other]

doi 10.1117/12.2292172

Unsupervised Pathology Image Segmentation Using Representation Learning with Spherical K-means

Authors: Takayasu Moriya, Holger R. Roth, Shota Nakamura, Hirohisa Oda, Kai Nagara, Masahiro Oda, Kensaku Mori

Abstract: This paper presents a novel method for unsupervised segmentation of pathology images. Staging of lung cancer is a major factor of prognosis. Measuring the maximum dimensions of the invasive component in a pathology images is an essential task. Therefore, image segmentation methods for visualizing the extent of invasive and noninvasive components on pathology images could support pathological exami… ▽ More This paper presents a novel method for unsupervised segmentation of pathology images. Staging of lung cancer is a major factor of prognosis. Measuring the maximum dimensions of the invasive component in a pathology images is an essential task. Therefore, image segmentation methods for visualizing the extent of invasive and noninvasive components on pathology images could support pathological examination. However, it is challenging for most of the recent segmentation methods that rely on supervised learning to cope with unlabeled pathology images. In this paper, we propose a unified approach to unsupervised representation learning and clustering for pathology image segmentation. Our method consists of two phases. In the first phase, we learn feature representations of training patches from a target image using the spherical k-means. The purpose of this phase is to obtain cluster centroids which could be used as filters for feature extraction. In the second phase, we apply conventional k-means to the representations extracted by the centroids and then project cluster labels to the target images. We evaluated our methods on pathology images of lung cancer specimen. Our experiments showed that the proposed method outperforms traditional k-means segmentation and the multithreshold Otsu method both quantitatively and qualitatively with an improved normalized mutual information (NMI) score of 0.626 compared to 0.168 and 0.167, respectively. Furthermore, we found that the centroids can be applied to the segmentation of other slices from the same sample. △ Less

Submitted 11 April, 2018; originally announced April 2018.

Comments: This paper was presented at SPIE Medical Imaging 2018, Houston, TX, USA

Journal ref: Proc. SPIE 10581, Medical Imaging 2018: Digital Pathology, 1058111 (6 March 2018)

arXiv:1803.08691 [pdf, other]

doi 10.11409/mit.36.63

Deep learning and its application to medical image segmentation

Authors: Holger R. Roth, Chen Shen, Hirohisa Oda, Masahiro Oda, Yuichiro Hayashi, Kazunari Misawa, Kensaku Mori

Abstract: One of the most common tasks in medical imaging is semantic segmentation. Achieving this segmentation automatically has been an active area of research, but the task has been proven very challenging due to the large variation of anatomy across different patients. However, recent advances in deep learning have made it possible to significantly improve the performance of image recognition and semant… ▽ More One of the most common tasks in medical imaging is semantic segmentation. Achieving this segmentation automatically has been an active area of research, but the task has been proven very challenging due to the large variation of anatomy across different patients. However, recent advances in deep learning have made it possible to significantly improve the performance of image recognition and semantic segmentation methods in the field of computer vision. Due to the data driven approaches of hierarchical feature learning in deep learning frameworks, these advances can be translated to medical images without much difficulty. Several variations of deep convolutional neural networks have been successfully applied to medical images. Especially fully convolutional architectures have been proven efficient for segmentation of 3D medical images. In this article, we describe how to build a 3D fully convolutional network (FCN) that can process 3D images in order to produce automatic semantic segmentations. The model is trained and evaluated on a clinical computed tomography (CT) dataset and shows state-of-the-art performance in multi-organ segmentation. △ Less

Submitted 23 March, 2018; originally announced March 2018.

Comments: Accepted for publication in the journal of the Japanese Society of Medical Imaging Technology (JAMIT)

Journal ref: Medical Imaging Technology, Volume 36 (2018), Issue 2, p. 63-71

arXiv:1803.05431 [pdf, other]

doi 10.1016/j.compmedimag.2018.03.001

An application of cascaded 3D fully convolutional networks for medical image segmentation

Authors: Holger R. Roth, Hirohisa Oda, Xiangrong Zhou, Natsuki Shimizu, Ying Yang, Yuichiro Hayashi, Masahiro Oda, Michitaka Fujiwara, Kazunari Misawa, Kensaku Mori

Abstract: Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting featur… ▽ More Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: https://github.com/holgerroth/3Dunet_abdomen_cascade. △ Less

Submitted 20 March, 2018; v1 submitted 14 March, 2018; originally announced March 2018.

Comments: Preprint accepted for publication in Computerized Medical Imaging and Graphics. Substantial extension of arXiv:1704.06382; Corrected references to figure numbers in this version

Journal ref: Computerized Medical Imaging and Graphics, Elsevier, Volume 66, June 2018, Pages 90-99

arXiv:1801.05912 [pdf, other]

On the influence of Dice loss function in multi-class organ segmentation of abdominal CT using 3D fully convolutional networks

Authors: Chen Shen, Holger R. Roth, Hirohisa Oda, Masahiro Oda, Yuichiro Hayashi, Kazunari Misawa, Kensaku Mori

Abstract: Deep learning-based methods achieved impressive results for the segmentation of medical images. With the development of 3D fully convolutional networks (FCNs), it has become feasible to produce improved results for multi-organ segmentation of 3D computed tomography (CT) images. The results of multi-organ segmentation using deep learning-based methods not only depend on the choice of networks archi… ▽ More Deep learning-based methods achieved impressive results for the segmentation of medical images. With the development of 3D fully convolutional networks (FCNs), it has become feasible to produce improved results for multi-organ segmentation of 3D computed tomography (CT) images. The results of multi-organ segmentation using deep learning-based methods not only depend on the choice of networks architecture, but also strongly rely on the choice of loss function. In this paper, we present a discussion on the influence of Dice-based loss functions for multi-class organ segmentation using a dataset of abdominal CT volumes. We investigated three different types of weighting the Dice loss functions based on class label frequencies (uniform, simple and square) and evaluate their influence on segmentation accuracies. Furthermore, we compared the influence of different initial learning rates. We achieved average Dice scores of 81.3%, 59.5% and 31.7% for uniform, simple and square types of weighting when the learning rate is 0.001, and 78.2%, 81.0% and 58.5% for each weighting when the learning rate is 0.01. Our experiments indicated a strong relationship between class balancing weights and initial learning rate in training. △ Less

Submitted 17 January, 2018; originally announced January 2018.

Comments: presented at MI-ken, November 2017, Takamatsu, Japan (http://www.ieice.org/iss/mi/)

arXiv:1711.06439 [pdf, other]

doi 10.1117/12.2293499

Towards dense volumetric pancreas segmentation in CT using 3D fully convolutional networks

Authors: Holger Roth, Masahiro Oda, Natsuki Shimizu, Hirohisa Oda, Yuichiro Hayashi, Takayuki Kitasaka, Michitaka Fujiwara, Kazunari Misawa, Kensaku Mori

Abstract: Pancreas segmentation in computed tomography imaging has been historically difficult for automated methods because of the large shape and size variations between patients. In this work, we describe a custom-build 3D fully convolutional network (FCN) that can process a 3D image including the whole pancreas and produce an automatic segmentation. We investigate two variations of the 3D FCN architectu… ▽ More Pancreas segmentation in computed tomography imaging has been historically difficult for automated methods because of the large shape and size variations between patients. In this work, we describe a custom-build 3D fully convolutional network (FCN) that can process a 3D image including the whole pancreas and produce an automatic segmentation. We investigate two variations of the 3D FCN architecture; one with concatenation and one with summation skip connections to the decoder part of the network. We evaluate our methods on a dataset from a clinical trial with gastric cancer patients, including 147 contrast enhanced abdominal CT scans acquired in the portal venous phase. Using the summation architecture, we achieve an average Dice score of 89.7 $\pm$ 3.8 (range [79.8, 94.8]) % in testing, achieving the new state-of-the-art performance in pancreas segmentation on this dataset. △ Less

Submitted 18 January, 2018; v1 submitted 17 November, 2017; originally announced November 2017.

Comments: Accepted for oral presentation at SPIE Medical Imaging 2018, Houston, TX, USA Updated experiment in Fig. 4

Report number: Published in SPIE Proceedings Vol. 10574

Journal ref: Medical Imaging 2018: Image Processing

arXiv:1710.05379 [pdf, other]

Towards Automatic Abdominal Multi-Organ Segmentation in Dual Energy CT using Cascaded 3D Fully Convolutional Network

Authors: Shuqing Chen, Holger Roth, Sabrina Dorn, Matthias May, Alexander Cavallaro, Michael M. Lell, Marc Kachelrieß, Hirohisa Oda, Kensaku Mori, Andreas Maier

Abstract: Automatic multi-organ segmentation of the dual energy computed tomography (DECT) data can be beneficial for biomedical research and clinical applications. However, it is a challenging task. Recent advances in deep learning showed the feasibility to use 3-D fully convolutional networks (FCN) for voxel-wise dense predictions in single energy computed tomography (SECT). In this paper, we proposed a 3… ▽ More Automatic multi-organ segmentation of the dual energy computed tomography (DECT) data can be beneficial for biomedical research and clinical applications. However, it is a challenging task. Recent advances in deep learning showed the feasibility to use 3-D fully convolutional networks (FCN) for voxel-wise dense predictions in single energy computed tomography (SECT). In this paper, we proposed a 3D FCN based method for automatic multi-organ segmentation in DECT. The work was based on a cascaded FCN and a general model for the major organs trained on a large set of SECT data. We preprocessed the DECT data by using linear weighting and fine-tuned the model for the DECT data. The method was evaluated using 42 torso DECT data acquired with a clinical dual-source CT system. Four abdominal organs (liver, spleen, left and right kidneys) were evaluated. Cross-validation was tested. Effect of the weight on the accuracy was researched. In all the tests, we achieved an average Dice coefficient of 93% for the liver, 90% for the spleen, 91% for the right kidney and 89% for the left kidney, respectively. The results show our method is feasible and promising. △ Less

Submitted 15 October, 2017; originally announced October 2017.

Comments: 5 pagens, 4 figures, conference

arXiv:1708.06297 [pdf, other]

Employing Weak Annotations for Medical Image Analysis Problems

Authors: Martin Rajchl, Lisa M. Koch, Christian Ledig, Jonathan Passerat-Palmbach, Kazunari Misawa, Kensaku Mori, Daniel Rueckert

Abstract: To efficiently establish training databases for machine learning methods, collaborative and crowdsourcing platforms have been investigated to collectively tackle the annotation effort. However, when this concept is ported to the medical imaging domain, reading expertise will have a direct impact on the annotation accuracy. In this study, we examine the impact of expertise and the amount of availab… ▽ More To efficiently establish training databases for machine learning methods, collaborative and crowdsourcing platforms have been investigated to collectively tackle the annotation effort. However, when this concept is ported to the medical imaging domain, reading expertise will have a direct impact on the annotation accuracy. In this study, we examine the impact of expertise and the amount of available annotations on the accuracy outcome of a liver segmentation problem in an abdominal computed tomography (CT) image database. In controlled experiments, we study this impact for different types of weak annotations. To address the decrease in accuracy associated with lower expertise, we propose a method for outlier correction making use of a weakly labelled atlas. Using this approach, we demonstrate that weak annotations subject to high error rates can achieve a similarly high accuracy as state-of-the-art multi-atlas segmentation approaches relying on a large amount of expert manual segmentations. Annotations of this nature can realistically be obtained from a non-expert crowd and can potentially enable crowdsourcing of weak annotation tasks for medical image analysis. △ Less

Submitted 21 August, 2017; originally announced August 2017.

arXiv:1706.05122 [pdf, ps, other]

Bib2vec: An Embedding-based Search System for Bibliographic Information

Authors: Takuma Yoneda, Koki Mori, Makoto Miwa, Yutaka Sasaki

Abstract: We propose a novel embedding model that represents relationships among several elements in bibliographic information with high representation ability and flexibility. Based on this model, we present a novel search system that shows the relationships among the elements in the ACL Anthology Reference Corpus. The evaluation results show that our model can achieve a high prediction ability and produce… ▽ More We propose a novel embedding model that represents relationships among several elements in bibliographic information with high representation ability and flexibility. Based on this model, we present a novel search system that shows the relationships among the elements in the ACL Anthology Reference Corpus. The evaluation results show that our model can achieve a high prediction ability and produce reasonable search results. △ Less

Submitted 5 April, 2018; v1 submitted 15 June, 2017; originally announced June 2017.

Comments: EACL2017 extended version. The demonstration is available at http://tti-coin.jp/demo/bib2vec/

Journal ref: Proceedings of the EACL 2017 Software Demonstrations, Valencia, Spain, April 3-7 2017, pages 112-115

arXiv:1704.08030 [pdf]

Airway segmentation from 3D chest CT volumes based on volume of interest using gradient vector flow

Authors: Qier Meng, Takayuki Kitasaka, Masahiro Oda, Junji Ueno, Kensaku Mori

Abstract: Some lung diseases are related to bronchial airway structures and morphology. Although airway segmentation from chest CT volumes is an important task in the computer-aided diagnosis and surgery assistance systems for the chest, complete 3-D airway structure segmentation is a quite challenging task due to its complex tree-like structure. In this paper, we propose a new airway segmentation method fr… ▽ More Some lung diseases are related to bronchial airway structures and morphology. Although airway segmentation from chest CT volumes is an important task in the computer-aided diagnosis and surgery assistance systems for the chest, complete 3-D airway structure segmentation is a quite challenging task due to its complex tree-like structure. In this paper, we propose a new airway segmentation method from 3D chest CT volumes based on volume of interests (VOI) using gradient vector flow (GVF). This method segments the bronchial regions by applying the cavity enhancement filter (CEF) to trace the bronchial tree structure from the trachea. It uses the CEF in the VOI to segment each branch. And a tube-likeness function based on GVF and the GVF magnitude map in each VOI are utilized to assist predicting the positions and directions of child branches. By calculating the tube-likeness function based on GVF and the GVF magnitude map, the airway-like candidate structures are identified and their centrelines are extracted. Based on the extracted centrelines, we can detect the branch points of the bifurcations and directions of the airway branches in the next level. At the same time, a leakage detection is performed to avoid the leakage by analysing the pixel information and the shape information of airway candidate regions extracted in the VOI. Finally, we unify all of the extracted bronchial regions to form an integrated airway tree. Preliminary experiments using four cases of chest CT volumes demonstrated that the proposed method can extract more bronchial branches in comparison with other methods. △ Less

Submitted 26 April, 2017; originally announced April 2017.

arXiv:1704.06382 [pdf, other]

Hierarchical 3D fully convolutional networks for multi-organ segmentation

Authors: Holger R. Roth, Hirohisa Oda, Yuichiro Hayashi, Masahiro Oda, Natsuki Shimizu, Michitaka Fujiwara, Kazunari Misawa, Kensaku Mori

Abstract: Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of full volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of seven abdominal structures (artery, vein, liver, spleen, stomach, gallbladder, and pancreas) can achieve competitive segmentation results, while avoiding the need for… ▽ More Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of full volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of seven abdominal structures (artery, vein, liver, spleen, stomach, gallbladder, and pancreas) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training organ-specific models. To this end, we propose a two-stage, coarse-to-fine approach that trains an FCN model to roughly delineate the organs of interest in the first stage (seeing $\sim$40% of the voxels within a simple, automatically generated binary mask of the patient's body). We then use these predictions of the first-stage FCN to define a candidate region that will be used to train a second FCN. This step reduces the number of voxels the FCN has to classify to $\sim$10% while maintaining a recall high of $>$99%. This second-stage FCN can now focus on more detailed segmentation of the organs. We respectively utilize training and validation sets consisting of 281 and 50 clinical CT images. Our hierarchical approach provides an improved Dice score of 7.5 percentage points per organ on average in our validation set. We furthermore test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans with three anatomical labels (liver, spleen, and pancreas). In such challenging organs as the pancreas, our hierarchical approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. △ Less

Submitted 20 April, 2017; originally announced April 2017.

arXiv:1703.04967 [pdf, other]

Comparison of the Deep-Learning-Based Automated Segmentation Methods for the Head Sectioned Images of the Virtual Korean Human Project

Authors: Mohammad Eshghi, Holger R. Roth, Masahiro Oda, Min Suk Chung, Kensaku Mori

Abstract: This paper presents an end-to-end pixelwise fully automated segmentation of the head sectioned images of the Visible Korean Human (VKH) project based on Deep Convolutional Neural Networks (DCNNs). By converting classification networks into Fully Convolutional Networks (FCNs), a coarse prediction map, with smaller size than the original input image, can be created for segmentation purposes. To refi… ▽ More This paper presents an end-to-end pixelwise fully automated segmentation of the head sectioned images of the Visible Korean Human (VKH) project based on Deep Convolutional Neural Networks (DCNNs). By converting classification networks into Fully Convolutional Networks (FCNs), a coarse prediction map, with smaller size than the original input image, can be created for segmentation purposes. To refine this map and to obtain a dense pixel-wise output, standard FCNs use deconvolution layers to upsample the coarse map. However, upsampling based on deconvolution increases the number of network parameters and causes loss of detail because of interpolation. On the other hand, dilated convolution is a new technique introduced recently that attempts to capture multi-scale contextual information without increasing the network parameters while kee** the resolution of the prediction maps high. We used both a standard FCN and a dilated convolution based FCN for semantic segmentation of the head sectioned images of the VKH dataset. Quantitative results showed approximately 20% improvement in the segmentation accuracy when using FCNs with dilated convolutions. △ Less

Submitted 15 March, 2017; originally announced March 2017.

Comments: Accepted for presentation at the 15th IAPR Conference on Machine Vision Applications (MVA2017), Nagoya, Japan

arXiv:1702.08155 [pdf, other]

Multi-scale Image Fusion Between Pre-operative Clinical CT and X-ray Microtomography of Lung Pathology

Authors: Holger R. Roth, Kai Nagara, Hirohisa Oda, Masahiro Oda, Tomoshi Sugiyama, Shota Nakamura, Kensaku Mori

Abstract: Computational anatomy allows the quantitative analysis of organs in medical images. However, most analysis is constrained to the millimeter scale because of the limited resolution of clinical computed tomography (CT). X-ray microtomography ($μ$CT) on the other hand allows imaging of ex-vivo tissues at a resolution of tens of microns. In this work, we use clinical CT to image lung cancer patients b… ▽ More Computational anatomy allows the quantitative analysis of organs in medical images. However, most analysis is constrained to the millimeter scale because of the limited resolution of clinical computed tomography (CT). X-ray microtomography ($μ$CT) on the other hand allows imaging of ex-vivo tissues at a resolution of tens of microns. In this work, we use clinical CT to image lung cancer patients before partial pneumonectomy (resection of pathological lung tissue). The resected specimen is prepared for $μ$CT imaging at a voxel resolution of 50 $μ$m (0.05 mm). This high-resolution image of the lung cancer tissue allows further insides into understanding of tumor growth and categorization. For making full use of this additional information, image fusion (registration) needs to be performed in order to re-align the $μ$CT image with clinical CT. We developed a multi-scale non-rigid registration approach. After manual initialization using a few landmark points and rigid alignment, several levels of non-rigid registration between down-sampled (in the case of $μ$CT) and up-sampled (in the case of clinical CT) representations of the image are performed. Any non-lung tissue is ignored during the computation of the similarity measure used to guide the registration during optimization. We are able to recover the volume differences introduced by the resection and preparation of the lung specimen. The average ($\pm$ std. dev.) minimum surface distance between $μ$CT and clinical CT at the resected lung surface is reduced from 3.3 $\pm$ 2.9 (range: [0.1, 15.9]) to 2.3 mm $\pm$ 2.8 (range: [0.0, 15.3]) mm. The alignment of clinical CT with $μ$CT will allow further registration with even finer resolutions of $μ$CT (up to 10 $μ$m resolution) and ultimately with histopathological microscopy images for further macro to micro image fusion that can aid medical image analysis. △ Less

Submitted 27 February, 2017; originally announced February 2017.

Comments: In proceedings of International Forum on Medical Imaging, IFMIA 2017, Okinawa, Japan

arXiv:1605.04278 [pdf, ps, other]

Universal Dependencies for Learner English

Authors: Yevgeni Berzak, Jessica Kenney, Carolyn Spadine, **g Xian Wang, Lucia Lam, Keiko Sophie Mori, Sebastian Garza, Boris Katz

Abstract: We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic… ▽ More We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic analyses are provided for both the original and error corrected versions of each sentence. Further on, we delineate ESL annotation guidelines that allow for consistent syntactic treatment of ungrammatical English. Finally, we benchmark POS tagging and dependency parsing performance on the TLE dataset and measure the effect of grammatical errors on parsing accuracy. We envision the treebank to support a wide range of linguistic and computational research on second language acquisition as well as automatic processing of ungrammatical language. The treebank is available at universaldependencies.org. The annotation manual used in this project and a graphical query engine are available at esltreebank.org. △ Less

Submitted 7 June, 2016; v1 submitted 13 May, 2016; originally announced May 2016.

Comments: Updated parsing experiments to EWT v1.3, improved grammatical error marking, minor revisions. To appear in ACL 2016

arXiv:1604.07904 [pdf]

Image Colorization Using a Deep Convolutional Neural Network

Authors: Tung Nguyen, Kazuki Mori, Ruck Thawonmas

Abstract: In this paper, we present a novel approach that uses deep learning techniques for colorizing grayscale images. By utilizing a pre-trained convolutional neural network, which is originally designed for image classification, we are able to separate content and style of different images and recombine them into a single image. We then propose a method that can add colors to a grayscale image by combin… ▽ More In this paper, we present a novel approach that uses deep learning techniques for colorizing grayscale images. By utilizing a pre-trained convolutional neural network, which is originally designed for image classification, we are able to separate content and style of different images and recombine them into a single image. We then propose a method that can add colors to a grayscale image by combining its content with style of a color image having semantic similarity with the grayscale one. As an application, to our knowledge the first of its kind, we use the proposed method to colorize images of ukiyo-e a genre of Japanese painting?and obtain interesting results, showing the potential of this method in the growing field of computer assisted art. △ Less

Submitted 26 April, 2016; originally announced April 2016.

ACM Class: I.2.6; I.4.9; J.5

Journal ref: Proc. of ASIAGRAPH 2016, Toyama, Japan, pp. 49-50, Mar. 5-6, 2016

arXiv:1404.3958 [pdf, other]

doi 10.3217/978-3-85125-378-8-20

Head-related Impulse Response-based Spatial Auditory Brain-computer Interface

Authors: Chisaki Nakaizumi, Toshie Matsui, Koichi Mori, Shoji Makino, Tomasz M. Rutkowski

Abstract: This study provides a comprehensive test of the head-related impulse response (HRIR) to an auditory spatial speller brain-computer interface (BCI) paradigm, including a comparison with a conventional virtual headphone-based spatial auditory modality. Five BCI-naive users participated in an experiment based on five Japanese vowels. The auditory evoked potentials obtained produced encouragingly good… ▽ More This study provides a comprehensive test of the head-related impulse response (HRIR) to an auditory spatial speller brain-computer interface (BCI) paradigm, including a comparison with a conventional virtual headphone-based spatial auditory modality. Five BCI-naive users participated in an experiment based on five Japanese vowels. The auditory evoked potentials obtained produced encouragingly good and stable P300-responses in online BCI experiments. Our case study indicates that the auditory HRIR spatial sound paradigm reproduced with headphones could be a viable alternative to established multi-loudspeaker surround sound BCI-speller applications. △ Less

Submitted 15 April, 2014; originally announced April 2014.

Comments: 5 pages, 1 figure, submitted to 6th International Brain-Computer Interface Conference 2014, Graz, Austria

arXiv:1312.4106 [pdf, other]

doi 10.1109/SITIS.2013.131

Auditory Brain-Computer Interface Paradigm with Head Related Impulse Response-based Spatial Cues

Authors: Chisaki Nakaizumi, Koichi Mori, Toshie Matsui, Shoji Makino, Tomasz M. Rutkowski

Abstract: The aim of this study is to provide a comprehensive test of head related impulse response (HRIR) for an auditory spatial speller brain-computer interface (BCI) paradigm. The study is conducted with six users in an experimental set up based on five Japanese hiragana vowels. Auditory evoked potentials resulted with encouragingly good and stable "aha-" or P300-responses in real-world online BCI exper… ▽ More The aim of this study is to provide a comprehensive test of head related impulse response (HRIR) for an auditory spatial speller brain-computer interface (BCI) paradigm. The study is conducted with six users in an experimental set up based on five Japanese hiragana vowels. Auditory evoked potentials resulted with encouragingly good and stable "aha-" or P300-responses in real-world online BCI experiments. Our case study indicated that the auditory HRIR spatial sound reproduction paradigm could be a viable alternative to the established multi-loudspeaker surround sound BCI-speller applications, as far as healthy pilot study users are concerned. △ Less

Submitted 14 December, 2013; originally announced December 2013.

Journal ref: Proceedings of the 9th International Conference on Signal Image Technology and Internet Based Systems. Kyoto, Japan: IEEE Computer Society; 2013. p. 806-811

Showing 1–50 of 54 results for author: MORI, K