11institutetext: Institute of Image Analysis and Computer Vision, University of Regensburg, Regensburg, Germany 22institutetext: Institute of Pathology, Hannover Medical School, Hannover, Germany 33institutetext: Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany
\starThese authors contributed equally to this work.
@Correspondence: [email protected]

Supplementary Material
Unsupervised Latent Stain Adaptation
for Computational Pathology

Daniel Reisenbüchler\star,@ 11    Lucas Luttner\star 11    Nadine S. Schaadt 22   
Friedrich Feuerhake
22
   Dorit Merhof 1133

1 Test Datasets

The following tables serve as overview of the number patches used for each staining and tissue combination in the segmentation task.

Table 1: Overview of the NEPTUNE dataset used for the segmentation experiments
Staining Tissue class #Images
PAS Glomerulus 329
Glomerular Tuft 352
Tubule 231
Artery 264
SIL Glomerulus 248
Glomerular Tuft 217
Artery 223
TRI Glomerulus 260
Glomerular Tuft 233
Artery 324
H&E Artery 402
Table 2: Overview of the HuBMAP dataset used for the segmentation experiments
Staining Tissue class #Images
PAS Glomerulus 2670

2 Hyperparameter search

All hyperparameter searches were performed via grid search on validation sets only. In the following we detail hyperparameter selections.

2.0.1 cGAN.

We performed a careful selection of hyperparameters to ensure that the images were perfectly translated into the target stainings. The number of epochs for the adversarial model was searched within the interval [200,500]200500[200,500][ 200 , 500 ] and set to 300300300300, while the learning rate was adjusted to 1.5e41.5𝑒41.5e-41.5 italic_e - 4, within the search range [1e3,1e5]1𝑒31𝑒5[1e-3,1e-5][ 1 italic_e - 3 , 1 italic_e - 5 ]. The momentum term of Adam was set to 0.50.50.50.5, within the interval [0.01,1]0.011[0.01,1][ 0.01 , 1 ]. The buffer for storing artifical images was fixed at 50505050 [10,200]10200[10,200][ 10 , 200 ], and the batch size was fixed at 2222, which achieved the best results within the interval of [1,4]14[1,4][ 1 , 4 ]. Finally, the number of unlabeled training data was set to 10000100001000010000, within the interval [1000,50000]100050000[1000,50000][ 1000 , 50000 ].

2.0.2 ULSA.

We performed grid search for finding the weighting λ𝜆\lambdaitalic_λ between 𝒮subscript𝒮\mathcal{L_{S}}caligraphic_L start_POSTSUBSCRIPT caligraphic_S end_POSTSUBSCRIPT and 𝒰subscript𝒰\mathcal{L_{U}}caligraphic_L start_POSTSUBSCRIPT caligraphic_U end_POSTSUBSCRIPT in the range of [0.3,1.5]0.31.5[0.3,1.5][ 0.3 , 1.5 ] with step size Δ=0.1Δ0.1\Delta=0.1roman_Δ = 0.1. We used the overall batch sizes between labeled bLsubscript𝑏𝐿b_{L}italic_b start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT and unlabeled samples bUsubscript𝑏𝑈b_{U}italic_b start_POSTSUBSCRIPT italic_U end_POSTSUBSCRIPT with boverall=bL+bU=128subscript𝑏𝑜𝑣𝑒𝑟𝑎𝑙𝑙subscript𝑏𝐿subscript𝑏𝑈128b_{overall}=b_{L}+b_{U}=128italic_b start_POSTSUBSCRIPT italic_o italic_v italic_e italic_r italic_a italic_l italic_l end_POSTSUBSCRIPT = italic_b start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT italic_U end_POSTSUBSCRIPT = 128 where we tried bU=λbLsubscript𝑏𝑈𝜆subscript𝑏𝐿b_{U}=\lambda b_{L}italic_b start_POSTSUBSCRIPT italic_U end_POSTSUBSCRIPT = italic_λ italic_b start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT for different λ=1,2,3𝜆123\lambda=1,2,3italic_λ = 1 , 2 , 3. We tested several noise injection approaches including salt and pepper, gaussian blurring and gaussian noise. Best results were archived with gaussian blurring (kernel size: (3,5)35(3,5)( 3 , 5 ), intensity: (0.01,0.4)0.010.4(0.01,0.4)( 0.01 , 0.4 )). Other augmentation methods like color jitter and random sharpness adjustments were tried, but did not show promising results. We further tried to replace Reinhard with Macenko, which was not possible, due to computational overload (ULSA with Reinhard: 7-10h, ULSA with Macenko: at least 48-60h). Also translating the stains offline and storing them locally would not be possible, because of the huge amount of images needed to store: xU=1.749.458|T|,|T|=[3,4]formulae-sequencesuperscript𝑥𝑈superscript1.749.458𝑇𝑇34x^{U}=1.749.458^{|T|},|T|=[3,4]italic_x start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT = 1.749.458 start_POSTSUPERSCRIPT | italic_T | end_POSTSUPERSCRIPT , | italic_T | = [ 3 , 4 ].

2.0.3 Comparable methods.

Reinhard and Macenko. For each image in the mini-batch we used a random target stained image as reference for transformation. Thus each image was translated multiple times into different target stains during training. UDA. We used various combinations such as color jitter and gaussian blurring (also see augmentations for ULSA) for data augmentation in the semi-supervised part. Other augmentations lead to worst results. The batch size factor λ𝜆\lambdaitalic_λ for unlabeled data in the unsupervised data augmentation procedure was set to 3333 as proposed by the authors within the interval [3,5]35[3,5][ 3 , 5 ]. FixMatch. We used a confidence threshold of 0.950.950.950.95, which was the same the authors used for their implementations. All other parameters were obtained as in UDA.