\tocauthor

Huyen Le, Khiet Dang, Tien Lai, Nhung Nguyen, Mai Tran, and Hieu Pham 11institutetext: VinUni-Illinois Smart Health Center, VinUniversity 22institutetext: College of Engineering & Computer Science, VinUniversity 33institutetext: College of Health Science, VinUniversity
Corresponding address: [email protected]
* These authors contributed equally

SarcNet: A Novel AI-based Framework to Automatically Analyze and Score Sarcomere Organizations in Fluorescently Tagged hiPSC-CMs

Huyen Le 11****    Khiet Dang 11****    Tien Lai 22    Nhung Nguyen 33    Mai Tran 1122    Hieu Pham 1122**
Abstract

Quantifying sarcomere structure organization in human-induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) is crucial for understanding cardiac disease pathology, improving drug screening, and advancing regenerative medicine. Traditional methods, such as manual annotation and Fourier transform analysis, are labor-intensive, error-prone, and lack high-throughput capabilities. In this study, we present a novel deep learning-based framework that leverages cell images and integrates cell features to automatically evaluate the sarcomere structure of hiPSC-CMs from the onset of differentiation. This framework overcomes the limitations of traditional methods through automated, high-throughput analysis, providing consistent, reliable results while accurately detecting complex sarcomere patterns across diverse samples. The proposed framework contains the SarcNet, a linear layers-added ResNet-18 module, to output a continuous score ranging from one to five that captures the level of sarcomere structure organization. It is trained and validated on an open-source dataset of hiPSC-CMs images with the endogenously GFP-tagged alpha-actinin-2 structure developed by the Allen Institute for Cell Science (AICS). SarcNet achieves a Spearman correlation of 0.831 with expert evaluations, demonstrating superior performance and an improvement of 0.075 over the current state-of-the-art approach, which uses linear regression. Our results also show a consistent pattern of increasing organization from day 18 to day 32 of differentiation, aligning with expert evaluations. By integrating the quantitative features calculated directly from the images with the visual features learned during the deep learning model, our framework offers a more comprehensive and accurate assessment, enhancing the further utility of hiPSC-CMs in medical research and therapy development.

keywords:
hiPSC-CMs, sarcomere structure organization, SarcNet, linear regression, deep learning.

1 Introduction

Human-induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) are cells safely collected from the blood or skin of living humans and then reprogrammed into human heart muscle cells [9]. Fully characterized hiPSC-CMs have a range of critical applications, including disease modeling, drug discovery, regenerative and precision medicine [3, 10, 12]. Specifically, these cells have attracted considerable attention as a promising alternative for modeling arrhythmogenic disorders and assessing the cardiac function of patients [24]. In another instance, Vicente et al. (2018) introduced the Comprehensive in vitro proarrhythmia Assay (CiPA) as a method to assess the impact of drugs on various ion channels in hiPSC-CMs and to predict the risk of proarrhythmia [22].

To effectively utilize these applications, hiPSC-CMs must acquire key features and capacities that accurately reflect the electrical activity, calcium dynamics, structure, and contractility of mature cardiomyocytes [3, 4]. During the differentiation of hiPSC-CMs, gene expression varies, with immature cardiomyocytes showing higher levels of the TNNI1 gene, while mature cells primarily express TNNI3 [2, 8]. Therefore, it is necessary to establish a framework to quantify hiPSC-CM maturation.

Quantitative approaches for tracking hiPSC-CMs development are vital but currently restricted [3]. Traditional methods for assessing cardiomyocyte structures include manual annotation, which is labor-intensive and lacks high-throughput capabilities, and Fourier transform analysis on hiPSC-CMs alignment, which is error-prone on images with thicker multi-cell layer [15]. Recently, various methodologies have been employed to assess the maturation of hiPSC-CMs, focusing on distinct characteristics, such as sarcomere configuration, electrophysiological attributes, metabolism, and gene expression profiles [1]. For example, measuring the TNNI3 to TNNI1 ratio is one way to quantify hiPSC-CMs based on gene expression [2]. The structure of the sarcomere can be evaluated using hiPSC-CM imaging, an important technique to consider. This is particularly relevant because the contraction capabilities of adult cardiomyocytes are closely tied to the organization and structure of the sarcomeres and myofilaments [21]. In particular, mature hiPSC-CMs are notably longer and display a higher degree of structural organization compared to their immature counterparts [17].

Several studies have employed artificial intelligence (AI) to quantify the maturity of hiPSC-CMs by analyzing fluorescent images [19, 7]. The authors in [19] (2015) first defined a set of 11 metrics to capture the increasing organization of sarcomeres within striated muscle cells during their developmental process and then used machine learning algorithms to score the phenotypic maturity of hiPSC-CMs unbiasedly. However, due to the limited availability of hiPSC-CMs at various developmental stages, the model was trained on primary cardiomyocytes from neonate rats (rpCMs) and was tested on immature hiPSC-CMs only. In [7] published in 2021, Gerbin, Kaytlyn A., et al. used linear regression to distinguish stages of myofibrillar organization in hiPSC-CMs at the single-cell level. They utilized 11 cell features, six of which originated from deep learning. The model reached a Spearman correlation of 0.63 and 0.67 on two test sets. Despite these advances, translation to clinical application remains a distant goal.

To address the gap, in this study, we propose SarcNet, a cell-features concatenated and linear layers-added ResNet-18 convolutional neural network (CNN) for automatically quantifying sarcomere structure organizations on single-cell images of hiPSC-CMs. In particular, this model leverages predictions by concatenating the output from the ResNet-18 module and a representation vector of quantitative single-cell measurements of subcellular organization, followed by the addition of four linear layers to gain deeper linear representations. To evaluate the proposed framework, we compare SarcNet with a combined model, which integrates predictions from linear regression and the ResNet-18 module to generate the final score. SarcNet obtains a higher performance, with a Spearman correlation of 0.831. Furthermore, we assess the performance of each approach using two different feature extraction protocols as described in Section 2.3.1. While the performance of the combined model drops significantly with Protocol 1, our SarcNet model maintains nearly consistent performance. This result indicates that using SarcNet, the features extracted by Protocol 1 can be sufficient for scoring sarcomere structure organizations without a complicated deep learning-based feature extraction method as in Protocol 2. In addition, SarcNet also outperforms linear regression and other state-of-the-art CNN models. Finally, we evaluate the proposed model using Gradient-weighted Class Activation Map** (Grad-CAM) [20], an explanatory algorithm. The heatmaps generated by Grad-CAM reveal that the important regions in the correctly predicted images correspond to the majority of sarcomere patterns within the cell. Our main contributions through this work can be summarized as follows:

  1. 1.

    We introduce a novel AI-based framework to effectively analyze and score sarcomere organizations in fluorescently tagged hiPSC-CMs single cell images by integrating feature extraction with the existing state-of-the-art CNN model.

  2. 2.

    We conduct thorough experiments to evaluate the efficacy of the proposed solution and compare it to other methods. The experimental findings demonstrated that our SarcNet model with 11 features extracted by Protocol 2 enhances performance by at least 0.075 and 0.022 for the Spearman correlation metric compared to linear regression and the combined model. Notably, using only five features from Protocol 1 does not significantly affect the model performance.

  3. 3.

    We can speed up the training and inference by simplifying the feature extraction process and decreasing the number of features while maintaining performance. Our codes and models will be released at https://github.com/vinuni-vishc/sarcnet.

The remainder of the paper is organized as follows. Section 2 describes an overview of the proposed framework for quantifying sarcomere structure organization. In Section 3, we assess the databases and discuss implementation specifics. Section 4 presents the experiment results, visualizations, and limitations. Section 5 concludes the paper.

2 Methodology

This section introduces details of the proposed approach. We first give an overview of our framework for predicting a continuous score of sarcomere structure organization on single-cell imaging of hiPSC-CMs (Section 2.1). We then provide a formulation of the problem as a regression task (Section 2.2). Next, the framework architecture is described (Section 2.3).

2.1 Overall Framework

Refer to caption

Figure 1: An illustration of our overall framework, which aims to quantify sarcomere structure organization. The system takes the hiPSC-CM images as input and outputs the scores of alpha-actinin-2 patterns, ranging from one to five. For feature extraction, three sets with a total of 11 cell features can be extracted using the techniques described in [7], including CellProfiler-based feature extraction, Gray-Level Co-occurrence Matrix-based (GLCM-based) feature extraction and deep learning-based feature extraction. Our model also localizes the most relevant portions of the image using a heatmap generated by the Grad-CAM approach and proclaims the increasing organization between day 18 and day 32 using the histogram.

Fig. 1 provides an overview of the proposed approach, which is a regression framework using CNN. It predicts a continuous score of sarcomere structure organization by taking fluorescent images of hiPSC-CMs single cells as input. To train the proposed model, an expert-scored hiPSC-CMs single cells dataset of 5,761 images has been used (Section 3.1). We enhance the prediction performance by integrating feature extraction into the model architecture. Model performance is evaluated using four metrics: Spearman correlation, mean absolute error (MAE), mean squared error (MSE), and R2superscript𝑅2R^{2}italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT score (Section 3.3). A visual explanation module based on Grad-CAMs is also used for model interpretation. Last but not least, we confirm the pattern of increasing maturity levels of sarcomere structure between day 18 and day 32-time points.

2.2 Problem Formulation

In a regression problem setting, we are given a training set 𝒟𝒟\mathcal{D}caligraphic_D consisting of N samples, 𝒟𝒟\mathcal{D}caligraphic_D = {(x(i),y(i)(x^{(i)},y^{(i)}( italic_x start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , italic_y start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT); i = 1,…, N} where each input image x(i)𝒳superscript𝑥𝑖𝒳x^{(i)}\in\mathcal{X}italic_x start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ∈ caligraphic_X is associated with a continuous value y(i)𝒴superscript𝑦𝑖𝒴y^{(i)}\in\mathcal{Y}italic_y start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ∈ caligraphic_Y. Our goal is to approximate a map** function fθ:𝒳𝒴:subscript𝑓𝜃𝒳𝒴f_{\theta}:\mathcal{X}\rightarrow\mathcal{Y}italic_f start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT : caligraphic_X → caligraphic_Y from input images, which are fluorescently tagged single-cell images of hiPSC-CMs to continuous output variables representing the organization of sarcomere structures. In general, this learning task could be performed by training a CNN, parameterized by weight θ𝜃\thetaitalic_θ that the MSE loss function is minimized over the training set 𝒟𝒟\mathcal{D}caligraphic_D. As the desired output is a continuous quantity, the activation function at the output layer is linear. The loss function MSE is given by

(θ)=1Ni=1N|y(i)y^(i)|2,𝜃1𝑁superscriptsubscript𝑖1𝑁superscriptsuperscript𝑦𝑖superscript^𝑦𝑖2\mathcal{L}(\theta)=\frac{1}{N}\sum_{i=1}^{N}|y^{(i)}-\hat{y}^{(i)}|^{2},caligraphic_L ( italic_θ ) = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_y start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT - over^ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (1)

where y(i)superscript𝑦𝑖y^{(i)}italic_y start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT is the ground truth and y^(i)superscript^𝑦𝑖\hat{y}^{(i)}over^ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT is the value predicted by the network.

2.3 Framework Architecture

2.3.1 Feature Extraction

Three sets with a total of 11 cell features can be extracted for each input image using techniques provided in [7]. More details, Set 1, the CellProfiler-based feature extraction [18], generates two cell morphology measurement metrics: cell area and aspect ratio. Set 2, the GLCM-based method, produces three metrics describing sarcomere alignment: max coefficient of variation, peak height, and peak distance. Finally, Set 3, the deep learning-based method, provides six additional metrics representing the fraction of the cell area covered by each class: background, diffuse/other, fibers, disorganized puncta, organized puncta, and organized z-discs. The features extracted from Set 1 and Set 2 together are referred to as Protocol 1. The features extracted from Set 1, Set 2, and Set 3 together are referred to as Protocol 2.

2.3.2 ResNet-18 Module

ResNet-18 [11], one of the state-of-the-art image classification models, is adapted for the single-cell image classification task. It receives input images of size 3×224×22432242243\times 224\times 2243 × 224 × 224 and generates labels ranging from one to five, corresponding to the score of sarcomere structure organization. The ResNet-18 module architecture used in this study consists of 18 convolution layers, an average pooling layer, and three linear layers. Compared to the original architecture, two fully connected layers are added at the end before outputting the result.

2.3.3 SarcNet

Refer to caption

Figure 2: SarcNet architecture for scoring sarcomere structure organization, which are continuous values ranging from one to five.

In this study, we propose the SarcNet, which comprises two branches of the ResNet-18 module and linear layers as depicted in Fig. 2. First, the former requires an input image with a size of 3×224×22432242243\times 224\times 2243 × 224 × 224 to pass through the ResNet-18 module (described in 2.3 - ResNet-18 module). Regarding the second branch, the input feature vector would be passed through three linear layers with rectified linear unit (ReLU) activation functions to extract more deep linear representations. After concatenating with the output from the ResNet-18 module, a new representation vector is created, consisting of the information contributed equally from both input images and input features. This vector is then processed through four linear layers to get the final score.

3 Experiments

3.1 Dataset and Settings

3.1.1 Dataset

In this study, we use a publicly available dataset of alpha-actinin-2-mEGFP-tagged hPSC-CMs single cells at days 18 and 32 time points since the initiation of differentiation, provided by the Allen Institute for Cell Science (AICS) [7]. The dataset consists of two components: images of single cells and tabular data containing cell features, along with the overall alpha-actinin-2 organization score corresponding to each image.

As described [7], each cell is manually scored for the structural maturity of its sarcomere organization by two experts; thus, all cells are assessed twice. Based on the majority of their organization, experts categorize cells into five score groups, from one to five. A score of 1 indicates cells with sparse, disorganized puncta; a score of 2 indicates cells with denser, more organized puncta; a score of 3 indicates cells with a combination of puncta and other structures like fibers and z-discs; a score of 4 indicates cells with regular but not aligned z-discs and finally a score of 5 indicates cells with almost organized and aligned z-discs. Cells lacking alpha-actinin-2 mEGFP protein expression were assigned a score of “0” and subsequently excluded from further analysis. After filtering, there are a total of 5,761 images of different sizes. The number of cell images in each score category is shown in Table 1. For consistency, we define ground truth (GT) in our downstream analysis as the mean score from two experts.

Table 1: Number of cells in each score category
Score 1 Score 2 Score 3 Score 4 Score 5
Expert 1 293 708 3,868 798 94
Expert 2 115 1,007 2,979 1,527 113

The tabular data component provides cell features extracted by CellProfiler, GLCM, and deep-learning model. In this work, we utilize a subset of 11 features from the tabular data in the dataset package, including cell area, cell aspect ratio, max coefficient of variation, peak height, peak distance, fraction cell area background, fraction cell area diffuse/other, fraction cell area fibers, fraction cell area disorganized puncta, fraction cell area organized puncta, and fraction cell area organized z-discs. Table 2 shows a statistical summary of the mean and standard deviation (SD) of these 11 metrics. It is evident that cell area, max coefficient of variation, and peak distance increase as cells mature. Also, the fraction cell area background is highest in cells scored as 1, while the fraction cell area organized z-disc is highest in cells scored as 5. These features are utilized differently in the two protocols: Protocol 1 employs the first five features, whereas Protocol 2 incorporates all 11 features, as depicted in Section 2.3.1.

Table 2: Statistical summary of cell features in each score category

Mean (SD) 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Count 81 234 440 622 2,543 1,108 572 120 41 Cell area 1,732 (1,040) 1,620 (1,016) 1,925 (1,070) 1,923 (993) 2,272 (1,207) 2,557 (1,489) 3,090 (2,086) 2,669 (1,394) 2,717 (1,300) Set 1 Cell aspect ratio 0.49 (0.16) 0.50 (0.18) 0.48 (0.19) 0.50 (0.19) 0.50 (0.18) 0.54 (0.18) 0.50 (0.18) 0.46 (0.18) 0.46 (0.19) Max coefficient of variation 0.26 (0.16) 0.25 (0.16) 0.24 (0.14) 0.25 (0.14) 0.30 (0.15) 0.32 (0.15) 0.37 (0.16) 0.49 (0.16) 0.57 (0.17) Peak height 0.47 (0.09) 0.48 (0.10) 0.44 (0.09) 0.44 (0.09) 0.44 (0.10) 0.44 (0.10) 0.46 (0.10) 0.55 (0.08) 0.60 (0.07) Set 2 Peak distance 1.50 (0.50) 1.72 (0.41) 1.69 (0.31) 1.72 (0.25) 1.76 (0.19) 1.77 (0.12) 1.78 (0.10) 1.79 (0.08) 1.79 (0.06) Fraction cell area background 0.20 (0.28) 0.10 (0.19) 0.04 (0.11) 0.04 (0.11) 0.03 (0.09) 0.02 (0.08) 0.02 (0.06) 0.04 (0.08) 0.05 (0.10) Fraction cell area diffuse/other 0.04 (0.06) 0.09 (0.09) 0.14 (0.11) 0.15 (0.12) 0.19 (0.14) 0.18 (0.14) 0.11 (0.10) 0.11 (0.09) 0.08 (0.07) Fraction cell area fiber 0.30 (0.22) 0.34 (0.20) 0.34 (0.19) 0.32 (0.19) 0.27 (0.18) 0.14 (0.12) 0.09 (0.10) 0.08 (0.09) 0.06 (0.07) Fraction cell area disorganized puncta 0.09 (0.11) 0.13 (0.13) 0.20 (0.15) 0.15 (0.13) 0.08 (0.09) 0.03 (0.06) 0.01 (0.02) 0.01 (0.02) 0.01 (0.02) Fraction cell area organized puncta 0.35 (0.23) 0.34 (0.20) 0.27 (0.17) 0.30 (0.20) 0.30 (0.18) 0.29 (0.18) 0.26 (0.16) 0.20 (0.13) 0.16 (0.09) Set 3 Fraction cell area organized z-discs 0.01 (0.02) 0.01 (0.02) 0.01 (0.03) 0.03 (0.05) 0.13 (0.14) 0.34 (0.19) 0.49 (0.20) 0.56 (0.19) 0.63 (0.19)

3.1.2 Settings

To evaluate the effectiveness of the proposed method, several experiments have been conducted. We divide the dataset into training, validation, and testing sets, resulting in 3,686 cells for training, 922 cells for validation, and 1,153 cells for testing. For all experiments, we preprocess the images by resizing them to a size of 3×224×22432242243\times 224\times 2243 × 224 × 224 to align with the ResNet-18 input shape, followed by normalization; and preprocess the tabular data by applying standard scaler from the Scikit-learn library. To investigate the impact of feature extraction, we perform the proposed method on both Protocol 1 and Protocol 2.

3.2 Training Methodology

In this study, an Adam optimizer is used with a learning rate of 0.0005 to optimize the MSE objective function (Section 2.2). The models are trained with a batch size of 40 in 100 epochs using the Pytorch framework. The network with the highest Spearman correlation on the validation dataset is selected and continuously tested on the testing dataset. All models are trained on a GeForce RTX 3090 GPU.

3.3 Evaluation Metrics

The model performance is evaluated and measured using four metrics, including Spearman correlation, MAE, MSE, and R2superscript𝑅2R^{2}italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT score. First, Spearman correlation [13] is a non-parametric statistical method to assess the relationships between two variables based on ranks. To calculate this coefficient, we first order the two variables y^^𝑦\hat{y}over^ start_ARG italic_y end_ARG and y𝑦yitalic_y from least to greatest. Let D(i)superscript𝐷𝑖D^{(i)}italic_D start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT be the distance between the two corresponding ranks. The Spearman correlation is computed using the formula

rS=16i=1ND(i)2N(N21),subscript𝑟𝑆16superscriptsubscript𝑖1𝑁superscriptsuperscript𝐷𝑖2𝑁superscript𝑁21\displaystyle r_{S}=1-\frac{6{\sum_{i=1}^{N}{D^{(i)}}^{2}}}{N(N^{2}-1)},italic_r start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT = 1 - divide start_ARG 6 ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_D start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_N ( italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 ) end_ARG , (2)

where i=1,,N𝑖1𝑁i=1,...,Nitalic_i = 1 , … , italic_N.

Secondly, MAE is the mean of the difference between y^^𝑦\hat{y}over^ start_ARG italic_y end_ARG and y𝑦yitalic_y in absolute value, calculated as

MAE=1Ni=1N|y(i)y^(i)|.𝑀𝐴𝐸1𝑁superscriptsubscript𝑖1𝑁superscript𝑦𝑖superscript^𝑦𝑖\displaystyle MAE=\frac{1}{N}{\sum_{i=1}^{N}|y^{(i)}-\hat{y}^{(i)}|}.italic_M italic_A italic_E = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_y start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT - over^ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT | . (3)

MSE is the mean of the square of the difference between y^^𝑦\hat{y}over^ start_ARG italic_y end_ARG and y𝑦yitalic_y as the equation

MSE=1Ni=1N|y(i)y^(i)|2.𝑀𝑆𝐸1𝑁superscriptsubscript𝑖1𝑁superscriptsuperscript𝑦𝑖superscript^𝑦𝑖2\displaystyle MSE=\frac{1}{N}{\sum_{i=1}^{N}|y^{(i)}-\hat{y}^{(i)}|^{2}}.italic_M italic_S italic_E = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_y start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT - over^ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (4)

R2superscript𝑅2R^{2}italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (also called the coefficient of determination) [23] is defined as the fraction of the variation in the dependent variable y^^𝑦\hat{y}over^ start_ARG italic_y end_ARG that is predicted from the independent variable y𝑦yitalic_y. It has the worst value of -\infty- ∞, the best value of +11+1+ 1 [5], and can be calculated with the formula

R2=1i=1N|y(i)y^(i)|2i=1N|y(i)y¯|2,superscript𝑅21superscriptsubscript𝑖1𝑁superscriptsuperscript𝑦𝑖superscript^𝑦𝑖2superscriptsubscript𝑖1𝑁superscriptsuperscript𝑦𝑖¯𝑦2\displaystyle R^{2}=1-\frac{\sum_{i=1}^{N}|y^{(i)}-\hat{y}^{(i)}|^{2}}{\sum_{i% =1}^{N}|y^{(i)}-\overline{y}|^{2}},italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1 - divide start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_y start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT - over^ start_ARG italic_y end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_y start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT - over¯ start_ARG italic_y end_ARG | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (5)

where y¯¯𝑦\overline{y}over¯ start_ARG italic_y end_ARG is the mean value of y𝑦yitalic_y.

4 Result & Discussion

4.1 Model Performance

Table 3: Result summary of four approaches on the testing dataset
Protocol
Approach Model
Spearman
Correlation
MAE MSE
R2superscript𝑅2R^{2}italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
score
1 Combined model 0.689 0.373 0.249 0.483
1 2 SarcNet 0.825 0.310 0.159 0.671
3 Combined model 0.809 0.320 0.180 0.627
2 4 SarcNet 0.831 0.310 0.161 0.668

In this section, we compare the performance of SarcNet with a combined model, which integrates the strengths of both the ResNet-18 module and linear regression to enhance the prediction performance of the sarcomere structure organization score. Given an input image of a single cell, the ResNet-18 module directly processes the raw image, while linear regression utilizes a set of extracted features. Both component models generate a predicted score ranging from one to five. The final score is computed as the mean of two outputs derived from the individual predictions from the ResNet-18 module and linear regression.

To understand the effectiveness of feature extraction and deep-learning models on quantifying sarcomere structure organization, Table 3 provides the four performance metrics, including Spearman correlation, MAE, MSE, and R2superscript𝑅2R^{2}italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT score, of four approaches on the testing dataset. The last two approaches (3, 4), which use Protocol 2, claim much superior outcomes (0.809-0.831) compared to those with Protocol 1 (0.689-0.825). A highlighted point is that while the combined model with Protocol 2 dramatically outperforms the performance in Protocol 1, the SarcNet model reveals nearly identical results in all performance metrics (Spearman: 0.83, MAE: 0.31, MSE: 0.16, and R2superscript𝑅2R^{2}italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT score: 0.67) between the two different methods. An explanation for this could be that the linear regression model with fewer features leads to poor performance, which significantly affects the performance of the combined model, lowering the results from 0.809 to 0.689.

In addition, the SarcNet model claims strength in extracting useful information from input images and features compared to the combined model. While the combined model requires the six features extracted from the deep-learning model to leverage the performance (Table 3), the former can forgo this step and still maintain similar outcomes. This would effectively speed up the framework and minimize the memory storage in assessing sarcomere structure organization in hiPSC-CMs.

Refer to caption

Figure 3: Visualization results for the GT and the corresponding predictions from the SarcNet model. The first row illustrates the predicted examples with MAE0.1𝑀𝐴𝐸0.1MAE\leq 0.1italic_M italic_A italic_E ≤ 0.1, the second row with 0.1<MAE10.1𝑀𝐴𝐸10.1<MAE\leq 10.1 < italic_M italic_A italic_E ≤ 1, and the last row with MAE>1𝑀𝐴𝐸1MAE>1italic_M italic_A italic_E > 1.

Fig. 3 presents prediction examples with different ranges of MAE. It is noticeable that the predictions from the proposed architecture significantly reach the original targets, especially those from the first row. The last row, however, shows some incorrectly predicted results. In particular, Fig. 3(g) is mistakenly predicted since this cell has a section with considerably organized z-discs, which confuses the model. Furthermore, Fig. 3(h) is enlarged too much, making the alpha-actinin-2 patterns in this image appear crisper with higher density, resulting in a higher prediction score. In contrast, Fig. 3(i) is shrunk too tiny relative to the original, resulting in blurrier alpha-actinin-2 patterns and a substantially lower prediction score.

4.2 Model Interpretation

Refer to caption

Figure 4: Visualization of Grad-CAM heatmaps for different examples.

Grad-CAM heatmaps (Fig. 4) are used to interpret the SarcNet model predictions. Fig. 4(c), (d), and (e) produce the highest values focusing mainly on the sarcomere structure, while Fig. 4(b) has the lowest value at the center region of the cell. An explanation for this could be that the two sides of the cell have clearer fiber patterns while the center does not; as a result, it would be easier for the model to make decisions based on the two sides of the cell. In contrast, Fig. 4(a) illustrates a mistaken case in which the model seems to focus more on the regions with organized z-discs than the diffuse regions.

4.3 Changes in Cell Organization Over Time

Refer to caption

Figure 5: Histogram of the predicted scores for day 18 (blue) and day 32 (red).

Fig. 5 aims to confirm the pattern of increasing organization levels between days 18 and 32. It is noticeable that the predicted scores from the SarcNet model are consistent with the culture period and expert annotations, with a slight shift to the right of the 32-day histogram. While most of the hiPSC-CM organization on day 18 concentrates around level 3, with most of the regions being disorganized puncta, day 32 indicates more organized regions with level 4. However, most of the cells on day 32 are still immature and need further culture. This finding agrees with the research in [6], which has investigated the culture of hiPSC-CMs on days 30, 90, and 200 and proclaimed that cell contractility reaches the maximum level at the 200-day time point.

4.4 Comparison

Table 4: Comparison of other models on the testing dataset
Model
Feature
extraction
Spearman
Correlation
MAE MSE
R2superscript𝑅2R^{2}italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
score
ResNet-18 [11] No 0.829 0.453 0.300 0.379
DenseNet [14] No 0.813 0.340 0.194 0.599
AlexNet [16] No 0.788 0.346 0.195 0.596
Linear Regression [7] Protocol 2 0.756 0.373 0.234 0.514
Ours Protocol 2 0.831 0.310 0.161 0.668

Table 4 compares the performance of our proposed framework to other models and claims a noticeable result. As can be seen from the table, the ResNet-18 model and SarcNet achieve an approximately similar Spearman correlation of 0.829 and 0.831, respectively, but are significantly different in the results of the other metrics. An explanation for this finding could be that the Spearman correlation only measures the association based on ranks between the GT and predictions but does not consider the distance between them as MAE and MSE. As a result, the ResNet-18 model seems to quantify the sarcomere structure organization in the correct rank of scores but still achieves poor performance in MAE and MSE (0.453 and 0.300, respectively). In contrast, thanks to feature extraction, SarcNet can get more relevant information and score the sarcomere structure with smaller values in MAE and MSE, which are 0.310 and 0.161, respectively.

4.5 Limitations and Future Directions

Some limitations in this work need addressing before quantifying sarcomere structure organization. One of the most noteworthy is the lack of single-cell segmentation progress. Experts manually mark single-cell boundaries in the AISC dataset [7], as there is no accessible automated framework. This work should be considered challenging because the individual cells are not completely separated from one another but overlap; also, high cell density regions should be avoided. In the future, we intend to develop an automated procedure that addresses this challenge and facilitates more accurate quantification of sarcomere structure organization.

5 Conclusion

This work highlights key discoveries in single-cell imaging of hiPSC-CMs. First, in this study, we develop SarcNet, a deep learning-based method for quantifying sarcomere structure organization. In particular, we propose this model to leverage predictions by concatenating the ResNet-18 module output with a representation vector of quantitative single-cell measurements of subcellular organization. Second, we conduct extensive experiments and compare SarcNet to other approaches to illustrate the usefulness of the proposed method, with the best performance in Spearman correlation of 0.831. Finally, we speed up the training and interference processes by reducing the number of features while retaining a performance of 0.825.

Conflict of Interest

The authors have no conflicts of interest to declare.

Acknowledgment

This study is supported by the VinUni-Illinois Smart Health Center, VinUniversity.

References

  • [1] Ahmed, R.E., Anzai, T., Chanthra, N., Uosaki, H.: A brief review of current maturation methods for human induced pluripotent stem cells-derived cardiomyocytes. Frontiers in Cell and Developmental Biology 8, 178 (2020)
  • [2] Bedada, F.B., Chan, S.S., Metzger, S.K., Zhang, L., Zhang, J., Garry, D.J., Kamp, T.J., Kyba, M., Metzger, J.M.: Acquisition of a quantitative, stoichiometrically conserved ratiometric marker of maturation status in stem cell-derived cardiac myocytes. Stem Cell Reports 3(4), 594–605 (2014)
  • [3] Bedada, F.B., Wheelwright, M., Metzger, J.M.: Maturation status of sarcomere structure and function in human iPSC-derived cardiac myocytes. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research 1863(7), 1829–1838 (2016)
  • [4] Cai, W., Zhang, J., de Lange, W.J., Gregorich, Z.R., Karp, H., Farrell, E.T., Mitchell, S.D., Tucholski, T., Lin, Z., Biermann, M., et al.: An unbiased proteomics method to assess the maturation of human pluripotent stem cell-derived cardiomyocytes. Circulation Research 125(11), 936–953 (2019)
  • [5] Chicco, D., Warrens, M.J., Jurman, G.: The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science 7, e623 (2021)
  • [6] Ebert, A., Joshi, A.U., Andorf, S., Dai, Y., Sampathkumar, S., Chen, H., Li, Y., Garg, P., Toischer, K., Hasenfuss, G., et al.: Proteasome-dependent regulation of distinct metabolic states during long-term culture of human iPSC-derived cardiomyocytes. Circulation Research 125(1), 90–103 (2019)
  • [7] Gerbin, K.A., Grancharova, T., Donovan-Maiye, R.M., Hendershott, M.C., Anderson, H.G., Brown, J.M., Chen, J., Dinh, S.Q., Gehring, J.L., Johnson, G.R., Lee, H., Nath, A., Nelson, A.M., Sluzewski, M.F., Viana, M.P., Yan, C., Zaunbrecher, R.J., Cordes Metzler, K.R., Gaudreault, N., Knijnenburg, T.A., Rafelski, S.M., Theriot, J.A., Gunawardane, R.N.: Cell states beyond transcriptomics: Integrating structural organization and gene expression in hiPSC-derived cardiomyocytes. Cell Systems 12(6), 670–687 e10 (2021). 10.1016/j.cels.2021.05.001. URL https://doi.org/10.1016/j.cels.2021.05.001
  • [8] Guo, Y., Pu, W.T.: Cardiomyocyte maturation: New phase in development. Circulation Research 126(8), 1086–1106 (2020)
  • [9] Hamledari, H., Asghari, P., Jayousi, F., Aguirre, A., Maaref, Y., Barszczewski, T., Ser, T., Moore, E., Wasserman, W., Klein Geltink, R., et al.: Using human induced pluripotent stem cell-derived cardiomyocytes to understand the mechanisms driving cardiomyocyte maturation. Frontiers in Cardiovascular Medicine 9, 967,659 (2022)
  • [10] Häneke, T., Sahara, M.: Progress in bioengineering strategies for heart regenerative medicine. International Journal of Molecular Sciences 23(7), 3482 (2022)
  • [11] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
  • [12] Hnatiuk, A.P., Briganti, F., Staudt, D.W., Mercola, M.: Human iPSC modeling of heart disease for drug development. Cell Chemical Biology 28(3), 271–282 (2021)
  • [13] Hollander, M., Wolfe, D.A., Chicken, E.: Nonparametric statistical methods. John Wiley & Sons (2013)
  • [14] Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
  • [15] Khan, M., Xu, Y., Hua, S., Johnson, J., Belevych, A., Janssen, P.M., Gyorke, S., Guan, J., Angelos, M.G.: Evaluation of changes in morphology and function of human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs) cultured on an aligned-nanofiber cardiac patch. PLOS One 10(5), e0126,338 (2015)
  • [16] Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. CoRR abs/1404.5997 (2014). URL http://arxiv.longhoe.net/abs/1404.5997
  • [17] Machiraju, P., Greenway, S.C.: Current methods for the maturation of induced pluripotent stem cell-derived cardiomyocytes. World Journal of Stem Cells 11(1), 33 (2019)
  • [18] McQuin, C., Goodman, A., Chernyshev, V., Kamentsky, L., Cimini, B.A., Karhohs, K.W., Doan, M., Ding, L., Rafelski, S.M., Thirstrup, D., et al.: CellProfiler 3.0: Next-generation image processing for biology. PLOS Biology 16(7), e2005,970 (2018)
  • [19] Pasqualini, F.S., Sheehy, S.P., Agarwal, A., Aratyn-Schaus, Y., Parker, K.K.: Structural phenoty** of stem cell-derived cardiomyocytes. Stem Cell Reports 4(3), 340–347 (2015)
  • [20] Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
  • [21] Skorska, A., Johann, L., Chabanovska, O., Vasudevan, P., Kussauer, S., Hillemanns, M., Wolfien, M., Jonitz-Heincke, A., Wolkenhauer, O., Bader, R., et al.: Monitoring the maturation of the sarcomere network: a super-resolution microscopy-based approach. Cellular and Molecular Life Sciences 79(3), 149 (2022)
  • [22] Vicente, J., Zusterzeel, R., Johannesen, L., Mason, J., Sager, P., Patel, V., Matta, M.K., Li, Z., Liu, J., Garnett, C., et al.: Mechanistic model-informed proarrhythmic risk assessment of drugs: review of the “CiPA” initiative and design of a prospective clinical validation study. Clinical Pharmacology & Therapeutics 103(1), 54–66 (2018)
  • [23] Wright, S.: Correlation and causation. Journal of Agricultural Research 20(7), 557 (1921)
  • [24] Yang, X., Ribeiro, A.J., Pang, L., Strauss, D.G.: Use of human iPSC-CMs in nonclinical regulatory studies for cardiac safety assessment. Toxicological Sciences 190(2), 117–126 (2022)