Page 252 - v11i4
P. 252
International Journal of Bioprinting Deep learning-based 3D digital model of fetal heart
ISUOG. All pregnant women were told to relax and lie in 2.4. Model development and training for fetal heart
the supine position. Echocardiographic volume data were position detection
obtained by a Voluson E10 US system (GE Healthcare, As displayed in Figure 1, FRT is an interactive method of
USA [Campanella, #25]) equipped with an eM6C electric medical image segmentation, combining both position
matrix transducer (2–7 MHz) to ensure that the fetus detection and interactive binary threshold segmentation.
was in the supine position and its cardiac apex facing Position detection was performed using the Faster-R-CNN
towards the front (11, 12, or 1 o’clock directions). Once architecture, which contains a feature extractor, region
data collection was initiated, the pregnant woman was told proposal networks (RPNs), region of interest (ROI) pooling,
to avoid making any movement and hold her breath in and a classifier. A convolutional neural network (CNN)
13
order to minimize stitching artifacts. Meanwhile, principal was used to extract the potential feature in the detection
sections, including the abdominal transverse section, the field during image classification, as it can accurately
standard FCV, the left ventricular outflow tract section, the identify human-identifiable phenotypes and characteristics
right ventricular outflow tract section, and the three-vessel that are not recognized by human experts. 14–18 The Visual
section and its derivative sections, were examined strictly Geometry Group (VGG) CNN architecture—without
in accordance using the three-segment analysis method to complete connection and classification layers—is retrained
ensure the proper condition of fetal hearts. Finally, the end- by backpropagation and was used as the feature extractor.
19
systolic phase with valves completely closed was precisely Each image was resized to 300 × 300 pixels to ensure
selected using M-mode; the data set was saved and output compatibility with the dimensions of the VGG network
in Cartesian volume format for editing and 3D modeling. architecture before processing in the feature extractor.
Region proposals are the output of RPN, which consists of
2.3. Preprocessing of volume data into 2D images classification and regression layers. The classification layer
A standard full scanned data includes 50–100 volume data, has a score that represents the ROI or the background, while
consisting of each time phase of the fetal heart. In this the regression layer has four coordinates that indicate the
study, one end-diastolic volume data contained the spatial position of the fetal heart. ROI pooling, characterized by
structure of the fetal heart extracted from each individual the non-fixed size of feature maps, is used to collect region
and can be decomposed into 50–70 2D images—depending proposals generated by RPNs and feature maps, creating
on the size of the heart—spaced 0.02 cm apart from each a vector with a similar shape to the feature map. In the
other. All images were saved in JPEG format. According to classifier, bounding box regression and classification layers
the suggestions of ISUOG, images were divided into FCV, were used to generate a more accurate target detection box
OTV, and TVV. To generate the label that can be trained and proposal class.
by Deep Neural Network (DNN), the graphical image
annotation tool “LabelImg” was used by professional For detecting the position of the fetal heart, the
sonographers to annotate data. A standard data stream is parameters were initialized randomly and trained using
made up of a JPEG 2D image as data and an annotation the same learning rate of 0.001, momentum of 0.9, and
(in XML format) as the label. The complete dataset batch size of 16 to minimize the function loss for each
contained 5255 data streams from 110 unique individuals. object proposal. The model was trained using the training
Images were randomly split into 3297, 1097, and 403 images set and tested using the test set. The data were split such
as the training, validation, and test sets, respectively. Each that the same US image was not in both training and test
image and annotation was manually rechecked by two sets. Both training and test were performed in Python
20
sonographers who did not participate in the annotation of using Google’s TensorFlow deep learning framework. The
original data to confirm the reliability of the data set. The training was terminated if the internal validation loss did
summary statistics of data are presented in Table 1. not decrease for 10 epochs (early stopping criteria), and
Table 1. Summary statistics of the training, test, and validation sets
View Training Test Validation Total
FCV 1462 627 224 2313
OTV 1096 470 164 1730
TVV 739 317 156 1212
Total 3297 1097 403 5255
Note: The values in the table refer to the number of annotated 2D ultrasound images. Abbreviations: FCV: Four-chamber view; OTV: Outflow tract view;
TVV: Three-vessel view.
Volume 11 Issue 4 (2025) 244 doi: 10.36922/IJB025200192