Page 69 - AIH-2-2
P. 69

Artificial Intelligence in Health                         Improved liver tumor segmentation with dense networks


































            Figure 1. Overview of our proposed liver tumor segmentation pipeline. The model takes three consecutive slices as input to predict the middle one. In the
            training phase, training samples with dimensions of 224 × 224 × 3 are cropped randomly from raw CT volumes for data augmentation, and then fed into
            the model. In the testing phase, the trained model processes a 3D CT volume by taking three adjacent slices in their original size as input and sliding along
            the z-axis of the CT volume at a step size of 1. The segmentation of the entire 3D volume is completed in this way. Figure created by the authors.
            Abbreviations: CT: computed tomography; FCN: fully convolutional network; n: total number of images captured in a CT scan.
                                                               neural networks. Recently, a novel connectivity pattern,
                                                               called dense connections, has been designed in DenseNets
                                                                                                            32
                                                               to further improve the information flow between layers,
                                                               yielding state-of-the-art classification results across several
                                                               datasets. Considering the superior feature extraction ability
            Figure 2. The CT image pre-processing module       of DenseNets, we employ the DenseNet-161 classification
            Abbreviations: CT: computed tomography; Conv: convolutional layer.
                                                                     32
                                                               network  as the encoder, excluding the global average
                                                               pooling, fully connected, and softmax layer. There are four
            2.2. I -DenseFCN segmentation network              dense blocks in the DenseNet-161 network and the dense
                2
            The proposed I -DenseFCN segmentation network is   connections in each block are referred to as intra-block dense
                          2
            built  on the  classical  encoder-decoder  architecture  as   connections. In this design, direct connections are introduced
            illustrated in  Figure  3. The encoder is derived from the   from every layer within a dense block to all subsequent
            DenseNet classification network.  The decoder introduces   layers. Each  layer,  therefore, receives  the concatenated
                                      32
            dense connections among upsampling blocks of different   feature maps of all preceding layers and produces k feature
            levels. The UNet-like long skip connections are established   maps through a composite function comprising three
            between the encoder and decoder.                   consecutive  operations:  Batch  normalization,  a  rectified
                                                               linear unit, and a 3 × 3 convolution. k is referred to as the
            2.2.1. Encoder                                     growth rate, indicating the amount by which each module’s
            The representation ability of features largely affects the   output increases relative to its input. The transition down
            performance of semantic image segmentation.  As deeper   blocks, consisting of a 1×1 convolution followed by a 2 × 2
                                                 33
            networks generally provide stronger feature representation   pooling operation, are introduced between dense blocks to
            capabilities,  the network structures for feature extraction   reduce the spatial dimensionality of feature maps.
                     32
            continue to deepen. However, deep neural networks are more
            difficult to train if their depth is attained by simply stacking   2.2.2. Decoder
            the layers. Mitigating this issue, ResNets  introduces the   Considering the existence of multiscale tumors – one of
                                             20
            residual connections to facilitate the training of very deep   the primary challenges in the liver tumor segmentation

            Volume 2 Issue 2 (2025)                         63                               doi: 10.36922/aih.5001
   64   65   66   67   68   69   70   71   72   73   74