Page 185 - IJB-8-1
P. 185
Yao, et al.
A B
Figure 4. (A and B) Our Aligned Disentangled Generative Adversarial Network model consists of same-domain translation (top and bottom)
and cross-domain translation (middle). The content representation ( ) is a tensor with spatial dimensions, while the style representation ( ) is
a learned vector by multilayer perceptron from domain label. During same-domain translation, the encoder g embeds an input into the shared
enc
content space and the decoder g reconstructs the content to image. Cross-domain translation is performed by swapping content representation.
dec
symmetric architecture with four ResNet blocks and two consistency between the original inputs and cycled
up-sampling modules. Both the encoder and decoder are outputs so as to keep the transferred content in unpaired
equipped with Adaptive Instance Normalization layers to image-to-image translation. measures the difference
sc
integrate the style representations, which are generated of the disentangled features between the original inputs
from domain labels through a multilayer perceptron. and transferred images in the content space. To keep more
A single style representation is assigned for each low frequency details, Mean Absolute Error is used to
domain using one-hot domain labels. Therefore, we can calculate the above three losses. The domain discriminator
disentangle content representations (underlying spatial loss adv is to measure the verisimilitude of the
structure) from styles (rendering of the structure) under reconstructed images/masks and generated fake images/
specific domain. masks. The loss function is defined as Equation 1.
The PatchGAN discriminator in this AD-GAN model
L ,
can identify the inputs’ domain and their verisimilitude, minmaxL total = L adv + λ sc sc cyc cyc + λ recrec (1)
L + λ
L
that is, the reconstructed images (microscopy images or G enc ,G dec D
synthetic masks) or generated fake images. Rather than where λ , λ and λ are used to adjust the importance of
rec
cyc
sc
defining a global discriminator network, this discriminator each term.
can classify local image patches and force the generator During training, we followed the setting in
to learn local properties in real CLSM images or synthetic CycleGAN and used LSGANs to stabilize the
[23]
[24]
masks. ADAM solver is used to train the model from training. We randomly cropped the original volumes with
[22]
scratch with a learning rate of 0.0002. As empirically tuned size of 64 × 64 × 64, and train the model with the batch
in the experiments, the learning rate keeps consistent for size of 4. With this novel training strategy, the proposed
the first 100 epochs and linearly decays to zero over the AD-GAN model can readily align the disentangled
next 100 epochs. Common data augmentation, including content representation of the two domains in one latent
random crop, random rotations are applied to avoid space.
overfitting.
The loss function in AD-GAN contains four terms: 3. Segmentation results and discussion
reconstructed loss , cycle-consistency loss , Several commercial software with numerous tutorials is
rec
cyc
semantic consistency loss , and adversarial loss adv . available to segment/analyze nuclei in cell aggregation,
sc
measures the difference between the original inputs spheroids and organoids. The most well-known 3D nuclei
rec
and reconstructed outputs in the same-domain translation segmentation tools are Cellprofiler 3.0, and Squassh in
so as to extract useful features. measures the ImageJ. Their specific image processing pipeline/steps/
cyc
International Journal of Bioprinting (2022)–Volume 8, Issue 1 171

