Page 88 - AIH-2-2
P. 88
Artificial Intelligence in Health Efficient knowledge distillation for breast US
Table 5. Results of SOTA methods effectively learn from the expertise encoded in the teacher’s
parameters, leading to significant performance gains. Such
Article DSC (%) No. of parameters (millions) endeavors hold promise for advancing the state-of-the-art
Liang et al. 52 84 20.5 in model compression and facilitating the deployment of
Gao et al. 63 85 2.34 efficient deep learning solutions across various domains
Lou et al. 69 90 26.63 and applications.
Lee et al. 55 89 7.7 Even though our study provides valuable insights,
Ours (L_KLD_WAug) 80 0.82 it would be advantageous to explore various student
Abbreviation: DSC: Dice similarity score. models with differing numbers of trainable parameters
to assess the trend of their performance relative to
the results reported in their respective papers. Table 5 parameter count. This investigation would offer a
exclusively showcases our best model alongside the top 3 deeper understanding of the scalability and efficiency
SOTA models that have reported the number of trainable of the proposed KD-based framework. Furthermore,
parameters in their corresponding paper. expanding our research to encompass additional
publicly available US datasets with diverse applications
Our proposed best model demonstrates comparable would improve the generalizability and robustness of the
performance to SOTA models, despite having significantly proposed framework.
fewer trainable parameters. This observation highlights
the efficiency of our model architecture in achieving 7. Conclusion
competitive results while keeping the number of
parameters minimal. By leveraging innovative design in This study concludes by demonstrating how KD can
distilling knowledge, our model strikes a balance between improve the performance of lightweight student models for
computational complexity and performance, making it US breast tumor segmentation. The suggested framework
well-suited for resource-constrained environments or shows promise for resource-constrained applications
applications where model size is a critical consideration. such as POCUS by achieving competitive performance
with substantially fewer parameters through a systematic
6. Discussion analysis of KD routes, loss functions, and augmentation
impacts. The results highlight how crucial it is to choose the
In this study, we investigated various aspects of KD best teacher representations and use teacher guidance to
techniques and their implications for enhancing student promote efficient knowledge transfer. The generalizability
performance. Through an extensive analysis of KD and resilience of this method would be further validated
pathways, loss functions, and the impact of augmentation, by future research examining scalability with different
we gained valuable insights into the mechanisms student model sizes and adding more US datasets, opening
underlying knowledge transfer from teacher to student the door for more effective and broadly applicable deep
networks. Our findings revealed that the proposed KD learning solutions in medical imaging.
paths consistently demonstrated performance closely
aligned with that of the teacher model, indicating effective Acknowledgement
knowledge transfer. Additionally, the comparative analysis
between MSE and KLD loss functions showed comparable None.
efficacy in facilitating knowledge transfer across different Funding
KD pathways. Furthermore, exploring the impact of
different augmentations on the teacher model showed the This work was supported by The Natural Sciences and
fundamental role of teacher guidance in improving student Engineering Research Council of Canada (NSERC)
performance, despite the negligible effect of augmentation (NSERC RGPIN-2020-04612).
on the teacher model itself.
Conflict of interest
Finally, our comparison with SOTA models showcased
the efficiency of our proposed model architecture. Despite The authors declare that they have no competing interests.
having significantly fewer trainable parameters, our best Author contributions
model demonstrated comparable performance to SOTA
models, highlighting the effectiveness of our approach in Conceptualization: All authors
achieving competitive results while minimizing model Formal analysis: All authors
complexity. Therefore, by leveraging the rich knowledge Investigation: Bahareh Behboodi
encapsulated within the teacher network, students can Methodology: All authors
Volume 2 Issue 2 (2025) 82 doi: 10.36922/aih.3509

