Page 25 - AIH-2-2

P. 25

Artificial Intelligence in Health Machine learning in arthroplasty

to outcomes and complications specific to TKA, further Yet, the preliminary evidence, largely upheld by
reinforcing cost-saving principles by identifying them retrospective chart reviews used to develop models, offers
before scheduling surgery. This use of ML may potentially promise for its application in TKA and may signal the
be augmented if researchers can meta-analyze cases and next wave of research focusing on prospective models
employ new ML models to identify the best alternative and eventually randomized control trials. For the present
treatments or the best perioperative management for these and the near future, AI in TKA is likely best utilized as a
patients. Future directions may include the automation of supplementary tool to surgeons and radiologists as efforts
recommendations based on a series of radiographic images. continue to bolster the basis for future research to prove its
It is feasible for ML to expand its capabilities to examine value in becoming accepted as a standard in patient care.
pre-operative plain films and compute recommendations
for implant models and positioning. 4. Shoulder arthroplasty

At present, ML in TKA has limited ability for patient 4.1. Radiographic imaging
counseling, mostly due to a lack of consumer trust. 4.1.1. Implant identification
As described above for THA, a lack of education and
transparency surrounding the technology may contribute Pre-surgical implant identification is a key aspect of
to the hesitation seen by patients and surgeons alike. Many shoulder arthroplasty revision operations, as it allows
of the studies included in this review fail to adequately surgeons to pre-anticipate needed hardware kits and
explain or reproduce the outcomes or predictions generated implants. However, 10% of shoulder implants are not
63
by their developed models. For example, Mehta et al.’s identified preoperatively. Previously cited challenges
48
model used to predict OA vs RA, adequately described to improving implant identification include incomplete
how their model was built but failed to provide details of medical records, the evolution of implants, and a surgeon’s
the steps the model took in histopathology identification. familiarity with implant selection. 64,65
Other studies included in this review are trained on few or To address this gap in care, Kunze et al. employed a DL
66
small data sets, limiting their generalizability. The novelty model to evaluate implants from specific manufacturers
of ML in healthcare is irrefutable, as there is not enough and specific models. Their model displayed an accuracy
randomized trial-backed data to fully support its use on of 97.1% and an AUROC between 0.99 and 1.00 when
human subjects on a large scale. identifying implants from eight different manufacturers
66
These concerns have expanded beyond physicians and with an average time of 0.079 seconds (±0.002 s).
researchers, making their way to the desks of legislators in Furthermore, when identifying specific implant models,
the United States. The executive branch, under former U.S. their DL model displayed an accuracy of 99.4% with an
President Biden, has prioritized the development and safe AUROC between 0.99 and 1.00, with a similar average
66
use of AI in healthcare. Beginning with an executive order identification time of 0.079 seconds (±0.002 s). Urban
64
that calls for safeguards, transparency, and replicability, the et al. also used various DL models to predict implants
Department of Health and Human Services has released for total shoulder arthroplasty (TSA) and compared their
a new AI transparency rule aimed at addressing concerns outcomes with and without pre-training using images
by aligning the application of the technology with the from ImageNet. Their results demonstrated that CNN
“FAVES” principles: fair, appropriate, valid, effective, and models with pre-training had TSA implant accuracies
safe. Standardization of AI practices from a legislative between 74% and 80%, while their equivalent models
level will be key in addressing the transparency challenge without pre-training displayed 51 – 56% accuracy.
64
that prevents the technology from reaching a broader Furthermore, Sultan et al. developed a dense residual
65
application. Many of the studies discussed in this article ensemble network (DRE-Net) by combining two CNN
pre-date these new standards, thus, future studies will likely models, demonstrating a better model with an accuracy of
have to adapt their research models to meet the regulations 85.92% and precision of 85.33%, outperforming previously
of new certification programs, reporting metrics, and established models for shoulder implants. 65
information-sharing policies. 62 Yi et al. used DCNN in a different approaches to detect
63
Collaborative efforts between industry, academia, and the presence of shoulder implants, classify them as reverse or
regulatory bodies are essential to implement fair legislation anatomic, and differentiate them between five models. When
and standards. By working together, stakeholders can detecting the presence or absence of shoulder implants, their
develop standardized approaches for validating AI models achieved a perfect AUROC of 1.0 with specificities
algorithms, promote the adoption of open-source models, and sensitivities of 100%. When distinguishing between
63
and enhance the explainability of AI systems. reverse and anatomic shoulder implants, an AUROC of 0.97

Volume 2 Issue 2 (2025) 19 doi: 10.36922/aih.3278

20 21 22 23 24 25 26 27 28 29 30