Page 22 - JCTR-11-5

P. 22

Journal of Clinical and
Translational Research AI and LLMs in iPSC cardiac research

Table 4. Benchmarking key LLM and AI tools across cardiovascular and regenerative contexts
Name Domain specificity Primary input modality Cardiovascular application Key advantage
AlphaFold Protein structure Amino acid sequences Accurate modeling of cardiac High-resolution protein folding for
prediction proteins (e.g., sarcomere variants CMs variant interpretation
TTN, and MYH7)
AlphaMissense Variant pathogenicity Gene variants Interpret missense mutations, for Enables classifications of VUS
prediction example, in cardiomyopathy-related in cardiac genomics using
genes ClinVar-linked benchmarking
BioBERT Biomedical NLP Scholarly biomedical text Named entity recognition and relation Domain-tuned language
extraction in cardiology studies understanding for gene-disease mining
BioGPT Biomedical LLM (text Biomedical text Gene-disease annotation, literature High precision and recall in
generation and mining) summarization domain-specific NLP tasks
BioMedLM Biomedical LLM Text corpora of medical Competitive QA performance on Strong domain-specific NLP for QA
publications medical exams (~57–69%), QA
systems in medical informatics,
preliminary cardiovascular insights
Cardiogen AI Cardiac genomics ML Genomic variant profiles Predicting disease phenotype Superior variant-to-outcome
severity in monogenic CVDs interpretation in cardiology
ChatGPT-4 General-purpose LLM Broad text corpora and Clinical guideline interpretation, broad fluency and multi-step
multimodal inputs literature synthesis, and preclinical reasoning capability
planning
Chemputer Automated synthesis Chemical synthesis In silico synthesis for Automated drug-generation
pipelines cardiac-regenerative compound workflows tied to target biology
generation
ClinVar Variant database raw clinical variant Reference database for variant Standard resource for variant
records pathogenicity annotation interpretation benchmarking
DeepChem Drug modeling library In silico molecular Toxicity screening of compounds in Efficient compound efficacy and
prediction cardiac assays toxicity modeling tools
DeepSeek-R1/ General-purpose LLM Bilingual reasoning tasks Potential use in the Chinese Scalable, open-source model with
Med (China-owned) cardiovascular research context strong multilingual NLP, rivaling
GPT4
Ensembl Genomic data platform Genomic and Identifying regulatory regions and Centralized gene/variant annotation
Genome transcriptomic query variants relevant to the iPSC-CM hub
Browser pipeline
ESMFold Structure-prediction and Protein sequence Efficient structure prediction aiding Fast and scalable folding predictions
evolutionary LLM cardiac variant annotation alternative to Alphafold
GEO Public gene expression Transcriptomic Data source for cardiac gene Large-scale expression datasets for
repository microarray and RNA-seq expression variation and iPSC-CM CM modeling
training
GROK open source LLM (xAI) Text reasoning Emerging general reasoning tasks, Early-stage reasoning capabilities in
limited iPSC-CM application yet open models
HuggingFace Model library and NLP/ML frameworks Used to fine-tune BioBERT/ Ecosystem support with model
Transformers fine-tuning hub BioGPT/REALM for sharing and fine-tuning infrastructure
cardiac-specific tasks
JAX ML computation Neural network training Training LLMs or multimodal High-performance, accelerated neural
framework models for iPSC-CM omics architecture support
interpretation
PyTorch ML Framework Deep-learning neural Foundation of variant annotation Large community and ecosystem for
modeling and regression models in cardiac model development
biology
REALM Retrieval-augmented LLM Document retrieval and EHR mining and real-time Efficient integration of large-text
text modeling guideline retrieval in cardiology archives with LLM query mechanisms
workflows
(Cont'd)

Volume 11 Issue 5 (2025) 16 doi: 10.36922/JCTR025230026

17 18 19 20 21 22 23 24 25 26 27