Page 16 - JCTR-11-5
P. 16

Journal of Clinical and
            Translational Research                                                AI and LLMs in iPSC cardiac research



                  machine learning to enable the automated design    cardiac cells and disease models. GEO enables
                  and synthesis of complex molecules. It allows      scientists to explore gene regulations in heart
                  chemists to program and control the synthesis of   disease, development, or drug response. It is used
                  specific compounds, improving efficiency. The      for discovering biomarkers, understanding disease
                  ultimate  goal  of  Chemputer  is  to  accelerate  the   mechanisms, and validating experimental findings
                  discovery and development of new compounds for     in iPSC-derived cardiomyocytes 69
                  applications in drug discovery, materials science,   (xv)   GROK: LLM developed by xAI; referenced in the
                  and beyond 61                                      context of AI landscape expansion but not yet
            (ix)   ClinVar:  National  Institute  of  Health  (NIH)-  applied in iPSC-CM pipelines 19
                  owned  variant,  a  public  database  that  archives   (xvi)  HuggingFace Transformers: LLM library hub used
                  reports of the relationships between human genetic   for implementing transformer-based models,
                  variants and their clinical significance. It collects   including BioBERT and REALM; provides the
                  submissions from research, labs, clinics, and      backbone for LLM fine-tuning in multi-modal
                  researchers,  helping  to  classify  whether  specific   omics data pipelines 43
                  variants are benign, pathogenic, or of uncertain   (xvii)  JAX: High-performance ML framework used in
                  significance 62                                    cardiac LLMs for omics modeling and optimization
            (x)   DeepChem: DL framework for  in silico drug         tasks, including protocol efficiency simulations in
                  modeling  and  structure-activity  prediction;     iPSC-CM studies 41,42
                  integrated into iPSC-CM cardiotoxicity screening   (xviii)  PyTorch: DL library used to build and train LLMs for
                  pipelines  using  LLM-generated  compound          cardiac modeling, including time-series prediction
                  profiling 63,64                                    and transformer network construction 40,70,71
            (xi)  DeepSeek-Med:  Chinese   biomedical  LLM     (xix)  REALM: Retrieval-augmented language model
                  initiative; mentioned as a potential future        combining LLMs with document retrieval—
                  collaborator in international AI consortia for     used  in  EHR mining and  real-time  diagnostic
                  regenerative platforms and clinical translation 19,58  applications, including arrhythmia detection 72
            (xii)   Ensembl Genome Browser: A  genomic data hub   (xx)   RoseTTAfold (Baker Lab): A DL tool that predicts
                  platform that provides integrated, annotated       protein structures from amino acid sequences
                  reference genomes for a wide range of species,     with high accuracy. It utilizes a three-track neural
                  including humans. It enables researchers to        network to integrate sequence, distance, and
                  explore genes, variants, regulatory regions, and   coordinate data, enabling rapid modeling of protein
                  comparative genomic data. In cardiac research and   structures. In cardiac research, RoseTTAfol aids
                  clinical translation, Ensembl plays a crucial role   in  understanding  the  structural  implications  of
                  in  identifying  genetic  mutations  and  regulatory   genetic mutations associated with heart diseases.
                  elements linked to heart diseases 65,66            By predicting how specific mutations affect protein
            (xiii)  ESMFold: DL model developed by Meta AI that      folding and function, researchers can identify
                  predicts protein 3D structures directly from       potential targets for therapeutic intervention.
                  amino acid sequences-similar to AlphaFold, but     This is particularly valuable when experimental
                  optimized for speed and scalability. It uses LLM   structures are unavailable, allowing for exploration
                  principles trained on millions of protein sequences   of disease mechanisms at the molecular level 56,73-75
                  to  understand  protein  folding  patterns  without   (xxi)  scFoundation: LLM foundation model for single-
                  relying on multiple sequence alignments. In cardiac   cell data integration; supports high-resolution
                  research and clinical translation, ESMFold can help   subtype prediction and cardiac developmental
                  predict how mutations in cardiac-related proteins,   mapping in iPSC-CM pipelines 47
                  such as ion channels or sarcomeric proteins, alter   (xxii)  scGPT: Generative LLM tailored for single-cell
                  their structure and function. This is crucial for   omics; used in predicting cell fate trajectories,
                  understanding diseases, such as cardiomyopathies   cardiac subtype classification, and transcriptomic
                  or channelopathies 69                              modeling 46
            (xiv)  GEO:  NIH-owned    high-throughput  gene    (xxiii)  TensorFlow: DL library used for implementing
                  expression and sequencing database repository,     deep learning models, including convolutional
                  such as RNA-seq and microarray results.            neural networks and recurrent neural networks,
                  Researchers submit datasets from various tissues,   for cardiac imaging, time-series EHR data, or
                  cell types, and experimental conditions, including   iPSC-CM signal traces. 70,71,76


            Volume 11 Issue 5 (2025)                        10                         doi: 10.36922/JCTR025230026
   11   12   13   14   15   16   17   18   19   20   21