Page 115 - ITPS-7-2
P. 115

INNOSC Theranostics and
            Pharmacological Sciences                                          PI3K-α inhibitors for cancer immunotherapy



            considering the inherent complexity and imperfections in   the human PI3K-α protein, encoded as 6PYS, is a protein
            data preparation operations. 37,38  It serves as a basis for valid   complex composed of a ligand and several water molecules.
            data analysis.  Preprocessing includes various techniques   The structure of 6PYS, obtained through X-ray diffraction,
                      37
            such as cleaning, integration, transformation, imputation   exhibits a resolution of 2.19Å, with associated R-values of
            of missing values, and reduction. 33,38            free, work, and observed, numerically presented as 0.259,

              In this study, lists of PI3K-α inhibitory molecules   0.2243, and 0.225, respectively. The composition of 6PYS
            were obtained from the binding databank, resulting in   includes a total structural weight of 110.61 kDa, an atom
            a dataset comprising 3994 inhibitory molecules in 3D   count of 7558, modeled residue counts of 890, deposited
            geometry. Furthermore, the dataset included columns   residue  counts of  945, and one  unique  protein chain A.
            containing IC  values of the molecules, molecule IDs,   Furthermore, no mutations were associated with the 6PYS
                       50
            ligand names, ROMol object information of ligands, etc.   polymer sequence that was engineered from the reference
            The IC  values column contained affinity information,   sequence.
                  50
            indicating the potency of each molecule against the PI3K-α   The  protein  preparation  involved  isolating the  ligand
            target. The dataset was in SDF (structure-data file) format,   from the 6PYS protein-ligand complex, followed by
            and data preprocessing was performed using the Python   protein content modification using the protein preparation
            programming language.                              and refinement wizard embedded in Schrödinger Maestro
              The IC  values of compounds, expressed in nanomolar   (Schrödinger Release 2020-3: Maestro, Schrödinger,
                    50
            (nM) units and ranging from 0.07 – 7200 nM, helped capture   LLC,  United  States,  2023).  The  Maestro  software  is  an
            a broader chemical space, enhancing the identification of   intuitive molecular modeling environment for various
            novel ligands. In addition, the IC  column was also used   scientific discoveries based on material science, as well
                                       50
            as a reference column, in which duplicate rows sharing the   as an integrated predictive computational modeling and
            same IC  were dropped. The governing code syntax was   machine-learning platform for small-molecule  drug
                   50
            specific to maintaining the first entries, as it was assumed   development. During refinement, simulation settings were
            that two or more ligands with the same IC  value exhibited   configured for a pH of 7.0, which allowed small molecules
                                             50
            similar potency or affinity, pharmacological effects, and   (HETs) to detect ligands, metals, and ions. In addition,
            functional activities toward the target protein or receptor.   the  refinement  process  incorporated  various  measures,
            Dropping duplicate entries of IC values offered a normal   such as the assignment of bond orders the employment
                                      50
            distribution of values that made the dataset more amenable   of a chemical component dictionary (CCD) database to
            to statistical analysis. However, docking ligands of similar   help identify and characterize the ligand present in the
            half-maximal inhibitory concentrations may not provide   protein structure in connection with its binding modes
            significant additional insight. Later, the IC values were   and their potential functional or therapeutic roles;
                                               50
            converted to pIC  values to enable dataset standardization   inclusion of missing hydrogens in the protein; the addition
                         50
            and consistency.                                   of  terminal  oxygens to  the  protein;  the  conversion  of
                                                               selenomethionines to methionines; the filling of missing
              The IC values depicted in multiple units can complicate
                    50
            the analysis of results across different concentrations.   loops; cap termini; the deletion of water molecules beyond
                                                               HETs of 0 Å; and the generation of HET state within 7.0 ±
            Hence, it was necessary to convert the IC values in the   2.0 pH value. The Kabat antibody annotation scheme was
                                              50
            dataset to pIC values. Data presentation in pIC values,
                                                   50
                       50
            which represent the values as the negative logarithm of   employed to facilitate the design and analysis of antibody-
            the molar concentration of the IC values, is considered   based therapeutics by  comparing  the protein  sequences
                                        50
            a better approach. This method enhances data clarity,   and structures of antibodies. Furthermore, to mimic the
                                                               natural environment of the protein and prevent unwanted
            minimizes potential errors in data representation, and
            improves reproducibility with standardization, linearity,   interactions or structural distortions that may arise from
            normal distribution, and precision as additional attributes.   exposed termini, the termini of the protein were capped
            Relevant columns were selected and preserved for further   with small fragments of peptides.
            analysis.  The  data  preprocessing  stage  functions  as  a   Hydrogen bond assignment was carried out in the
            preliminary filtering technique to minimize the compound   refinement stage to assign hydrogen bonds to the right
            selection size before executing virtual screening campaigns.  geometry. The optimization of the hydrogen bond
                                                               assignment scheme was carried out using PROPKA, a
            2.3. Protein complex refinement                    molecular dynamics program in Maestro  that facilitated
            The PI3K-α protein structure was obtained from the   a quantitative analysis of the protein pKa values of
            RCSB  Protein  Data  Bank  (rcsb.org).  The  architecture  of   ionizable groups. More specifically, PROPKA was utilized


            Volume 7 Issue 2 (2024)                         5                                doi: 10.36922/itps.2340
   110   111   112   113   114   115   116   117   118   119   120