Page 72 - TD-4-3
P. 72

Tumor Discovery                                               Highly accurate gene panels for cancer screening



            of genes required for a perfect panel depend on the size of   4.4. Cancer diagnosis, tumor taxonomy, and gene
            the tumor sample set?                              therapy
              The results, summarized in Figure S2, revealed that in   Our construction of perfect gene panels follows a data-
            the smaller external dataset, a single gene identifies 98%   driven approach to gene expression profiles that do
            of the tumor samples, and the addition of a second gene   not require prior domain knowledge of the biological
            completes the panel, achieving maximal sensitivity and   relevance of individual genes in a given tissue. These
            specificity  without requiring  TRIM27.  In contrast,  for   panels have an apparent value as candidate combinatorial
            the larger TCGA dataset, the first gene alone covers only   biomarkers for diagnosis, which could be further enhanced
            95% of tumors, and the two-gene panel still leaves 1% of   by incorporating information about gene ontology and
            samples unclassified. In that case,  TRIM27 is necessary   function into our data mining process.
            to achieve full classification. These observations suggest   In addition, the perfect T-gene panels could be leveraged
            rare tumor variants emerge only in larger datasets. Their   in tumor taxonomy. Typically, tumor classification and
            low frequency means that they are often absent in smaller   the associated therapeutic decisions are made based on
            cohorts, where simpler panels may suffice.         the most frequently mutated genes in a given tumor (for
              For illustration, a hypothetical cohort of 5,000 tumor   example, Ruiz-Cordero et al.,  for lung cancer). However,
                                                                                      61
            samples is also considered in Figure S2. In that scenario,   the classification is often incomplete, with a subset of
            the 3-gene panel covers 99.7% of tumors, indicating that   tumors  assigned  to the so-called “wild-type” category,
            a fourth gene would likely be needed to achieve complete   meaning that none of the genes in the reference panel
            coverage. The figure also shows that saturation is reached   are mutated. In our framework, any perfect T-gene panel
            very quickly: the number of classified tumor samples   enables a complete classification of tumors by providing the
            increases steeply with the addition of genes to the panel.   list of dysregulated genes in each tumor sample. Moreover,
            This strongly supports our assertion that a small number   since multiple perfect panels may exist for a given tissue,
            of genes can effectively capture the global state of the Gene   tumors could be fully classified under different but equally
            Regulatory Network, consistent with the effective reduced   valid criteria.
            dimensionality of the tumor manifold. 51             Consider, for example, the only-T-above panel for
              In summary, the expression distribution functions   LUAD, examined above. Both ALDH10A1  and  PYCR1
            used to define the panels depend on the sample set size.   genes,  related  to  glutamine  metabolism,  are  known  to
            When the sample size reaches the order of hundreds, the   play an important role in lung cancer. 62,63  The taxonomy
            distribution appears “saturated,” showing only minor   based on this panel indicates that around 98 % of LUAD
            changes when the number of samples is further increased.  tumors  rely  on  glutamine  metabolism  to  foster  cell
                                                               proliferation and induce an immune-suppressive tumor
              This insight allowed us to evaluate how our panels
            would change with an increased number of normal    microenvironment.  In the remaining  2%  of tumors,
                                                               cell proliferation is regulated by  TRIM27 through the
            samples. For instance, assuming that the distribution   SIX homeobox 3-β-catenin signaling pathway.  These
                                                                                                      64
            functions are saturated in BRCA (112 normal samples   statements reflect the known role of these genes and their
            and 1094 tumor samples), we performed re-sampling   dysregulation frequencies in the tumor subpopulation.
            to assess the performance of the six-gene only-T-above   Nevertheless, further research is needed to validate
            panel found for BRCA (Supplementary File) under highly   these findings  and translate  them  into  therapeutic
            imbalanced situations, such as 20 normal samples and 500   recommendations.
            tumor samples. The results, shown in Figure S3, indicate
            that while the panel size tends to decrease in the reduced   Moreover, N-  and T-genes included in the perfect
            sets, notably, two genes from the original panel still classify   panels may have important applications in gene therapy.
            more than 95% of samples in all cases.             Consider, for instance, a gene belonging to both N- and
                                                               T-groups, such as the  AGER gene in LUAD. This gene
              Thus, we expect, for example, that the single-gene
            only-T-above panel found for uterine corpus endometrial   is silenced in tumors and strongly expressed in normal
                                                               samples. What happens if, through a transfection vector,
            carcinoma (23 normal samples) may change as the normal   its  expression were  shifted from  the  N-region  to  the
            sample size grows, but the original gene will continue to   T-region or  vice  versa?  Such an  experiment  has already
            cover at least 85% of the tumor samples.           been conducted on cellular lines,  and the results indicate
                                                                                         65
              It is worth noting that Figure S3 can also be interpreted   a significant change in the proliferation rate and invasion
            as a form of validation of the six-gene only-T-above panel   capacity of both tumor and normal cells. These astonishing
            for BRCA across different experimental conditions.  results warrant further investigation.


            Volume 4 Issue 3 (2025)                         64                           doi: 10.36922/TD025190035
   67   68   69   70   71   72   73   74   75   76   77