Page 114 - AIH-1-3
P. 114

Artificial Intelligence in Health                                 ISM: A new multi-view space-learning model




            Table 6. Computational time observed in the TEA‑seq   A-MVC requires a specialized algorithm to select the
            multi‑omic single‑cell data                        anchor points that are best distributed across clusters.
                                                               Since clusters must be sufficiently populated with A-MVC
            Method                                Time (min)   anchors for the method to work, the number of anchors
            MVMDS                                 5.31         must be set higher than the number of clusters. In contrast,
            ISM                                   1.17         ISM attribute anchors are found automatically through the
            ILSM                                  1.31         process of parsimonization. This process requires the setting
            GFA                                   23.14        of a sparsity parameter to relax the reciprocal of the HHI,
            MOFA+                                 19.38        which may otherwise lead to excessive sparsity. In the
            MOWGLI (20%)                          82.31        examples considered in this article, this value is experiment
            NMF                                   0.55         independent and is set to 0.8. Further reducing the sparsity
                                                               parameter risks a lack of overlap between the simplicial
            Note: The parallelization of separate factorizations was not activated for   cones,  potentially  rendering  the  tensor  decomposition
            ILSM, hence the slightly higher computational time compared to ISM.
            Abbreviations: GFA: Group factor analysis; ILSM: Integrated latent   ineffective. Therefore, until more experience is gained with
            sources model; ISM: Integrated sources model; MOFA+:  Multi-Omics   ISM, we do not recommend changing this parameter.
            Factor Analysis+; MOWGLI: Multi-Omics Wasserstein inteGrative
            anaLysIs; MVMDS: Multi-view multidimensional scaling;   Just as NMF and NTF factors are more interpretable
            NMF: Non-negative matrix factorization.            and meaningful due to the non-negativity of their
                                                               loadings, ISM produces latent factors whose interpretation
              However, the proportion of known categories retrieved   is greatly facilitated by the non-negativity and sparsity of
            and other metrics depend on the data being analyzed.   the attribute loadings. This is illustrated by the example
            For  example,  for  the  Reuters data,  only  three out  of  six   of the Signature 915 dataset. It is noteworthy that all non-
            categories are recognized at best using ISM or MVMDS,   negative approaches result in a high sparsity index of the
            suggesting that latent-space-based methods may not be the   view-mapping, in contrast to the mixed-sign approaches.
            most effective approaches with bag-of-words data.    The  ISM  has  only  three  hyperparameters,  which  are

              In contrast to the other approaches studied, MVMDS   very  few  compared  to  alternative  methods:  The  sparsity
            and ISM are the only approaches that perform relatively well   coefficient, the embedding dimension, and the rank
            on all the datasets analyzed, demonstrating their versatility.   dimension. As mentioned, the sparsity coefficient should
            The main advantages of ISM over MVMDS are its speed   be kept at its default value of 0.8. Regarding the rank and
            and increased sparsity in the latent-space representation.   embedding dimensions chosen for the ISM model, an
            Regarding missing data, the ISM implementation uses   objective and natural choice was the known number of
            an NTF package that can handle missing data, unlike   classes for our examples, as we expect each factor to be
                                                               distinctly assigned to a particular class. The only exception
            MVMDS.
                                                               was the UCI digit dataset, where reducing the embedding
              To the best of our knowledge, ISM is the first approach   dimension by one unit significantly reduced the error rate.
            that uses NMF to transform heterogeneous views into   However, this is only possible in a supervised setting where
            a 3D array and then uses NTF to extract consistent   classes are known. More generally, as with all factorization
            information  from  the  transformed  views.  However,   methods, the factorization rank must be determined in
            apparent commonalities with anchor-based MVC methods   advance.
            (A-MVC) are worth mentioning to further illustrate the   This raises the issue of the subjectivity of the choice
            originality of ISM:                                made, especially in an unsupervised setting where cross-
            (i)  In the first step, ISM relies on anchors, akin to A-MVC.   validation cannot be used. For PCA, MVMDS, MOFA+,
               ISM anchors correspond to zero-loading attributes in   and GFA, setting the rank by inspecting the scree plot of
               the latent spaces defined by the H , whereas A-MVC   the variance ratio is indeed a subjective choice due to the
                                           v
               anchors are observations well distributed over existing   variety of possible criteria that can be used to identify
               clusters. Both act as intermediaries to derive either a   an “elbow” in the scree plot. We tried a range of values
               latent space or cluster labels shared by all views.  around the “observed” elbow. The observed changes in
            (ii)  In the second step, ISM applies NTF on the embedded   the close neighborhood metric had no impact on the
               views. A-MVC applies NTF on a tensor of anchor   conclusions about the performance of ISM relative to other
               graphs, albeit with added constraints that ensure   approaches (Tables S1 and S2). Since GFA and MOFA+
               orthogonality and consistency in the cluster labels   include  automatic  rank  detection  (ARD),  increasing
               across all views.                               the rank should not adversely affect performance, as


            Volume 1 Issue 3 (2024)                        108                               doi: 10.36922/aih.3427
   109   110   111   112   113   114   115   116   117   118   119