Page 115 - AIH-1-3
P. 115

Artificial Intelligence in Health                                 ISM: A new multi-view space-learning model



            it can be automatically reduced if the ARD criteria are   consensus PCA. However, this property reaches its limits
            met. Notably, for both experiments, increasing the   when view sizes are extremely unbalanced, as seen in the
            chosen rank decreased performance in terms of cluster   prokaryotic dataset. In such cases, it is recommended
            association with known classes. This again illustrates   to use ILSM, as ISM is applied to transformed views of
            the difficulty of choosing the “right” rank. However,   equal size, giving equal weight to the original views with
            non-negative factorization-based methods, including   the smallest size, whereas global factorization tends to
            ISM, are not subject to orthogonality constraints and   ignore them at initialization. In addition, ILSM requires
            can, therefore, create a new dimension by, for example,   significantly less computational time due to parallelizable
            splitting a given component into two parts to disentangle   view factorizations.
            close mechanisms that are otherwise intertwined in that   Recently,  graph  transformers  and  deep  learning
            component.  For this reason, the rank could be set to the   approaches have been proposed for the inference of
                      40
            number of known classes in a more logical and objective   biological single-cell networks.   The preliminary NMF
                                                                                        45
            way. Finding  the correct rank  is, therefore,  less  critical   in Unit 1 of Workflow 1, which combines the data before
            than with mixed signed factorization approaches such as   the application of NTF, is somewhat reminiscent of the
            singular value decomposition (SVD), where low-variance   “attention” mechanism used in transformers before the
            components tend to represent the noisy part of the data.   application of a lightweight neural network.  This could
                                                                                                   46
            However, multiple solutions have been proposed, among   explain why ISM can outperform NTF when applied to
            which the cophenetic correlation coefficient is widely   a multidimensional array, even if the data structure is
            used to estimate a rank that provides the most stable   suitable for the direct application of NTF, as shown by
            clustering derived from the NMF components.  A similar   the clustering of marker genes achieved in the Signature
                                                 41
            criterion,  named concordance, has  been proposed,    915 dataset example. This also explains why, in the first
                                                         31
            where extensive simulations showed that NMF finds the   two examples, although NMF is close to ISM in terms of
            most stable solutions around the correct rank, even if   purity index and other metrics, ISM outperforms NMF in
            the latent factors are strongly correlated. While such an   terms of the number of classes detected and, in the second
            approach could be used with ISM to determine the best   example, by generating a better positioning of the detected
            combination for  the preliminary embedding  and latent   cell types on the 2D map projection. Likewise, in the multi-
            space dimensions, it would become too computationally   omic single-cell TEA-seq dataset, only ISM identifies and
            intensive. However, in line with the fact that embedding   places a naïve cell subtype next to the most biologically
            and latent spaces are later merged in the ISM workflow,   relevant one.
            it  can  still  be  applied  in  the  case  where  the  model   Like other latent space methods, ISM is not limited to
            imposes the same dimensions for both parameters. As   the purpose of MVC. The ISM components and the view-
            demonstrated in the proof-of-concept analysis of our   mapping matrix can be used for data reduction on newly
            examples, the embedding dimension can be further   collected data (i.e., data that is not part of the data used
            optimized by examining the approximation error in the   to train/learn the model) by fixing these components in
            neighborhood of the chosen rank.                   the ISM model. Data reduction for newly collected data

              Redundancy in the latent factors is a known issue for   remains feasible even if some of the views contained in
            NMF-based techniques, as identified and illustrated early   the training data are missing, as the ISM parameters are
            on with Donoho’s swimmer dataset, where a ghost torso   compartmentalized by views.
            appeared in all basis vectors representing body parts in   The ISM is not limited to views with non-negative data.
            different orientations.  L1 regularization techniques,   Each mixed-signed view can be split into its positive part
                              32
            such  as  using  Hoyer’s sparsity index 42,43  or appropriate   and the absolute value of its negative part, resulting in
            initialization like non-negative SVD (NNSVD),  can help   two different non-negative views, as illustrated in the UCI
                                                  44
            mitigate these problems. Notably, in our ISM workflow   Digits and prokaryotic data examples.
            implementation, the HHI used in the embedding step is
            mathematically equivalent to Hoyer’s sparsity index, and   An important limitation of ISM and other multi-
            NNSVD is used for NMF and NTF initialization.      view latent space approaches is the requirement for the
                                                               availability of multi-view data for all observations in the
              ISM’s intrinsic view loadings also enable the automatic   training set. For financial or logistical reasons, a particular
            weighting of views within each latent factor. This allows   view may be missing in a subset of the observations,
            the simultaneous analysis of views of very different sizes   and this subset may vary depending on the view under
            without the need for prior normalization to give each view   consideration.  We  are  currently  developing  a  variant  of
            the same importance, as is necessary with methods like   ISM that can process multi-view data with missing views.


            Volume 1 Issue 3 (2024)                        109                               doi: 10.36922/aih.3427
   110   111   112   113   114   115   116   117   118   119   120