Page 102 - AIH-1-3
P. 102

Artificial Intelligence in Health                                 ISM: A new multi-view space-learning model



            the simplicial cone  Γ   contained in the positive orthant   to be found, and this number is known for our example
                             W v
               n
            of ℝ  and generated by the columns of W . In this simplicial   datasets). Within each cluster, the class that contains the
                                            v
            cone, each view-attribute corresponds to a point, with   majority of the points, that is, the main class is identified.
            coordinates  found  in  the  corresponding  row  of  H .  To   If two clusters share the same main class, they are merged
                                                      v
            identify a consensus simplicial cone between the  Γ  ,   unless they are not contiguous (the ratio of the distance
                                                        W v
            NTF decomposes the tensor formed by the W  into a sum   between the centroids to the intra-cluster distance between
                                                 v
            of rank-1 tensors (Unit  4). However, for such a   points >1). In this case, the non-contiguous clusters are
            decomposition to be meaningful, the dimensions defined   excluded because they are assigned to the same class,
            by the columns of the W  must be consistent from one view   which should appear homogeneous in the representation.
                               v
            to another. This implies a strong overlap between the   Similarly, any cluster that does not contain an absolute
            simplicial cones  Γ  . Such consistency is achieved by the   majority is not considered clearly representative of the
                           W v
            multiple  zeros  found  across  the  columns  of  H   when   class to which it is assigned and is excluded from the study.
                                                    v
            starting  the embedding process (Unit  3). These are   A global purity index is then calculated for the remaining
            “inactive” attributes, as their zero status cannot be changed   clusters using Workflow 3. To enhance clarity, the clusters
            by multiplicative update rules. They can be interpreted as   are visualized using 95% confidence ellipses, while the
            anchors ensuring that the W  do not deviate significantly   classes are represented using distinct colors. In addition to
                                   v
            from their common ancestor  W, estimated in  the   the proportion of classes retrieved and the global purity
                                                                                            34
            preliminary NMF over concatenated views (Unit 1). The   index, the adjusted rand index (ARI),  normalized mutual
                                                                                     35
            parsimonization process (Unit 2) is designed to ensure that   information (NMI) index,  and Fowlkes-Mallows score
                                                                    36
            there will be a sufficient number of anchor attributes to   (FMS)  are also included, along with the factor specificity
            rein in the multiplicative updates.                index (FSI) and view-mapping sparsity (VSI) defined as
                                                               follows:
            (b)  Workflow 2: Projection of new observations
                                                                 The FSI reflects the level of factor specificity with
              For new observations Y comprising k views, k ≤ m, ISM   respect to a given class: A value close to 1 means that only
            parameters H*, Q*, and the view-mapping matrix H can   one factor contributes significantly to the explanation of
            be used to project Y onto the latent ISM components, as   the class; while a value close to 0 means that the class is
            described in Workflow 2.                           explained  by  a  large  number  of  factors.  This  index  was
                                                               proposed in Huizing et al.,  but in its original definition,
                                                                                     21
            Workflow 2. Projection of new observations
                                                               it measures the level of specificity of each factor relative to
            Input: New observations Y (k views, k ≤ m),        the class. The FSI is defined as the ratio of the maximum
             NTF factors H*, Q* and mapping matrix H.          specificity observed across all factors over the number of
            Output: Estimation of Y*.                          significant factors. To estimate the number of significant
                                                               factors, we use the inverse HHI of all factor indices.
             1: Disregard any views in Q*, H that are absent in Y;
             2: Apply Unit 3 of Workflow 1 to embed Y with W initialized with   The VSI reflects the level of the sparsity of the mapping
             ones and with fixed mapping matrix H;             matrix H. To obtain the VSI: (i) Estimate, for each view and
             3:  Apply step 2 of Unit 4 of Workflow 1 to calculate W* with fixed   each ISM component, the number of significant loadings,
               NTF factors H*, Q* and define the projection of Y on the latent   using the inverse HHI; (ii) for each view, define the view-
               space as Y* = W*;                               sparsity as the average sparsity over all ISM components;
            Abbreviation: NTF: Non-negative tensor factorization.  and (iii) define VSI as the average view-sparsity over all
                                                               views.
            (c)  Workflow 3: Proof of concept analysis
                                                                 Multidimensional scaling is applied to achieve the 2D
              Each dataset is analyzed using ISM, ILSM, NMF,
            MVMDS,  GFA, MOFA+, and MOWGLI. PCA  is also       map projection. MDS uses a simple metric objective to find
                                                               a low-dimensional embedding that accurately represents
            applied to the concatenated views of the UCI Digits and   the distances between points in the latent space.  MDS is,
                                                                                                     37
            Signature 915 datasets, mainly to show the added value of   therefore, agnostic to the intrinsic clustering performances
            alternative approaches over this widely used method.  of the methods that we want to evaluate. Effective
              To facilitate interpretation, the transformed data   embedding methods, for example, uniform manifold
            are projected onto a 2D map before being subjected to   approximation and projection (UMAP) or  t-distributed
            K-means clustering, where k is the known number of   stochastic  neighbor  embedding,  are  not  as  optimal  for
            classes (K-means clustering was chosen for its versatility   preserving the global geometric structure in the latent
            and simplicity, as it only requires the number of clusters   space.  For example, a resolution parameter needs to
                                                                    38

            Volume 1 Issue 3 (2024)                         96                               doi: 10.36922/aih.3427
   97   98   99   100   101   102   103   104   105   106   107