Page 102 - AIH-1-3

P. 102

Artificial Intelligence in Health ISM: A new multi-view space-learning model

the simplicial cone Γ contained in the positive orthant to be found, and this number is known for our example
W v
n
of ℝ and generated by the columns of W . In this simplicial datasets). Within each cluster, the class that contains the
v
cone, each view-attribute corresponds to a point, with majority of the points, that is, the main class is identified.
coordinates found in the corresponding row of H . To If two clusters share the same main class, they are merged
v
identify a consensus simplicial cone between the Γ , unless they are not contiguous (the ratio of the distance
W v
NTF decomposes the tensor formed by the W into a sum between the centroids to the intra-cluster distance between
v
of rank-1 tensors (Unit 4). However, for such a points >1). In this case, the non-contiguous clusters are
decomposition to be meaningful, the dimensions defined excluded because they are assigned to the same class,
by the columns of the W must be consistent from one view which should appear homogeneous in the representation.
v
to another. This implies a strong overlap between the Similarly, any cluster that does not contain an absolute
simplicial cones Γ . Such consistency is achieved by the majority is not considered clearly representative of the
W v
multiple zeros found across the columns of H when class to which it is assigned and is excluded from the study.
v
starting the embedding process (Unit 3). These are A global purity index is then calculated for the remaining
“inactive” attributes, as their zero status cannot be changed clusters using Workflow 3. To enhance clarity, the clusters
by multiplicative update rules. They can be interpreted as are visualized using 95% confidence ellipses, while the
anchors ensuring that the W do not deviate significantly classes are represented using distinct colors. In addition to
v
from their common ancestor W, estimated in the the proportion of classes retrieved and the global purity
34
preliminary NMF over concatenated views (Unit 1). The index, the adjusted rand index (ARI), normalized mutual
35
parsimonization process (Unit 2) is designed to ensure that information (NMI) index, and Fowlkes-Mallows score
36
there will be a sufficient number of anchor attributes to (FMS) are also included, along with the factor specificity
rein in the multiplicative updates. index (FSI) and view-mapping sparsity (VSI) defined as
follows:
(b) Workflow 2: Projection of new observations
The FSI reflects the level of factor specificity with
For new observations Y comprising k views, k ≤ m, ISM respect to a given class: A value close to 1 means that only
parameters H*, Q*, and the view-mapping matrix H can one factor contributes significantly to the explanation of
be used to project Y onto the latent ISM components, as the class; while a value close to 0 means that the class is
described in Workflow 2. explained by a large number of factors. This index was
proposed in Huizing et al., but in its original definition,
21
Workflow 2. Projection of new observations
it measures the level of specificity of each factor relative to
Input: New observations Y (k views, k ≤ m), the class. The FSI is defined as the ratio of the maximum
NTF factors H*, Q* and mapping matrix H. specificity observed across all factors over the number of
Output: Estimation of Y*. significant factors. To estimate the number of significant
factors, we use the inverse HHI of all factor indices.
1: Disregard any views in Q*, H that are absent in Y;
2: Apply Unit 3 of Workflow 1 to embed Y with W initialized with The VSI reflects the level of the sparsity of the mapping
ones and with fixed mapping matrix H; matrix H. To obtain the VSI: (i) Estimate, for each view and
3: Apply step 2 of Unit 4 of Workflow 1 to calculate W* with fixed each ISM component, the number of significant loadings,
NTF factors H*, Q* and define the projection of Y on the latent using the inverse HHI; (ii) for each view, define the view-
space as Y* = W*; sparsity as the average sparsity over all ISM components;
Abbreviation: NTF: Non-negative tensor factorization. and (iii) define VSI as the average view-sparsity over all
views.
(c) Workflow 3: Proof of concept analysis
Multidimensional scaling is applied to achieve the 2D
Each dataset is analyzed using ISM, ILSM, NMF,
MVMDS, GFA, MOFA+, and MOWGLI. PCA is also map projection. MDS uses a simple metric objective to find
a low-dimensional embedding that accurately represents
applied to the concatenated views of the UCI Digits and the distances between points in the latent space. MDS is,
37
Signature 915 datasets, mainly to show the added value of therefore, agnostic to the intrinsic clustering performances
alternative approaches over this widely used method. of the methods that we want to evaluate. Effective
To facilitate interpretation, the transformed data embedding methods, for example, uniform manifold
are projected onto a 2D map before being subjected to approximation and projection (UMAP) or t-distributed
K-means clustering, where k is the known number of stochastic neighbor embedding, are not as optimal for
classes (K-means clustering was chosen for its versatility preserving the global geometric structure in the latent
and simplicity, as it only requires the number of clusters space. For example, a resolution parameter needs to
38

Volume 1 Issue 3 (2024) 96 doi: 10.36922/aih.3427

97 98 99 100 101 102 103 104 105 106 107