Page 115 - AIH-1-3
P. 115
Artificial Intelligence in Health ISM: A new multi-view space-learning model
it can be automatically reduced if the ARD criteria are consensus PCA. However, this property reaches its limits
met. Notably, for both experiments, increasing the when view sizes are extremely unbalanced, as seen in the
chosen rank decreased performance in terms of cluster prokaryotic dataset. In such cases, it is recommended
association with known classes. This again illustrates to use ILSM, as ISM is applied to transformed views of
the difficulty of choosing the “right” rank. However, equal size, giving equal weight to the original views with
non-negative factorization-based methods, including the smallest size, whereas global factorization tends to
ISM, are not subject to orthogonality constraints and ignore them at initialization. In addition, ILSM requires
can, therefore, create a new dimension by, for example, significantly less computational time due to parallelizable
splitting a given component into two parts to disentangle view factorizations.
close mechanisms that are otherwise intertwined in that Recently, graph transformers and deep learning
component. For this reason, the rank could be set to the approaches have been proposed for the inference of
40
number of known classes in a more logical and objective biological single-cell networks. The preliminary NMF
45
way. Finding the correct rank is, therefore, less critical in Unit 1 of Workflow 1, which combines the data before
than with mixed signed factorization approaches such as the application of NTF, is somewhat reminiscent of the
singular value decomposition (SVD), where low-variance “attention” mechanism used in transformers before the
components tend to represent the noisy part of the data. application of a lightweight neural network. This could
46
However, multiple solutions have been proposed, among explain why ISM can outperform NTF when applied to
which the cophenetic correlation coefficient is widely a multidimensional array, even if the data structure is
used to estimate a rank that provides the most stable suitable for the direct application of NTF, as shown by
clustering derived from the NMF components. A similar the clustering of marker genes achieved in the Signature
41
criterion, named concordance, has been proposed, 915 dataset example. This also explains why, in the first
31
where extensive simulations showed that NMF finds the two examples, although NMF is close to ISM in terms of
most stable solutions around the correct rank, even if purity index and other metrics, ISM outperforms NMF in
the latent factors are strongly correlated. While such an terms of the number of classes detected and, in the second
approach could be used with ISM to determine the best example, by generating a better positioning of the detected
combination for the preliminary embedding and latent cell types on the 2D map projection. Likewise, in the multi-
space dimensions, it would become too computationally omic single-cell TEA-seq dataset, only ISM identifies and
intensive. However, in line with the fact that embedding places a naïve cell subtype next to the most biologically
and latent spaces are later merged in the ISM workflow, relevant one.
it can still be applied in the case where the model Like other latent space methods, ISM is not limited to
imposes the same dimensions for both parameters. As the purpose of MVC. The ISM components and the view-
demonstrated in the proof-of-concept analysis of our mapping matrix can be used for data reduction on newly
examples, the embedding dimension can be further collected data (i.e., data that is not part of the data used
optimized by examining the approximation error in the to train/learn the model) by fixing these components in
neighborhood of the chosen rank. the ISM model. Data reduction for newly collected data
Redundancy in the latent factors is a known issue for remains feasible even if some of the views contained in
NMF-based techniques, as identified and illustrated early the training data are missing, as the ISM parameters are
on with Donoho’s swimmer dataset, where a ghost torso compartmentalized by views.
appeared in all basis vectors representing body parts in The ISM is not limited to views with non-negative data.
different orientations. L1 regularization techniques, Each mixed-signed view can be split into its positive part
32
such as using Hoyer’s sparsity index 42,43 or appropriate and the absolute value of its negative part, resulting in
initialization like non-negative SVD (NNSVD), can help two different non-negative views, as illustrated in the UCI
44
mitigate these problems. Notably, in our ISM workflow Digits and prokaryotic data examples.
implementation, the HHI used in the embedding step is
mathematically equivalent to Hoyer’s sparsity index, and An important limitation of ISM and other multi-
NNSVD is used for NMF and NTF initialization. view latent space approaches is the requirement for the
availability of multi-view data for all observations in the
ISM’s intrinsic view loadings also enable the automatic training set. For financial or logistical reasons, a particular
weighting of views within each latent factor. This allows view may be missing in a subset of the observations,
the simultaneous analysis of views of very different sizes and this subset may vary depending on the view under
without the need for prior normalization to give each view consideration. We are currently developing a variant of
the same importance, as is necessary with methods like ISM that can process multi-view data with missing views.
Volume 1 Issue 3 (2024) 109 doi: 10.36922/aih.3427

