Page 96 - AIH-1-3
P. 96

Artificial Intelligence in Health                                 ISM: A new multi-view space-learning model



            1. Introduction                                    matrices, sometimes using tensor-based approaches.
                                                                                                            5,6
                                                               However, these clustering approaches cannot be applied
            In machine learning, multi-view data involve multiple   to other tasks, such as dimensionality reduction. This is
            distinct  sets  of  attributes  (“views”)  for  a  common  set  of   because the representations of such similarity matrices do
            observations. In the special case where each view has the   not project the data from multiple views into a common
            same attributes but is considered in different contexts, the   latent space with a small number of common attributes,
            data are a multidimensional array of order three that can   such as underlying factors or concepts.
            be conceptualized as a tensor. For example, an RGB image
            has three color channels: Red, green, and blue, each being   Another strategy, which allows the use of tensor
            a non-negative two-dimensional (2D) matrix in which the   decomposition techniques, starts by selecting representative
            intensity of the respective color is stored for each pixel.   points from the data, known as anchor points. These
            Non-negative tensor factorization (NTF) is a powerful   anchor  points  act  as  intermediaries  to  derive  transition
            latent space representation technique designed to analyze   probabilities from samples to clusters. Within each view,
            non-negative multidimensional arrays of order three or   an anchor graph estimates the probability transition matrix
            more. In the RGB image example, NTF captures both color   from the observations to the anchor points, typically
            and spatial information using non-negative factors, which   by imposing a sum-to-one constraint on non-negative
            can be used for various tasks such as image compression,   similarity  indices  over all anchor  points  for  each point.
            enhancement, segmentation, classification, and fusion. 1  Within each view, the probability transition matrices from
              Unfortunately, NTF cannot be applied to multi-view   anchor points to clusters and from observations to clusters
            data when the views have heterogeneous content with   need to be estimated, together with the clustering labels of
            distinct sets of attributes. For example, a text document   the observations. For this purpose, NTF is applied with an
            can be mapped to different views, such as bag-of-words,   orthogonality constraint on the cluster indicator matrices.
            topic modeling, or sentiment analysis, each with a different   A shadow p-norm constraint ensures that the cluster labels
                                                                                     7-9
            set of attributes. Another example is the transformed the   are consistent across views.  This approach is primarily
            University of California Irvine Pen-Based Recognition of   designed for MVC, as it requires a special algorithm to
            Handwritten Digits (UCI Digits) dataset analyzed in this   select the anchor points that are best distributed across the
            article. In this dataset, the original bitmaps of handwritten   clusters. It should be noted that many MVC approaches do
            digits, extracted from a preprinted form, have been   not involve tensor decomposition techniques. For example,
            subjected to various transformations (e.g., Fourier, profile   fuzzy-model-based robust clustering on multivariate
                                                                                          10
            correlations, Karhunen-Love coefficients, pixel averages   t-mixture distributions (F-MB-T)  uses a t-mixture model
            of images from 2 × 3 windows, Zernike moments, and   in  the  expectation-maximization  algorithm,  resulting  in
            morphological features), resulting in views with very   more robust clustering. Unsupervised multi-view K-means
            different formats unsuitable for the direct application   or fuzzy C-means 11,12  consider a K-means-like membership
            of NTF. Numerous algorithms have been proposed for   architecture across different views. To eliminate the need
            handling  such heterogenous  multi-view data,  some  of   for  a predefined  number  of  clusters,  these  methods add
            which have become popular in the machine learning   penalty terms to construct an unsupervised regularization
            community. For example, the MVLEARN package uses   structure. Starting with each data point forming its own
            the scikit-learn API to make it easily accessible to Python   cluster, an agglomerative process allows such approaches
            users,  while the Multi-Omics Factor Analysis (MOFA and   to be initialization-free.
                2
            MOFA+) Bioconductor packages  are widely used for    This article introduces the integrated sources model
                                       3,4
            the analysis of multi-omics datasets. However, since these   (ISM), which allows NTF to analyze non-negative
            algorithms assume a heterogeneous data structure, they   heterogeneous views, albeit indirectly, by means of a
            do not incorporate NTF’s explicit factorization of a three-  preliminary  embedding  of  the  data  in  a  latent  space
            dimensional (3D) array.                            common to all views. To this end, each view is subjected
              Other methods first convert each view into a similarity   to non-negative matrix factorization (NMF), using a
            matrix  between  the  observations,  using  techniques   simple process that ensures consistency between the NMF
            such as cosine similarity, Euclidean distance, transition   components across all views. This consistency ensures
            probability, or self-representation learning. Since all views   that the embedded views share the same (synthetic)
            refer to the same observations, the similarity matrices   attributes, forming a non-negative 3D array that can
            have the same shape regardless of the view they originate   be analyzed by NTF. Our goal in pursuing this strategy
            from, resulting in a tensor of similarity matrices. Multi-  is to directly benefit from the proven performance and
            view clustering (MVC) is performed on these similarity   convergence properties of the NMF and NTF algorithms,


            Volume 1 Issue 3 (2024)                         90                               doi: 10.36922/aih.3427
   91   92   93   94   95   96   97   98   99   100   101