Decomposition

Multiview ICA

class mvlearn.decomposition.MultiviewICA(n_components=None, noise=1.0, max_iter=1000, init='permica', multiview_output=True, random_state=None, tol=0.001, verbose=False, n_jobs=30)[source]

Multiview ICA for which views share a common source but separate mixing matrices.

Parameters
  • n_components (int, optional) -- Number of components to extract. If None, n_components is set to the minimum number of features in the dataset.

  • noise (float, default=1.0) -- Gaussian noise level

  • max_iter (int, default=1000) -- Maximum number of iterations to perform

  • init ({'permica', 'groupica'} or np array of shape) -- (n_groups, n_components, n_components), default='permica' If permica: initialize with perm ICA, if groupica, initialize with group ica. Else, use the provided array to initialize.

  • preproc ('pca' or a ViewTransformer-like instance,) -- default='pca' Preprocessing method to use to reduce data. If "pca", performs PCA separately on each view to reduce dimension of each view. Otherwise the dimension reduction is performed using the transform method of the ViewTransformer-like object. This instance also needs an inverse transform method to recover original data from reduced data.

  • multiview_output (bool, optional (default True)) -- If True, the .transform method returns one dataset per view. Otherwise, it returns one dataset, of shape (n_samples, n_components)

  • random_state (int, RandomState instance or None, default=None) -- Used to perform a random initialization. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

  • tol (float, default=1e-3) -- A positive scalar giving the tolerance at which the un-mixing matrices are considered to have converged.

  • verbose (bool, default=False) -- Print information

pcas_

The fitted ViewTransformer used to reduce the data. The ViewTransformer is given by ViewTransformer(PCA(n_components=n_components)) where n_components is the number of chosen. Only used if n_components is not None.

Type

ViewTransformer instance

mixing_

The square mixing matrices, linking preprocessed data and the independent components.

Type

array, shape (n_views, n_components, n_components)

pca_components_

Principal axes in feature space, representing the directions of maximum variance in the data. Only used if n_components is not None.

Type

array, shape (n_components, n_features)

components_

The square unmixing matrices

Type

array, shape (n_views, n_components, n_components)

individual_components_

Individual unmixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_components, n_features) where n_features is the number of features in the dataset i.

Type

list of array

individual_mixing_

Individual mixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_features, n_components) where n_features is the number of features in the dataset i.

Type

list of array

See also

groupica

Notes

Given each view \(X_i\) It optimizes:

\[l(W) = \frac{1}{T} \sum_{t=1}^T [\sum_k log(cosh(Y_{avg,k,t})) + \sum_i l_i(X_{i,.,t})]\]

where

\[l _i(X_{i,.,t}) = - log(|W_i|) + 1/(2 \sigma) ||X_{i,.,t}W_i - Y_{avg,.,t}||^2,\]

\(W_i\) is the mixing matrix for view \(i\), \(Y_{avg} = \frac{1}{n} \sum_{i=1}^n X_i W_i\), and \(\sigma\) is the noise level.

References

1

Hugo Richard, et al. "Modeling shared responses in neuroimaging studies through multiview ICA." In Pre-proceedings of Advances in Neural Information Processing Systems, volume 33, 2020

Examples

>>> from mvlearn.datasets import load_UCImultifeature
>>> from mvlearn.decomposition import MultiviewICA
>>> Xs, _ = load_UCImultifeature()
>>> ica = MultiviewICA(n_components=3, max_iter=10)
>>> sources = ica.fit_transform(Xs)
>>> print(sources.shape)
(6, 2000, 3)

Group ICA

class mvlearn.decomposition.GroupICA(n_components=None, n_individual_components='auto', multiview_output=True, prewhiten=False, solver='picard', ica_kwargs={}, random_state=None)[source]

Group Independent component analysis.

Each dataset in Xs is reduced with usual PCA (this step is optional). Then, datasets are concatenated in the features direction, and a PCA is performed on this matrix, yielding a single dataset. ICA is finally performed yielding the output dataset S. The unmixing matrix W corresponding to data X are obtained by solving argmin_{W} ||S - WX||^2.

Parameters
  • n_components (int, optional) -- Number of components to extract. If None, n_components is set to the minimum number of features in the dataset

  • n_individual_components (int or list of int or 'auto', optional) -- The number of individual components to extract as a preprocessing. If None, no preprocessing is applied. If an int, each dataset is reduced to this dimension. If a list, the dataset i is reduced to the dimension n_individual_components[i]. If 'auto', set to the minimum between n_components and the smallest number of features in each dataset.

  • multiview_output (bool, optional (default True)) -- If True, the .transform method returns one dataset per view. Otherwise, it returns one dataset, of shape (n_samples, n_components)

  • prewhiten (bool, optional (default False)) -- Whether the data should be whitened after the original preprocessing.

  • solver (str {'picard', 'fastica'}) -- Chooses which ICA solver to use. picard is generally faster and more reliable.

  • ica_kwargs (dict) -- Optional keyword arguments for the ICA solver. If solver='fastica', see the documentation of sklearn.decomposition.fastica. If solver='picard', see the documentation of picard.picard.

  • random_state (int, RandomState instance, default=None) -- Controls the random number generator used in the estimator. Pass an int for reproducible output across multiple function calls.

means_

The mean of each dataset

Type

list of arrays of shape (n_components,)

grouppca_

A GroupPCA class for preprocessing and dimension reduction

Type

mvlearn.decomposition.GroupPCA instance

mixing_

The square mixing matrix, linking the output of the group-pca and the independent components.

Type

array, shape (n_components, n_components)

components_

The square unmixing matrix

Type

array, shape (n_components, n_components)

individual_components_

Individual unmixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_components, n_features) where n_features is the number of features in the dataset i.

Type

list of array

individual_mixing_

Individual mixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_features, n_components) where n_features is the number of features in the dataset i.

Type

list of array

n_components_

The estimated number of components.

Type

int

n_features_

Number of features in each training dataset.

Type

list of int

n_samples_

Number of samples in the training data.

Type

int

n_views_

Number of views in the training data

Type

int

See also

GroupPCA, multiviewica

References

2

Vince D Calhoun, et al. "A method for making group inferences from functional MRI data using independent component analysis." Human Brain Mapping, 14(3):140–151, 2001.

Examples

>>> from mvlearn.datasets import load_UCImultifeature
>>> from mvlearn.decomposition import GroupICA
>>> Xs, _ = load_UCImultifeature()
>>> ica = GroupICA(n_components=3)
>>> Xs_transformed = ica.fit_transform(Xs)
>>> print(len(Xs_transformed))
6
>>> print(Xs_transformed[0].shape)
(2000, 3)

Group PCA

class mvlearn.decomposition.GroupPCA(n_components=None, n_individual_components='auto', multiview_output=True, prewhiten=False, whiten=False, random_state=None)[source]

Group Principal Component Analysis.

As an optional preprocessing, each dataset in Xs is reduced with usual PCA. Then, datasets are concatenated in the features direction, and a PCA is performed on this matrix, yielding a single output dataset. Linear coefficients linking the output dataset and each view are computed using least squares estimation, which permits to return one dataset per view.

Parameters
  • n_components (int, optional) -- Number of components to extract. If None, n_components is set to the minimum number of features in the dataset.

  • n_individual_components (int or list of int or 'auto', optional) -- The number of individual components to extract as a preprocessing. If None, no preprocessing is applied. If an int, each dataset is reduced to this dimension. If a list, the dataset i is reduced to the dimension n_individual_components[i]. If 'auto', set to the minimum between n_components and the smallest number of features in each dataset.

  • multiview_output (bool, optional (default True)) -- If True, the .transform method returns one dataset per view. Otherwise, it returns one dataset, of shape (n_samples, n_components)

  • prewhiten (bool, optional (default False)) -- Whether the data should be whitened after the original preprocessing.

  • whiten (bool, optional (default False)) -- When True (False by default) the components_ vectors are multiplied by the square root of n_samples and then divided by the singular values to ensure uncorrelated outputs with unit component-wise variances. Whitening will remove some information from the transformed signal (the relative variance scales of the components) but can sometime improve the predictive accuracy of the downstream estimators by making their data respect some hard-wired assumptions.

  • random_state (int, RandomState instance, default=None) -- Used when svd_solver == 'arpack' or 'randomized'. Pass an int for reproducible results across multiple function calls.

components_

Principal axes in feature space, representing the directions of maximum variance in the data. The components are sorted by explained_variance_. n_total_features is the sum of all the number of input features.

Type

array, shape (n_components, n_total_features)

explained_variance_

The amount of variance explained by each of the selected components.

Type

array, shape (n_components,)

explained_variance_ratio_

Percentage of variance explained by each of the selected components. If n_components is not set then all components are stored and the sum of the ratios is equal to 1.0.

Type

array, shape (n_components,)

individual_components_

Individual components for each individual PCA. individual_components_[i] is an array of shape (n_individual_components, n_features) where n_features is the number of features in the dataset i.

Type

list of array

individual_explained_variance_

Individual explained variance for each individual PCA. individual_explained_variance_[i] is an array of shape (n_individual_components, ).

Type

list of array

individual_explained_variance_ratio_

Individual explained variance ratio for each individual PCA. individual_explained_variance_ratio_[i] is an array of shape (n_individual_components, ) where n_features is the number of features in the dataset i.

Type

list of array

individual_mean_

Mean of each dataset, estimated on the training data individual_mean_[i] is an array of shape (n_features) where n_features is the number of features in the dataset i.

Type

list of array

individual_projections_

List containing the linear transform linking the dataset to the output. Xs[i].dot(individual_projections_[i].T) gives the estimated reduced dataset for view i. This is obtained by least squares estimation.

Type

list of array

individual_embeddings_

List containing the pseudo-inverses of individual_projections_. Allows to recover each original dataset from reduced data.

Type

list of array

n_components_

The estimated number of components.

Type

int

n_features_

Number of features in each training dataset.

Type

list of int

n_samples_

Number of samples in the training data.

Type

int

n_views_

Number of views in the training data

Type

int

References

3

Vince D Calhoun, et al. "A method for making group inferences from functional MRI data using independent component analysis." Human Brain Mapping, 14(3):140–151, 2001.

Examples

>>> from mvlearn.datasets import load_UCImultifeature
>>> from mvlearn.decomposition import GroupPCA
>>> Xs, _ = load_UCImultifeature()
>>> pca = GroupPCA(n_components=3)
>>> Xs_transformed = pca.fit_transform(Xs)
>>> print(len(Xs_transformed))
6
>>> print(Xs_transformed[0].shape)
(2000, 3)

Angle-Based Joint and Individual Variation Explained (AJIVE)

AJIVE

class mvlearn.decomposition.AJIVE(init_signal_ranks=None, joint_rank=None, individual_ranks=None, n_elbows=2, reconsider_joint_components=True, wedin_percentile=5, n_wedin_samples=1000, randdir_percentile=95, n_randdir_samples=1000, verbose=False, n_jobs=None, random_state=None)[source]

An implementation of Angle-based Joint and Individual Variation Explained 4. This algorithm takes multiple views and decomposes them into 3 distinct matrices representing:

  • Low rank approximation of joint variation between views

  • Low rank approximation of individual variation within each view

  • Residual noise

AJIVE can handle any number of views, not just two.

Parameters
  • init_signal_ranks (list, default = None) -- Initial guesses as to the rank of each view's signal.

  • joint_rank (int, default = None) -- Rank of the joint variation matrix. If None, will estimate the joint rank.

  • individual_ranks (list, default = None) -- Ranks of individual variation matrices. If None, will estimate the individual ranks. Otherwise, will use provided individual ranks.

  • n_elbows (int, optional, default = 2) -- If init_signal_ranks=None, then computes the initial signal rank guess using mvlearn.embed.utils.select_dimension() with n_elbows for each view.

  • reconsider_joint_components (boolean, default = True) -- Triggers _reconsider_joint_components function to run and removes columns of joint scores according to identifiability constraint.

  • wedin_percentile (int, default = 5) -- Percentile used for wedin (lower) bound cutoff for squared singular values used to estimate joint rank.

  • n_wedin_samples (int, default = 1000) -- Number of wedin bound samples to draw.

  • randdir_percentile (int, default = 95) -- Percentile for random direction (lower) bound cutoff for squared singular values used to estimate joint rank.

  • n_randdir_samples (int, default = 1000) -- Number of random direction samples to draw.

  • verbose (boolean, default = False) -- Prints information during runtime if True.

  • n_jobs (int (positive) or None, default=None) -- The number of jobs to run in parallel. None will run 1 job, -1 uses all processors.

  • random_state (int, RandomState instance or None, default=None) -- Used to seed a random initialization for reproducible results. If None, a random initialization is used.

means_

The means of each view.

Type

numpy.ndarray

sv_thresholds_

The singular value thresholds for each view based on initial SVD. Used to estimate the individual ranks.

Type

list of floats

all_joint_svals_

All singular values from the concatenated joint matrix.

Type

list of floats

random_sv_samples_

Random singular value samples from random direction bound.

Type

list of floats

rand_cutoff_

Singular value squared cutoff for the random direction bound.

Type

float

wedin_samples_

The wedin samples for each view.

Type

list of numpy.ndarray

wedin_cutoff_

Singular value squared cutoff for the wedin bound.

Type

float

svalsq_cutoff_

max(rand_cutoff_, wedin_cutoff_)

Type

float

joint_rank_wedin_est_

The joint rank estimated using the wedin/random direction bound

Type

int

init_signal_ranks_

Provided or estimated init_signal_ranks

Type

list of ints

joint_rank_

The rank of the joint matrix

Type

int

joints_scores_

Left singular vectors of the joint matrix

Type

numpy.ndarray

individual_ranks_

Ranks of the individual matrices for each view.

Type

list of ints

individual_scores_

Left singular vectors of each view after joint matrix is removed

Type

list of numpy.ndarrays

individual_svals_ = list of numpy.ndarrays

Singular values of each view after joint matrix is removed

individual_loadings_

Right singular vectors of each view after joint matrix is removed

Type

list of numpy.ndarrays

individual_mats_

Individual matrices for each view, reconstructed from the SVD

Type

list of numpy.ndarrays

Notes

Angle-Based Joint and Individual Variation Explained (AJIVE) is a specfic variation of the Joint and Individual Variation Explained (JIVE) algorithm. This algorithm takes \(k\) different views with \(n\) observations and \(d\) variables and finds a basis that represents the joint variation and \(k\) bases with their own ranks representing the individual variation of each view. Each of these individual bases is orthonormal to the joint basis. These bases are then used to create the following \(k\) statements:

\[X^{(i)}= J^{(i)} + I^{(i)} + E^{(i)}\]

where \(X^{(i)}\) represents the i-th view of data and \(J^{(i)}\), \(I^{(i)}\), and \(E^{(i)}\) represent its joint, individual, and noise signal estimates respectively.

The AJIVE algorithm calculations can be split into three seperate steps:
  • Signal Space Initial Extraction

  • Score Space Segmentation

  • Final Decomposition and Outputs

In the Signal Space Initial Extraction step we compute a rank \(r_{initial}^{(i)}\) singular value decomposition for each \(X^{(i)}\), the value of \(r_{initial}^{(i)}\) can be found by looking at the scree plots of each view or thresholding based on singular value. From each singular value decomposition, the first \(r_{initial}^{(i)}\) columns of of the scores matrix (\(U^{(i)}\)) are taken to form \(\widetilde{U}^{(i)}\).

After this, the Score Space Segmentation step concatenates the k \(\widetilde{U}^{(i)}\) matrices found in the first step as follows:

\[M = [\widetilde{U}^{(1)}, \dots, \widetilde{U}^{(k)}]\]

From here, the \(r_{joint}\) singular value decomposition is taken. \(r_{joint}\) is estimated individually or using the wedin bound which quantifies how the theoretical singular subspaces are affected by noise as thereby quantifying the distance between rank of the original input and the estimation. For the scores of the singular value decomposition of \(M\), the first \(r_{joint}\) columns are taken to obtain the basis, \(U_{joint}\). The \(J^{(i)}\) matrix (joint signal estimate) can be found by projecting \(X^{(i)}\) onto \(U_{joint}\).

In the Final Decomposition and Outputs step, we project each \(X^{i}\) matrix onto the orthogonal complement of \(U_{joint}\):

\[X^{(i), orthog} = (I - U_{joint}U_{joint}^T)X^{(i)}\]

\(I\) in the above equation represents the identity matrix. From here, the \(I^{(i)}\) matrix (individual signal estimate) can be found by performing the rank \(r_{individual}^{(i)}\) singular value decomposition of \(X^{(i), orthog}\). \(r_{individual}^{(i)}\) can be found by using the aforementioned singular value thresholding method.

Finally, we can solve for the noise estimates, \(E^{(i)}\) by using the equation:

\[E^{(i)}= X^{(i)} - (J^{(i)} + I^{(i)})\]

Much of this implementation has been adapted from Iain Carmichael Ph.D.'s pip-installable package, jive, the code for which is linked here.

References

4

Qing Feng, Meilei Jiang, Jan Hannig, and J.S. Marron. "Angle-based joint and individual variation explained." Journal of Multivariate Analysis, 166:241–265, 2018.

Examples

>>> from mvlearn.factorization.ajive import AJIVE
>>> from mvlearn.datasets import load_UCImultifeature
>>> Xs, _ = load_UCImultifeature()
>>> print(len(Xs)) # number of views
6
>>> print(Xs[0].shape, Xs[1].shape) # number of samples in each view
(2000, 76) (2000, 216)
>>> ajive = AJIVE()
>>> Xs_transformed = ajive.fit_transform(Xs)
>>> print(Xs_transformed[0].shape)
(2000, 76)