Decomposition¶
Multiview ICA¶
- class mvlearn.decomposition.MultiviewICA(n_components=None, noise=1.0, max_iter=1000, init='permica', multiview_output=True, random_state=None, tol=0.001, verbose=False, n_jobs=30)[source]¶
Multiview ICA for which views share a common source but separate mixing matrices.
- Parameters
n_components (int, optional) -- Number of components to extract. If None, n_components is set to the minimum number of features in the dataset.
noise (float, default=1.0) -- Gaussian noise level
max_iter (int, default=1000) -- Maximum number of iterations to perform
init ({'permica', 'groupica'} or np array of shape) -- (n_groups, n_components, n_components), default='permica' If permica: initialize with perm ICA, if groupica, initialize with group ica. Else, use the provided array to initialize.
preproc ('pca' or a ViewTransformer-like instance,) -- default='pca' Preprocessing method to use to reduce data. If "pca", performs PCA separately on each view to reduce dimension of each view. Otherwise the dimension reduction is performed using the transform method of the ViewTransformer-like object. This instance also needs an inverse transform method to recover original data from reduced data.
multiview_output (bool, optional (default True)) -- If True, the .transform method returns one dataset per view. Otherwise, it returns one dataset, of shape (n_samples, n_components)
random_state (int, RandomState instance or None, default=None) -- Used to perform a random initialization. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
tol (float, default=1e-3) -- A positive scalar giving the tolerance at which the un-mixing matrices are considered to have converged.
verbose (bool, default=False) -- Print information
- pcas_¶
The fitted ViewTransformer used to reduce the data. The ViewTransformer is given by ViewTransformer(PCA(n_components=n_components)) where n_components is the number of chosen. Only used if n_components is not None.
- Type
ViewTransformer instance
- mixing_¶
The square mixing matrices, linking preprocessed data and the independent components.
- Type
array, shape (n_views, n_components, n_components)
- pca_components_¶
Principal axes in feature space, representing the directions of maximum variance in the data. Only used if n_components is not None.
- Type
array, shape (n_components, n_features)
- components_¶
The square unmixing matrices
- Type
array, shape (n_views, n_components, n_components)
- individual_components_¶
Individual unmixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_components, n_features) where n_features is the number of features in the dataset i.
- Type
list of array
- individual_mixing_¶
Individual mixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_features, n_components) where n_features is the number of features in the dataset i.
- Type
list of array
See also
groupica
Notes
Given each view \(X_i\) It optimizes:
\[l(W) = \frac{1}{T} \sum_{t=1}^T [\sum_k log(cosh(Y_{avg,k,t})) + \sum_i l_i(X_{i,.,t})]\]where
\[l _i(X_{i,.,t}) = - log(|W_i|) + 1/(2 \sigma) ||X_{i,.,t}W_i - Y_{avg,.,t}||^2,\]\(W_i\) is the mixing matrix for view \(i\), \(Y_{avg} = \frac{1}{n} \sum_{i=1}^n X_i W_i\), and \(\sigma\) is the noise level.
References
- 1
Hugo Richard, et al. "Modeling shared responses in neuroimaging studies through multiview ICA." In Pre-proceedings of Advances in Neural Information Processing Systems, volume 33, 2020
Examples
>>> from mvlearn.datasets import load_UCImultifeature >>> from mvlearn.decomposition import MultiviewICA >>> Xs, _ = load_UCImultifeature() >>> ica = MultiviewICA(n_components=3, max_iter=10) >>> sources = ica.fit_transform(Xs) >>> print(sources.shape) (6, 2000, 3)
Group ICA¶
- class mvlearn.decomposition.GroupICA(n_components=None, n_individual_components='auto', multiview_output=True, prewhiten=False, solver='picard', ica_kwargs={}, random_state=None)[source]¶
Group Independent component analysis.
Each dataset in Xs is reduced with usual PCA (this step is optional). Then, datasets are concatenated in the features direction, and a PCA is performed on this matrix, yielding a single dataset. ICA is finally performed yielding the output dataset S. The unmixing matrix W corresponding to data X are obtained by solving argmin_{W} ||S - WX||^2.
- Parameters
n_components (int, optional) -- Number of components to extract. If None, n_components is set to the minimum number of features in the dataset
n_individual_components (int or list of int or 'auto', optional) -- The number of individual components to extract as a preprocessing. If None, no preprocessing is applied. If an int, each dataset is reduced to this dimension. If a list, the dataset i is reduced to the dimension n_individual_components[i]. If 'auto', set to the minimum between n_components and the smallest number of features in each dataset.
multiview_output (bool, optional (default True)) -- If True, the .transform method returns one dataset per view. Otherwise, it returns one dataset, of shape (n_samples, n_components)
prewhiten (bool, optional (default False)) -- Whether the data should be whitened after the original preprocessing.
solver (str {'picard', 'fastica'}) -- Chooses which ICA solver to use. picard is generally faster and more reliable.
ica_kwargs (dict) -- Optional keyword arguments for the ICA solver. If solver='fastica', see the documentation of sklearn.decomposition.fastica. If solver='picard', see the documentation of picard.picard.
random_state (int, RandomState instance, default=None) -- Controls the random number generator used in the estimator. Pass an int for reproducible output across multiple function calls.
- means_¶
The mean of each dataset
- Type
list of arrays of shape (n_components,)
- grouppca_¶
A GroupPCA class for preprocessing and dimension reduction
- Type
mvlearn.decomposition.GroupPCA instance
- mixing_¶
The square mixing matrix, linking the output of the group-pca and the independent components.
- Type
array, shape (n_components, n_components)
- components_¶
The square unmixing matrix
- Type
array, shape (n_components, n_components)
- individual_components_¶
Individual unmixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_components, n_features) where n_features is the number of features in the dataset i.
- Type
list of array
- individual_mixing_¶
Individual mixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_features, n_components) where n_features is the number of features in the dataset i.
- Type
list of array
- n_features_¶
Number of features in each training dataset.
- Type
list of int
See also
GroupPCA
,multiviewica
References
- 2
Vince D Calhoun, et al. "A method for making group inferences from functional MRI data using independent component analysis." Human Brain Mapping, 14(3):140–151, 2001.
Examples
>>> from mvlearn.datasets import load_UCImultifeature >>> from mvlearn.decomposition import GroupICA >>> Xs, _ = load_UCImultifeature() >>> ica = GroupICA(n_components=3) >>> Xs_transformed = ica.fit_transform(Xs) >>> print(len(Xs_transformed)) 6 >>> print(Xs_transformed[0].shape) (2000, 3)
Group PCA¶
- class mvlearn.decomposition.GroupPCA(n_components=None, n_individual_components='auto', multiview_output=True, prewhiten=False, whiten=False, random_state=None)[source]¶
Group Principal Component Analysis.
As an optional preprocessing, each dataset in Xs is reduced with usual PCA. Then, datasets are concatenated in the features direction, and a PCA is performed on this matrix, yielding a single output dataset. Linear coefficients linking the output dataset and each view are computed using least squares estimation, which permits to return one dataset per view.
- Parameters
n_components (int, optional) -- Number of components to extract. If None, n_components is set to the minimum number of features in the dataset.
n_individual_components (int or list of int or 'auto', optional) -- The number of individual components to extract as a preprocessing. If None, no preprocessing is applied. If an int, each dataset is reduced to this dimension. If a list, the dataset i is reduced to the dimension n_individual_components[i]. If 'auto', set to the minimum between n_components and the smallest number of features in each dataset.
multiview_output (bool, optional (default True)) -- If True, the .transform method returns one dataset per view. Otherwise, it returns one dataset, of shape (n_samples, n_components)
prewhiten (bool, optional (default False)) -- Whether the data should be whitened after the original preprocessing.
whiten (bool, optional (default False)) -- When True (False by default) the components_ vectors are multiplied by the square root of n_samples and then divided by the singular values to ensure uncorrelated outputs with unit component-wise variances. Whitening will remove some information from the transformed signal (the relative variance scales of the components) but can sometime improve the predictive accuracy of the downstream estimators by making their data respect some hard-wired assumptions.
random_state (int, RandomState instance, default=None) -- Used when
svd_solver
== 'arpack' or 'randomized'. Pass an int for reproducible results across multiple function calls.
- components_¶
Principal axes in feature space, representing the directions of maximum variance in the data. The components are sorted by
explained_variance_
. n_total_features is the sum of all the number of input features.- Type
array, shape (n_components, n_total_features)
- explained_variance_¶
The amount of variance explained by each of the selected components.
- Type
array, shape (n_components,)
- explained_variance_ratio_¶
Percentage of variance explained by each of the selected components. If
n_components
is not set then all components are stored and the sum of the ratios is equal to 1.0.- Type
array, shape (n_components,)
- individual_components_¶
Individual components for each individual PCA. individual_components_[i] is an array of shape (n_individual_components, n_features) where n_features is the number of features in the dataset i.
- Type
list of array
- individual_explained_variance_¶
Individual explained variance for each individual PCA. individual_explained_variance_[i] is an array of shape (n_individual_components, ).
- Type
list of array
- individual_explained_variance_ratio_¶
Individual explained variance ratio for each individual PCA. individual_explained_variance_ratio_[i] is an array of shape (n_individual_components, ) where n_features is the number of features in the dataset i.
- Type
list of array
- individual_mean_¶
Mean of each dataset, estimated on the training data individual_mean_[i] is an array of shape (n_features) where n_features is the number of features in the dataset i.
- Type
list of array
- individual_projections_¶
List containing the linear transform linking the dataset to the output. Xs[i].dot(individual_projections_[i].T) gives the estimated reduced dataset for view i. This is obtained by least squares estimation.
- Type
list of array
- individual_embeddings_¶
List containing the pseudo-inverses of individual_projections_. Allows to recover each original dataset from reduced data.
- Type
list of array
- n_features_¶
Number of features in each training dataset.
- Type
list of int
References
- 3
Vince D Calhoun, et al. "A method for making group inferences from functional MRI data using independent component analysis." Human Brain Mapping, 14(3):140–151, 2001.
Examples
>>> from mvlearn.datasets import load_UCImultifeature >>> from mvlearn.decomposition import GroupPCA >>> Xs, _ = load_UCImultifeature() >>> pca = GroupPCA(n_components=3) >>> Xs_transformed = pca.fit_transform(Xs) >>> print(len(Xs_transformed)) 6 >>> print(Xs_transformed[0].shape) (2000, 3)
Angle-Based Joint and Individual Variation Explained (AJIVE)¶
AJIVE¶
- class mvlearn.decomposition.AJIVE(init_signal_ranks=None, joint_rank=None, individual_ranks=None, n_elbows=2, reconsider_joint_components=True, wedin_percentile=5, n_wedin_samples=1000, randdir_percentile=95, n_randdir_samples=1000, verbose=False, n_jobs=None, random_state=None)[source]¶
An implementation of Angle-based Joint and Individual Variation Explained 4. This algorithm takes multiple views and decomposes them into 3 distinct matrices representing:
Low rank approximation of joint variation between views
Low rank approximation of individual variation within each view
Residual noise
AJIVE can handle any number of views, not just two.
- Parameters
init_signal_ranks (list, default = None) -- Initial guesses as to the rank of each view's signal.
joint_rank (int, default = None) -- Rank of the joint variation matrix. If None, will estimate the joint rank.
individual_ranks (list, default = None) -- Ranks of individual variation matrices. If None, will estimate the individual ranks. Otherwise, will use provided individual ranks.
n_elbows (int, optional, default = 2) -- If init_signal_ranks=None, then computes the initial signal rank guess using
mvlearn.embed.utils.select_dimension()
with n_elbows for each view.reconsider_joint_components (boolean, default = True) -- Triggers _reconsider_joint_components function to run and removes columns of joint scores according to identifiability constraint.
wedin_percentile (int, default = 5) -- Percentile used for wedin (lower) bound cutoff for squared singular values used to estimate joint rank.
n_wedin_samples (int, default = 1000) -- Number of wedin bound samples to draw.
randdir_percentile (int, default = 95) -- Percentile for random direction (lower) bound cutoff for squared singular values used to estimate joint rank.
n_randdir_samples (int, default = 1000) -- Number of random direction samples to draw.
verbose (boolean, default = False) -- Prints information during runtime if True.
n_jobs (int (positive) or None, default=None) -- The number of jobs to run in parallel. None will run 1 job, -1 uses all processors.
random_state (int, RandomState instance or None, default=None) -- Used to seed a random initialization for reproducible results. If None, a random initialization is used.
- means_¶
The means of each view.
- Type
- sv_thresholds_¶
The singular value thresholds for each view based on initial SVD. Used to estimate the individual ranks.
- Type
list of floats
- all_joint_svals_¶
All singular values from the concatenated joint matrix.
- Type
list of floats
- random_sv_samples_¶
Random singular value samples from random direction bound.
- Type
list of floats
- wedin_samples_¶
The wedin samples for each view.
- Type
list of numpy.ndarray
- svalsq_cutoff_¶
max(rand_cutoff_, wedin_cutoff_)
- Type
- init_signal_ranks_¶
Provided or estimated init_signal_ranks
- Type
list of ints
- joints_scores_¶
Left singular vectors of the joint matrix
- Type
- individual_ranks_¶
Ranks of the individual matrices for each view.
- Type
list of ints
- individual_scores_¶
Left singular vectors of each view after joint matrix is removed
- Type
list of numpy.ndarrays
- individual_svals_ = list of numpy.ndarrays
Singular values of each view after joint matrix is removed
- individual_loadings_¶
Right singular vectors of each view after joint matrix is removed
- Type
list of numpy.ndarrays
- individual_mats_¶
Individual matrices for each view, reconstructed from the SVD
- Type
list of numpy.ndarrays
Notes
Angle-Based Joint and Individual Variation Explained (AJIVE) is a specfic variation of the Joint and Individual Variation Explained (JIVE) algorithm. This algorithm takes \(k\) different views with \(n\) observations and \(d\) variables and finds a basis that represents the joint variation and \(k\) bases with their own ranks representing the individual variation of each view. Each of these individual bases is orthonormal to the joint basis. These bases are then used to create the following \(k\) statements:
\[X^{(i)}= J^{(i)} + I^{(i)} + E^{(i)}\]where \(X^{(i)}\) represents the i-th view of data and \(J^{(i)}\), \(I^{(i)}\), and \(E^{(i)}\) represent its joint, individual, and noise signal estimates respectively.
- The AJIVE algorithm calculations can be split into three seperate steps:
Signal Space Initial Extraction
Score Space Segmentation
Final Decomposition and Outputs
In the Signal Space Initial Extraction step we compute a rank \(r_{initial}^{(i)}\) singular value decomposition for each \(X^{(i)}\), the value of \(r_{initial}^{(i)}\) can be found by looking at the scree plots of each view or thresholding based on singular value. From each singular value decomposition, the first \(r_{initial}^{(i)}\) columns of of the scores matrix (\(U^{(i)}\)) are taken to form \(\widetilde{U}^{(i)}\).
After this, the Score Space Segmentation step concatenates the k \(\widetilde{U}^{(i)}\) matrices found in the first step as follows:
\[M = [\widetilde{U}^{(1)}, \dots, \widetilde{U}^{(k)}]\]From here, the \(r_{joint}\) singular value decomposition is taken. \(r_{joint}\) is estimated individually or using the wedin bound which quantifies how the theoretical singular subspaces are affected by noise as thereby quantifying the distance between rank of the original input and the estimation. For the scores of the singular value decomposition of \(M\), the first \(r_{joint}\) columns are taken to obtain the basis, \(U_{joint}\). The \(J^{(i)}\) matrix (joint signal estimate) can be found by projecting \(X^{(i)}\) onto \(U_{joint}\).
In the Final Decomposition and Outputs step, we project each \(X^{i}\) matrix onto the orthogonal complement of \(U_{joint}\):
\[X^{(i), orthog} = (I - U_{joint}U_{joint}^T)X^{(i)}\]\(I\) in the above equation represents the identity matrix. From here, the \(I^{(i)}\) matrix (individual signal estimate) can be found by performing the rank \(r_{individual}^{(i)}\) singular value decomposition of \(X^{(i), orthog}\). \(r_{individual}^{(i)}\) can be found by using the aforementioned singular value thresholding method.
Finally, we can solve for the noise estimates, \(E^{(i)}\) by using the equation:
\[E^{(i)}= X^{(i)} - (J^{(i)} + I^{(i)})\]Much of this implementation has been adapted from Iain Carmichael Ph.D.'s pip-installable package, jive, the code for which is linked here.
References
- 4
Qing Feng, Meilei Jiang, Jan Hannig, and J.S. Marron. "Angle-based joint and individual variation explained." Journal of Multivariate Analysis, 166:241–265, 2018.
Examples
>>> from mvlearn.factorization.ajive import AJIVE >>> from mvlearn.datasets import load_UCImultifeature >>> Xs, _ = load_UCImultifeature() >>> print(len(Xs)) # number of views 6 >>> print(Xs[0].shape, Xs[1].shape) # number of samples in each view (2000, 76) (2000, 216) >>> ajive = AJIVE() >>> Xs_transformed = ajive.fit_transform(Xs) >>> print(Xs_transformed[0].shape) (2000, 76)