Decomposition¶
Multiview ICA¶
- class mvlearn.decomposition.MultiviewICA(n_components=None, noise=1.0, max_iter=1000, init='permica', multiview_output=True, random_state=None, tol=0.001, verbose=False, n_jobs=30)[source]¶
- Multiview ICA for which views share a common source but separate mixing matrices. - Parameters
- n_components (int, optional) -- Number of components to extract. If None, n_components is set to the minimum number of features in the dataset. 
- noise (float, default=1.0) -- Gaussian noise level 
- max_iter (int, default=1000) -- Maximum number of iterations to perform 
- init ({'permica', 'groupica'} or np array of shape) -- (n_groups, n_components, n_components), default='permica' If permica: initialize with perm ICA, if groupica, initialize with group ica. Else, use the provided array to initialize. 
- preproc ('pca' or a ViewTransformer-like instance,) -- default='pca' Preprocessing method to use to reduce data. If "pca", performs PCA separately on each view to reduce dimension of each view. Otherwise the dimension reduction is performed using the transform method of the ViewTransformer-like object. This instance also needs an inverse transform method to recover original data from reduced data. 
- multiview_output (bool, optional (default True)) -- If True, the .transform method returns one dataset per view. Otherwise, it returns one dataset, of shape (n_samples, n_components) 
- random_state (int, RandomState instance or None, default=None) -- Used to perform a random initialization. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random. 
- tol (float, default=1e-3) -- A positive scalar giving the tolerance at which the un-mixing matrices are considered to have converged. 
- verbose (bool, default=False) -- Print information 
 
 - pcas_¶
- The fitted ViewTransformer used to reduce the data. The ViewTransformer is given by ViewTransformer(PCA(n_components=n_components)) where n_components is the number of chosen. Only used if n_components is not None. - Type
- ViewTransformer instance 
 
 - mixing_¶
- The square mixing matrices, linking preprocessed data and the independent components. - Type
- array, shape (n_views, n_components, n_components) 
 
 - pca_components_¶
- Principal axes in feature space, representing the directions of maximum variance in the data. Only used if n_components is not None. - Type
- array, shape (n_components, n_features) 
 
 - components_¶
- The square unmixing matrices - Type
- array, shape (n_views, n_components, n_components) 
 
 - individual_components_¶
- Individual unmixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_components, n_features) where n_features is the number of features in the dataset i. - Type
- list of array 
 
 - individual_mixing_¶
- Individual mixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_features, n_components) where n_features is the number of features in the dataset i. - Type
- list of array 
 
 - See also - groupica- Notes - Given each view \(X_i\) It optimizes: \[l(W) = \frac{1}{T} \sum_{t=1}^T [\sum_k log(cosh(Y_{avg,k,t})) + \sum_i l_i(X_{i,.,t})]\]- where \[l _i(X_{i,.,t}) = - log(|W_i|) + 1/(2 \sigma) ||X_{i,.,t}W_i - Y_{avg,.,t}||^2,\]- \(W_i\) is the mixing matrix for view \(i\), \(Y_{avg} = \frac{1}{n} \sum_{i=1}^n X_i W_i\), and \(\sigma\) is the noise level. - References - 1
- Hugo Richard, et al. "Modeling shared responses in neuroimaging studies through multiview ICA." In Pre-proceedings of Advances in Neural Information Processing Systems, volume 33, 2020 
 - Examples - >>> from mvlearn.datasets import load_UCImultifeature >>> from mvlearn.decomposition import MultiviewICA >>> Xs, _ = load_UCImultifeature() >>> ica = MultiviewICA(n_components=3, max_iter=10) >>> sources = ica.fit_transform(Xs) >>> print(sources.shape) (6, 2000, 3) 
Group ICA¶
- class mvlearn.decomposition.GroupICA(n_components=None, n_individual_components='auto', multiview_output=True, prewhiten=False, solver='picard', ica_kwargs={}, random_state=None)[source]¶
- Group Independent component analysis. - Each dataset in Xs is reduced with usual PCA (this step is optional). Then, datasets are concatenated in the features direction, and a PCA is performed on this matrix, yielding a single dataset. ICA is finally performed yielding the output dataset S. The unmixing matrix W corresponding to data X are obtained by solving argmin_{W} ||S - WX||^2. - Parameters
- n_components (int, optional) -- Number of components to extract. If None, n_components is set to the minimum number of features in the dataset 
- n_individual_components (int or list of int or 'auto', optional) -- The number of individual components to extract as a preprocessing. If None, no preprocessing is applied. If an int, each dataset is reduced to this dimension. If a list, the dataset i is reduced to the dimension n_individual_components[i]. If 'auto', set to the minimum between n_components and the smallest number of features in each dataset. 
- multiview_output (bool, optional (default True)) -- If True, the .transform method returns one dataset per view. Otherwise, it returns one dataset, of shape (n_samples, n_components) 
- prewhiten (bool, optional (default False)) -- Whether the data should be whitened after the original preprocessing. 
- solver (str {'picard', 'fastica'}) -- Chooses which ICA solver to use. picard is generally faster and more reliable. 
- ica_kwargs (dict) -- Optional keyword arguments for the ICA solver. If solver='fastica', see the documentation of sklearn.decomposition.fastica. If solver='picard', see the documentation of picard.picard. 
- random_state (int, RandomState instance, default=None) -- Controls the random number generator used in the estimator. Pass an int for reproducible output across multiple function calls. 
 
 - means_¶
- The mean of each dataset - Type
- list of arrays of shape (n_components,) 
 
 - grouppca_¶
- A GroupPCA class for preprocessing and dimension reduction - Type
- mvlearn.decomposition.GroupPCA instance 
 
 - mixing_¶
- The square mixing matrix, linking the output of the group-pca and the independent components. - Type
- array, shape (n_components, n_components) 
 
 - components_¶
- The square unmixing matrix - Type
- array, shape (n_components, n_components) 
 
 - individual_components_¶
- Individual unmixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_components, n_features) where n_features is the number of features in the dataset i. - Type
- list of array 
 
 - individual_mixing_¶
- Individual mixing matrices estimated by least squares. individual_components_[i] is an array of shape (n_features, n_components) where n_features is the number of features in the dataset i. - Type
- list of array 
 
 - n_features_¶
- Number of features in each training dataset. - Type
- list of int 
 
 - See also - GroupPCA,- multiviewica- References - 2
- Vince D Calhoun, et al. "A method for making group inferences from functional MRI data using independent component analysis." Human Brain Mapping, 14(3):140–151, 2001. 
 - Examples - >>> from mvlearn.datasets import load_UCImultifeature >>> from mvlearn.decomposition import GroupICA >>> Xs, _ = load_UCImultifeature() >>> ica = GroupICA(n_components=3) >>> Xs_transformed = ica.fit_transform(Xs) >>> print(len(Xs_transformed)) 6 >>> print(Xs_transformed[0].shape) (2000, 3) 
Group PCA¶
- class mvlearn.decomposition.GroupPCA(n_components=None, n_individual_components='auto', multiview_output=True, prewhiten=False, whiten=False, random_state=None)[source]¶
- Group Principal Component Analysis. - As an optional preprocessing, each dataset in Xs is reduced with usual PCA. Then, datasets are concatenated in the features direction, and a PCA is performed on this matrix, yielding a single output dataset. Linear coefficients linking the output dataset and each view are computed using least squares estimation, which permits to return one dataset per view. - Parameters
- n_components (int, optional) -- Number of components to extract. If None, n_components is set to the minimum number of features in the dataset. 
- n_individual_components (int or list of int or 'auto', optional) -- The number of individual components to extract as a preprocessing. If None, no preprocessing is applied. If an int, each dataset is reduced to this dimension. If a list, the dataset i is reduced to the dimension n_individual_components[i]. If 'auto', set to the minimum between n_components and the smallest number of features in each dataset. 
- multiview_output (bool, optional (default True)) -- If True, the .transform method returns one dataset per view. Otherwise, it returns one dataset, of shape (n_samples, n_components) 
- prewhiten (bool, optional (default False)) -- Whether the data should be whitened after the original preprocessing. 
- whiten (bool, optional (default False)) -- When True (False by default) the components_ vectors are multiplied by the square root of n_samples and then divided by the singular values to ensure uncorrelated outputs with unit component-wise variances. Whitening will remove some information from the transformed signal (the relative variance scales of the components) but can sometime improve the predictive accuracy of the downstream estimators by making their data respect some hard-wired assumptions. 
- random_state (int, RandomState instance, default=None) -- Used when - svd_solver== 'arpack' or 'randomized'. Pass an int for reproducible results across multiple function calls.
 
 - components_¶
- Principal axes in feature space, representing the directions of maximum variance in the data. The components are sorted by - explained_variance_. n_total_features is the sum of all the number of input features.- Type
- array, shape (n_components, n_total_features) 
 
 - explained_variance_¶
- The amount of variance explained by each of the selected components. - Type
- array, shape (n_components,) 
 
 - explained_variance_ratio_¶
- Percentage of variance explained by each of the selected components. If - n_componentsis not set then all components are stored and the sum of the ratios is equal to 1.0.- Type
- array, shape (n_components,) 
 
 - individual_components_¶
- Individual components for each individual PCA. individual_components_[i] is an array of shape (n_individual_components, n_features) where n_features is the number of features in the dataset i. - Type
- list of array 
 
 - individual_explained_variance_¶
- Individual explained variance for each individual PCA. individual_explained_variance_[i] is an array of shape (n_individual_components, ). - Type
- list of array 
 
 - individual_explained_variance_ratio_¶
- Individual explained variance ratio for each individual PCA. individual_explained_variance_ratio_[i] is an array of shape (n_individual_components, ) where n_features is the number of features in the dataset i. - Type
- list of array 
 
 - individual_mean_¶
- Mean of each dataset, estimated on the training data individual_mean_[i] is an array of shape (n_features) where n_features is the number of features in the dataset i. - Type
- list of array 
 
 - individual_projections_¶
- List containing the linear transform linking the dataset to the output. Xs[i].dot(individual_projections_[i].T) gives the estimated reduced dataset for view i. This is obtained by least squares estimation. - Type
- list of array 
 
 - individual_embeddings_¶
- List containing the pseudo-inverses of individual_projections_. Allows to recover each original dataset from reduced data. - Type
- list of array 
 
 - n_features_¶
- Number of features in each training dataset. - Type
- list of int 
 
 - References - 3
- Vince D Calhoun, et al. "A method for making group inferences from functional MRI data using independent component analysis." Human Brain Mapping, 14(3):140–151, 2001. 
 - Examples - >>> from mvlearn.datasets import load_UCImultifeature >>> from mvlearn.decomposition import GroupPCA >>> Xs, _ = load_UCImultifeature() >>> pca = GroupPCA(n_components=3) >>> Xs_transformed = pca.fit_transform(Xs) >>> print(len(Xs_transformed)) 6 >>> print(Xs_transformed[0].shape) (2000, 3) 
Angle-Based Joint and Individual Variation Explained (AJIVE)¶
AJIVE¶
- class mvlearn.decomposition.AJIVE(init_signal_ranks=None, joint_rank=None, individual_ranks=None, n_elbows=2, reconsider_joint_components=True, wedin_percentile=5, n_wedin_samples=1000, randdir_percentile=95, n_randdir_samples=1000, verbose=False, n_jobs=None, random_state=None)[source]¶
- An implementation of Angle-based Joint and Individual Variation Explained 4. This algorithm takes multiple views and decomposes them into 3 distinct matrices representing: - Low rank approximation of joint variation between views 
- Low rank approximation of individual variation within each view 
- Residual noise 
 - AJIVE can handle any number of views, not just two. - Parameters
- init_signal_ranks (list, default = None) -- Initial guesses as to the rank of each view's signal. 
- joint_rank (int, default = None) -- Rank of the joint variation matrix. If None, will estimate the joint rank. 
- individual_ranks (list, default = None) -- Ranks of individual variation matrices. If None, will estimate the individual ranks. Otherwise, will use provided individual ranks. 
- n_elbows (int, optional, default = 2) -- If init_signal_ranks=None, then computes the initial signal rank guess using - mvlearn.embed.utils.select_dimension()with n_elbows for each view.
- reconsider_joint_components (boolean, default = True) -- Triggers _reconsider_joint_components function to run and removes columns of joint scores according to identifiability constraint. 
- wedin_percentile (int, default = 5) -- Percentile used for wedin (lower) bound cutoff for squared singular values used to estimate joint rank. 
- n_wedin_samples (int, default = 1000) -- Number of wedin bound samples to draw. 
- randdir_percentile (int, default = 95) -- Percentile for random direction (lower) bound cutoff for squared singular values used to estimate joint rank. 
- n_randdir_samples (int, default = 1000) -- Number of random direction samples to draw. 
- verbose (boolean, default = False) -- Prints information during runtime if True. 
- n_jobs (int (positive) or None, default=None) -- The number of jobs to run in parallel. None will run 1 job, -1 uses all processors. 
- random_state (int, RandomState instance or None, default=None) -- Used to seed a random initialization for reproducible results. If None, a random initialization is used. 
 
 - means_¶
- The means of each view. - Type
 
 - sv_thresholds_¶
- The singular value thresholds for each view based on initial SVD. Used to estimate the individual ranks. - Type
- list of floats 
 
 - all_joint_svals_¶
- All singular values from the concatenated joint matrix. - Type
- list of floats 
 
 - random_sv_samples_¶
- Random singular value samples from random direction bound. - Type
- list of floats 
 
 - wedin_samples_¶
- The wedin samples for each view. - Type
- list of numpy.ndarray 
 
 - svalsq_cutoff_¶
- max(rand_cutoff_, wedin_cutoff_) - Type
 
 - init_signal_ranks_¶
- Provided or estimated init_signal_ranks - Type
- list of ints 
 
 - joints_scores_¶
- Left singular vectors of the joint matrix - Type
 
 - individual_ranks_¶
- Ranks of the individual matrices for each view. - Type
- list of ints 
 
 - individual_scores_¶
- Left singular vectors of each view after joint matrix is removed - Type
- list of numpy.ndarrays 
 
 - individual_svals_ = list of numpy.ndarrays
- Singular values of each view after joint matrix is removed 
 - individual_loadings_¶
- Right singular vectors of each view after joint matrix is removed - Type
- list of numpy.ndarrays 
 
 - individual_mats_¶
- Individual matrices for each view, reconstructed from the SVD - Type
- list of numpy.ndarrays 
 
 - Notes - Angle-Based Joint and Individual Variation Explained (AJIVE) is a specfic variation of the Joint and Individual Variation Explained (JIVE) algorithm. This algorithm takes \(k\) different views with \(n\) observations and \(d\) variables and finds a basis that represents the joint variation and \(k\) bases with their own ranks representing the individual variation of each view. Each of these individual bases is orthonormal to the joint basis. These bases are then used to create the following \(k\) statements: \[X^{(i)}= J^{(i)} + I^{(i)} + E^{(i)}\]- where \(X^{(i)}\) represents the i-th view of data and \(J^{(i)}\), \(I^{(i)}\), and \(E^{(i)}\) represent its joint, individual, and noise signal estimates respectively. - The AJIVE algorithm calculations can be split into three seperate steps:
- Signal Space Initial Extraction 
- Score Space Segmentation 
- Final Decomposition and Outputs 
 
 - In the Signal Space Initial Extraction step we compute a rank \(r_{initial}^{(i)}\) singular value decomposition for each \(X^{(i)}\), the value of \(r_{initial}^{(i)}\) can be found by looking at the scree plots of each view or thresholding based on singular value. From each singular value decomposition, the first \(r_{initial}^{(i)}\) columns of of the scores matrix (\(U^{(i)}\)) are taken to form \(\widetilde{U}^{(i)}\). - After this, the Score Space Segmentation step concatenates the k \(\widetilde{U}^{(i)}\) matrices found in the first step as follows: \[M = [\widetilde{U}^{(1)}, \dots, \widetilde{U}^{(k)}]\]- From here, the \(r_{joint}\) singular value decomposition is taken. \(r_{joint}\) is estimated individually or using the wedin bound which quantifies how the theoretical singular subspaces are affected by noise as thereby quantifying the distance between rank of the original input and the estimation. For the scores of the singular value decomposition of \(M\), the first \(r_{joint}\) columns are taken to obtain the basis, \(U_{joint}\). The \(J^{(i)}\) matrix (joint signal estimate) can be found by projecting \(X^{(i)}\) onto \(U_{joint}\). - In the Final Decomposition and Outputs step, we project each \(X^{i}\) matrix onto the orthogonal complement of \(U_{joint}\): \[X^{(i), orthog} = (I - U_{joint}U_{joint}^T)X^{(i)}\]- \(I\) in the above equation represents the identity matrix. From here, the \(I^{(i)}\) matrix (individual signal estimate) can be found by performing the rank \(r_{individual}^{(i)}\) singular value decomposition of \(X^{(i), orthog}\). \(r_{individual}^{(i)}\) can be found by using the aforementioned singular value thresholding method. - Finally, we can solve for the noise estimates, \(E^{(i)}\) by using the equation: \[E^{(i)}= X^{(i)} - (J^{(i)} + I^{(i)})\]- Much of this implementation has been adapted from Iain Carmichael Ph.D.'s pip-installable package, jive, the code for which is linked here. - References - 4
- Qing Feng, Meilei Jiang, Jan Hannig, and J.S. Marron. "Angle-based joint and individual variation explained." Journal of Multivariate Analysis, 166:241–265, 2018. 
 - Examples - >>> from mvlearn.factorization.ajive import AJIVE >>> from mvlearn.datasets import load_UCImultifeature >>> Xs, _ = load_UCImultifeature() >>> print(len(Xs)) # number of views 6 >>> print(Xs[0].shape, Xs[1].shape) # number of samples in each view (2000, 76) (2000, 216) >>> ajive = AJIVE() >>> Xs_transformed = ajive.fit_transform(Xs) >>> print(Xs_transformed[0].shape) (2000, 76)