Main Content

Dimensionality Reduction and Feature Extraction

PCA, factor analysis, feature selection, feature extraction, and more

Feature transformation techniques reduce the dimensionality in the data by transforming data into new features. Feature selection techniques are preferable when transformation of variables is not possible, e.g., when there are categorical variables in the data. For a feature selection technique that is specifically suitable for least-squares fitting, see Stepwise Regression.

Live Editor Tasks

Reduce DimensionalityReduce dimensionality using Principal Component Analysis (PCA) in Live Editor (Since R2022b)

Functions

expand all

fscchi2Univariate feature ranking for classification using chi-square tests (Since R2020a)
fscmrmrRank features for classification using minimum redundancy maximum relevance (MRMR) algorithm (Since R2019b)
fscncaFeature selection using neighborhood component analysis for classification
fsrftestUnivariate feature ranking for regression using F-tests (Since R2020a)
fsrmrmrRank features for regression using minimum redundancy maximum relevance (MRMR) algorithm (Since R2022a)
fsrncaFeature selection using neighborhood component analysis for regression
fsulaplacianRank features for unsupervised learning using Laplacian scores (Since R2019b)
partialDependenceCompute partial dependence (Since R2020b)
plotPartialDependenceCreate partial dependence plot (PDP) and individual conditional expectation (ICE) plots
oobPermutedPredictorImportanceOut-of-bag predictor importance estimates for random forest of classification trees by permutation
oobPermutedPredictorImportanceOut-of-bag predictor importance estimates for random forest of regression trees by permutation
predictorImportanceEstimates of predictor importance for classification tree
predictorImportanceEstimates of predictor importance for classification ensemble of decision trees
predictorImportanceEstimates of predictor importance for regression tree
predictorImportanceEstimates of predictor importance for regression ensemble of decision trees
relieffRank importance of predictors using ReliefF or RReliefF algorithm
sequentialfsSequential feature selection using custom criterion
stepwiselmPerform stepwise regression
stepwiseglmCreate generalized linear regression model by stepwise regression
ricaFeature extraction by using reconstruction ICA
sparsefiltFeature extraction by using sparse filtering
transformTransform predictors into extracted features
tsnet-Distributed Stochastic Neighbor Embedding
barttestBartlett’s test
canoncorrCanonical correlation
pcaPrincipal component analysis of raw data
pcacovPrincipal component analysis on covariance matrix
pcaresResiduals from principal component analysis
ppcaProbabilistic principal component analysis
incrementalPCAIncremental principal component analysis (Since R2024a)
fitFit principal component analysis model to streaming data (Since R2024a)
transformTransform data into principal component scores (Since R2024a)
resetReset incremental principal component analysis model (Since R2024a)
factoranFactor analysis
rotatefactorsRotate factor loadings
nnmfNonnegative matrix factorization
cmdscaleClassical multidimensional scaling
mahalMahalanobis distance to reference samples
mdscaleNonclassical multidimensional scaling
pdistPairwise distance between pairs of observations
squareformFormat distance matrix
procrustesProcrustes analysis

Objects

expand all

FeatureSelectionNCAClassificationFeature selection for classification using neighborhood component analysis (NCA)
FeatureSelectionNCARegressionFeature selection for regression using neighborhood component analysis (NCA)
ReconstructionICAFeature extraction by reconstruction ICA
SparseFilteringFeature extraction by sparse filtering

Topics

Feature Selection

Feature Extraction

t-SNE Multidimensional Visualization

  • t-SNE
    t-SNE is a method for visualizing high-dimensional data by nonlinear reduction to two or three dimensions, while preserving some features of the original data.
  • Visualize High-Dimensional Data Using t-SNE
    This example shows how t-SNE creates a useful low-dimensional embedding of high-dimensional data.
  • tsne Settings
    This example shows the effects of various tsne settings.
  • t-SNE Output Function
    Output function description and example for t-SNE.

PCA and Canonical Correlation

Factor Analysis

  • Factor Analysis
    Factor analysis is a way to fit a model to multivariate data to estimate interdependence of measured variables on a smaller number of unobserved (latent) factors.
  • Analyze Stock Prices Using Factor Analysis
    Use factor analysis to investigate whether companies within the same sector experience similar week-to-week changes in stock prices.
  • Perform Factor Analysis on Exam Grades
    This example shows how to perform factor analysis using Statistics and Machine Learning Toolbox™.

Nonnegative Matrix Factorization

Multidimensional Scaling

Procrustes Analysis