Main Content

ClassificationDiscriminant

Discriminant analysis classification

Description

A ClassificationDiscriminant object encapsulates a discriminant analysis classifier, which is a Gaussian mixture model for data generation. A ClassificationDiscriminant object can predict responses for new data using the predict method. The object contains the data used for training, so can compute resubstitution predictions.

Creation

Create a ClassificationDiscriminant object by using fitcdiscr.

Properties

expand all

This property is read-only.

Between-class covariance, specified as a p-by-p matrix, where p is the number of predictors.

Data Types: double

This property is read-only.

Categorical predictor indices, which is always empty ([]).

This property is read-only.

Class names in the training data Y with duplicates removed. ClassNames has the same data type as the data in the argument Y in the training data. ClassNames can have the following data types:

  • Categorical array

  • Cell array of character vectors

  • Character array

  • Logical vector

  • Numeric vector

(The software treats string arrays as cell arrays of character vectors.)

Data Types: single | double | logical | char | string | cell | categorical

This property is read-only.

Coefficient matrices, specified as a k-by-k structure, where k is the number of classes. If fitcdiscr had the FillCoeffs name-value pair set to 'off' when constructing the classifier, Coeffs is empty ([]).

Coeffs(i,j) contains coefficients of the linear or quadratic boundaries between classes i and j. Fields in Coeffs(i,j):

  • DiscrimType

  • Class1ClassNames(i)

  • Class2ClassNames(j)

  • Const — A scalar

  • Linear — A vector with p components, where p is the number of columns in X

  • Quadraticp-by-p matrix, exists for quadratic DiscrimType

The equation of the boundary between class i and class j is

Const + Linear * x + x' * Quadratic * x = 0,

where x is a column vector of length p.

Data Types: struct

Cost of classifying a point, specified as a square matrix. Cost(i,j) is the cost of classifying a point into class j if its true class is i (the rows correspond to the true class and the columns correspond to the predicted class). The order of the rows and columns of Cost corresponds to the order of the classes in ClassNames. The number of rows and columns in Cost is the number of unique classes in the response.

Change a Cost matrix using dot notation: obj.Cost = costMatrix.

Data Types: double

Value of the Delta threshold for a linear discriminant model, specified as a nonnegative scalar. If a coefficient of obj has magnitude smaller than Delta, obj sets this coefficient to 0, and so you can eliminate the corresponding predictor from the model. Set Delta to a higher value to eliminate more predictors.

Delta must be 0 for quadratic discriminant models.

Change Delta using dot notation: obj.Delta = newDelta.

Data Types: double

This property is read-only.

Minimum value of Delta coefficient for predictor to be in model, specified as a row vector of length p, where p is the number of predictors in obj. If DeltaPredictor(i) < Delta then coefficient i of the model is 0.

If obj is a quadratic discriminant model, all elements of DeltaPredictor are 0.

Data Types: double

Discriminant type, specified as a character vector or string. Available values:

  • 'linear'

  • 'quadratic'

  • 'diagLinear'

  • 'diagQuadratic'

  • 'pseudoLinear'

  • 'pseudoQuadratic'

Change DiscrimType using dot notation: obj.DiscrimType = newDiscrimType. You can change between linear types, or between quadratic types, but cannot change between linear and quadratic types.

Data Types: char | string

Value of the Gamma regularization parameter, specified as a scalar from 0 through 1. Change Gamma using dot notation: obj.Gamma = newGamma.

  • If you set 1 for linear discriminant, the discriminant sets its type to 'diagLinear'.

  • If you set a value between MinGamma and 1 for linear discriminant, the discriminant sets its type to 'linear'.

  • You cannot set values below the value of the MinGamma property.

  • For quadratic discriminant, you can set either 0 (for DiscrimType 'quadratic') or 1 (for DiscrimType 'diagQuadratic').

Data Types: double

This property is read-only.

Description of the cross-validation optimization of hyperparameters, returned as a BayesianOptimization object or a table of hyperparameters and associated values. Nonempty when the OptimizeHyperparameters name-value pair is nonempty at creation. Value depends on the setting of the HyperparameterOptimizationOptions name-value pair at creation:

  • 'bayesopt' (default) — Object of class BayesianOptimization

  • 'gridsearch' or 'randomsearch' — Table of hyperparameters used, observed objective function values (cross-validation loss), and rank of observations from lowest (best) to highest (worst)

This property is read-only.

Logarithm of the determinant of the within-class covariance matrix, returned as a scalar or vector. The type of LogDetSigma depends on the discriminant type:

  • Scalar for linear discriminant analysis

  • Vector of length K for quadratic discriminant analysis, where K is the number of classes

Data Types: double

This property is read-only.

Minimal value of the Gamma parameter so that the correlation matrix is invertible, specified as a nonnegative scalar. If the correlation matrix is not singular, MinGamma is 0.

Data Types: double

This property is read-only.

Parameters used in training the model, returned as a DiscriminantParams object. The returned parameters have the following properties.

PropertyValue
DiscrimType
  • 'linear'

  • 'quadratic'

  • 'diagLinear'

  • 'diagQuadratic'

  • 'pseudoLinear'

  • 'pseudoQuadratic'

Gammascalar from 0 through 1
Deltanonnegative scalar
FillCoeffslogical scalar
SaveMemorylogical scalar
Versionscalar
Method'Discriminant'
Type'classification'

This property is read-only.

Class means, specified as a K-by-p matrix of real values. K is the number of classes, and p is the number of predictors. Each row of Mu represents the mean of the multivariate normal distribution of the corresponding class. The class indices are in the ClassNames attribute.

Data Types: double

This property is read-only.

Number of observations in the training data, returned as a positive integer. NumObservations can be less than the number of rows of input data when there are missing values in the input data or response data.

Data Types: double

This property is read-only.

Names of predictor variables, returned as a cell array. The names are in the order in which they appear in the training data X.

Data Types: cell

Prior probabilities for each class, returned as a numeric vector. The order of the elements of Prior corresponds to the order of the classes in ClassNames.

Add or change a Prior vector using dot notation: obj.Prior = priorVector.

Data Types: double

This property is read-only.

Name of the response variable Y, returned as a character vector.

Data Types: char | string

This property is read-only.

Rows of the original predictor data X used for fitting, returned as an n-element logical vector, where n is the number of rows of X. If the software uses all rows of X for constructing the object, then RowsUsed is an empty array ([]).

Data Types: logical

Score transformation function, specified as a character vector or string representing a built-in transformation function, or as a function handle for transforming scores. 'none' means no transformation; equivalently, 'none' means @(x)x. For a list of built-in transformation functions and the syntax of custom transformation functions, see fitcdiscr.

Implement dot notation to add or change a ScoreTransform function using one of the following:

  • cobj.ScoreTransform = 'function'

  • cobj.ScoreTransform = @function

Data Types: char | string | function_handle

This property is read-only.

Within-class covariance, returned as a numeric array. The dimensions depend on DiscrimType:

  • 'linear' (default) — Matrix of size p-by-p, where p is the number of predictors

  • 'quadratic' — Array of size p-by-p-by-K, where K is the number of classes

  • 'diagLinear' — Row vector of length p

  • 'diagQuadratic' — Array of size 1-by-p-by-K

  • 'pseudoLinear' — Matrix of size p-by-p

  • 'pseudoQuadratic' — Array of size p-by-p-by-K

Data Types: double

This property is read-only.

Scaled observation weights, returned as a numeric vector of length n, where n is the number of rows in X.

Data Types: double

This property is read-only.

Predictor values, returned as a real matrix. Each column of X represents one predictor (variable), and each row represents one observation.

Data Types: single | double

This property is read-only.

X data with class means subtracted, returned as a real matrix. If Y(i) is of class j,

Xcentered(i,:) = X(i,:)Mu(j,:),

where Mu is the class mean property.

Data Types: single | double

This property is read-only.

Row classifications, returned as a categorical array, cell array of character vectors, character array, logical vector, or numeric vector with the same number of rows as X. Each row of Y represents the classification of the corresponding row of X.

Data Types: single | double | logical | char | string | cell | categorical

Object Functions

compactReduce size of discriminant analysis classifier
compareHoldoutCompare accuracies of two classification models using new data
crossvalCross-validate machine learning model
cvshrinkCross-validate regularization of linear discriminant
edgeClassification edge for discriminant analysis classifier
limeLocal interpretable model-agnostic explanations (LIME)
logpLog unconditional probability density for discriminant analysis classifier
lossClassification loss for discriminant analysis classifier
mahalMahalanobis distance to class means of discriminant analysis classifier
marginClassification margins for discriminant analysis classifier
nLinearCoeffsNumber of nonzero linear coefficients in discriminant analysis classifier
partialDependenceCompute partial dependence
plotPartialDependenceCreate partial dependence plot (PDP) and individual conditional expectation (ICE) plots
predictPredict labels using discriminant analysis classifier
resubEdgeResubstitution classification edge for discriminant analysis classifier
resubLossResubstitution classification loss for discriminant analysis classifier
resubMarginResubstitution classification margins for discriminant analysis classifier
resubPredictClassify observations in discriminant analysis classifier by resubstitution
shapleyShapley values
testckfoldCompare accuracies of two classification models by repeated cross-validation

Examples

collapse all

Load Fisher's iris data set.

load fisheriris

Train a discriminant analysis model using the entire data set.

Mdl = fitcdiscr(meas,species)
Mdl = 
  ClassificationDiscriminant
             ResponseName: 'Y'
    CategoricalPredictors: []
               ClassNames: {'setosa'  'versicolor'  'virginica'}
           ScoreTransform: 'none'
          NumObservations: 150
              DiscrimType: 'linear'
                       Mu: [3x4 double]
                   Coeffs: [3x3 struct]


Mdl is a ClassificationDiscriminant model. To access its properties, use dot notation. For example, display the group means for each predictor.

Mdl.Mu
ans = 3×4

    5.0060    3.4280    1.4620    0.2460
    5.9360    2.7700    4.2600    1.3260
    6.5880    2.9740    5.5520    2.0260

To predict labels for new observations, pass Mdl and predictor data to predict.

More About

expand all

References

[1] Guo, Y., T. Hastie, and R. Tibshirani. "Regularized linear discriminant analysis and its application in microarrays." Biostatistics, Vol. 8, No. 1, pp. 86–100, 2007.

Extended Capabilities

Version History

Introduced in R2011b

expand all