Multivariate Linear Regression in Matlab Programming
Large, high-dimensional data sets are common in the new age of computer-based
instrumentation and electronic data storage. High-dimensional data present much more challenges
for statistical visualization, analysis, and modeling. Data visualization, of course, is
impossible after a few dimensions. As a result, pattern recognition, data preprocessing, and
model selection should rely heavily on numerical methods.
A basic challenge in high-dimensional data analysis is the so-called curse of dimensionality.
Observations in a high-dimensional space are truly sparser and less representative than those in
a low-dimensional space. In higher dimensions, data over-represent the edges of a sample
distribution, because the regions of higher-dimensional space have the majority of their volume
near the surface.
Often, many of the dimensions in a data set have the measured features are not
useful in creating a model. Features may be irrelevant or redundant. Regression and
classification algorithms may take large amounts of storage and computation time to compute raw
data, and even if the algorithms are successful the resulting models may contain an
incomprehensible large number of terms.
Because of these challenges, multivariate statistical methods generally begin with
some type of dimension reduction, in which data are shown by points in a lower-dimensional
space. Dimension reduction is the target of the methods presented in this section. Dimension
reduction often points to simpler models and fewer measured variables, with consequent benefits
when measurements are expensive and visualization is important. MATLAB gives us these functions
to make our work easier.
beta = mvregress(X,Y)
beta = mvregress(X,Y,Name,Value)
[beta,Sigma] = mvregress(___)>
[beta,Sigma,E,CovB,logL] = mvregress(___)
beta = mvregress(X,Y)this function returns the estimated coefficients for a multivariate normal
regression of the d-dimensional responses in Y on the design matrices in X.
beta = mvregress(X,Y,Name,Value) this function returns the estimated coefficients using
additional options specified by one or more name-value pair arguments
[beta,Sigma] = mvregress(___) this function also returns the estimated d-by-d
variance-covariance matrix of Y, using any of the input arguments from the previous syntaxes.
[beta,Sigma,E,CovB,logL] = mvregress(___) this function also returns a matrix of residuals E,
estimated variance-covariance matrix of the regression coefficients CovB, and the value of the
log likelihood objective function after the last iteration logL.