Main Content

random

Generate random responses from fitted linear mixed-effects model

Description

example

ysim = random(lme) returns a vector of simulated responses ysim from the fitted linear mixed-effects model lme at the original fixed- and random-effects design points, used to fit lme.

random simulates new random-effects vector and new observation errors. So, the simulated response is

ysim=Xβ^+Zb^+ε,

where β^ is the estimated fixed-effects coefficients, b^ is the new random effects, and ε is the new observation error.

random also accounts for the effect of observation weights, if you use any when fitting the model.

example

ysim = random(lme,tblnew) returns a vector of simulated responses ysim from the fitted linear mixed-effects model lme at the values in the new table or dataset array tblnew. Use a table or dataset array for random if you use a table or dataset array for fitting the model lme.

ysim = random(lme,Xnew,Znew) returns a vector of simulated responses ysim from the fitted linear mixed-effects model lme at the values in the new fixed- and random-effects design matrices, Xnew and Znew, respectively. Znew can also be a cell array of matrices. Use the matrix format for random if you use design matrices for fitting the model lme.

example

ysim = random(lme,Xnew,Znew,Gnew) returns a vector of simulated responses ysim from the fitted linear mixed-effects model lme at the values in the new fixed- and random-effects design matrices, Xnew and Znew, respectively, and the grouping variable Gnew.

Znew and Gnew can also be cell arrays of matrices and grouping variables, respectively.

Examples

collapse all

Load the sample data.

load('fertilizer.mat');

The dataset array includes data from a split-plot experiment, where soil is divided into three blocks based on the soil type: sandy, silty, and loamy. Each block is divided into five plots, where five different types of tomato plants (cherry, heirloom, grape, vine, and plum) are randomly assigned to these plots. The tomato plants in the plots are then divided into subplots, where each subplot is treated by one of four fertilizers. This is simulated data.

Store the data in a dataset array called ds, for practical purposes, and define Tomato, Soil, and Fertilizer as categorical variables.

ds = fertilizer;
ds.Tomato = nominal(ds.Tomato);
ds.Soil = nominal(ds.Soil);
ds.Fertilizer = nominal(ds.Fertilizer);

Fit a linear mixed-effects model, where Fertilizer and Tomato are the fixed-effects variables, and the mean yield varies by the block (soil type), and the plots within blocks (tomato types within soil types) independently.

lme = fitlme(ds,'Yield ~ Fertilizer * Tomato + (1|Soil) + (1|Soil:Tomato)');

Generate random response values at the original design points. Display the first five values.

rng(123,'twister') % For reproducibility
ysim = random(lme);
ysim(1:5)
ans = 5×1

  114.8785
  134.2018
  154.2818
  169.7554
   84.6089

Load the sample data.

load carsmall

Fit a linear mixed-effects model, with a fixed-effects for Weight, and a random intercept grouped by Model_Year. First, store the data in a table.

tbl = table(MPG,Weight,Model_Year);
lme = fitlme(tbl,'MPG ~ Weight + (1|Model_Year)');

Randomly generate responses using the original data.

rng(123,'twister') % For reproducibility
ysim = random(lme,tbl);

Plot the original and the randomly generated responses to see how they differ. Group them by model year.

figure()
gscatter(Weight,MPG,Model_Year)
hold on
gscatter(Weight,ysim,Model_Year,[],'o+x')
legend('70-data','76-data','82-data','70-sim','76-sim','82-sim')
hold off

Note that the simulated random response values for year 82 are lower than the original data for that year. This might be due to a lower simulated random effect for year 82 than the estimated random effect in the original data.

Load the sample data.

load('fertilizer.mat');

The dataset array includes data from a split-plot experiment, where soil is divided into three blocks based on the soil type: sandy, silty, and loamy. Each block is divided into five plots, where five different types of tomato plants (cherry, heirloom, grape, vine, and plum) are randomly assigned to these plots. The tomato plants in the plots are then divided into subplots, where each subplot is treated by one of four fertilizers. This is simulated data.

Store the data in a dataset array called ds, for practical purposes, and define Tomato, Soil, and Fertilizer as categorical variables.

ds = fertilizer;
ds.Tomato = nominal(ds.Tomato);
ds.Soil = nominal(ds.Soil);
ds.Fertilizer = nominal(ds.Fertilizer);

Fit a linear mixed-effects model, where Fertilizer and Tomato are the fixed-effects variables, and the mean yield varies by the block (soil type), and the plots within blocks (tomato types within soil types) independently.

lme = fitlme(ds,'Yield ~ Fertilizer * Tomato + (1|Soil) + (1|Soil:Tomato)');

Create a new dataset array with design values. The new dataset array must have the same variables as the original dataset array you use for fitting the model lme.

dsnew = dataset();
dsnew.Soil = nominal({'Sandy';'Silty';'Silty'});
dsnew.Tomato = nominal({'Cherry';'Vine';'Plum'});
dsnew.Fertilizer = nominal([2;2;4]);

Generate random responses at the new points.

rng(123,'twister') % For reproducibility
ysim = random(lme,dsnew)
ysim = 3×1

   99.6006
  101.9911
  161.4026

Load the sample data.

load carbig

Fit a linear mixed-effects model for miles per gallon (MPG), with fixed effects for acceleration, horsepower, and cylinders, and potentially correlated random effect for intercept and acceleration grouped by model year.

First, prepare the design matrices for fitting the linear mixed-effects model.

X = [ones(406,1) Acceleration Horsepower];
Z = [ones(406,1) Acceleration];
Model_Year = nominal(Model_Year);
G = Model_Year;

Now, fit the model using fitlmematrix with the defined design matrices and grouping variables.

lme = fitlmematrix(X,MPG,Z,G,'FixedEffectPredictors',....
{'Intercept','Acceleration','Horsepower'},'RandomEffectPredictors',...
{{'Intercept','Acceleration'}},'RandomEffectGroups',{'Model_Year'});

Create the design matrices that contain the data at which to predict the response values. Xnew must have three columns as in X. The first column must be a column of 1s. And the values in the last two columns must correspond to Acceleration and Horsepower, respectively. The first column of Znew must be a column of 1s, and the second column must contain the same Acceleration values as in Xnew. The original grouping variable in G is the model year. So, Gnew must contain values for the model year. Note that Gnew must contain nominal values.

Xnew = [1,13.5,185; 1,17,205; 1,21.2,193];
Znew = [1,13.5; 1,17; 1,21.2];
Gnew = nominal([73 77 82]);

Generate random responses for the data in the new design matrices.

rng(123,'twister') % For reproducibility
ysim = random(lme,Xnew,Znew,Gnew)
ysim = 3×1

   15.7416
   10.6085
    6.8796

Now, repeat the same for a linear mixed-effects model with uncorrelated random-effects terms for intercept and acceleration. First, change the original random effects design and the random effects grouping variables. Then, fit the model.

Z = {ones(406,1),Acceleration};
G = {Model_Year,Model_Year};

lme = fitlmematrix(X,MPG,Z,G,'FixedEffectPredictors',....
{'Intercept','Acceleration','Horsepower'},'RandomEffectPredictors',...
{{'Intercept'},{'Acceleration'}},'RandomEffectGroups',{'Model_Year','Model_Year'});

Now, recreate the new random effects design, Znew, and the grouping variable design, Gnew, using which to predict the response values.

Znew = {[1;1;1],[13.5;17;21.2]};
MY = nominal([73 77 82]);
Gnew = {MY,MY};

Generate random responses using the new design matrices.

rng(123,'twister') % For reproducibility
ysim = random(lme,Xnew,Znew,Gnew)
ysim = 3×1

   16.8280
   10.4375
    4.1027

Input Arguments

collapse all

Linear mixed-effects model, specified as a LinearMixedModel object constructed using fitlme or fitlmematrix.

New input data, which includes the response variable, predictor variables, and grouping variables, specified as a table or dataset array. The predictor variables can be continuous or grouping variables. tblnew must have the same variables as in the original table or dataset array used to fit the linear mixed-effects model lme.

New fixed-effects design matrix, specified as an n-by-p matrix, where n is the number of observations and p is the number of fixed predictor variables. Each row of X corresponds to one observation and each column of X corresponds to one variable.

Data Types: single | double

New random-effects design, specified as an n-by-q matrix or a cell array of R design matrices Z{r}, where r = 1, 2, ..., R. If Znew is a cell array, then each Z{r} is an n-by-q(r) matrix, where n is the number of observations, and q(r) is the number of random predictor variables.

Data Types: single | double | cell

New grouping variable or variables, specified as a vector or a cell array, of length R, of grouping variables used to fit the linear mixed-effects model, lme.

random treats all levels of each grouping variable as new levels. It draws an independent random effects vector for each level of each grouping variable.

Data Types: single | double | categorical | logical | char | string | cell

Output Arguments

collapse all

Simulated response values, returned as an n-by-1 vector, where n is the number of observations.

Version History

Introduced in R2013b