Firstly, model.frame() creates a data.frame, and therefore keeps the column classes as they are (e.g., numeric, logical, ordered or factor), whereas model.matrix() creates a matrix, and therefore converts all columns to class numeric. Variables of class logical are converted to 0-1 variables (with "TRUE" added to the variables name), variables of class factor are converted to (number of factor levels - 1) 0-1 variables using treatment contrasts (all factor levels are compared with the first level of the factor), variables of class ordered are converted to (number of factor levels - 1) 0-1 variables using polynomial contrasts.

Secondly, model.frame() keeps the y variable (or left-hand side of the formula specification) in the model by default, whereas model.matrix() does not. If you want to leave the y-variable out of the model.frame, simply leave it out of the formula (e.g., "~ x1+ x2" instead of "y ~ x1 + x2")

Thirdly, model.matrix includes a column of ones for the intercept, by default. If you do not want this column in your model.matrix, include "-1" in the right hand side of the formula (e.g., "y ~ -1 + x1+ x2")

The following R code also illustrates some of the differences:

## Create example dataset and model formula: exdata <- data.frame(y = 1:8, x1 = c(1:4, 1:4), x2 = rep(c(TRUE, FALSE), times = 2), x3 = factor(rep(letters[1:2], times = 4)), x4 = factor(rep(1:4, times = 2), ordered = TRUE)) exformula <- y ~ x1 + x2 + x3 + x4 ## Create an example model.frame: exmodframe <- model.frame(formula, exdata) ## y is retained in the model.frame: exmodframe ## model.frame keeps keeps same column classes: sapply(exmodframe, class) ## Create an example model.frame: exmodmat1 <- model.matrix(exformula, exdata) ## y is dropped from model.matrix: exmodmat1 ## all columns are converted to numeric class: apply(exmodmat1, 2, class) ## It does not make a difference if we create a model.matrix directly from the ## data, instead of from a model.frame of the data: exmodmat2 <- model.matrix(exformula, exmodframe) exmodmat2 apply(exmodmat2, 2, class) exmodmat1 == exmodmat2

## No comments:

## Post a Comment