| % File src/library/stats/man/step.Rd |
| % Part of the R package, https://www.R-project.org |
| % Copyright 1995-2014 R Core Team |
| % Distributed under GPL 2 or later |
| |
| \name{step} |
| \alias{step} |
| \title{ |
| Choose a model by AIC in a Stepwise Algorithm |
| } |
| \description{ |
| Select a formula-based model by AIC. |
| } |
| \usage{ |
| step(object, scope, scale = 0, |
| direction = c("both", "backward", "forward"), |
| trace = 1, keep = NULL, steps = 1000, k = 2, \dots) |
| } |
| \arguments{ |
| \item{object}{ |
| an object representing a model of an appropriate class (mainly |
| \code{"lm"} and \code{"glm"}). |
| This is used as the initial model in the stepwise search. |
| } |
| \item{scope}{ |
| defines the range of models examined in the stepwise search. |
| This should be either a single formula, or a list containing |
| components \code{upper} and \code{lower}, both formulae. See the |
| details for how to specify the formulae and how they are used. |
| } |
| \item{scale}{ |
| used in the definition of the AIC statistic for selecting the models, |
| currently only for \code{\link{lm}}, \code{\link{aov}} and |
| \code{\link{glm}} models. The default value, \code{0}, indicates |
| the scale should be estimated: see \code{\link{extractAIC}}. |
| } |
| \item{direction}{ |
| the mode of stepwise search, can be one of \code{"both"}, |
| \code{"backward"}, or \code{"forward"}, with a default of \code{"both"}. |
| If the \code{scope} argument is missing the default for |
| \code{direction} is \code{"backward"}. Values can be abbreviated. |
| } |
| \item{trace}{ |
| if positive, information is printed during the running of \code{step}. |
| Larger values may give more detailed information. |
| } |
| \item{keep}{ |
| a filter function whose input is a fitted model object and the |
| associated \code{AIC} statistic, and whose output is arbitrary. |
| Typically \code{keep} will select a subset of the components of |
| the object and return them. The default is not to keep anything. |
| } |
| \item{steps}{ |
| the maximum number of steps to be considered. The default is 1000 |
| (essentially as many as required). It is typically used to stop the |
| process early. |
| } |
| \item{k}{ |
| the multiple of the number of degrees of freedom used for the penalty. |
| Only \code{k = 2} gives the genuine AIC: \code{k = log(n)} is sometimes |
| referred to as BIC or SBC. |
| } |
| \item{\dots}{ |
| any additional arguments to \code{\link{extractAIC}}. |
| } |
| } |
| \value{ |
| the stepwise-selected model is returned, with up to two additional |
| components. There is an \code{"anova"} component corresponding to the |
| steps taken in the search, as well as a \code{"keep"} component if the |
| \code{keep=} argument was supplied in the call. The |
| \code{"Resid. Dev"} column of the analysis of deviance table refers |
| to a constant minus twice the maximized log likelihood: it will be a |
| deviance only in cases where a saturated model is well-defined |
| (thus excluding \code{lm}, \code{aov} and \code{survreg} fits, |
| for example). |
| } |
| \details{ |
| \code{step} uses \code{\link{add1}} and \code{\link{drop1}} |
| repeatedly; it will work for any method for which they work, and that |
| is determined by having a valid method for \code{\link{extractAIC}}. |
| When the additive constant can be chosen so that AIC is equal to |
| Mallows' \eqn{C_p}{Cp}, this is done and the tables are labelled |
| appropriately. |
| |
| The set of models searched is determined by the \code{scope} argument. |
| The right-hand-side of its \code{lower} component is always included |
| in the model, and right-hand-side of the model is included in the |
| \code{upper} component. If \code{scope} is a single formula, it |
| specifies the \code{upper} component, and the \code{lower} model is |
| empty. If \code{scope} is missing, the initial model is used as the |
| \code{upper} model. |
| |
| Models specified by \code{scope} can be templates to update |
| \code{object} as used by \code{\link{update.formula}}. So using |
| \code{.} in a \code{scope} formula means \sQuote{what is |
| already there}, with \code{.^2} indicating all interactions of |
| existing terms. |
| |
| There is a potential problem in using \code{\link{glm}} fits with a |
| variable \code{scale}, as in that case the deviance is not simply |
| related to the maximized log-likelihood. The \code{"glm"} method for |
| function \code{\link{extractAIC}} makes the |
| appropriate adjustment for a \code{gaussian} family, but may need to be |
| amended for other cases. (The \code{binomial} and \code{poisson} |
| families have fixed \code{scale} by default and do not correspond |
| to a particular maximum-likelihood problem for variable \code{scale}.) |
| } |
| \note{ |
| This function differs considerably from the function in S, which uses a |
| number of approximations and does not in general compute the correct AIC. |
| |
| This is a minimal implementation. Use \code{\link[MASS]{stepAIC}} |
| in package \CRANpkg{MASS} for a wider range of object classes. |
| } |
| \section{Warning}{ |
| The model fitting must apply the models to the same dataset. This |
| may be a problem if there are missing values and \R's default of |
| \code{na.action = na.omit} is used. We suggest you remove the |
| missing values first. |
| |
| Calls to the function \code{\link{nobs}} are used to check that the |
| number of observations involved in the fitting process remains unchanged. |
| } |
| \seealso{ |
| \code{\link[MASS]{stepAIC}} in \CRANpkg{MASS}, \code{\link{add1}}, |
| \code{\link{drop1}} |
| } |
| \references{ |
| Hastie, T. J. and Pregibon, D. (1992) |
| \emph{Generalized linear models.} |
| Chapter 6 of \emph{Statistical Models in S} |
| eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole. |
| |
| Venables, W. N. and Ripley, B. D. (2002) |
| \emph{Modern Applied Statistics with S.} |
| New York: Springer (4th ed). |
| } |
| \author{ |
| B. D. Ripley: \code{step} is a slightly simplified version of |
| \code{\link[MASS]{stepAIC}} in package \CRANpkg{MASS} (Venables & |
| Ripley, 2002 and earlier editions). |
| |
| The idea of a \code{step} function follows that described in Hastie & |
| Pregibon (1992); but the implementation in \R is more general. |
| } |
| \examples{\donttest{ |
| ## following on from example(lm) |
| \dontshow{utils::example("lm", echo = FALSE)} |
| step(lm.D9) |
| |
| summary(lm1 <- lm(Fertility ~ ., data = swiss)) |
| slm1 <- step(lm1) |
| summary(slm1) |
| slm1$anova |
| }} |
| \keyword{models} |