mlogit {mlogit}R Documentation

Multinomial logit model

Description

Estimation by maximum likelihood of the multinomial logit model, with alternative-specific and/or individual specific variables.

Usage

mlogit(formula, data, subset, weights, na.action, start = NULL,
       alt.subset = NULL, reflevel = NULL, 
       nests = NULL, heterosc = FALSE, rpar = NULL,
       R = 40, correlation = FALSE, halton = NULL,
       random.nb = NULL, estimate = TRUE, ...)
## S3 method for class 'mlogit':
print(x, digits = max(3, getOption("digits") - 2),
    width = getOption("width"), ...)
## S3 method for class 'mlogit':
summary(object, ...)
## S3 method for class 'summary.mlogit':
print(x, digits = max(3, getOption("digits") - 2),
    width = getOption("width"), ...)
## S3 method for class 'mlogit':
print(x, digits = max(3, getOption("digits") - 2),
    width = getOption("width"), ...)
## S3 method for class 'mlogit':
logLik(object, ...)
## S3 method for class 'mlogit':
vcov(object, ...)
## S3 method for class 'mlogit':
residuals(object, outcome = TRUE, ...)
## S3 method for class 'mlogit':
fitted(object, outcome = TRUE, ...)
## S3 method for class 'mlogit':
df.residual(object, ...)
## S3 method for class 'mlogit':
terms(x, ...)
## S3 method for class 'mlogit':
model.matrix(object, ...)
## S3 method for class 'mlogit':
update(object, new, ...)

Arguments

x, object an object of class mlogit
formula a symbolic description of the model to be estimated,
new an updated formula for the update method,
data the data: an mlogit.data object or an ordinary data.frame,
subset an optional vector specifying a subset of observations,
weights an optional vector of weights,
na.action a function which indicates what should happen when the data contains 'NA's,
start a vector of starting values,
alt.subset a vector of character strings containing the subset of alternative on which the model should be estimated,
reflevel the base alternative (the one for which the coefficients of individual-specific variables are normalized to 0),
estimate a boolean indicating whether the model should be estimated or not: if not, the model.frame is returned,
nests a named list of characters vectors, each names being a nest, the corresponding vector being the set of alternatives that belong to this nest,
heterosc a boolean, if TRUE, the heteroscedastic logit model is esitmated,
rpar a named vector whose names are the random parameters and values the distribution : 'n' for normal, 'l' for log-normal, 't' for truncated normal, 'u' for uniform,
R the number of function evaluation for the gaussian quadrature method used if heterosc=TRUE, the number of draws of pseudo-random numbers if rpar is not NULL,
correlation only relevant if rpar is not NULL, if true, the correlation between random parameters is taken into account,
halton only relevant if rpar is not NULL, if not NULL, halton sequence is used instead of pseudo-random numbers. If halton=NA, some default values are used for the prime of the sequence (actually, the primes are used in order) and for the number of elements droped. Otherwise, halton should be a list with elements prime (the primes used) and drop (the number of elements droped).
random.nb only relevant if rpar is not NULL, a user-supplied matrix of random,
digits the number of digits,
width the width of the printing,
outcome a boolean which indicates, for the fitted and the residuals methods whether a matrix (for each choice, one value for each alternative) or a vector (for each choice, only a value for the alternative chosen) should be returned,
... further arguments passed to mlogit.data or mlogit.optim.

Details

Let J being the number of alternatives. The formula may include alternative-specific and individual specific variables. For the latter, J-1 coefficients are estimated for each variable. Alternative and individual specific variables are separated by a |. For example, if x1 and x2 are alternative specific and z1 and z2 are individual specific, the formula y~x1+x2|z1+z2 describe a model with one coefficient for x1 and x2 and J-1 coefficients for z1 and z2. J-1 intercepts are also estimated. Models with only alternative-specific or individual-specific variables are respectively estimated by y~x1+x2 and y~0|z1+z2. In multinomial logit models, the intercept are alternative specific. By default, intercepts are included, to obtain a model without intercepts, use -1 or +0 on the second part of the equation: y~x1+x2|z1+z2-1. For models with only alternative specific, this can be done on the unique part, i.e. y~x1+x2-1 is similar to y~x1+x2|0.

The data argument may be an ordinary data.frame. In this case, some supplementary arguments should be provided and are passed to mlogit.data. Note that it is not necessary to indicate the choice argument as it is deduced from the formula.

The model is estimated using the mlogit.optim function.

The basic multinomial logit model and three important extentions of this model may be estimated.

If heterosc=TRUE, the heteroscedastic logit model is estimated. J-1 extra coefficients are estimated that represent the scale parameter for J-1 alternatives, the scale parameter for the reference alternative being normalized to 1. The probabilities doesn't have a closed form, they are estimated using a gaussian quadrature method.

If nests is not NULL, the nested logit model is estimated.

If rpar is not NULL, the random parameter model is estimated. The probabilities are approximated using simulations with R draws and halton sequences are used if halton is not NULL. Pseudo-random numbers are drawns from a standard normal and the relevant transformations are performed to obtain numbers drawns from a normal, log-normal, censored-normal or uniform distribution. If correlation=TRUE, the correlation between the random parameters are taken into account by estimating the components of the cholesky decomposition of the covariance matrix. With G random parameters, without correlation G standard deviations are estimated, with correlation G * (G + 1) /2 coefficients are estimated.

Value

An object of class "mlogit", a list with elements:

coefficients the named vector of coefficients,
logLik the value of the log-likelihood,
hessian the hessian of the log-likelihood at convergence,
gradient the gradient of the log-likelihood at convergence,
call the matched call,
est.stat some information about the estimation (time used, optimisation method),
freq the frequency of choice,
residuals the residuals,
fitted.values the fitted values,
formula the formula (a logitform object),
expanded.formula the formula (a formula object),
model the model frame used,
index the index of the choice and of the alternatives.

Author(s)

Yves Croissant

References

McFadden, D. (1973) Conditional Logit Analysis of Qualitative Choice Behavior, in P. Zarembka ed., Frontiers in Econometrics, New-York: Academic Press.

McFadden, D. (1974) ``The Measurement of Urban Travel Demand'', Journal of Public Economics, 3, pp. 303-328.

Train, K. (2004) Discrete Choice Modelling, with Simulations, Cambridge University Press.

See Also

mlogit.data to shape the data. multinom from package nnet performs the estimation of the multinomial logit model with individual specific variables. mlogit.optim for details about the optimization function.

Examples


## Cameron and Trivedi's Microeconometrics p.493
## There are two alternative specific variables : pr (price) and ca (catch)
## and four fishing mode : beach, pier, boat, charter

data("Fishing", package = "mlogit")
Fish <- mlogit.data(Fishing, varying = c(4:11), shape = "wide", choice = "mode")

## a pure "conditional" model without intercepts

summary(mlogit(mode ~ pr + ca - 1, data = Fish))

## a pure "multinomial model"

summary(mlogit(mode ~ 0 | income, data = Fish))

## which can also be estimated using multinom (package nnet)

library(nnet)
summary(multinom(mode ~ income, data = Fishing))

## a "mixed" model

m <- mlogit(mode ~ pr + ca | income, data = Fish)
summary(m)

## same model with charter as the reference level

m <- mlogit(mode ~ pr + ca | income, data = Fish, reflevel = "charter")

## same model with a subset of alternatives : charter, pier, beach

m <- mlogit(mode ~ pr + ca | income, data = Fish,
            alt.subset = c("charter", "pier", "beach"))



## An heteroscedastic logit model

data("TravelMode", package = "AER")
hl <- mlogit(choice ~ wait + travel + vcost, TravelMode,
             shape = "long", id.var = "individual", alt.var = "mode",
             method = "bfgs", heterosc = TRUE, tol = 10)

## A nested logit model

TravelMode$avincome <- with(TravelMode, income * (mode == "air"))
TravelMode$time <- with(TravelMode, travel + wait)/60
TravelMode$timeair <- with(TravelMode, time * I(mode == "air"))
TravelMode$income <- with(TravelMode, income / 10)
#Heiss p.231
nl <- mlogit(choice ~ time + timeair | income, TravelMode, shape = "long",
             alt.var = "mode", print.level = 0, method = "bfgs",
             nests = list(public = c("train", "bus"), other = c("air","car")),
             tol = 1)

## a mixed logit model

rpl <- mlogit(mode ~ pr + ca | income, Fishing, varying = 4:11, shape = 'wide',
              rpar = c(pr = 'n', ca = 'n'), correlation = TRUE, halton = NA,
              R = 10, tol = 10, print.level = 0)
summary(rpl)


[Package mlogit version 0.1-4 Index]