lm.dibp {bivpois}R Documentation

General Diagonal Inflated Bivariate Poisson Model

Description

Produces a "list" object which gives details regarding the fit of a bivariate diagonal inflated Poisson regression model of the form

hspace{1cm} $(X_i,Y_i) sim DIBP( λ_{1i}, λ_{2i}, λ_{3i} , D(theta) )$ which is equivalent to

hspace{2cm} $(Xi,Yi) sim (1-p)BP( x_i, y_i| λ_{1i}, λ_{2i}, λ_{3i} )$ if $x_i ne y_i$

hspace{2cm} $(Xi,Yi) sim (1-p)BP( x_i, y_i| λ_{1i}, λ_{2i}, λ_{3i} )+pD( x_i | theta )$ if $x_i=y_i$ for $i =1, 2, ..., n$ with

hspace{2cm} $log {underline λ}_1 = {bf w}_1 {underline β}$, $log {underline λ}_2 = {bf w}_2 {underline β}$ and $log {underline λ}_3 = {bf w}_3 {underline β}_3$ ;

where

$n$ is the sample size.
${underline λ}_k = ( λ_{k1}, λ_{k2}, ..., λ_{kn} )^T$ for $k=1,2,3$ are vectors of length $n$ containing the estimated lambda for each observation.
${bf w}_1$, ${bf w}_2$ are $n times p$ data matrices containing explanatory variables for $λ_1$ and $λ_2$.
${bf w}_3$ are $ntimes p_2$ data matrix containing explanatory variables for $λ_3$.
${underline β}$ is a vector of length $p$ which is common for $λ_1$ and $λ_2$ in order to allow for common effects.
${underline β}_3$ vector of length $p_2$.
$D(theta)$ is a discrete distribution with parameter vector $theta$ used to inflate the diagonal.
$p$ is the mixing proportion.

Usage

lm.dibp( l1, l2, l1l2=NULL, l3=~1, data, common.intercept=FALSE, 
         zeroL3 = FALSE, distribution = "discrete", jmax = 2, maxit = 300, 
         pres = 1e-08, verbose=getOption("verbose") )

Arguments

l1 Formula of the form ``$xsim X_1+...+X_p$" for parameters of $logλ_1$.
l2 Formula of the form ``$ysim X_1+...+X_p$" for parameters of $logλ_2$.
l1l2 Formula of the form ``$sim X_1+...+X_p$" for the common parameters of $logλ_1$ and $logλ_2$. If the explanatory variable is also found on l1 and/or l2 then a model using interaction type parameters is fitted (one parameter common for both predictors [main effect] and differences from this for the other predictor [interaction type effect] ). Special terms of the form ``c(X1,X2)" can be also used here. These terms imply common parameters of $λ_1$ and $λ_2$ on different variables. For example if c(x1,x2) is used then use the same beta for the effect of $x_1$ on $logλ_1$ and the effect of $x_2$ on $logλ_2$. For details see example 4 - dataset ex4.ita91.
l3 Formula of the form ``$sim X_1+...+X_p$" for the parameters of $logλ_3$.
data Data frame containing the variables in the model.
common.intercept Logical function specifying whether a common intercept on $logλ_1$ and $logλ_2$ should be used. The default value is FALSE.
zeroL3 Logical argument controlling whether $λ_3$ should be set equal to zero (therefore fits a double Poisson model).
distribution Specifies the type of inflated distribution; ="discrete": Discrete(J=jmax), ="poisson" : Poisson($theta$) ="geometric": Geometric($theta$).
jmax Number of parameters used in $Discrete$ distribution. This argument is not used for the Poisson or the Geometric distributions are used as for the inflation of the diagonal.
maxit Maximum number of EM steps. Default value is 300 iterations.
pres Precision used in stopping the EM algorithm. The algorithm stops when the relative log-likelihood difference is lower than the value of pres.
verbose Logical argument controlling whether beta parameters will we printed while EM runs. Default value is taken equal to the value of linebreak options()\$verbose. If verbose=FALSE then only the iteration number, the loglikelihood and its relative difference from the previous iteration are printed. If verbose=TRUE then the model parameters $β_1$, $β_2$ and $β_3$ are additionally printed

Value

A list object returned with the following variables.

coefficients Estimates of the model parameters for $β_1$, $β_2$ and $β_3$, $p$ and $theta$.
fitted.values Data frame with $n$ lines and 2 columns containing the fitted values for $x$ and $y$.
residuals Data frame with $n$ lines and 2 columns containing the residuals of the model for $x$ and $y$ given by $x-E(x)$ and $y-E(y)$ respectively; where $E(x)$ and $E(y)$ are given by the fitted.values .
beta1,beta2, beta3 Vectors $β_1, β_2$ and $β_3$ containing the coefficients involved in the linear predictors of $λ_1 , λ_2$ and $λ_3$ respectively. When zeroL3=TRUE then beta3 is not calculated.
lambda1, lambda2 Vectors of length $n$ containing the estimated $λ_1$ and $λ_2$ for each observation
lambda3 vector containing the values of $λ_3$. If zeroL3=TRUE then lambda3 is equal to zero and is not provided.
loglikelihood Maximized log-likelihood of the fitted model. This is given in a vector form (one value per iteration). Using this vector we can monitor the log-likelihood evolution in each EM step.
AIC, BIC AIC and BIC of the model. Values are also provided for the double Poisson model and the saturated model.
diagonal.distribution label for the diagonal inflated distribution used.
p mixing proportion.
theta Parameter vector of the diagonal distribution. For discrete distribution theta has length equal to jmax with $theta_i=$theta[i] and $theta_0 =1-sum_{i=1}^{JMAX}theta_i$; for the Poisson distribution theta is the mean; for the Geometric distribution theta is the success probability.
parameters Number of parameters.
iterations Number of iterations.
call Argument providing the exact calling details of the lm.dibp function.

Author(s)

1. Dimitris Karlis, Department of Statistics, Athens University of Economics and Business, Athens, Greece, karlis@aueb.gr .

2. Ioannis Ntzoufras, Department of Statistics, Athens University of Economics and Business, Athens, Greece, ntzoufras@aueb.gr .

References

1. Karlis, D. and Ntzoufras, I. (2005). Bivariate Poisson and Diagonal Inflated Bivariate Poisson Regression Models in R. Journal of Statistical Software (to appear).

2. Karlis, D. and Ntzoufras, I. (2003). Analysis of Sports Data Using Bivariate Poisson Models. Journal of the Royal Statistical Society, D, (Statistician), 52, 381 - 393.

See Also

pbivpois, simple.bp, lm.bp.

Examples

data(ex2.sim)
#
# Model 1: BivPois
ex2.m1<-lm.bp( x~z1 , y~z1+z5, l1l2=~z3, l3=~.-z5, data=ex2.sim )
# Model 2: Zero Inflated BivPois 
ex2.m2<-lm.dibp( x~z1 , y~z1+z5, l1l2=~z3, l3=~.-z5, data=ex2.sim , jmax=0)
#
# for models 3-10, the maximum number of iterations is set to 2
#
# Model 3: Diagonal Inflated BivPois with DISCRETE(1) diagonal distribution
ex2.m3<-lm.dibp( x~z1 , y~z1+z5, l1l2=~z3, l3=~.-z5, data=ex2.sim , jmax=1, maxit=2)
# Model 4: Diagonal Inflated BivPois with DISCRETE(2) diagonal distribution
ex2.m4<-lm.dibp( x~z1 , y~z1+z5, l1l2=~z3, l3=~.-z5, data=ex2.sim , jmax=2, maxit=2)
# Model 5: Diagonal Inflated BivPois with DISCRETE(3) diagonal distribution
ex2.m5<-lm.dibp( x~z1 , y~z1+z5, l1l2=~z3, l3=~.-z5, data=ex2.sim , jmax=3, maxit=2)
# Model 6: Diagonal Inflated BivPois with DISCRETE(4) diagonal distribution
ex2.m6<-lm.dibp( x~z1 , y~z1+z5, l1l2=~z3, l3=~.-z5, data=ex2.sim , jmax=4, maxit=2)
# Model 7: Diagonal Inflated BivPois with DISCRETE(5) diagonal distribution
ex2.m7<-lm.dibp( x~z1 , y~z1+z5, l1l2=~z3, l3=~.-z5, data=ex2.sim , jmax=5, maxit=2)
# Model 8: Diagonal Inflated BivPois with DISCRETE(6) diagonal distribution
ex2.m8<-lm.dibp( x~z1 , y~z1+z5, l1l2=~z3, l3=~.-z5, data=ex2.sim , jmax=6, maxit=2)
# Model 9: Diagonal Inflated BivPois with POISSON diagonal distribution
ex2.m9<-lm.dibp( x~z1 , y~z1+z5, l1l2=~z3, l3=~.-z5, data=ex2.sim , 
                 distribution="poisson", maxit=2)
# Model 10: Diagonal Inflated BivPois with GEOMETRIC diagonal distribution
ex2.m10<-lm.dibp( x~z1 , y~z1+z5, l1l2=~z3, l3=~.-z5, data=ex2.sim , 
                  distribution="geometric", maxit=2)
#
# printing parameters of model 7
ex2.m7$beta1
ex2.m7$beta2
ex2.m7$beta3
ex2.m7$p
ex2.m7$theta
#
# printing parameters of model 9
ex2.m9$beta1
ex2.m9$beta2
ex2.m9$beta3
ex2.m9$p
ex2.m9$theta

[Package bivpois version 0.50-3 Index]