meanscore {meanscore}R Documentation

Mean Score Method for Missing Covariate Data in Logistic Regression Models

Description

Weighted logistic regression using the Mean Score method

Usage

        meanscore(x=x,y=y,z=z,factor=NULL,print.all=FALSE)

Arguments

x matrix of predictor variables, one column of which contains some missing values (NA)
y response variable (binary 0-1)
z matrix of the surrogate or auxiliary variables which must be categorical

OPTIONAL ARGUMENTS
print.all logical value determining all output to be printed. The default is False (F).
factor factor variables; if the columns of the matrix of predictor variables have names, supply these names, otherwise supply the column numbers. MS.NPREV will fit separate coefficients for each level of the factor variables.

Details

The response, predictor and surrogate variables must be numeric. The function will automatically call the CODING function to recode the z matrix to give a new.z vector which takes a unique value for each combination (type help(coding) for further particulars), as follows:
z1 z2 z3 new.z
0 0 0 1
1 0 0 2
0 1 0 3
1 1 0 4
0 0 1 5
1 0 1 6
0 1 1 7
1 1 1 8

The values of this new.z are reported as new.z see coding.

Value

A list called "parameters" containing the following will be returned:

est the vector of estimates of the regression coefficients
se the vector of standard errors of the estimates
z Wald statistic for each coefficient
pvalue 2-sided p-value (H0: coeff=0)

when print.all = TRUE, it will also return the following lists:
Ihat the Fisher information matrix
varsi variance of the score for each (ylevel,zlevel) stratum

References

Reilly,M and M.S. Pepe. 1995. A mean score method for missing and auxiliary
covariate data in regression models. Biometrika 82:299-314

See Also

ms.nprev,coding, ectopic,simNA,glm.

Examples

## Not run: 
THE SIMULATED DATASET EXAMPLE
## End(Not run)

## Not run: 
We use the simulated dataset which is stored in the matrix simNA.
You can load the dataset using:
## End(Not run)

data(simNA) 

help (simNA)
#gives a detailed description of the data.
      
## Not run: To analyze this data using the meanscore function:

meanscore(y=simNA[,1],z=simNA[,2],x=simNA[,3])

## Not run: This will give the following:

[1] "For calls to ms.nprev, input n1 or prev in the following order!!"
     ylevel z new.z  n1  n2
[1,]      0 0     0 310 150
[2,]      0 1     1 166  85
[3,]      1 0     0 177  86
[4,]      1 1     1 347 179

$parameters
                  est         se          z    pvalue
(Intercept) 0.0493998 0.07155138  0.6904103 0.4899362
x           1.0188437 0.10187094 10.0013188 0.0000000
## End(Not run)
## Not run: 
If you extract the complete cases (n=500) to a matrix called
"complete", using
## End(Not run)

complete=simNA[!is.na(simNA[,3]),]

## Not run: then 
summary(glm(complete[,1]~complete[,3], family="binomial"))

## Not run: gives the following results:

## Not run: 
Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)    0.05258    0.09879   0.532    0.595    
complete[, 3]  1.01942    0.12050   8.460   <2e-16 ***
## End(Not run)

## Not run: 
Notice that the Mean Score estimates above had smaller 
standard errors, reflecting the additional information
in the incomplete observations used in the analysis.
Also note that since z is a surrogate for x, it is not 
used in the complete case analysis.
## End(Not run)
 

## Not run: THE ECTOPIC DATASET EXAMPLE

## Not run: This is a real-data example of an application of Mean Score
to a case-control study of the association between ectopic 
pregnancy and sexually transmitted diseases (see Reilly and 
Pepe, 1995). To learn more about the dataset, type help(ectopic). 

The data frame called "ectopic" is in the data subfolder
of the meanscore library. You can load the data by typing:
## End(Not run)
data(ectopic)

## Not run: 
The following lines will reproduce the results presented in Table 3 
of Reilly & Pepe (1995)
## End(Not run)

# use gonnorhoea, contracept and sexpatr as auxiliary variables
ectopic.z=ectopic[,3:5]

# the auxiliary variables defined above and the chlamydia antibody status 
# are the predictor variables in the logistic regression model          
ectopic.x=ectopic[,2:5]    

meanscore(x=ectopic.x,z=ectopic.z,y=ectopic[,1])


[Package meanscore version 1.0-6 Index]