coding {meanscore}R Documentation

combines two or more surrogate/auxiliary variables into a vector

Description

recodes a matrix of categorical variables into a vector which takes a unique value for each combination

BACKGROUND

From the matrix Z of first-stage covariates, this function creates a vector which takes a unique value for each combination as follows:

z1 z2 z3 new.z
0 0 0 1
1 0 0 2
0 1 0 3
1 1 0 4
0 0 1 5
1 0 1 6
0 1 1 7
1 1 1 8

If some of the combinations do not exist, the function will adjust accordingly: for example if the combination (0,1,1) is absent above, then (1,1,1) will be coded as 7.

The values of this new.z are reported as new.z in the printed output (see value below)

This function should be run on second stage data prior to using the ms.nprev function, as it illustrates the order in which the call to ms.nprev expects the first-stage sample sizes to be provided.

Usage

coding(x=x,y=y,z=z,return=FALSE)

Arguments

REQUIRED ARGUMENTS

y response variable (should be binary 0-1)
x matrix of predictor variables for regression model
z matrix of any surrogate or auxiliary variables

OPTIONAL ARGUMENTS
return logical value; if it's TRUE(T) the original surrogate or auxiliary variables and the re-coded auxilliary variables will be returned. The default is FALSE (F).

Value

This function does not return any values except if return=T.

If used with only second stage (i.e. complete) data, it will print the following:

ylevel the distinct values (or levels) of y
z1 ... zi the distinct values of first stage variables z1 ... zi
new.z recoded first stage variables. Each value represents a unique combination of first stage variable values.
n2 second stage sample sizes in each (ylevel,new.z) stratum.

If used with combined first and second stage data (i.e. with NA for missing values), in addition to the above items, the function will also print the following:
n1 first-stage sample sizes in each (ylevel,new.z) stratum.

See Also

meanscore,ms.nprev, ectopic,simNA,glm.

Examples


## Not run: 
The ectopic data set has 3 categorical first-stage variables in columns 
3 to 5, which together with column 2 are the predictor variables of the
dichotomous outcome in column 1 (see help(ectopic) for further details). Typing
## End(Not run)
data(ectopic)
coding(x=ectopic[,2:5],y=ectopic[,1], z=ectopic[,3:5])

## Not run: 
gives the following coding scheme and first-stage and second-stage 
sample sizes (n1 and n2 respectively)
## End(Not run)

## Not run: 
 ylevel gonnorhoea contracept sexpatr new.z  n1 n2
      0          0          0       0     1  56 13
      0          0          1       0     2 146 36
      0          0          0       1     3 119 33
      0          1          0       1     4  19  8
      0          0          1       1     5 344 93
      0          1          1       1     6  31  9
      1          0          0       0     1  26 11
      1          0          1       0     2   9  5
      1          0          0       1     3 160 79
      1          1          0       1     4  29 18
      1          0          1       1     5  35 20
      1          1          1       1     6   5  2
## End(Not run)

[Package Contents]