coding {meanscore} | R Documentation |
recodes a matrix of categorical variables into a vector which takes
a unique value for each combination
BACKGROUND
From the matrix Z of first-stage covariates, this function creates a vector which takes a unique value for each combination as follows:
z1 | z2 | z3 | new.z |
0 | 0 | 0 | 1 |
1 | 0 | 0 | 2 |
0 | 1 | 0 | 3 |
1 | 1 | 0 | 4 |
0 | 0 | 1 | 5 |
1 | 0 | 1 | 6 |
0 | 1 | 1 | 7 |
1 | 1 | 1 | 8 |
If some of the combinations do not exist, the function will adjust
accordingly: for example if the combination (0,1,1) is absent above,
then (1,1,1) will be coded as 7.
The values of this new.z are reported as new.z
in the printed output
(see value
below)
This function should be run on second stage data prior to using
the ms.nprev
function, as it illustrates the order
in which the call to ms.nprev expects the first-stage sample sizes to be provided.
coding(x=x,y=y,z=z,return=FALSE)
REQUIRED ARGUMENTS
y |
response variable (should be binary 0-1) |
x |
matrix of predictor variables for regression model |
z |
matrix of any surrogate or auxiliary variables OPTIONAL ARGUMENTS |
return |
logical value; if it's TRUE(T) the original surrogate or auxiliary variables and the re-coded auxilliary variables will be returned. The default is FALSE (F). |
This function does not return any values except if return
=T.
If used with only second stage (i.e. complete) data, it will print the
following:
ylevel |
the distinct values (or levels) of y |
z1 ... zi |
the distinct values of first stage variables z1 ... zi |
new.z |
recoded first stage variables. Each value represents a unique combination of first stage variable values. |
n2 |
second stage sample sizes in each (ylevel ,new.z ) stratum. If used with combined first and second stage data (i.e. with NA for missing values), in addition to the above items, the function will also print the following: |
n1 |
first-stage sample sizes in each (ylevel ,new.z ) stratum. |
meanscore
,ms.nprev
,
ectopic
,simNA
,glm
.
The ectopic data set has 3 categorical first-stage variables in columns 3 to 5, which together with column 2 are the predictor variables of the dichotomous outcome in column 1 (see help(ectopic) for further details). Typing data(ectopic) coding(x=ectopic[,2:5],y=ectopic[,1], z=ectopic[,3:5]) gives the following coding scheme and first-stage and second-stage sample sizes (n1 and n2 respectively) ylevel gonnorhoea contracept sexpatr new.z n1 n2 0 0 0 0 1 56 13 0 0 1 0 2 146 36 0 0 0 1 3 119 33 0 1 0 1 4 19 8 0 0 1 1 5 344 93 0 1 1 1 6 31 9 1 0 0 0 1 26 11 1 0 1 0 2 9 5 1 0 0 1 3 160 79 1 1 0 1 4 29 18 1 0 1 1 5 35 20 1 1 1 1 6 5 2