create.fused {StatMatch} | R Documentation |
Creates a synthetic data frame after the statistical matching of two data sources at micro level.
create.fused(data.rec, data.don, mtc.ids, z.vars, dup.x=FALSE, match.vars=NULL)
data.rec |
A matrix or data frame that has been used as recipient in the statistical matching application. |
data.don |
A matrix or data frame that has been used as the donor in the statistical matching application. |
mtc.ids |
A matrix with two columns. Each row must contain the name or the index of the recipient record (row) in data.don and the name or the index of the corresponding donor record (row) in data.don . Note that this type of matrix is returned by the functions NND.hotdeck and RANDwNND.hotdeck .
|
z.vars |
A character vector with the name of the variables available only in data.don that should be “donated” to data.rec .
|
dup.x |
Logical. When TRUE the values of the matching variables in data.don are also “donated” to data.rec . The name of the matching variables has to be specified with the argument match.vars . To avoid confusion, the matching variables added to data.rec are renamed by adding the suffix “don”. By default dup.x=FALSE .
|
match.vars |
A character vector with the names of the matching variables. It has to be specified only when dup.x=TRUE .
|
This function allows to create the synthetic (or fused) data set after the application of a statistical matching in a micro framework. For details D'Orazio et al. (2006).
The data.rec
data frame with the z.vars
filled in and, when dup.x=TRUE
, with the values of the matching variables for the donor records.
Marcello D'Orazio madorazi@istat.it
D'Orazio, M., Di Zio, M. and Scanu, M. (2006). Statistical Matching: Theory and Practice. Wiley, Chichester.
lab <- c(1:15, 51:65, 101:115) iris.rec <- iris[lab, c(1:3,5)] # recipient data.frame iris.don <- iris[-lab, c(1:2,4:5)] # donor data.frame # Now iris.rec and iris.don have the variables # "Sepal.Length", "Sepal.Width" and "Species" # in common. # "Petal.Length" is available only in iris.rec # "Petal.Width" is available only in iris.don # find the closest donors using NND hot deck; # distances are computed on "Sepal.Length" and "Sepal.Width" out.NND <- NND.hotdeck(data.rec=iris.rec, data.don=iris.don, match.vars=c("Sepal.Length", "Sepal.Width"), don.class="Species") # create synthetic data.set, without the duplication of the matching variables fused.0 <- create.fused(data.rec=iris.rec, data.don=iris.don, mtc.ids=out.NND$mtc.ids, z.vars="Petal.Width") # create synthetic data.set, with the "duplication" of the matching variables fused.1 <- create.fused(data.rec=iris.rec, data.don=iris.don, mtc.ids=out.NND$mtc.ids, z.vars="Petal.Width", dup.x=TRUE, match.vars=c("Sepal.Length", "Sepal.Width"))