cluster {sampling} | R Documentation |
Cluster sampling with equal/unequal probabilities.
cluster(data, clustername, size, method=c("srswor","srswr","poisson","systematic"), pik,description=FALSE)
data |
data frame or data matrix; its number of rows is N, the population size. |
clustername |
the name of the clustering variable. |
size |
sample size. |
method |
method to select clusters; the following methods are implemented: simple random sampling without replacement (srswor), simple random sampling with replacement (srswr), Poisson sampling (poisson), systematic sampling (systematic); if the method is not specified, by default the method is "srswor". |
pik |
vector of selection probabilities or auxiliary information used to compute them; this argument is only used for unequal probability sampling (Poisson, systematic). If an auxiliary information is provided, the function uses the inclusionprobabilities function for computing these probabilities. If the method is "srswr" and the sample size is larger than the population size, this vector is normalized to one. |
description |
a message is printed if its value is TRUE; the message gives the number of selected clusters, the number of units in the population and the number of selected units. By default, the value is FALSE. |
The cluster object contains the following information: the selected clusters, the identifier of the units in the selected clusters, the final inclusion probabilities for the units (they are equal for the units coming from the same cluster). If method is "srswr", the number of replicates is also given.
############ ## Example 1 ############ # Uses the swissmunicipalities data to draw a sample of clusters data(swissmunicipalities) # the variable 'REG' has 7 categories in the population; it is used as clustering variable # the sample size is 3; the method is simple random sampling without replacement cl=cluster(swissmunicipalities,clustername=c("REG"),size=3,method="srswor") # extracts the observed data # the order of the columns is different from the order in the swissmunicipalities database getdata(swissmunicipalities, cl) ############ ## Example 2 ############ # the same data as in Example 1 # the sample size is 3; the method is systematic sampling # the pik vector is randomly generated using the U(0,1) distribution cl_sys=cluster(swissmunicipalities,clustername=c("REG"),size=3,method="systematic", pik=runif(7)) # extracts the observed data getdata(swissmunicipalities,cl_sys)