HT {TeachingSampling} | R Documentation |
Computes the Horvitz-Thompson estimator of the population total for several variables of interest
HT(y, Pik)
y |
Vector, matrix or data frame containig the recollected information of the variables of interest for every unit in the selected sample |
Pik |
A vetor containing the inclusion probabilities for each unit in the selected sample |
The Horvitz-Thompson estimator is given by
sum_{k in U}frac{y_k}{{π}_k}
where y_k is the value of the variables of interest for the kth unit, and {π}_k its corresponding inclusion probability. This estimator could be used for without replacement designs as well as for with replacement designs.
The function returns a vector of total population estimates for each variable of interest.
Hugo Andrés Gutiérrez Rojas hugogutierrez@usantotomas.edu.co
Sarndal, C-E. and Swensson, B. and Wretman, J. (1992), Model Assisted Survey Sampling. Springer.
Guti'errez, H. A. (2009), Estrategias de muestreo: Dise~no de encuestas y estimacion de par'ametros.
Editorial Universidad Santo Tom'as.
############ ## Example 1 ############ # Without replacement sampling # Vector U contains the label of a population of size N=5 U <- c("Yves", "Ken", "Erik", "Sharon", "Leslie") # Vector y1 and y2 are the values of the variables of interest y1<-c(32, 34, 46, 89, 35) y2<-c(1,1,1,0,0) y3<-cbind(y1,y2) # The population size is N=5 N <- length(U) # The sample size is n=2 n <- 2 # The sample membership matrix for fixed size without replacement sampling designs Ind <- Ik(N,n) # p is the probability of selection of every possible sample p <- c(0.13, 0.2, 0.15, 0.1, 0.15, 0.04, 0.02, 0.06, 0.07, 0.08) # Computation of the inclusion probabilities inclusion <- Pik(p, Ind) # Selection of a random sample sam <- sample(5,2) # The selected sample U[sam] # The inclusion probabilities for these two units inclusion[sam] # The values of the variables of interest for the units in the sample y1[sam] y2[sam] y3[sam,] # The Horvitz-Thompson estimator HT(y1[sam],inclusion[sam]) HT(y2[sam],inclusion[sam]) HT(y3[sam,],inclusion[sam]) ############ ## Example 2 ############ # Following Example 1... With replacement sampling # The population size is N=5 N <- length(U) # The sample size is m=2 m <- 2 # pk is the probability of selection of every single unit pk <- c(0.9, 0.025, 0.025, 0.025, 0.025) # Computation of the inclusion probabilities Pik <- 1-(1-pk)^m # Selection of a random sample with replacement sam <- sample(5,2, replace=TRUE, prob=pk) # The selected sample U[sam] # The inclusion probabilities for these two units inclusion[sam] # The values of the variables of interest for the units in the sample y1[sam] y2[sam] y3[sam,] # The Horvitz-Thompson estimator HT(y1[sam],inclusion[sam]) HT(y2[sam],inclusion[sam]) HT(y3[sam,],inclusion[sam]) ############ ## Example 3 ############ # Uses the Marco and Lucy data to draw a simple random sample without replacement data(Marco) data(Lucy) N <- dim(Marco)[1] n <- 400 sam <- sample(N,n) # The vector of inclusion probabilities for each unit in the sample Pik <- rep(n/N,n) # The information about the units in the sample is stored in an object called data data <- Lucy[sam,] attach(data) names(data) # The variables of interest are: Income, Employees and Taxes # This information is stored in a data frame called estima estima <- data.frame(Income, Employees, Taxes) HT(estima, Pik) ############ ## Example 4 ############ # Uses the Marco and Lucy data to draw a simple random sample with replacement data(Marco) data(Lucy) N <- dim(Marco)[1] m <- 400 sam <- sample(N,m,replace=TRUE) # The vector of selection probabilities of units in the sample pk <- rep(1/N,m) # Computation of the inclusion probabilities Pik <- 1-(1-pk)^m # The information about the units in the sample is stored in an object called data data <- Lucy[sam,] attach(data) names(data) # The variables of interest are: Income, Employees and Taxes # This information is stored in a data frame called estima estima <- data.frame(Income, Employees, Taxes) HT(estima, Pik)