pga {pga}R Documentation

Parallel Genetic Algorithm for Variable Selection

Description

The PGA algorithm as described in Technometrics 48, page 494, Table 6.

Usage

pga(y, X, m, N, B=100, mutation=1/ncol(X), start = NULL, prior = 0.35)
pga(y, X, N=8, B=25)

Arguments

y an n-by-1 response vector.
X an n-by-p matrix; each column is a candidate predictor variable.
m population size in each universe. If missing, default = ncol(X) or ncol(X)+1, depending on whether ncol(X) is even or odd.
N number of generations to evolve in each universe; this needs to be fairly short to prevent each evolutionary path from converging. For an example of how to determine the proper N, see sga.
B number of parallel paths or parallel universes.
mutation mutation rate; this can be a vector of length N if a different mutation rate is needed for each generation t=1,2,...,N; default = 1/p for all t=1,2,...,N.
start the starting population; mostly useless, default = NULL.
prior prior probability which controls the density of 1's in the initial population, default = 0.35, but if there is some prior information that the number of relevant variables is large, then it can be more efficient to use a higher prior, e.g., prior=0.7.

Value

Returns a B-by-p matrix. Element (b,j) of the matrix is the frequency that variable j “shows up” in the last-generation population of universe b.

Note

Please see pga-package for an introductory overview and examples.

Author(s)

Dandi Qiao and Mu Zhu, University of Waterloo, Canada.

References

Zhu M, Chipman HA (2006). Darwinian evolution in parallel universes: A parallel genetic algorithm for variable selection. Technometrics, 48(4), 491 – 502.

Zhu M (2008). Kernels and ensembles: Perspectives on statistical learning. The American Statistician, 62(2), 97 – 109.

See Also

sga


[Package pga version 0.1-1 Index]