pga {pga} | R Documentation |
The PGA algorithm as described in Technometrics 48, page 494, Table 6.
pga(y, X, m, N, B=100, mutation=1/ncol(X), start = NULL, prior = 0.35) pga(y, X, N=8, B=25)
y |
an n -by-1 response vector. |
X |
an n -by-p matrix; each column is a
candidate predictor variable. |
m |
population size in each universe. If missing, default =
ncol(X) or ncol(X)+1 , depending on whether ncol(X) is
even or odd. |
N |
number of generations to evolve in each universe; this needs to
be fairly short to prevent each evolutionary path from converging. For an
example of how to determine the proper N , see sga . |
B |
number of parallel paths or parallel universes. |
mutation |
mutation rate; this can be a vector of length N if
a different mutation rate is needed for each generation
t=1,2,...,N ; default = 1/p for all t=1,2,...,N . |
start |
the starting population; mostly useless, default =
NULL . |
prior |
prior probability which controls the density of 1's in the
initial population, default = 0.35 , but if there is some prior
information that the number of relevant variables is large, then it can be
more efficient to use a higher prior , e.g., prior=0.7 . |
Returns a B
-by-p
matrix. Element (b,j)
of the matrix
is the frequency that variable j
“shows up” in the
last-generation population of universe b
.
Please see pga-package for an introductory overview and examples.
Dandi Qiao and Mu Zhu, University of Waterloo, Canada.
Zhu M, Chipman HA (2006). Darwinian evolution in parallel universes: A parallel genetic algorithm for variable selection. Technometrics, 48(4), 491 – 502.
Zhu M (2008). Kernels and ensembles: Perspectives on statistical learning. The American Statistician, 62(2), 97 – 109.