AsciiGridImpute {yaImpute} | R Documentation |
AsciiGridImpute
finds nearest neighbor reference
observations for each point in the input grid maps and outputs maps
of selected Y-variables in a set of output grid maps.
AsciiGridPredict
applies a predict function to each point in the
input grid maps and outputs maps of the prediction(s) in one or more
output grid maps (see Details).
One row of the each grid maps is read and processed at a time thereby avoiding the need to build huge objects in R that would be necessary all the rows of all the maps were processed together.
AsciiGridImpute(object,xfiles,outfiles,xtypes=NULL,ancillaryData=NULL, ann=NULL,lon=NULL,lat=NULL,rows=NULL,cols=NULL, nodata=NULL,myPredFunc=NULL,...) AsciiGridPredict(object,xfiles,outfiles,xtypes=NULL,lon=NULL,lat=NULL, rows=NULL,cols=NULL,nodata=NULL,myPredFunc=NULL,...)
object |
An object of class yai , any other object for which
a predict function is defined. See Details. |
xfiles |
A list of input file names where there is one
grid file for each X-variable. List elements must be given the same names
as the X-variables they correspond with and there must be one file for
each X-variable used when object was built. |
outfiles |
One of these two forms:
|
xtypes |
A list of data type names that corresponds exactly to data type of the
maps listed in xfiles . Each value can be one of:
"logical", "integer", "numeric", "character" . If NULL,
or if a type is missing for a member of xfiles , type "numeric" is used. |
ancillaryData |
A data frame of Y-variables that may not have been used in
the original call to yai . There must be one row for
each reference observation, no missing data, and row names must match those used
in the original reference observations. |
ann |
if NULL, the value is taken from object . When TRUE, ann is
used to find neighbors, and when FALSE a slow exact search is used (ignored for when
method randomForest is used when the original yai object was created). |
lon |
if NULL, the value of rows is used. Otherwise, a 2-element
vector given the range of longitudes (vertical distance) desired for the output. |
lat |
if NULL, the value of cols is used. Otherwise, a 2-element
vector given the range of latitudes (horizontal distance) desired for the output. |
rows |
if NULL, all rows from the input grids are used. Otherwise, rows is a 2-element
vector given the rows desired for the output. If the second element is greater than
the number of rows, the header value YLLCORNER is adjusted accordingly. Ignored
if lon is specified. |
cols |
if NULL, all columns from the input grids are used. Otherwise, cols is a 2-element
vector given the columns desired for the output. If the first element is greater than
one, the header value XLLCORNER is adjusted accordingly. Ignored
if lat is specified. |
nodata |
the NODATA_VALUE for the output. If NULL, the value is taken from the
input grids. |
myPredFunc |
called to predict output using the object and newdata from
the xfiles . Two arguments are required, the first is object and the
second is a data frame of the new predictor variables. If NULL, the generic
predict function is called for object . |
... |
passed to myPredFunc or predict . |
The input maps are assumed to be Asciigrid maps with 6-line headers
containing the following tags: NCOLS, NROWS, XLLCORNER, YLLCORNER,
CELLSIZE
and NODATA_VALUE
(case insensitive). The headers should be
identical, a warning is issued if they are not. It is critical that NODATA_VALUE
is the same on all input maps.
The function builds data frames from the input maps one row at a time and builds predictions using those data frames as newdata. Each row of the input maps is processed in sequence so that the entire maps are not stored in memory. The function works by opening all the input and reads one line (row) at a time from each. The output file(s) are created one line at time as the input maps are processed.
Use AsciiGridImpute
for objects builts with yai
,
otherwise use AsciiGridPredict
. When AsciiGridPredict
is
used, the following rules apply. First, when myPredFunc
is not
null it is called with the arguments object, newdata, ...
where the
new data is the data frame built from the input maps, otherwise the
generic predict
function is called with these same arguments.
When object
and myPredFunc
are both NULL a copy
newdata
used as the prediction. This is useful when lat, lon, rows,
or cols
are used in to subset the maps.
The NODATA_VALUE
is output for every NODATA_VALUE
found on any
grid cell on any one of the input maps (the predict function is not called for
these grid cells). NODATA_VALUE
is also output for any grid cell where
the predict function returns an NA
. If factors are used as X-variables in
object
, the levels found the map data are checked against those used in
building the object
. If new levels are found, the corresponding output
map grid point is set to NODATA_VALUE
; the predict function is not called
for these cells as most predict functions will fail in these circumstances.
Checking on factors depends on object
containing a meaningful member
named xlevels
, as done objects model objects produced by
lm
. Note that objects produced by
randomForest
version 4.5-19 and prior do not contain
xlevels
; use function addXlevels
to add the necessary
element if one or more factor
s
are used.
Asciigrid maps do not contain character data, only numbers. The numbers in the
maps are matched the xlevels
by subscript (the first entry in a level corresponds
to the numeric value 1 in the Asciigrid maps, the second to the number 2 and so
on). Care must be taken by the user to insure that the coding scheme used in
building the maps is identical to that used in building the object
.
An invisible
list containing the following named elements:
unexpectedNAs |
A data frame listing the map row numbers and the number
of NA values generated by the predict function for each row. If none
are generated for a row the row is not reported, if none are generated for any rows,
the data frame is NULL. |
illegalLevels |
A data frame listing levels found in the maps that
were not found in the xlevels for the object . The row names
are the illegal levels, the column names are the variable names, and the
values are the number of grid cells where the illegal levels were found. |
outputLegend |
A data frame showing the relationship between levels in
the output maps and those found in object . The row names are
level index values, the column names are variable names, and the values
are the levels. NULL if no factors are output. |
inputLegend |
A data frame showing the relationship between levels found in
the input maps and those found in object . The row names are
level index values (this function assumes they correspond to numeric values
on the maps), the column names are variable names, and the values
are the levels. NULL if no factors are input. This information is basically
the same as that in xlevels . |
Nicholas L. Crookston ncrookston@fs.fed.us
Andrew O. Finley finleya@msu.edu
yai
, impute
, and newtargets
## These commands write new files to your working directory # Use the iris data data(iris) # Section 1: Imagine that the iris are planted in a planting bed. # The following set of commands create Asciigrid map # files for four attributes to illustrate the planting layout. # Change species from a character factor to numeric (the sp classes # can not handle character data). sLen <- matrix(iris[,1],10,15) sWid <- matrix(iris[,2],10,15) pLen <- matrix(iris[,3],10,15) pWid <- matrix(iris[,4],10,15) spcd <- matrix(as.numeric(iris[,5]),10,15) # Make maps of each variable. header = c("NCOLS 15","NROWS 10","XLLCORNER 1","YLLCORNER 1", "CELLSIZE 1","NODATA_VALUE -9999") cat(file="slen.txt",header,sep="\n") cat(file="swid.txt",header,sep="\n") cat(file="plen.txt",header,sep="\n") cat(file="pwid.txt",header,sep="\n") cat(file="spcd.txt",header,sep="\n") write.table(sLen,file="slen.txt",append=TRUE,col.names=FALSE, row.names=FALSE) write.table(sWid,file="swid.txt",append=TRUE,col.names=FALSE, row.names=FALSE) write.table(pLen,file="plen.txt",append=TRUE,col.names=FALSE, row.names=FALSE) write.table(pWid,file="pwid.txt",append=TRUE,col.names=FALSE, row.names=FALSE) write.table(spcd,file="spcd.txt",append=TRUE,col.names=FALSE, row.names=FALSE) # Section 2: Create functions to predict species # set the random number seed so that example results are consistant # normally, leave out this command set.seed(12345) # sample the data refs <- sample(rownames(iris),50) y <- data.frame(Species=iris[refs,5],row.names=rownames(iris[refs,])) # build a yai imputation for the reference data. # set the random number seed so that example results are consistant # normally, one would leave out this next command set.seed(1) rfNN <- yai(x=iris[refs,1:4],y=y,method="randomForest") # make lists of input and output map files. xfiles <- list(Sepal.Length="slen.txt",Sepal.Width="swid.txt", Petal.Length="plen.txt",Petal.Width="pwid.txt") outfiles1 <- list(distance="dist.txt",Species="spOutrfNN.txt") # map the imputation-based predictions for the input maps AsciiGridImpute(rfNN,xfiles,outfiles1,ancillaryData=iris) # build a randomForest predictor rf <- randomForest(x=iris[refs,1:4],y=iris[refs,5]) # map the predictions for the input maps outfiles2 <- list(predict="spOutrf.txt") AsciiGridPredict(rf,xfiles,outfiles2,xtypes=NULL,rows=NULL) # read the species maps and plot them using the sp package classes if (require(sp)) { spOrig <- read.asciigrid("spcd.txt") sprfNN <- read.asciigrid("spOutrfNN.txt") sprf <- read.asciigrid("spOutrf.txt") dist <- read.asciigrid("dist.txt") par(mfcol=c(2,2)) image(spOrig,col=c(1,2,3)) title("Original") image(sprfNN,col=c(1,2,3)) title("Using Predict") image(sprf,col=c(1,2,3)) title("Using Impute") image(dist) title("Neighbor Distances") }