AsciiGridImpute {yaImpute}R Documentation

Imputes/Predicts data for Ascii Grid maps

Description

AsciiGridImpute finds nearest neighbor reference observations for each point in the input grid maps and outputs maps of selected Y-variables in a set of output grid maps.

AsciiGridPredict applies a predict function to each point in the input grid maps and outputs maps of the prediction(s) in one or more output grid maps.

One row of the each grid maps is read and processed at a time thereby avoiding the need to build huge objects in R that would be necessary all the rows of all the maps were processed together.

Usage

AsciiGridImpute(object,xfiles,outfiles,xtypes=NULL,ancillaryData=NULL,
                ann=NULL,rows=NULL,cols=NULL,
                nodata=NULL,myPredFunc=NULL,...)

AsciiGridPredict(object,xfiles,outfiles,xtypes=NULL,rows=NULL,
                cols=NULL,nodata=NULL,myPredFunc=NULL,...)

Arguments

object An object of class yai, any other object for which a predict function is defined, or when myPredFunc is specified, myPredFunc may optionally be NULL (but not missing).
xfiles A list of input file names where there is one grid file for each X-variable. List elements must be given the same names as the X-variables they correspond with and there must be one file for each X-variable used in argument object.
outfiles One of these two forms:
  • (1) A file name that is understood to correspond to the single prediction returned by myPredFunc (generally applies to AsciiGridPredict), or
  • (2) A list of output file names where there is one grid file for each desired output variable. While there may be many Y-variables in the object or ancillaryData, only those you desire must be specified. In addition to variables names, the following two special names can be coded when the object class is yai: For distance="filename" a map of the distances is output and if useid="filename" a map of useid's is output. When myPredFunc is not NULL and when it returns a vector, an additional special name of predict="filename" is used.
  • xtypes A list of data types names that corresponds exactly to data type of the maps listed in xfiles. Each value can be one of: "logical", "integer", "numeric", "character". If NULL, or if a type is missing for a member of xfiles, "numeric" is assigned.
    ancillaryData A data frame of Y-variables that may not have been used in the original call to yai. There must be one row for each reference observation, no missing data, and row names must match those used in the reference observations.
    ann if NULL, the value is taken from object. When TRUE, ann is used to find neighbors, and when FALSE a slow exact search is used.
    rows if NULL, all rows from the input grids are used. Otherwise, rows is a 2-element vector given the rows desired for the output. If the first element is greater than one, the header value YLLCORNER is adjusted accordingly.
    cols if NULL, all columns from the input grids are used. Otherwise, cols is a 2-element vector given the columns desired for the output. If the first element is greater than one, the header value XLLCORNER is adjusted accordingly.
    nodata the NODATA_VALUE for the output. If NULL, the value is taken from the input grids.
    myPredFunc called to predict output using the object and newdata from the xfiles. Two arguments are required, the first is object and the second is a data frame of the new predictor variables. If NULL, the generic predict function is called.
    ... passed to myPredFunc or predict.

    Details

    The input maps are assumed to be Ascii Grid maps with 6-line headers containing the following tags: NCOLS, NROWS, XLLCORNER, YLLCORNER, CELLSIZE and NODATA_VALUE (case insensitive).

    The function builds a data frame from the input maps and builds predictions using that data frame as new data. Each row of the input maps is processed in sequence so that the entire maps are not stored in memory. The function works by opening all the input and reads one line at a time from each. The output file(s) are created as the input maps are processed.

    If factors are used as X-variables in object, the levels found the new data are checked against those used in building the object. If new levels are found, the corresponding map pixel is considered a missing value. This approach is done despite the fact that most predict functions will simply fail in these circumstances.

    The methods used for checking depend on object containing a meaningful member list named xlevels, as done objects model objects produced by lm. Note that objects produced by randomForest do not contain xlevels; use function addXlevels to add the necessary element if one or more factors are used.

    Missing data is first removed from the data frame (if all are missing, an output frame is built of just missing data). After the new neighbors are found for the observations in the input data frame, imputed values for the Y-variables are computed using impute.yai with its default arguments. Missing data, if any, are added into the output and written to the grid maps.

    Value

    A list of factor levels that are in the maps, but not used to build the object, or NULL if there are none. The list includes the number of map grid points the offending level was found.

    Author(s)

    Nicholas L. Crookston ncrookston@fs.fed.us
    Andrew O. Finley afinley@stat.umn.edu

    See Also

    yai, impute, and newtargets

    Examples

    
    ## These commands write new files to your working directory
    
    # Use the iris data
    data(iris)
    
    # Change species from a character factor to numeric (the sp classes
    # can not handle character data).
    
    iris[,5]<-as.factor(as.numeric(iris[,5]))
    sLen <- matrix(iris[,1],10,15)
    sWid <- matrix(iris[,2],10,15)
    pLen <- matrix(iris[,3],10,15)
    pWid <- matrix(iris[,4],10,15)
    spcd <- matrix(as.numeric(iris[,5]),10,15)
    
    # Make a "map" of each variable.
    
    header = c("NCOLS 15","NROWS 10","XLLCORNER 1","YLLCORNER 1",
               "CELLSIZE 1","NODATA_VALUE -9999")
    cat(file="slen.txt",header,sep="\n")
    cat(file="swid.txt",header,sep="\n")
    cat(file="plen.txt",header,sep="\n")
    cat(file="pwid.txt",header,sep="\n")
    cat(file="spcd.txt",header,sep="\n")
    
    write.table(sLen,file="slen.txt",append=TRUE,col.names=FALSE,
                row.names=FALSE)
    write.table(sWid,file="swid.txt",append=TRUE,col.names=FALSE,
                row.names=FALSE)
    write.table(pLen,file="plen.txt",append=TRUE,col.names=FALSE,
                row.names=FALSE)
    write.table(pWid,file="pwid.txt",append=TRUE,col.names=FALSE,
                row.names=FALSE)
    write.table(spcd,file="spcd.txt",append=TRUE,col.names=FALSE,
                row.names=FALSE)
    
    # sample the data
    refs <- sample(rownames(iris),50)
    y <- data.frame(Species=iris[refs,5],row.names=rownames(iris[refs,]))
    
    # build a yai imputation for the reference data.
    rfNN <- yai(x=iris[refs,1:4],y=y,method="randomForest")
    
    xfiles <- list(Sepal.Length="slen.txt",Sepal.Width="swid.txt",
                   Petal.Length="plen.txt",Petal.Width="pwid.txt")
    outfiles1 <- list(distance="dist.txt",Species="spOutrfNN.txt")
    
    # map the imputation-based predictions for the input maps
    AsciiGridImpute(rfNN,xfiles,outfiles1,ancillaryData=iris)
    
    # build a randomForest predictor
    rf <- randomForest(x=iris[refs,1:4],y=iris[refs,5])
    
    # map the predictions for the input maps
    outfiles2 <- list(predict="spOutrf.txt")
    AsciiGridPredict(rf,xfiles,outfiles2,xtypes=NULL,rows=NULL)
    
    # read the species maps and plot them using the sp package classes
    if (require(sp)) {
       spOrig <- read.asciigrid("spcd.txt")
       sprfNN <- read.asciigrid("spOutrfNN.txt")
       sprf <- read.asciigrid("spOutrf.txt")
       dist <- read.asciigrid("dist.txt")
    
       par(mfcol=c(2,2))
       image(spOrig,col=c(1,2,3))
       title("Original")
       image(sprfNN,col=c(1,2,3))
       title("Using Predict")
       image(sprf,col=c(1,2,3))
       title("Using Impute")
       image(dist)
       title("Neighbor Distances")
    }
    

    [Package yaImpute version 0.0-4 Index]