daply {plyr}R Documentation

Split data frame, apply function, and return results in an array

Description

For each subset of data frame, apply function then combine results into an array

Usage

daply(.data, .variables, .fun = NULL, ..., .progress = "none", .drop = TRUE)

Arguments

.data data frame to be processed
.variables variables to split data frame by, as quoted variables, a formula or character vector
.fun function to apply to each piece
... other arguments passed on to .fun
.progress name of the progress bar to use, see create_progress_bar
.drop should extra dimensions of length 1 be dropped, simplifying the output. Defaults to TRUE

Details

All plyr functions use the same split-apply-combine strategy: they split the input into simpler pieces, apply .fun to each piece, and then combine the pieces into a single data structure. This function splits data frames by variable and combines the result into an array. If there are no results, then this function will return a vector of length 0 (vector()).

daply with a function that operates column-wise is similar to aggregate.

@keyword manip @arguments data frame to be processed @arguments variables to split data frame by, as quoted variables, a formula or character vector @arguments function to apply to each piece @arguments other arguments passed on to .fun @arguments name of the progress bar to use, see create_progress_bar @arguments should extra dimensions of length 1 be dropped, simplifying the output. Defaults to TRUE @value if results are atomic with same type and dimensionality, a vector, matrix or array; otherwise, a list-array (a list with dimensions)

Value

if results are atomic with same type and dimensionality, a vector, matrix or array; otherwise, a list-array (a list with dimensions)

Author(s)

Hadley Wickham <h.wickham@gmail.com>

Examples

daply(baseball, .(year), nrow)

# Several different ways of summarising by variables that should not be 
# included in the summary

daply(baseball[, c(2, 6:9)], .(year), mean)
daply(baseball[, 6:9], .(baseball$year), mean)
daply(baseball, .(year), function(df) mean(df[, 6:9]))

[Package plyr version 0.1.5 Index]