ddply {plyr}R Documentation

Split data frame, apply function, and return results in a data frame

Description

For each subset of a data frame, apply function then combine results into a data frame

Usage

ddply(.data, .variables, .fun = NULL, ..., .progress = "none", .drop = TRUE)

Arguments

.data data frame to be processed
.variables variables to split data frame by, as quoted variables, a formula or character vector
.fun function to apply to each piece
... other arguments passed on to .fun
.progress name of the progress bar to use, see create_progress_bar
.drop

Details

All plyr functions use the same split-apply-combine strategy: they split the input into simpler pieces, apply .fun to each piece, and then combine the pieces into a single data structure. This function splits data frames by variables and combines the result into a data frame. If there are no results, then this function will return a data frame with zero rows and columns (data.frame()).

The most unambiguous behaviour is achieved when .fun returns a data frame - in that case pieces will be combined with rbind.fill. If .fun returns an atomic vector of fixed length, it will be rbinded together and converted to a data frame. Any other values will result in an error.

Value

a data frame

Author(s)

Hadley Wickham <h.wickham@gmail.com>

Examples

mean_rbi <- function(df) mean(df$rbi, na.rm=TRUE)
rbi <- ddply(baseball, .(year), mean_rbi)
with(rbi, plot(year, V1, type="l"))

mean_rbi <- function(rbi, ...) mean(rbi, na.rm=TRUE)
rbi <- ddply(baseball, .(year), splat(mean_rbi))

ddply(baseball, .(year), numcolwise(mean), na.rm=TRUE)
base2 <- ddply(baseball, .(id), function(df) {
transform(df, career_year = year - min(year) + 1)
})

[Package plyr version 0.1.5 Index]