Baseball batting {plyr} | R Documentation |
This data frame contains batting statistics for a subset of players collected from http://www.baseball-databank.org/. There are a total of 21,699 records, covering 1,228 players from 1871 to 2007. Only players with more 15 seasons of play are included.
See the baseball case study in the introductory vignette (vignette("intro", "plyr")
) for more details, and example of how you might explore this data.
Variables:
data(baseball)
A 21699 x 22 data frame
http://www.baseball-databank.org/
baberuth <- subset(baseball, id == "ruthba01") baberuth$cyear <- baberuth$year - min(baberuth$year) + 1 calculate_cyear <- function(df) { transform(df, cyear = year - min(year), cpercent = (year - min(year)) / (max(year) - min(year)) ) } baseball <- ddply(baseball, .(id), calculate_cyear) baseball <- subset(baseball, ab >= 25) model <- function(df) { lm(rbi / ab ~ cyear, data=df) } model(baberuth) models <- dlply(baseball, .(id), model)