sanfrancisco.home.sales {nutshell} | R Documentation |
This data contains information on homes sold in San Francisco between 2/13/2008 and 7/14/2009.
data(sanfrancisco.home.sales)
A data frame with 3281 observations on the following 15 variables.
line
county
San Francisco County
street
city
San Francisco
zip
date
price
bedrooms
squarefeet
lotsize
year
latitude
longitude
month
neighborhood
This data set was assembled from a variety of sources, including two Bay area newspapers (the San Jose Mercury News and the San Francisco Chronicle), Yahoo Maps, and Zillow Neighborhood Boundaries.
This data set is used as an example in the book "R in a Nutshell" from O'Reilly
Media. In the book, we took separate samples for training and testing. Indices
for observations in each sample are included in
sanfrancisco.home.sales.testing.indices
and
sanfrancisco.home.sales.training.indices
.
Data was assembled from a variety of sources including http://www.sfgate.com http://www.mercurynews.com http://www.zillow.com/howto/api/neighborhood-boundaries.htm
data(sanfrancisco.home.sales) library(lattice) trellis.par.set(fontsize=list(text=7)) dollars.per.squarefoot <- mean( sanfrancisco.home.sales$price / sanfrancisco.home.sales$squarefeet, na.rm=TRUE); xyplot(price~squarefeet|neighborhood, data=sanfrancisco.home.sales, pch=19, cex=.2, subset=(zip!=94100 & zip!=94104 & zip!=94108 & zip!=94111 & zip!=94133 & zip!=94158 & price<4000000 & ifelse(is.na(squarefeet),FALSE,squarefeet<6000)), strip=strip.custom(strip.levels=TRUE, horizontal=TRUE, par.strip.text=list(cex=.8)), panel=function(...) { panel.abline(a=0,b=dollars.per.squarefoot); panel.xyplot(...); } )