data.table {data.table}R Documentation

Enhanced data.frame

Description

Same internal structure as data.frame (i.e. list of vectors) but enhanced extract methods.

Usage

data.table(..., keep.rownames=FALSE, check.names=TRUE, key=NULL)
DT(...)

Arguments

... Just as ... in data.frame. These arguments are of either the form value or tag = value. Column names are created based on the tag (if present) or the deparsed argument itself.
The data.table can be formed from one or more of (mixed type) vector, matrix, data.frame, data.table, etc. Usual recycling rules are applied to vectors of different lengths.
make.names is used to resolve any column name conflicts in the union of the column names of these objects and the tag will be prefixed to the column name.
keep.rownames If ... is a matrix or data.frame, TRUE will retain the rownames of that object into a column 'rn'. If the matrix or data.frame already has a column called 'rn' it will be renamed 'rn.1' via make.names. If more than one matrix or data.frame appear in ... all rownames from all those objects will be kept in multiple rn.<n> columns.
check.names Just as check.names in data.frame(). For example this replaces spaces in column names with period and ensures column names are valid R object names.
key Character vector of length 1 containing one or more column names separated by comma which is passed to setkey

Details

data.table creates a data.table from its arguments just as data.frame does. DT() is an alias for data.table() and is often used instead of as.data.table().

A data.table is a list of vectors, just like a data.frame, however :

  1. it never has rownames. Instead it may have an optional key of one or more columns using setkey. This key can be used for row indexing instead of rownames.
  2. when the data.table has over 20 rows the print method displays column names at the bottom as well as at the top to save scrolling up at the console.
  3. character vectors may be passed in but they are automatically converted to factor. A data.table does not allow character columns for time and space reasons.
  4. however the main difference is enhanced functionality in [.data.table where most documentation for this package lives.

Value

A data.table.

Note

keep.rownames and check.names, if suppplied, must be written in full since they appear after the .... R does not allow partial argument names after .... For example data.table(DF,keep=TRUE) will create a column called 'keep' containing TRUE and this is correct behaviour. Most likely data.table(DF,keep.rownames=TRUE) was intended.

Author(s)

Matthew Dowle

See Also

data.frame, tables, setkey, [.data.table

Examples

DF = data.frame(a=1:5, b=letters[1:5])
DT = data.table(a=1:5, b=letters[1:5])
identical(as.data.table(DF), DT)
identical(DT(DF), DT)
identical(dim(DT),dim(DF))
identical(DF$a, DT$a)
DT
tables()
identical(DT(DT,DT), cbind(DT,DT))
DT2=rbind(DT,DT)
DT3 = data.table(A=DT, B=DT, key="A.b")
tables()
test.data.table()
## Not run: 
example("[.data.table")
## End(Not run)

[Package data.table version 1.2 Index]