sqlite.data.frame {SQLiteDF} | R Documentation |
Creates an Sqlite Data Frame (SDF) from ordinary data frames.
sqlite.data.frame(x, name=NULL)
x |
The object to be coerced into a data frame which is then stored in a
SQLite database. as.data.frame is called first on x before creating
the SDF database. |
name |
The internal name of the SDF. If none is provided, a generic name data<n> is used (e.g. data1, data2, etc). Each SDF should have a unique internal name and also be a valid R symbol. Numbers are appended to names in case of duplicates, e.g. if name arg is iris, and it already exists, then the new SDF will have a name iris1. If it still exists, then the name will be iris2, and so on. |
SQLite data frames (SDF's) are data frames whose data are stored in a SQLite database. SQLite is an open source, powerful (considering its size), light weight data base engine. It stores each database (composed of tables, indices, etc.) in a single file. Since a single SDF occupies a whole database, each SDF will be contained in a single file.
Each SDF file contains the following tables:
SDF's are managed in a workspace separate from R's. When SQLiteDF is loaded, it searches
for the file workspace.db
inside the subdirectory .SQLiteDF
in the current
working directory. This file contains
a list of SDF's created/used in the previous session (i.e. SQLiteDF sessions are automatically
saved), including their full and relative path and attach information.
Workspace is managed using the SQLite engine
by opening workspace.db
as the main database and then attaching (SQLite's attach) the SDF's.
Unfortunately, the number of attached databases is limited to 31 (actually 32, but 1 is reserved
for the temp db). Therefore, SDF's are scored according to the number of times
it has been used. When the maximum allowed attachment is reached, the least used attached
SDF's is detached and the needed one is attached in its place. On compiling the package, the
configure script modifies the bundled SQLite source such that
constant controlling the maximum attachments is modified to 31 (default is 10).
Back to when SQLiteDF is loaded, after opening workspace.db
, the SDF's stored in the list
are sorted according to their number of uses in the previous session and then the
first 30 are attached. The relative path is used for finding the SQLite file. If the file
cannot be found, it is deleted from the SQLiteDF workspace (with a warning message). The
scores are then all reset.
A sqlite.data.frame object is a list a single element:
and the following attributes:
c("sqlite.data.frame", "data.frame")
All SDF's created in the session will have their SQLite file stored in the subdirectory .SQLiteDF in the
current working directory. SDF's created in the other session can be imported/attached
to the current SDF workspace using attachSdf
, which may reside anywhere in the
file system.
A S3 object representing the SDF. The SDF database will be created in the same directory with file name derived by appending the extension db to the passed internal name, or the default internal name if none is provided.
The full path is used to avoid attaching the same db which may have different relative path
after the user changes directory after loading SQLiteDF (see attachSdf
).
Miguel A. R. Manese
lsSdf
getSdf
attachSdf
detachSdf
library(datasets) iris.sdf <- sqlite.data.frame(iris) names(iris.sdf) class(iris.sdf) iris.sdf$Petal.Length[1:10] iris.sdf[["Petal.Length"]][1:10] iris.sdf[1,c(TRUE,FALSE)] #apply(iris.sdf[1:4], 2, mean)