pkg-trackObjs {trackObjs}R Documentation

Overview of trackObjs package

Description

The trackObjs package sets up a link between R objects in memory and files on disk so that objects are automatically resaved to files when they are changed. R objects in files are read in on demand and do not consume memory prior to being referenced. The trackObjs package also tracks times when objects are created and modified, and caches some basic characteristics of objects to allow for fast summaries of objects.

Each object is stored in a separate RData file using the standard format as used by save(), so that objects can be manually picked out of or added to the trackObjs database if needed.

Tracking works by replacing a tracked variable by an 'activeBinding', which when accessed looks up information in an associated 'tracking environment' and reads or writes the corresponding RData file and/or gets or assigns the variable in the tracking environment.

Details

There are three main reasons to use the trackObjs package:

There is an option to control whether tracked objects are cached in memory as well as being stored on disk. By default, objects are not cached. To save time when working with collections of objects that will all fit in memory, turn on caching with track.options(cache=TRUE), or start tracking with track.start(..., cache=TRUE).

Here is a brief example of tracking some variables in the global environment:

> library(trackObjs)
> track.start("tmp1")
> x <- 123                  # Not yet tracked
> track(x)                  # Variable 'x' is now tracked
> track(y <- matrix(1:6, ncol=2)) # 'y' is assigned & tracked
> z1 <- list("a", "b", "c")
> z2 <- Sys.time()
> track(list=c("z1", "z2")) # Track a bunch of variables
> track.summary(size=F)     # See a summary of tracked vars
            class    mode extent length            modified TA TW
x         numeric numeric    [1]      1 2007-09-07 08:50:58  0  1
y          matrix numeric  [3x2]      6 2007-09-07 08:50:58  0  1
z1           list    list  [[3]]      3 2007-09-07 08:50:58  0  1
z2 POSIXt,POSIXct numeric    [1]      1 2007-09-07 08:50:58  0  1
> # (TA="total accesses", TW="total writes")
> ls(all=TRUE)
[1] "x"  "y"  "z1" "z2"
> track.stop()              # Stop tracking
> ls(all=TRUE)
character(0)
>
> # Restart using the tracking dir -- the variables reappear
> track.start("tmp1") # Start using the tracking dir again
> ls(all=TRUE)
[1] "x"  "y"  "z1" "z2"
> track.summary(size=F)
            class    mode extent length            modified TA TW
x         numeric numeric    [1]      1 2007-09-07 08:50:58  0  1
y          matrix numeric  [3x2]      6 2007-09-07 08:50:58  0  1
z1           list    list  [[3]]      3 2007-09-07 08:50:58  0  1
z2 POSIXt,POSIXct numeric    [1]      1 2007-09-07 08:50:58  0  1
> track.stop()
>
> # the files in the tracking directory:
> list.files("tmp1", all=TRUE)
[1] "."                    ".."
[3] "filemap.txt"          ".trackingSummary.rda"
[5] "x.rda"                "y.rda"
[7] "z1.rda"               "z2.rda"
>

There are several points to note:

List of basic functions and common calling patterns

Six functions cover the majority of common usage of the trackObjs package:

Complete list of functions and common calling patterns

The trackObjs package provides many additional functions for controlling how tracking is performed (e.g., whether or not tracked variables are cached in memory), examining the state of tracking (show which variables are tracked, untracked, orphaned, masked, etc.) and repairing tracking environments and databases that have become inconsistent or incomplete (this may result from resource limitiations, e.g., being unable to write a save file due to lack of disk space, or from manual tinkering, e.g., dropping a new save file into a tracking directory.)

The functions that can be used to set up and take down tracking are:

Functions for tracking and stopping tracking variables:

Functions for getting status of tracking and summaries of variables:

The remaining functions allow the user to more closely manage variable tracking, but are less likely to be of use to new users.

Functions for getting status of tracking and summaries of variables:

Functions for managing tracking and tracked variables:

Functions for recovering from errors:

Design and internals of tracking:

Author(s)

Tony Plate <tplate@acm.org>

References

Roger D. Peng. Interacting with data using the filehash package. R News, 6(4):19-24, October 2006. http://cran.r-project.org/doc/Rnews and http://sandybox.typepad.com/software

David E. Brahm. Delayed data packages. R News, 2(3):11-12, December 2002. http://cran.r-project.org/doc/Rnews

See Also

Design of the trackObjs package.

Potential future features of the trackObjs package.

Documentation for save load (in 'base' package).

Documentation for makeActiveBinding and related functions (in 'base' package).

Inspriation from the packages g.data and filehash.


[Package trackObjs version 0.8-3 Index]