ff {ff}R Documentation

Flat file database designed for large data vectors

Description

The function ff and its methods allow for handling data using a flat file with memory mapped pages. It is a constructor function for ff objects, which are numerical vectors stored in a flat file. The maximum size of the flat file is 16 GB on 32-bit platforms; however possible limitations of the file system apply.

Usage

  ff(file, length = 0, pagesize = getdefaultpagesize(), readonly = FALSE)
  ## S3 method for class 'ff':
  x[index]
  ## S3 method for class 'ff':
  x[index] <- value
  ## S3 method for class 'ff':
  dim(x)
  ## S3 method for class 'ff':
  length(x)
  ## S3 method for class 'ff':
  sample(x, size, replace = FALSE, prob = NULL)
  ## S3 method for class 'ff':
  print(x, ...)
  

Arguments

file character string giving the name of a file to load or create.
length size/length of double vector if object should be (re-)created.
pagesize page size (in multiples of the system page size, see getpagesize).
readonly boolean indicating whether the flat file should be accessed as read-only.
x a ff object.
index indices specifying elements to extract or replace.
value suitable replacement value or vector of values.
size non-negative integer giving the number of items to choose.
replace should sampling be with replacement?
prob a vector of probability weights for obtaining the elements of the vector being sampled. The argument prob is ignored in the sample method for ff.
... further arguments passed to or from other methods.

Details

On 32-bit R platforms the indexing is limited to a maximum number of 2^31-1. By using a multi-dimensional array, the data vector can be greater to overcome this limitation (see ffm).

As ff objects are held by external pointers, they are copied as a reference. The connection life-time of the ff object and its implementation part (written in C++) is under control of the garbage collector gc. To explicitly close an ff object, one should call the garbage collector after deleting the reference(!). ff depends on the OS and file-system facilities. E.g. it is not possible to create files > 4GB on FAT32 systems.

The following table gives an overview of file size limits for common file systems (see http://en.wikipedia.org/wiki/Comparison_of_file_systems for further details):

File System File size limit
FAT16 2GB
FAT32 4GB
NTFS 16GB
ext2/3/4 16GB to 2TB
ReiserFS 4GB (up to version 3.4) / 8TB (from version 3.5)
XFS 8EB
JFS 4PB
HFS 2GB
HFS Plus 16GB
USF1 4GB to 256TB
USF2 512GB to 32PB
UDF 16EB

Examples

  a <- ff("foo.ff", 8192)        # create a big vector
  a[1:10] <- rnorm(10)           # set data cells
  a[1:10]                        # get data cells
  

[Package ff version 1.0-1 Index]