Distributed Data Structures in R


[Up] [Top]

Documentation for package ‘ddR’ version 0.1.2

Help Pages

$-method Extracts elements of a distributed object matching the name.
as.darray Convert input matrix into a distributed array.
as.dframe Convert input matrix or data.frame into a distributed data.frame.
as.DList Creates a distributed list from the input.
as.dlist Creates a distributed list from the input.
cbind rbindddR
cbind-method Column binds the objects.
collect Fetch partition(s) of 'darray', 'dframe' or 'dlist' from remote workers.
colMeans-method Gets the column means for a distributed array or data.frame.
colnames-method Gets the colnames for the distributed object.
colSums-method Get the column sums for a distributed array or data.frame.
combine Combines a list of partitions into a single distributed object. (can be implemented by a frontend wrapper without actually combining data in storage).
combine-method Combines a list of partitions into a single distributed object. (can be implemented by a frontend wrapper without actually combining data in storage).
DArray Creates a distributed array with the specified partitioning and contents.
darray Creates a distributed array with the specified partitioning and contents.
ddR Distributed Data-structures in R
ddRDriver-class The base S4 class for backend driver classes to extend.
DFrame Creates a distributed data.frame with the specified partitioning and data.
dframe Creates a distributed data.frame with the specified partitioning and data.
dimnames-method Gets the dimnames for the distributed object.
dimnames<--method Sets the dimnames for the distributed object.
dlapply Distributed version of 'lapply'. Similar to 'dmapply', but permits only one iterable argument, and output.type is always 'dlist'.
DList Creates a distributed list with the specified partitioning and data.
dlist Creates a distributed list with the specified partitioning and data.
dmapply Distributed version of mapply. Similar to R's 'mapply', it allows a multivariate function, FUN, to be applied to several inputs. Unlike standard mapply, it always returns a distributed object.
DObject-class The baseline distributed object class to be extended by each backend driver. Backends may elect to extend once for all distributed object types ('dlist', 'darray', 'dframe,', etc.) for one per type, depending on needs.
do_collect Backend implemented function to move data from storage to the calling context (node).
do_collect-method Backend implemented function to move data from storage to the calling context (node).
do_dmapply Backend-specific dmapply logic. This is a required override for all backends to implement so dmapply works.
do_dmapply-method Backend-specific dmapply logic. This is a required override for all backends to implement so dmapply works.
getBestOutputPartitioning This is an overrideable function that determines what the output partitioning scheme of a dlapply or dmapply function should be. It determines the 'ideal' nparts for the output if it is not supplied. For API standard-enforcement, overriding this is not recommended.
getBestOutputPartitioning.ddRDriver This is an overrideable function that determines what the output partitioning scheme of a dlapply or dmapply function should be. It determines the 'ideal' nparts for the output if it is not supplied. For API standard-enforcement, overriding this is not recommended.
getPartitionIdsAndOffsets Gets the internal set of partitions, and offsets within each partition, of a set 1d or 2d-subset indices for a distributed object
get_parts Gets the partitions to a distributed object, given an index.
get_parts-method Gets the partitions to a distributed object, given an index.
init Called when the backend driver is initialized.
init-method Called when the backend driver is initialized.
is.DArray Returns whether the input is a darray
is.darray Returns whether the input is a darray
is.DFrame Returns whether the input is a dframe
is.dframe Returns whether the input is a dframe
is.DList Returns whether the input is a dlist
is.dlist Returns whether the input is a dlist
is.DObject Returns whether the input entity is a DObject
is.dobject Returns whether the input entity is a DObject
is.sparse_darray Returns whether the input is a sparse_darray
mean-method Gets the mean value of the elements within the object.
names<--method Sets the names of a distributed object
nparts Returns a 2d-vector denoting the number of partitions existing along each dimension of the distributed object, where the vector==c(partitions_per_column, partitions_per_row). For a dlist, the value is equivalent to c(totalParts(dobj),1).
parallel The default parallel driver
parts Retrieves, as a list of independent objects, pointers to each individual partition of the input.
psize Return sizes of each partition of the input distributed object.
rbind rbindddR
rbind-method row binds the arguments
repartition Repartitions a distributed object. This function takes two inputs, a distributed object and a skeleton. These inputs must both be distributed objects of the same type and same dimension. If 'dobj' and 'skeleton' have different internal partitioning, this function will return a new distributed object with the same internal data as in 'dobj' but with the partitioning scheme of 'skeleton'.
repartition.DObject Repartitions a distributed object. This function takes two inputs, a distributed object and a skeleton. These inputs must both be distributed objects of the same type and same dimension. If 'dobj' and 'skeleton' have different internal partitioning, this function will return a new distributed object with the same internal data as in 'dobj' but with the partitioning scheme of 'skeleton'.
rowMeans-method Gets the row means for a distributed array or data.frame.
rownames-method Gets the rownames for the distributed object.
rowSums-method Gets the row sums for a distributed array or data.frame.
shutdown Called when the backend driver is shutdown.
shutdown-method Called when the backend driver is shutdown.
sum-method Gets the sum of the objects.
totalParts Returns the total number of partitions of the distributed object. The result is same as prod(nparts(dobj))
useBackend Sets the active backend driver. Functions exported by the 'ddR' package are dispatched to the backend driver. Backend-specific initialization parameters may be passed into the ellipsis (...) part of the function arguments.
[ Extract parts of a distributed object.
[-method Extract parts of a distributed object.
[[-method Extracts a single element of a distributed object.