nmf-methods {NMF}R Documentation

Main Interface to run NMF algorithms

Description

This method implements the main interface to launch NMF algorithms within the framework defined in package NMF. It allows to combine NMF algorithms with seeding methods. The returned object can be directly passed to visualisation or comparison methods.

For a tutorial on how to use the interface, please see the package's vignette: vignette('NMF')

Usage


## S4 method for signature 'matrix, numeric, function':
nmf(x, rank, method, name, objective='euclidean', model='NMFstd', mixed=FALSE, ...)
## S4 method for signature 'matrix, numeric, character':
nmf(x, rank, method=nmf.getOption("default.algorithm"), ...)
## S4 method for signature 'matrix, numeric, NMFStrategy':
nmf(x, rank, method, seed=nmf.getOption('default.seed'), nrun=1, model=list(), .options=list(), ...)

Arguments

method The algorithm to use to perform NMF on x. Different formats are allowed: character, function. See section Methods for more details on how each format is used.
mixed Boolean that states if the algorithm requires a nonnegative input matrix (mixed=FALSE which is the default value) or accepts mixed sign input matrices (mixed=TRUE). An error is thrown if the sign required is not fulfilled. This parameter is useful to plug-in algorithms such as semi-NMF, that typically does not impose nonnegativity constraints on both the input and the basis component matrices.
model When method is a function, argument model must be a single character string or a list that specifies the NMF model to use. A NMF model is defined by a S4 class that extends class NMF. As a single character string, argument model must be the name of the class that defines the NMF model. As a list it contains at least the name of the class that defines the NMF model in element model$class or in the first element (if not named). The other element in the list are used to initialize the model's slots.
When method is a single character string, argument model must be a list. It is used to initialize the slots in the NMF model associated with the NMF strategy of name method. Note that the model associated with NMF strategy cannot be changed via argument model.
Note also that values to initialize the NMF model's slots can also be passed in .... However, those passed via argument model have priority over the later. This is designed to handle the situation where one wants to pass a parameter to the NMF algorithm, that has the same name as a slot in the NMF model, or vice versa. If a variable appears in both argument model and ..., the former will be used to initialize the NMF model, the latter will be passed to the NMF algorithm. See code examples for an illustration of this situation.
name A character string to be used as a name for the custom NMF algorithm.
nrun Used to perform multiple runs of the algorithm. It specifies the number of runs to perform . This argument is useful to achieve stability when using a random seeding method.
objective Used when method is a function. It must be A character string giving the name of a built-in distance method or a function to be used as the objective function. It is used to compute the residuals between the target matrix and its NMF estimate.
.options this argument is used to set some runtime options. It can be list containing the named options and their values, or, in the case only boolean options need to be set, a character string that specifies which options are turned on or off. The string must be composed of characters that correspond to a given option. Characters '+' and '-' are used to explicitly specify on and off respectively. E.g. .options='tv' will toogle on options track and verbose, while .options='t-v' will toogle on option track and off option verbose. Note that '+' and '-' apply to all option character found after them. Default behaviour assumes that .options starts with a '+'.
The following options are available (note the characters that correspond to each option, to be used when .options is passed as a string):

debug - d
Toogle debug mode. Like option verbose but with more information displayed.

keep.all - k
used when performing multiple runs (nrun>1): if toogled on, all factorizations are saved and returned, otherwise only the factorization achieving the minimum residuals is returned.

track - t
enables (resp. disables) error tracking. When TRUE, the returned object's slot residuals contains the trajectory of the objective values. This tracking functionality is available for all built-in algorithms.

verbose - v
Toogle verbosity. If on, messages about the configuration and the state of the current run(s) are displayed.


rank The factorization rank to achieve [i.e a single positive numeric]
seed The seeding method to use to compute the starting point passed to the algorithm. See section Seeding methods for more details on the possible classes and types for argument seed.
x The target object to estimate. It can be a matrix, a data.frame, an ExpressionSet object (this requires the Biobase package to be installed). See section Methods for more details.
... Extra parameters passed to the NMF algorithm's run method. When there is no conflict with the slot names in the NMF model class, values for model slots can also be passed in .... See argument model.

Value

The returned value depends on the run mode:

Single run: An object that inherits from class NMF.
Multiple runs: When nrun > 1 or when method is a list, this method returns an object of class NMFSet

Methods

x = "ANY", rank = "ANY", method = "list"
Performs NMF on object x for each algorithm defined in method.
x = "ANY", rank = "ANY", method = "missing"
Performs default NMF algorithm on object x.
x = "data.frame", rank = "ANY", method = "ANY"
Performs NMF on a data.frame: the target matrix is the converted data.frame as.matrix(x)
x = "ExpressionSet", rank = "ANY", method = "ANY"
Performs NMF on an ExpressionSet: the target matrix is the expression matrix exprs(x).

This method requires the Biobase package to be installed. Special methods for bioinformatics are provided in an optional layer, which is automatically loaded when the Biobase is installed. See NMF-bioc.

x = "matrix", rank = "numeric", method = "character"
Performs NMF on a matrix using an algorithm whose name is given by parameter method. The name provided must partially match the name of a registered algorithm. See section Algorithms below or the package's vignette.

x = "matrix", rank = "numeric", method = "function"
Performs NMF using a custom algorithm defined by a function. It must have signature (x=matrix, start=NMF, ...) and return an object that inherits from class NMF. It should use its argument start as a starting point.

NMF Algorithms

The following algorithms are available:

lee
Standard NMF. Based on euclidean distance, it uses simple multiplicative updates. See Lee and Seung (2000).
brunet
Standard NMF. Based on Kullbach-Leibler divergence, it uses simple multiplicative updates from Lee and Seung (2000), enhanced differently to avoid numerical underflow. See Brunet et al. (2004).
lnmf
Local Nonnegative Matrix Factorization. Based on a regularized Kullbach-Leibler divergence, it uses a modified version of Lee and Seung's multiplicative updates. See Li et al. (2001).
nsNMF
Nonsmooth NMF. Use a modified version of Lee and Seung's multiplicative updates for Kullbach-Leibler divergence to fit a extension of the standard NMF model. See Pascual-Montatno et al. (2006).
offset
Use a modified version of Lee and Seung's multiplicative updates for euclidean distance, to fit a NMF model that includes an intercept. See Badea (2008).
snmf/r, snmf/l
Alternating Least Square (ALS) approach from Kim and Park (2007).

Seeding methods

The purpose of seeding methods is to compute initial values for the factor matrices in a given NMF model. This initial guess will be used as a starting point by the chosen NMF algorithm.

The seeding method to use in combination with the algorithm can be passed to interface nmf through argument seed. The following formats are supported:

a character string:
giving the name of a registered seeding method. The corresponding method will be called to compute the starting point.
a list:
giving the name of a registered seeding method and, optionally, extra parameters to pass to it.
an object that inherits from NMF:
it will be directly passed to the algorithm's method – via its argument start.
a function:
that computes the starting point. It must have signature (object=NMF, target=matrix, ...) and return an object that inherits from class NMF. Argument object should be used as a template for the returned object.

Author(s)

Renaud Gaujoux renaud@cbio.uct.ac.za

References

Lee, D.~D. and Seung, H.~S. (2000). Algorithms for non-negative matrix factorization. In NIPS, 556–562.

Brunet, J.~P., Tamayo, P., Golub, T.~R., and Mesirov, J.~P. (2004). Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci U S A, 101(12), 4164–4169.

Pascual-Montano, A., Carazo, J.~M., Kochi, K., Lehmann, D., and Pascual-Marqui, R.~D. (2006). Nonsmooth nonnegative matrix factorization (nsnmf). IEEE transactions on pattern analysis and machine intelligence, 8(3), 403–415.

Kim, H. and Park, H. (2007). Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics.

Liviu Badea (2008). Extracting Gene Expression Profiles Common To Colon And Pancreatic Adenocaricinoma Using Simultaneous Nonnegative Matrix Factorization. In Pacific Symposium on Biocomputing, 13, 279–290

S. Li, X. Hou, and H. Zhang (2001). Learning spatially localized, parts-based representation. In Proc. CVPR, 2001.

See Also

class NMF, NMF-utils, package's vignette

Examples


# generate a synthetic dataset with known classes: 100 features, 23 samples (10+5+8)
n <- 100; counts <- c(10, 5, 8);
V <- syntheticNMF(n, counts, noise=TRUE)

# build the class factor
groups <- as.factor(do.call('c', lapply(seq(3), function(x) rep(x, counts[x]))))

# run default algorithm
res <- nmf(V, 3)
res
summary(res, class=groups)

# run default algorithm multiple times (only keep the best fit)
res <- nmf(V, 3, nrun=20)
res
summary(res, class=groups)

# run nonsmooth NMF algorithm
res <- nmf(V, 3, 'nsNMF')
res
summary(res, class=groups)

# compare some NMF algorithms
res <- nmf(V, 3, list('brunet', 'lee', 'nsNMF'))
res
compare(res, class=groups)

# run on an ExpressionSet (requires package Biobase)
## Not run: 
data(esGolub)
nmf(esGolub, 3)
## End(Not run)


[Package NMF version 0.2.4 Index]