selectPrototypes {GOSim}R Documentation

Heuristic selection of prototypes and dimensionality reduction of feature vectors.

Description

begin{enumerate}

  • Heuristic selection of prototypes
  • Dimensionality reduction of feature vectors end{enumerate}

    Usage

    selectPrototypes(n = 250, method = "frequency", data = NULL, verbose = TRUE)
    

    Arguments

    n number of prototypes or maximum number of clusters
    method method to select prototypes or to perform subset selection
    data data matrix (l x d) of feature vectors (l = number of genes)
    verbose print out information

    Details

    The following heuristics to perform automatic selection of prototypes are implemented: begin{ldescription}

    "frequency"
    select n genes with highest number of GO annotations in the currently selected ontology
    "random"
    select n genes uniform randomly over all genes with annotations in the currently selected ontology
    end{ldescription}

    To perfom dimensionality reduction implemented methods are:

    begin{itemize}

  • {bf "PCA"}: dimensionality reduction via principal component analysis; the number of prinicipal components is determined such that at least 95
  • {bf "clustering"}: EM-clustering in feature space end{itemize}

    Value

    If the function is called to automatically select prototypes, a character vector of gene IDs is returned.
    If the function is called to perform dimensionality via PCA, the result is a list with items

    "features" original data projected on the first k principal components
    "pcs" l x k matrix of principal components. Each column is one principal component
    "lambda" first k eigenvalues


    If the function is called to perform clustering in feature space, the cluster centers are returned in a l x k matrix (each column is one cluster center). The "Mclust" function in the package "mclust" is called to perform the clustering. The BIC is used to calculate the optimal number of clusters in the range 2,...,n.

    Note

    The result depends on the currently set ontology ("BP","MF","CC").

    Author(s)

    Holger Froehlich

    References

    [1] H. Froehlich, N. Speer, C. Spieth, and A. Zell, Kernel Based Functional Gene Grouping, Proc. Int. Joint Conf. on Neural Networks (IJCNN), pp. 6886 - 6891, 2006\ [2] N. Speer, H. Froehlich, A. Zell, Functional Grouping of Genes Using Spectral Clustering and Gene Ontology, Proc. Int. Joint Conf. on Neural Networks (IJCNN), pp. 298 - 303, 2005

    See Also

    getGeneFeaturesPrototypes, getGeneSimPrototypes, setOntology

    Examples

    ## Not run: 
     # takes too much time in the R CMD check
     proto=selectPrototypes(n=50) # --> returns a character vector of 50 genes with the highest number of annotations 
     feat=getGeneFeaturesPrototypes(c("207","208","7494"),prototypes=proto,pca=FALSE) # --> compute feature vectors 
     selectPrototypes(data=feat$features,method="pca") # ... and PCA projection
     
    ## End(Not run)
    

    [Package GOSim version 1.0.2 Index]