corpus_sample {lsa} | R Documentation |
Generate a random sample of a document collection.
corpus_sample( filelist, samplesize, index.return=FALSE)
filelist |
a vector containing (relative or absolute) filenames. |
samplesize |
the desired number of files to be returned. |
index.return |
if set to TRUE , the position of the sample files in filelist will be returned. |
Creates a random sample of the size samplesize
of
the specified filelist.
x |
The random sample; a vector with filenames. |
x |
If index.return is set to TRUE , a list is returned; x contains
the filenames and ix contains the position of the sample files in the
original filelist. |
Fridolin Wild fridolin.wild@wu-wien.ac.at
# create some files td = tempfile() dir.create(td) write( c("dog", "cat", "mouse"), file=paste(td, "D1", sep="/") ) write( c("hamster", "mouse", "sushi"), file=paste(td, "D2", sep="/") ) write( c("dog", "monster", "monster"), file=paste(td, "D3", sep="/") ) s = corpus_sample(dir(td, full.names=TRUE), 2, index.return=TRUE) textmatrix(s$x) # clean up unlink(td)