generate.data {tileHMM}R Documentation

Generate Simulated Dataset

Description

Generate simulated data based on real data and the results of a previous analysis.

Usage

generate.data(data, group, pos.range = c(1, 10), 
    num.seq = 100, gap = 35, split.gap = 1000, min.len = 2)

Arguments

data A data.frame with information about genomic coordinates of probes (chromosome and position) in the first two columns. Subsequent columns contain probe measurements of individual samples.
group Information that can be used to assign probes to one of two classes. Either a logical vector or the name of a GFF file. In the later case all probes in annotated regions are considered to be ‘positive’.
pos.range Indicates how many positive regions should be generated for each observation sequence. The actual number for each sequence is sampled uniformly from the indicated range of values.
num.seq Number of observation sequences to generate.
gap Gap between probes. Used to generate artificial probe coordinates.
split.gap Gap between sequences.
min.len Minimum number of probes per region.

Value

A list with components

observation A data.frame with the same format as data.
regions A list of state sequences.

Author(s)

Peter Humburg


[Package tileHMM version 1.0-2 Index]