seqdef {TraMineR} | R Documentation |
Create a sequence object to be passed to other functions provided by the TraMineR package. There are specific method for plotting and printing sequences objects.
seqdef(data, var=NULL, informat="STS", stsep="-", alphabet=NULL, states=NULL, start=1, left=NA, right="DEL", gaps=NA, missing=NA, void="%", nr="*", cnames=NULL, cpal=NULL, missing.color="darkgrey", labels=NULL, ...)
data |
a data frame or matrix containing sequence data. |
var |
the list of columns containing the sequences. Defaut to NULL, ie all the columns. Whether the sequences are in the compressed (successive states in a character string) or extended format is automatically detected. |
informat |
format of the original data. Default is 'STS'. Avalaible formats are: STS, SPS, SPELL. See TraMineR user manual (Gabadinho et al., 2008) for a description of the formats. |
stsep |
the character used as separator in the original data if input format is successive states in a character string. By default, "-". |
alphabet |
optional vector containing the alphabet (the list of all possible states). Use this option if some states in the alphabet don't appear in the data or if yopu want to reorder the states in the alphabet. The specified vector MUST contain AT LEAST all the states appearing in the data as they are labelled, and some optional states not appearing in the data. If left to NULL, the alphabet is set to the distinct states appearing in the data, in the same order as returned by the seqstatl function. |
states |
an optional vector containing the labels for the states. Must have a length equal to the number of states in the data, and the labels must be sorted according to the output of the seqstatl function. |
start |
starting time. For instance, if your sequences begin at age 15, you can specify 15. At this stage, used only for labelling column names. |
left |
the behavior for missing values appearing in the left part of the sequences, i.e the part before the first (leftmost) valid state in the sequences. See Gabadinho et al. (2008) for more details on the options for handling missing values when defining sequence objects. By default, missing values in this part are treated as 'real' missing values and converted to the internal code for missing values as defined by the nr option. |
right |
the behavior for missing values appearing in the right part of the sequences, i.e the part after the last (rightmost) valid state in the sequences. See Gabadinho et al. (2008) for more details on the options for handling missing values when defining sequence objects. By default, missing values in this part are treated as 'void' elements and converted to the internal code for void values as defined by the void option. |
gaps |
the behavior for missing values appearing in the central part of the sequences, i.e the part after the first (leftmost) valid state in the sequences and before the last (rightmost) valid state in the sequences. See Gabadinho et al. (2008) for more details on the options for handling missing values when defining sequence objects. By default, missing values in this part are treated as 'real' missing values and converted to the internal code for missing values as defined by the nr option. |
missing |
the code for the missing values appearing in the input data. If specified, all cells containing this value will be replaced by NA's, the internal R code for missing values. If 'missing' is not specified, cells containing NA's are considered to be missing values. |
void |
the internal code used by TraMineR for the void elements in the sequences. Default to "% ". |
nr |
the internal code used by TraMineR for the missing elements in the sequences. Default to "* ". |
cnames |
optional names for the columns composing the sequence data. Those names will be used by default in the graphics as axis labels. If not specified, names are taken from the original columns names in the data. |
cpal |
an optional color palette for representing the states in the graphics. If not specified, colors palette is created with the RColorBrewer package, using the "accent" palette. Note that the maximum number of colors in this palette is 8. If the number of states in the data is greater, you have to specify your own palette. The list of available colors is displayed by the colors function. You can also use alternatively some other palettes from the RColorBrewer package. |
missing.color |
alternative color for representing missing values inside the sequences. Default to "darkgrey". |
labels |
labels for the states, to appear in the graphics' legend. |
... |
options passed to the seqformat function to handle input data not in STS format. |
Subscripts applied to sequence objects (eg. seq[,1:5] or seq[1:10,]
) returns a sequence object with preserved (alphabet, missing) and adapted attributes (start, column names), unless only one column is selected, in which case a factor is returned.
An object of class stslist
. There are methods for print
, summary
, and subscripting sequence objects. Sequence objects are required as argument to other functions such as plotting functions (seqdplot, seqiplot or seqfplot), functions to compute distances (seqdist), etc...
Gabadinho, A., G. Ritschard, M. Studer and N. S. Müller (2008). Mining Sequence Data in R
with TraMineR
: A user's guide. Department of Econometrics and Laboratory of Demography, University of Geneva.
## Creating a sequence object with the columns 13 to 24 ## in the 'actcal' example data set data(actcal) actcal.seq <- seqdef(actcal,13:24, labels=c("> 37 hours", "19-36 hours", "1-18 hours", "no work")) ## Displaying the first 10 rows of the sequence object actcal.seq[1:10,] ## Displaying the first 10 rows of the sequence object ## in SPS format print(actcal.seq[1:10,], format="SPS") ## Frequency plot for the monthes June to September seqfplot(actcal.seq[,6:9]) ## Re-ordering the alphabet actcal.seq <- seqdef(actcal,13:24,alphabet=c("B","A","D","C")) alphabet(actcal.seq) ## Adding a state not appearing in the data to the ## alphabet actcal.seq <- seqdef(actcal,13:24,alphabet=c("A","B","C","D","E")) alphabet(actcal.seq) ## Adding a state not appearing in the data to the ## alphabet and changing the states labels actcal.seq <- seqdef(actcal,13:24, alphabet=c("A","B","C","D","E"), states=c("FT","PT","LT","NO","TR")) alphabet(actcal.seq) actcal.seq[1:10,]