seqient {TraMineR}R Documentation

Within sequences entropy

Description

Within sequences entropy

Usage

 seqient(seqdata, norm=TRUE)

Arguments

seqdata a sequence object as returned by the the seqdef function.
norm by default (TRUE), entropy is normalized, ie divided by the maximum entropy. The maximum entropy is computed as the entropy of the alphabet, ie an hypothetic sequence having all the states in the alphabet with equal length. Note that if for example the sequence length is uneven and the number of states in the alphabet is even, the theoretical maximum cannot be observed in the data.

Details

The seqient function returns the Shannon entropy of each sequence in seqdata. The entropy of a sequence is computed using the formula

h(π_1,...,π_s)=-sum_{i=1}^sπ_ilog_2 π_i

where s is the size of the alphabet and π_i the proportion of occurrences of the $i$th state in the considered sequence. The entropy can be interpreted as the `uncertainty' of predicting the states in a given sequence. If all states in the sequence are the same, the entropy is equal to 0. The maximum entropy for a sequence of length 12 with an alphabet of 4 states is 1.386294 and is attained when each of the four states appears 3 times.

Another measure of entropy is available: seqstatd returns the entropy of the distribution of states for each time unit.

Value

a vector whose number of elements is the number of sequences in seqdata, containing the entropy value of each sequence.

References

Gabadinho, A., G. Ritschard, M. Studer and N. S. Muller (2008). Mining Sequence Data in R with TraMineR: A user's guide. Department of Econometrics and Laboratory of Demography, University of Geneva.

See Also

seqstatd.

Examples

data(actcal)
actcal.seq <- seqdef(actcal,13:24)

## Summarize and plots an histogram 
## of the within sequence entropy 
actcal.ient <- seqient(actcal.seq)
summary(actcal.ient)
hist(actcal.ient)

[Package TraMineR version 1.1 Index]