dutchSpeakersDist {languageR} | R Documentation |
A distance matrix for the conversations of 165 speakers in the
Spoken Dutch Corpus. Metadata on the speakers are available in
a separate dataset, dutchSpeakersDistMeta
.
data(dutchSpeakersDist)
A data frame for a 165 by 165 matrix of between-speaker differences.
http://lands.let.kun.nl/cgn/ data collected and analyzed in collaboration with Patrick Juola
Juola, P. (2003) The time course of language change, Computers and the Humanities, 37, 77-96.
Juola, P. and Baayen, R. H. (2005) A Controlled-corpus Experiment in Authorship Identification by Cross-entropy, Literary and Linguistic Computing, 20, 59-67.
## Not run: data(dutchSpeakersDist) dutchSpeakersDist.d = as.dist(dutchSpeakersDist) dutchSpeakersDist.mds = cmdscale(dutchSpeakersDist.d, k = 3) data(dutchSpeakersDistMeta) dat = data.frame(dutchSpeakersDist.mds, Sex = dutchSpeakersDistMeta$Sex, Year = dutchSpeakersDistMeta$AgeYear, EduLevel = dutchSpeakersDistMeta$EduLevel) dat = dat[!is.na(dat$Year),] par(mfrow=c(1,2)) plot(dat$Year, dat$X1, xlab="year of birth", ylab = "dimension 1", type = "p") lines(lowess(dat$Year, dat$X1)) boxplot(dat$X3 ~ dat$Sex, ylab = "dimension 3") par(mfrow=c(1,1)) cor.test(dat$X1, dat$Year, method="sp") t.test(dat$X3~dat$Sex) ## End(Not run)