varPed {MasterBayes}R Documentation

Transforms Variables for a Multinomial Log-Linear Model

Description

Creates offspring specific design matrices the columns of which refer to the explanatory variables of the liner model.

Usage

varPed(x, gender=NULL, lag=c(0,0), relational=FALSE, 
  lag_relational=c(0,0), restrict=NULL, keep=FALSE, 
  USvar=NULL, merge=FALSE, ...)

Arguments

x predictor variable; numeric or factor
gender the gender of the parent to which x applies
lag numeric vector of length 2. The time interval over which x is evaluated relative to a record of the offspring.
relational a character string. If "Offspring", calculates the distance between x in the parents and x in the offspring. If "Mate", calculates the distance between x in the two parental sexes. If NULL, x is untransformed.
lag_relational numeric vector of length 2. If relational is not empty then the time interval over which x is evaluated in the relational category relative to the offspring record can be changed
restrict Designates parents with a zero prior probability of parentage. logical or character string. If relational="Offspring" then restrict may be TRUE or FALSE, and parents that and x is a factor then only parents with zero distance are retained as plausible candidates. If a character string only parents for which x matches are retained.
keep logical; if TRUE then the design matrices for parents excluded using the argument restrict are retained in the estimation of beta
USvar if NULL, unsampled parents take the population mean value of the parameter. If x is a factor then USvar can be a level of that factor to which unsampled parents belong. If x is numeric then USvar can be the value for unsampled parents. Is not implemented for relational=="MATE" or interactions between male and female variables.
merge logical; if TRUE then beta is the log odds ratio of an offspring's parent belonging to category A compared to category B, where A and B are levels of x. If FALSE then beta is the log odds ratio of an individual belonging to category A being the parent of an offspring compared to an individual of category B. Is not implemented for relational=="MATE" or interactions between male and female variables, and is only valid for categorical variables.
... further arguments to be passed

Details

The design matrix for each offspring represents the state of each parental (dam/sire) combination for each explanatory variable. The number of rows in the design matrix (the number of parental combinations) is free to vary across offspring, but the number of explanatory variables remain the same. As with standard generalised linear modelling the columns of the design matrices take on numerical values or inidicator values for continuous and categorical variables, respectively. When relational=FALSE, elements of the design matrices refering to specific parental combinations will not vary across offspring (unless longitudinal data are being used) and the associated vector of parameters will relate the explanatory variables to overall fecundity. For these variables the model is essentially the multinomial analogue of the more familiar Poisson model often used to analyse such data. However, the counts of the multinomial are not known with certainty because uncertainty exists around the maternity and/or paternity of each offspring.

Additional variables can fitted that relate specific parental combinations to specific offspring, or specific dams to specific sires. Elements of the design matrices refering to specific parental combinations are then free to vary across offspring. The most obvious variable of this type is the mendelian transition probability obtained from the genetic data themsleves. However, by specifying relational="OFFSPRING", or relational="MATE" non-genetic variables are free to vary across offspring. When x is numeric the Euclidean distances between parents and offspring, or between mates enter into the design matrix. When x is a factor then an indicator variable is set up indicating whether parent and offspring, or mate, factor levels match. Often, each offspring will have a variable number of candidate parents as some parents may be excluded a priori. When x is a factor and both relational="OFFSPRING" and restrict=TRUE, only those potential parents that have factor levels matching the offspring factor level are retained. When relational=FALSE, restrict can take on factor levels which exclude parents that have non-matching factor levels.

If a time variable (timevar) is not passed to PdataPed the data are assumed to be cross-sectional and each indivdiual only respresented once. If a time variable (timevar) is passed to PdataPed then lag and lag_mate can be set so that time specific covariates are used. lag designates time units realtive to the offspring record when relational=FALSE; for example, if lag=c(0,0) the value of x is taken for that parent during the same time period as the offspring record. If relational="OFFSPRING" or relational="MATE" then lag determines the time units relative to the record of the offspring or mate to which the focal inidvidual is being compared. This record can be specified by using lag_relational, which is always relative to the offspring record. Negative lags refer to previous time intervals (e.g. lag=c(-1,-1) takes x from the previous time step), and if the elements of lag or lag_relational differ then the average value of x during this period is taken (e.g lag=c(-1,0) averages x in the record matching and preceding the offspring record). This is not applicable when x is a factor unless restrict=TRUE in which case parents are retained when factor levels match for any times in the specified interval.

Below are models that can be fitted using varPed, where x is a univariate continuous variable:

varPed(x, gender="Female")

p(i,j) = exp(b*x(i)...)

varPed(x, gender="Male")

p(i,j) = exp(b*x(j)...)

varPed(x)

p(i,j) = exp(b*(x(i)+x(j))...)

varPed(x, gender="Female", relational="OFFSPRING")

p(i,j) = exp(b*abs(x(i)-x(o))...)

varPed(x, gender="Female", relational="MATE")

p(i,j) = exp(b*abs(x(i)-x(j))...)

varPed(x, gender="Female", lag=c(-1,-1))

p(i,j) = exp(b*x(i,t-1)...)

varPed(x, gender="Female", lag=c(-1,-1), relational="OFFSPRING")

p(i,j) = exp(b*abs(x(i,t-1)-x(o,t))...)

varPed(x, gender="Female", lag=c(0,0), relational="MATE",

lag_relational=c(-1,-1))

p(i,j) = exp(b*(x(i,t)+x(j,t-1))...)

For a categorical variable with two levels (A and B) the model specified by varPed(x, gender="Female") takes on the form

p(i,j) = exp(b*I(i)...)

where I(i) is an indicator variable taking the value 1 if x(i) is equal to the first level of x and zero otherwise. beta is then the log odds ratio of the two levels of x with respect to maternity. If merge=TRUE is specified then beta may vary across offspring, and b_m is estimated. b_m is related to b:

b_m = logit(((theta*N_A)/(N_A*theta+N_B*(1-theta)))

where theta is the inverse logit transformation of b, and N_A and N_B are the number of potential mothers that have level A and B for x. If N_A and N_B are invariant over offspring the models are functionally equivalent.

The denominator of the multinomial likelihood is the summed linear predictors of all possible parents (after setting up a contrast with the baseline parents). Designating the first set of parents as baseline, the contrast for each set of parents is simply:

eta(i,j)=log(p(i,j)/p(1,1))

and the likelihood of b is

Pr(x| b) = prod(no)(exp(eta(d,s))/sum(ni*nj)(exp(eta(i,j))))

where no, ni and nj are the number of offspring, the number of potential mothers for offspring o, and the number of potential fathers for offspring o, respectively. d and s are the actual parents of offspring o. The set of possible parents in the denominator of the multinomial likelihood are those that are not excluded using the argument restrict. However, if the argument keep=TRUE is used then the denominator of the likelihood will include excluded parents depsite the fact that d!=i and s!=j.

Value

list containing the design matrix for variable x, the identity of retained parents and the gender of the parents

Author(s)

Jarrod Hadfield j.hadfield@sheffield.ac.uk

References

Hadfield J.D. et al (2006) Molecular Ecology 15 3715-31

See Also

MCMCped


[Package MasterBayes version 2.0 Index]