boolean {boolean}R Documentation

Partial-Observability Logit or Probit Models for Testing Boolean Hypotheses

Description

Boolean logit and probit are a family of partial-observability n-variate models designed to permit researchers to model causal complexity, or multiple causal "paths" to a given outcome.

Usage

boolean(structure, method, maxoptions = "", optimizer="nlm",
        safety=1, bootstrap=FALSE, bootsize=100, popsize=5000)

Arguments

structure Structure of equation to be estimated, in standard y ~ f(x) form, using & to represent the Boolean operator "and" and | to represent the Boolean operator "or." (Note that the syntax requires that constants be entered explicitly; see the entry for boolprep for details.) Be sure to enter the correct functional form and balance parentheses; if in doubt, or just for convenience, use the boolprep command to prepare structure prior to estimation.
method Either "logit" or "probit".
maxoptions Maximization options (see nlm or optim for details).
optimizer Either "nlm", "optim", or "genoud".
safety Number of search attempts. The likelihood functions implied by Boolean procedures can become quite convoluted; in such cases, multiple searches from different starting points can be run. Works only when using nlm.
bootstrap If TRUE, bootstraps standard errors.
bootsize Number of iterations if bootstrap=TRUE.
popsize Population size if optimizer=genoud.

Details

Boolean permits estimation of Boolean logit and probit models (see Braumoeller 2003 for derivation). Boolean logit and probit are a family of partial-observability n-variate models designed to permit researchers to model causal complexity, or multiple causal "paths" to a given outcome. The various "paths" are modeled as latent dependent variables that are multiplied together in a manner determined by the logic of their (Boolean) interaction. If, for example, we wanted to model a situation in which diet OR smoking causes heart failure, we would use one set of independent variables (caloric intake, fat intake, etc.) to predict the latent probability of diet-related coronary failure (y1*), use another set of variables (cigarettes smoked per day, exposure to second-hand smoke, etc.) to predict the latent probability of smoking-related coronary failure (y2*), and model the observed outcome (y, or coronary failure) as a function of the Boolean interaction of the two: Pr(y=1) = 1-([1-y1*] x [1-y2*]). Independent variables that have an impact on both latent dependent variables can be included in both paths. Any combination of ANDs and ORs can be posited, and the interaction of any number of latent dependent variables can be modeled, although the procedure becomes exponentially more data-intensive as the number of latent dependent variables increases.

Value

Returns an object of class booltest, with slots @Calculus, @LogLik, @Variables, @Coefficients, @StandardErrors, @Iterations, @Hessian, @Gradient, @Zscore, @Probz, @Conf95lo, @Conf95hi, @pstructure, and @method (note that some slots may be left empty if the relevant information is not furnished by the maximizer).

Note

Examining profile likelihoods with boolprof is highly recommended. Boolean logit and probit are partial observability models, which are generically starved for information; as a result, maximum likelihood estimation can encounter problems with plateaus in likelihood functions even with very large n.

Author(s)

Bear F. Braumoeller, Harvard University, bfbraum@fas.harvard.edu
Jacob Kline, Harvard University, jkline@fas.harvard.edu

References

Braumoeller, Bear F. (2003) "Causal Complexity and the Study of Politics." Political Analysis 11(3): 209-233.

See Also

boolprep to prepare structure of equation, boolfirst to graph first differences after estimation, and boolprof to produce profile likelihoods after estimation.

Examples

library("boolean")
set.seed(50)
x1<-rnorm(1000)
x2<-rnorm(1000)
x3<-rnorm(1000)
x4<-rnorm(1000)
x5<-rnorm(1000)
x6<-rnorm(1000)
y<-1-(1-pnorm(-2+0.33*x1+0.66*x2+1*x3)*1-(pnorm(1+1.5*x4-0.25*x5)*pnorm(1+0.2*x6)))
y <- y>runif(1000)
answer <- boolean(y ~( ((cons+x1+x2+x3)|((cons+x4+x5)&(cons+x6))) ), method="probit")

## Examine coefficients, standard errors, etc.
summary(answer)

## Examine "summary" output plus Hessian, gradient, etc.
show(answer)

## Plot first differences for model
plot(answer)

## Plot profiles
plot(answer, panel="boolprof")

[Package boolean version 1.06 Index]