Factanal {FAiR} | R Documentation |
This function is intended for users and estimates a factor analysis model that has been
set up previously with a call to make_manifest
and a call to
make_restrictions
.
Factanal(manifest, restrictions, scores = "none", seeds = 12345, lower = sqrt(.Machine$double.eps), analytic = TRUE, reject = TRUE, NelderMead = TRUE, impatient = FALSE, ...)
manifest |
An object that inherits from manifest-class and is
typically produced by make_manifest . | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
restrictions |
An object that inherits from restrictions-class and
is typically produced by make_restrictions . | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
scores |
Type of factor scores to produce, if any. The default is "none" .
Other valid choices (which can be partially matched) are "regression" ,
"Bartlett" , "Thurstone" , "Ledermann" ,
"Anderson-Rubin" , "McDonald" ,"Krinjen" ,
"Takeuchi" , and "Harman" . See Beauducel (2007) for
formulae for these factor scores as well as proofs that all but
"regression" and "Harman" produce the same
correlation matrix. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
seeds |
A vector of length one or two to be used as the random
number generator seeds corresponding to the unif.seed and
int.seed arguments to genoud respectively.
If seeds is a single number, this seed is used for both
unif.seed and int.seed . These seeds override the defaults
for genoud and make it easier to replicate
an analysis exactly. If NULL , the default arguments for
unif.seed and int.seed as specified in genoud
are used. NULL should be used in simulations or else they will be
horribly wrong. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
lower |
A lower bound. In exploratory factor analysis, lower is the
minimum uniqueness and corresponds to the 'lower' element of the list
specified for control in factanal . Otherwise, lower
is the lower bound used for singular values when checking for positive-definiteness
and ranks of matrices. If the unlikely event that you get errors referencing
positive definiteness, try increasing the value of lower slightly. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
analytic |
A logical (default to TRUE ) indicating whether analytic gradients
should be used as much as possible. If FALSE , then numeric gradients will be
calculated, which are slower and slightly less accurate but are necessary in some
situations and useful for debugging analytic gradients. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
reject |
Logical indicating whether to reject starting values that fail the
constraints required by the model; see create_start | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
NelderMead |
Logical indicating whether to call optim with
method = "Nelder-Mead" when the genetic algorithm has finished to further
polish the solution. This option is not relevant or necessary for exploratory factor
analysis models. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
impatient |
Logical that defaults to FALSE . If restrictions is of
restrictions.factanal-class , setting it to TRUE will cause
factanal to be used for optimization instead of genoud .
In all other situations, setting it to TRUE will use factanal to
to generate initial communality estimates instead of the slower default mechanism. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
... |
Further arguments that are passed to genoud .
The following arguments to genoud are hard-coded and
cannot be changed because they are logically required by the factor analyis
estimator:
The following arguments to genoud default to values
that differ from those documented at genoud but can
be overridden by specifying them explicitly in the ... :
Other arguments to genoud will take the documented
default values unless explicitly specified. In particular, you may want to
change wait.generations and solution.tolerance . Also, if informative
bounds were placed on any of the parameters in the call to make_restrictions
it is usually preferable to specify that boundary.enforcement = 2 to use
constrained optimization in the internal calls to optim . However,
the "L-BFGS-B" optimizer is less robust than the default "BFGS" optimizer
and occasionally causes fatal errors, largly due to misfortune. |
The call to Factanal
is somewhat of a formality in the sense that most of the
difficult decisions were already made in the call to make_restrictions
and the call to make_manifest
. The most important remaining detail is
the specification of the values for the starting population in the genetic algorithm.
It is not necessary to provide starting values, since there are methods for this
purpose; see create_start
. Also, if starting.values = NA
, then
a population of starting values will be created using the typical mechanism in
genoud
, namely random uniform draws from the domain of the
parameter.
Otherwise, if reject = TRUE
, starting values that fail one or more constraints
are rejected and new vectors of starting values are generated until the population is
filled with admissable starting values. In some cases, the constraints are quite difficult
to satisfy by chance, and it may be more practical to specify reject = FALSE
or to
supply starting values explicitly. If starting values are supplied, it is helpful if at
least one member of the genetic population satisfies all the constraints imposed on the
model. Note the rownames of restrictions@Domains
, which indicate the proper order
of the free parameters.
A matrix (or vector) of starting values can be passed as starting.values
.
(Also, it is possible to pass an object of FA-class
to
starting.values
, in which case the estimates from the previous call to
Factanal
are used as the starting values.) If a matrix, it should have
columns equal to the number of rows in restrictions@Domains
in the specified
order and one or more rows up to the number of genetic individuals in the population.
If starting.values
is a vector, its length can be equal to the number of rows
in restrictions@Domains
in which case it is treated as a one-row matrix, or its
length can be equal to the number of manifest variables, in which case it is passed
to the start
argument of create_start
as a vector of initial
communality estimes, thus avoiding the sometimes time-consuming process of generating
good initial communality estimates. This process can also be accelerated by specifying
impatient = TRUE
.
An object of that inherits from FA-class
.
The underlying genetic algorithm can print a variety of output as it progresses. On Windows, you either have to move the scrollbar periodically to flush the output to the screen or disable buffering by either going to the Misc menu or by clicking Control+W. The output will, by default, look something like this
Generation | First | Second | ... | Last | Discrepancy |
number | constraint | constraint | constraint | function | |
0 | -1.0 | -1.0 | ... | -1.0 | double |
1 | -1.0 | -1.0 | ... | -1.0 | double |
... | ... | ... | ... | ... | ... |
42 | -1.0 | -1.0 | ... | -1.0 | double |
The integer on the far left indicates the generation number. If it appears to
skip one or more generations, that signifies that the best individual in the
“missing” generation was no better than the best individual in the
previous generation. The sequence of -1.0 indicates that various constraints
are being satisfied by the best individual in the generation. Some of these
constraints are hard-coded, some are added by the choices the user makes in the call
to make_restrictions
. The curious are referred to the source code,
but for the most part users need not worry about them provided they are -1.0.
If any but the last are not -1.0 after the first few generations, there is a
major problem because no individual is satisfying all the constraints.
The last number is a double-precision number indicating the value of the discrepancy
function. This number will decrease, sometimes painfully slowly, sometimes intermittently,
over the generations since the discrepancy function is being minimized, subject to the
aforementioned constraints.
Ben Goodrich
Barthlomew, D. J. and Knott, M. (1990) Latent Variable Analysis and Factor Analysis. Second Edition, Arnold.
Beauducel, A. (2007) In spite of indeterminancy, many common factor score estimates yield an identical reproduced covariance matrix. Psychometrika, 72, 437–441.
Smith, G. A. and Stanley G. (1983) Clocking g: relating intelligence and measures of timed performance. Intelligence, 7, 353–368.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
make_manifest
, make_restrictions
, and
Rotate
## Example from Venables and Ripley (2002, p. 323) ## Previously from Bartholomew and Knott (1999, p. 68--72) ## Originally from Smith and Stanley (1983) ## Replicated from example(ability.cov) man <- make_manifest(covmat = ability.cov) ## Not run: ## Here is the easy way to set up a SEFA model, which uses pop-up menus res <- make_restrictions(manifest = man, factors = 2, model = "SEFA") ## End(Not run) ## This is the hard way to set up a restrictions object without pop-up menus beta <- matrix(NA_real_, nrow = nrow(cormat(man)), ncol = 2) rownames(beta) <- rownames(cormat(man)) free <- is.na(beta) beta <- new("parameter.coef.SEFA", x = beta, free = free, num_free = sum(free)) Phi <- diag(2) free <- lower.tri(Phi) Phi <- new("parameter.cormat", x = Phi, free = free, num_free = sum(free)) res <- make_restrictions(manifest = man, beta = beta, Phi = Phi) # This is how to make starting values where Phi is the correlation matrix # among factors, beta is the matrix of coefficients, and the scales are # the logarithm of the sample standard deviations. It is also the MLE. starts <- c( 4.46294498156615e-01, # Phi_{21} 4.67036349420035e-01, # beta_{11} 6.42220238211291e-01, # beta_{21} 8.88564379236454e-01, # beta_{31} 4.77779639176941e-01, # beta_{41} -7.13405536379741e-02, # beta_{51} -9.47782525342137e-08, # beta_{61} 4.04993872375487e-01, # beta_{12} -1.04604290549591e-08, # beta_{22} -9.44950629176182e-03, # beta_{32} 2.63078925240678e-04, # beta_{42} 9.38038168787216e-01, # beta_{52} 8.43618801925473e-01, # beta_{62} log(man@sds)) # log manifest standard deviations sefa <- Factanal(manifest = man, restrictions = res, # NOTE: Do NOT specify any of the following tiny values in a # real research situation; it is done here solely for speed starting.values = starts, pop.size = 2, max.generations = 6, wait.generations = 1) nsim <- 101 # number of simulations, also too small for real work show(sefa) summary(sefa, nsim = nsim) model_comparison(sefa, nsim = nsim) stuff <- list() # output list for various methods stuff$model.matrix <- model.matrix(sefa) # sample correlation matrix stuff$fitted <- fitted(sefa, reduced = TRUE) # reduced covariance matrix stuff$residuals <- residuals(sefa) # difference between model.matrix and fitted stuff$rstandard <- rstandard(sefa) # normalized residual matrix stuff$weights <- weights(sefa) # (scaled) approximate weights for residuals stuff$influence <- influence(sefa) # weights * residuals stuff$cormat <- cormat(sefa, matrix = "RF") # reference factor correlations stuff$uniquenesses <- uniquenesses(sefa, standardized = FALSE) # uniquenesses stuff$FC <- loadings(sefa, matrix = "FC") # factor contribution matrix stuff$draws <- FA2draws(sefa, nsim = nsim) # draws from sampling distribution if(require(nFactors)) screeplot(sefa) # Enhanced scree plot profile(sefa) # profile plots of non-free parameters pairs(sefa) # Thurstone-style plot if(require(Rgraphviz)) plot(sefa) # DAG