Short Reference for logmult

Supported Models and Options

The logmult package currently supports these model families via separate functions:

  • UNIDIFF (a.k.a. log-multiplicative layer effect model): unidiff function.
  • RC(M) (a.k.a. Goodman Type II) row-column association models: rc function.
  • RC(M)-L row-column association models with layer effect: rcL function.
  • Skew-symmetric row-column association model (van der Heijden & Mooijaart): hmskew function.
  • Skew-symmetric row-column association model with layer effect (extension of van der Heijden & Mooijaart): hmskewL function.
  • Skew-symmetric row-column association model (Yamaguchi RC-SK): yrcskew function.

Please refer to the inline documentation for each function (e.g. ?unidiff) for more details and classic examples.

These functions take as their first argument a table, typically obtained via the table or xtabs function. Arrays of counts without row, column and layer names will have letters attributed automatically; use rownames, colnames and/or dimnames to change these names.

Main options common to several models include:

  • No weighting, uniform weighting or marginal weighting when normalizing scores: weighting argument.
  • Symmetric (a.k.a. homogeneous) scores for rows and columns: symmetric argument.
  • Homogeneous scores and association coefficients for all layers, homogeneous scores only (a.k.a. “simple homogeneous”), or heterogeneous scores and association coefficients: layer.effect, layer.effect.symm and layer.effect.skew arguments.
  • Number of dimensions: nd, nd.symm and nd.skew arguments.
  • Diagonal-specific parameters (“quasi-” models), either stable or varying over layers: diagonal argument.
  • Jackknife and bootstrap standard errors: se and nreplicates argument.
  • Supplementary rows and columns: rowsup and colsup arguments.
  • Fully random or precomputed (semi-random) starting values: start argument.
  • Fitting control via arguments passed to gnm: tolerance criterion (tolerance), maximum number of iterations (iterMax), progress output (trace and verbose), faster fitting by not estimating uninteresting parameters (elim).

Custom models which cannot be obtained via the standard options can be fitted manually by calling gnm directly. Association coefficients can then be extracted by calling one of the assoc.* functions on the model: assoc.rc, assoc.rcL, assoc.rcL.symm, assoc.hmskew, assoc.hmskewL, assoc.rc.symm or assoc.yrcskew. Since these functions are not exported, you need to fully qualify them to call them, e.g. logmult:::assoc.rc(model). The resulting objects (of class assoc) can be passed to plot and support the same options as models.

Models of the “quasi-” type, i.e. excluding some cells of a table, can be fitted by setting the corresponding cells of the input table to NA. Reported degrees of freedom will be correct (contrary to what often happens when setting zero weights for these cells).

Plotting

The package supports rich plotting features for each model family.

For the UNIDIFF model the layer coefficient can be plotted by simply calling plot on the fitted model. See ?plot.unidiff for details and examples.

For association models, one- and multi-dimensional scores plots can be drawn, again by calling plot on the fitted model. For models with a layer effect, a given layer can be chosen via the layer argument, or an average of association coefficients can be used (for models with homogeneous scores only). Several arguments allow tweaking the display, including:

  • Which dimensions to plot: dim argument.
  • Whether to plot the symmetric or skew-symmetric part of the association (when applicable): what argument.
  • Whether to show rows, columns or both: what argument.
  • Which specific rows/columns to represent: which argument.
  • Whether to draw confidence intervals/ellipses (when jackknife/bootstrap were enabled for fitting): conf.int and replicates argument.
  • Whether the size of symbols should vary according to their frequencies: mass argument.
  • Whether the luminosity of symbols should vary according to the strength of the association: luminosity argument.
  • Whether to reverse the axes: rev.axes argument.
  • Standard arguments allow choosing the title (main), axis labels (xlab, ylab), axis limits (xlim, ylim), symbol size (cex) and type (pch), draw onto an existing plot (add).

See ?plot.assoc for the full reference.

Notes About LEM

Results provided by logmult should generally be consistent with LEM, and have been checked against it when possible. Some models are known not to work correctly in LEM, though.

  • UNIDIFF layer coefficients are consistent with those computed by LEM, including when diagonal cells are excluded (using the wei commands or diagonal-specific parameters). Row-column intraction coefficients obtained with weighting="none" or weighting="uniform" are consistent with LEM (coefficients reported by LEM exclude the last row and column).
  • RC(1) scores and intrinsic association coefficients are consistent with logmult; some sign changes can happen but do not affect results.
  • Multidimensional RC(M) models can be fitted in LEM, but their association parameters are not identified; however fit statistics agree with logmult.
  • RC(M)-L model scores and intrinsic association coefficients are consistent with logmult; some sign changes can happen but do not affect results.

Even when models are supposed to be consistent between LEM and logmult, it can happen that different results are obtained. There are several possible reasons to that:

  • Several local optima may exist. Since logmult uses random starting values, running the model many times will allow checking whether another solution with a lower deviance exists. This can be achieved with LEM by adding ran at the end of the mod line.
  • Convergence may appear to have been reached while this is not the case. This is a particularly common risk with LEM since the default tolerance criterion is not very strict. Add a cri 0.00000001 line (or use an even lower value if time permits) to use a stricter criterion. Even then, check that changing the criterion does not affect too much the estimated coefficients: if that is the case, they may not be reliable.

When unsure whether parameters of a model are identified in LEM, add ran at the end of the mod line to use random starting values. Unidentified coefficients will then be different at every run; only identified coefficients will remain the same. logmult only reports identifiable parameters. On the other hand, gnm returns unidentified parameters from coef, but these have NA standard errors when calling summary(asGnm(model)); since random starting values are used by default, unidentified parameters will also be different when re-fitting a model.

When using null weights, LEM reports incorrect degrees of freedom, as zero-weight cells are still considered as free. With logmult, instead of using null weights, set corresponding cells to NA in the input table; this will report the same results as LEM, but with correct degrees of freedom.

logmult/gnm Limitations Compared With LEM

gnm and logmult do not always work well with effects coding ("contr.sum"). Models may fail to converge and parameters extraction will not always work. Using dummy coding ("contr.treatment") is recommended, and gives the same log-multiplicative parameters as when using effects coding (which only affects linear parameters).