R2MLwiN: issues specifying formula for a categorical var

Welcome to the forum for R2MLwiN users. Feel free to post your question about R2MLwiN here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to R2MLwiN: Running MLwiN from within R >> http://www.bris.ac.uk/cmm/software/r2mlwin/
Post Reply
autarkie
Posts: 4
Joined: Fri Jan 17, 2014 9:01 am

R2MLwiN: issues specifying formula for a categorical var

Post by autarkie »

Dear all,

I am modeling a continuous dependent variable with a set of predictors, one of which is categorical. In R, that categorical predictor is a factor with two levels. My baseline model runs fine with the following formula:

Code: Select all

dependent ~ (0|cons + cont1 + cont2 + categ) + (1|cons) + (2|cons)
Now, when I try to add the categorical predictor to the second level like this:

Code: Select all

dependent ~ (0|cons + cont1 + cont2 + categ) + (1|cons) + (2|cons + categ)
MLwiN throws an error "SETV 2 categ", this is on the generated batch file execution. I tried specifying reference category, removing the (1|cons) part as shown in some of the R examples, but it doesn't work. I can perfectly run the baseline model, and then add the categorical variable to the 2nd level inside MLwiN (using debugmode=TRUE argument). I could also recode the "categ" variable as a 0/1 dummy in this particular case, but this is a workaround, and I am looking for an actual solution. In MLwiN the categorical variable looks rather normal, coded as 1 and 2, with corresponding labels. Am I doing something wrong syntax-wise?

Thanks in advance.
Regards,

Maxim
autarkie
Posts: 4
Joined: Fri Jan 17, 2014 9:01 am

Re: R2MLwiN: issues specifying formula for a categorical var

Post by autarkie »

Addendum:

This is the MLwiN output related to the error:

Code: Select all

error while obeying batch file C:/Users/path/macrofile_2bf8134e3937.txt at line number 28:
SETV  2   'categ'

Undefined explanatory variable referenced
When I try to do something with that variable (without changing anything else):

Code: Select all

->AVERage 1  'categ'  

                 N     Missing    Mean         s.d.
categ   18841         0     1.1474        0.35456      
So, the name is properly recognized outside the batch file. Also, when I specify a continuous variable at the second level, the formula is recognized properly.
richardparker
Posts: 58
Joined: Fri Oct 23, 2009 1:49 pm

Re: R2MLwiN: issues specifying formula for a categorical var

Post by richardparker »

Hi Maxim,

Thanks for raising this issue. You're right: due to the manner in which MLwiN handles dummy variables, there can be issues when fitting a model (via R2MLwiN) with a categorical predictor in the fixed and random part of a model, and so I'm afraid a workaround is needed.

To this end, R2MLwiN has an Untoggle function to create dummy variables on the user's behalf. However, we've just spotted a bug in how it assigns variable names for factors, so in the meantime it's necessary to first convert your variable (assuming it's a factor) to a continuous variable, as in the example below. We'll update this post when a new version of R2MLwiN is released with this bug fixed.

Code: Select all

library("R2MLwiN")
data("tutorial")
IDa <- c("school", "student")
## change path as appropriate:
MLwiN <- "C:/Program Files (x86)/MLwiN v2.27/"

## By default, schgend is a factor with 3 levels (mixedsch, boysch, girlsch):
summary(tutorial)

## R2MLwiN's Untoggle function is designed to create separate vectors for each level of a categorical variable
## ...however we've just discovered a bug in how it assigns variable names for factors,
## ...so in the meantime (whilst we resolve bug) please first convert to a continuous variable:

tutorial$schgend <- as.numeric(tutorial$schgend)
tutorial <- cbind(tutorial, Untoggle(tutorial[["schgend"]], "schgend"))

## new vectors schgend_1 (mixedsch), schgend_3 (girlsch), schgend_2 (boysch) all added to dataset:
summary(tutorial)

F1 <- "normexam ~ (0|cons + standlrt + schgend_2 + schgend_3) + (2|cons + schgend_2 + schgend_3) + (1|cons)"
(mymodel2 <- runMLwiN(Formula = F1, levID = IDa, indata = tutorial, MLwiNPath = MLwiN))

## Will post again when new version R2MLwiN released with bug fix.
Hope that answers your question?

Best wishes,

Richard
autarkie
Posts: 4
Joined: Fri Jan 17, 2014 9:01 am

Re: R2MLwiN: issues specifying formula for a categorical var

Post by autarkie »

I believe it does, thank you.
richardparker
Posts: 58
Joined: Fri Oct 23, 2009 1:49 pm

Re: R2MLwiN: issues specifying formula for a categorical var

Post by richardparker »

Hi - just confirming this bug has now been fixed in R2MLwiN version 0.1-8, now available on CRAN.

Best wishes,

Richard
Post Reply