Page 1 of 1

R2MLwiN: issues specifying formula for a categorical var

Posted: Fri Jan 17, 2014 9:19 am
by autarkie
Dear all,

I am modeling a continuous dependent variable with a set of predictors, one of which is categorical. In R, that categorical predictor is a factor with two levels. My baseline model runs fine with the following formula:

Code: Select all

dependent ~ (0|cons + cont1 + cont2 + categ) + (1|cons) + (2|cons)
Now, when I try to add the categorical predictor to the second level like this:

Code: Select all

dependent ~ (0|cons + cont1 + cont2 + categ) + (1|cons) + (2|cons + categ)
MLwiN throws an error "SETV 2 categ", this is on the generated batch file execution. I tried specifying reference category, removing the (1|cons) part as shown in some of the R examples, but it doesn't work. I can perfectly run the baseline model, and then add the categorical variable to the 2nd level inside MLwiN (using debugmode=TRUE argument). I could also recode the "categ" variable as a 0/1 dummy in this particular case, but this is a workaround, and I am looking for an actual solution. In MLwiN the categorical variable looks rather normal, coded as 1 and 2, with corresponding labels. Am I doing something wrong syntax-wise?

Thanks in advance.
Regards,

Maxim

Re: R2MLwiN: issues specifying formula for a categorical var

Posted: Sat Jan 18, 2014 3:55 pm
by autarkie
Addendum:

This is the MLwiN output related to the error:

Code: Select all

error while obeying batch file C:/Users/path/macrofile_2bf8134e3937.txt at line number 28:
SETV  2   'categ'

Undefined explanatory variable referenced
When I try to do something with that variable (without changing anything else):

Code: Select all

->AVERage 1  'categ'  

                 N     Missing    Mean         s.d.
categ   18841         0     1.1474        0.35456      
So, the name is properly recognized outside the batch file. Also, when I specify a continuous variable at the second level, the formula is recognized properly.

Re: R2MLwiN: issues specifying formula for a categorical var

Posted: Mon Jan 20, 2014 1:41 pm
by richardparker
Hi Maxim,

Thanks for raising this issue. You're right: due to the manner in which MLwiN handles dummy variables, there can be issues when fitting a model (via R2MLwiN) with a categorical predictor in the fixed and random part of a model, and so I'm afraid a workaround is needed.

To this end, R2MLwiN has an Untoggle function to create dummy variables on the user's behalf. However, we've just spotted a bug in how it assigns variable names for factors, so in the meantime it's necessary to first convert your variable (assuming it's a factor) to a continuous variable, as in the example below. We'll update this post when a new version of R2MLwiN is released with this bug fixed.

Code: Select all

library("R2MLwiN")
data("tutorial")
IDa <- c("school", "student")
## change path as appropriate:
MLwiN <- "C:/Program Files (x86)/MLwiN v2.27/"

## By default, schgend is a factor with 3 levels (mixedsch, boysch, girlsch):
summary(tutorial)

## R2MLwiN's Untoggle function is designed to create separate vectors for each level of a categorical variable
## ...however we've just discovered a bug in how it assigns variable names for factors,
## ...so in the meantime (whilst we resolve bug) please first convert to a continuous variable:

tutorial$schgend <- as.numeric(tutorial$schgend)
tutorial <- cbind(tutorial, Untoggle(tutorial[["schgend"]], "schgend"))

## new vectors schgend_1 (mixedsch), schgend_3 (girlsch), schgend_2 (boysch) all added to dataset:
summary(tutorial)

F1 <- "normexam ~ (0|cons + standlrt + schgend_2 + schgend_3) + (2|cons + schgend_2 + schgend_3) + (1|cons)"
(mymodel2 <- runMLwiN(Formula = F1, levID = IDa, indata = tutorial, MLwiNPath = MLwiN))

## Will post again when new version R2MLwiN released with bug fix.
Hope that answers your question?

Best wishes,

Richard

Re: R2MLwiN: issues specifying formula for a categorical var

Posted: Wed Jan 29, 2014 1:54 pm
by autarkie
I believe it does, thank you.

Re: R2MLwiN: issues specifying formula for a categorical var

Posted: Tue Feb 04, 2014 5:35 pm
by richardparker
Hi - just confirming this bug has now been fixed in R2MLwiN version 0.1-8, now available on CRAN.

Best wishes,

Richard