check the homoscedasticity assumption in logistic models
check the homoscedasticity assumption in logistic models
Hi
I would like to check the homoscedasticity assumption in my logistic multilevel model. According to this post viewtopic.php?f=1&t=2006&e=1&view=unread#p4567, “You can assess the homoskedasticity assumption by plotting the predicted random effects against the covariates.”.
In the MLwiN envirioment, I assume this is achieved by plotting “the standardised residual x fixed part prediction” for the second level. I wonder how I can achieve this in R2mlwin? I cannot find anything on this in the manual.
Thanks in advance
I would like to check the homoscedasticity assumption in my logistic multilevel model. According to this post viewtopic.php?f=1&t=2006&e=1&view=unread#p4567, “You can assess the homoskedasticity assumption by plotting the predicted random effects against the covariates.”.
In the MLwiN envirioment, I assume this is achieved by plotting “the standardised residual x fixed part prediction” for the second level. I wonder how I can achieve this in R2mlwin? I cannot find anything on this in the manual.
Thanks in advance

 Posts: 1282
 Joined: Mon Oct 19, 2009 10:34 am
Re: check the homoscedasticity assumption in logistic models
The following is an example of how to replicate this residuals plot within R:
Code: Select all
library(R2MLwiN)
library(doBy)
data(tutorial)
(mymodel < runMLwiN(normexam ~ 1 + standlrt + (1  school) + (1  student), estoptions = list(resi.store = TRUE, resioptions = "standardised"), data = tutorial))
pred = data.frame(school=tutorial$school, predexam=predict(mymodel))
predexam.mean < summaryBy(predexam~school, data=pred, FUN=mean)$predexam.mean
plot(predexam.mean, mymodel@residual$lev_2_std_resi_est_Intercept, xlab="pred. val.", ylab="std(Intercept)")

 Posts: 1282
 Joined: Mon Oct 19, 2009 10:34 am
Re: check the homoscedasticity assumption in logistic models
If I am interpreting what George said correctly he is suggesting plotting the residuals vs each of the predictors in the model (I assume averaged within level2 units). The R code for this would be the same as above, except that the prediction would be replaced by the variable being checked.
Re: check the homoscedasticity assumption in logistic models
Thanks for the clarifications, Chris, and some additional questions:
1)
Does the predict(mymodel) function takes the mean of the covariates or the actual data values for each observation (and plug them into the model)? [it looks like it is taking the actual values…]
What if I want to set some covariates to their mean, and but usual actual values for one?
2)
library(R2MLwiN)
library(doBy)
data(tutorial)
(mymodel < runMLwiN(normexam ~ 1 + standlrt + sex + (1  school) + (1  student), estoptions = list(resi.store = TRUE, resioptions = "standardised"), data = tutorial))
# predicting on standlrt
pred = data.frame(school=tutorial$school, predexam=predict(mymodel, terms = "standlrt"))
predexam.mean < summaryBy(predexam~school, data=pred, FUN=mean)$predexam.mean
plot(predexam.mean, mymodel@residual$lev_2_std_resi_est_Intercept, xlab="pred. val.", ylab="std(Intercept)")
# predicting on sex
pred = data.frame(school=tutorial$school, predexam=predict(mymodel, terms = "sex"))
predexam.mean < summaryBy(predexam~school, data=pred, FUN=mean)$predexam.mean
plot(predexam.mean, mymodel@residual$lev_2_std_resi_est_Intercept, xlab="pred. val.", ylab="std(Intercept)")
1)
Does the predict(mymodel) function takes the mean of the covariates or the actual data values for each observation (and plug them into the model)? [it looks like it is taking the actual values…]
What if I want to set some covariates to their mean, and but usual actual values for one?
2)
I am refitting the model with two variables, to get predictions on each variable. However, I am getting the exact same predictions back (same plots). Can you see what I am doing wrong?If I am interpreting what George said correctly he is suggesting plotting the residuals vs each of the predictors in the model (I assume averaged within level2 units). The R code for this would be the same as above, except that the prediction would be replaced by the variable being checked.
library(R2MLwiN)
library(doBy)
data(tutorial)
(mymodel < runMLwiN(normexam ~ 1 + standlrt + sex + (1  school) + (1  student), estoptions = list(resi.store = TRUE, resioptions = "standardised"), data = tutorial))
# predicting on standlrt
pred = data.frame(school=tutorial$school, predexam=predict(mymodel, terms = "standlrt"))
predexam.mean < summaryBy(predexam~school, data=pred, FUN=mean)$predexam.mean
plot(predexam.mean, mymodel@residual$lev_2_std_resi_est_Intercept, xlab="pred. val.", ylab="std(Intercept)")
# predicting on sex
pred = data.frame(school=tutorial$school, predexam=predict(mymodel, terms = "sex"))
predexam.mean < summaryBy(predexam~school, data=pred, FUN=mean)$predexam.mean
plot(predexam.mean, mymodel@residual$lev_2_std_resi_est_Intercept, xlab="pred. val.", ylab="std(Intercept)")

 Posts: 1282
 Joined: Mon Oct 19, 2009 10:34 am
Re: check the homoscedasticity assumption in logistic models
1) The predict function by default will give you the sum of the covariates multiplied by their parameter estimates. It does however have an option to supply you own data, so you could give it a data frame where you have replaced the values with their means.
2) If you wanted to replicate the behaviour of the MLwiN residuals window where it plots residuals by an average for variable instead of a fixedpart prediction the code would just be something like the following:
2) If you wanted to replicate the behaviour of the MLwiN residuals window where it plots residuals by an average for variable instead of a fixedpart prediction the code would just be something like the following:
Code: Select all
library(R2MLwiN)
library(doBy)
data(tutorial)
(mymodel < runMLwiN(normexam ~ 1 + standlrt + (1  school) + (1  student), estoptions = list(resi.store = TRUE, resioptions = "standardised"), data = tutorial))
standlrt.mean < summaryBy(standlrt~school, data=tutorial, FUN=mean)$standlrt.mean
plot(standlrt.mean, mymodel@residual$lev_2_std_resi_est_Intercept, xlab="standlrt", ylab="std(Intercept)")
Re: check the homoscedasticity assumption in logistic models
Thanks, again, Chris.
Regarding the behavior of the predict() function.
1) does it take the sample data stored in the mymodel object?
2) not getting what the difference between the terms= argument and the params= argument (from the manual and my own experimentation)? Could you please help me understand this?
3) Beyond testing homoscedasticity, I would like to produce a couple of prediction holding a set of covariates at their mean, and only vary one variable (the range specified by me, and not the sample). [is this where you propose I feed the predict() with new data?]
Many thanks
Regarding the behavior of the predict() function.
1) does it take the sample data stored in the mymodel object?
2) not getting what the difference between the terms= argument and the params= argument (from the manual and my own experimentation)? Could you please help me understand this?
3) Beyond testing homoscedasticity, I would like to produce a couple of prediction holding a set of covariates at their mean, and only vary one variable (the range specified by me, and not the sample). [is this where you propose I feed the predict() with new data?]
Many thanks

 Posts: 1282
 Joined: Mon Oct 19, 2009 10:34 am
Re: check the homoscedasticity assumption in logistic models
1) Unless you specify an alternative data frame with the newdata= argument in the function call, then yes this is the data that it uses.
2) The terms= argument allows you to choose by a subset of data via the variable names, params= allows to to choose via a subset of data via the parameter names (FP_...). The terms= parameter is only used if you have also specified type="terms", however in this case it takes precedence over the params= argument. The params= parameter applies to all prediction types.
3) Yes, my suggestion here was to create a new data frame which has the same variable names as the original data, but has the variables not being investigated held at their means. You would then pass this to the predict function via the newdata= argument.
2) The terms= argument allows you to choose by a subset of data via the variable names, params= allows to to choose via a subset of data via the parameter names (FP_...). The terms= parameter is only used if you have also specified type="terms", however in this case it takes precedence over the params= argument. The params= parameter applies to all prediction types.
3) Yes, my suggestion here was to create a new data frame which has the same variable names as the original data, but has the variables not being investigated held at their means. You would then pass this to the predict function via the newdata= argument.
Re: check the homoscedasticity assumption in logistic models
Hi, I know this is the R2MLwiN user forum, but i was wondering if you could please point me at the guidelines on how to check for homoscedasticity assumption in logistic models (so for level 2) using MLwiN software, as i don't use R. If this post is inappropriate here, i will post on the MLwiN forum. Thanks in advance.

 Posts: 59
 Joined: Fri Oct 23, 2009 1:49 pm
Re: check the homoscedasticity assumption in logistic models
Hi  you can assess the homoscedasticity assumption of the level2 random effect(s) by plotting the predicted random effects, at that level, against the covariates or fixed part prediction (such plots are commonlyused to visually inspect this assumption).
Since your question concerned using MLwiN directly, rather than R, then using the Model > Residuals option in MLwiN, you can first request that the level 2 residuals be calculated via the Settings tab (specifying the relevant level via the dropdown box at the bottom), and then request to plot the resulting (e.g. standardised) residuals against the (level 2 averages for the) covariate(s) of interest, or fixed part prediction, via the Plots tab (the last option under the "single" heading).
Since your question concerned using MLwiN directly, rather than R, then using the Model > Residuals option in MLwiN, you can first request that the level 2 residuals be calculated via the Settings tab (specifying the relevant level via the dropdown box at the bottom), and then request to plot the resulting (e.g. standardised) residuals against the (level 2 averages for the) covariate(s) of interest, or fixed part prediction, via the Plots tab (the last option under the "single" heading).
Re: check the homoscedasticity assumption in logistic models
Thank you, Richard!