Complex surveys with MI and replicate weights

Welcome to the forum for R2MLwiN users. Feel free to post your question about R2MLwiN here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to R2MLwiN: Running MLwiN from within R >> http://www.bris.ac.uk/cmm/software/r2mlwin/
Post Reply
GKonyarov
Posts: 1
Joined: Fri Apr 17, 2020 1:54 pm

Complex surveys with MI and replicate weights

Post by GKonyarov »

Hi everyone,

I am trying to analyse data from the PISA and PIAAC surveys. If you are not familiar with the data, these two are surveys evaluating the competencies in subjects such as mathematics and reading for high school students and teachers, respectively. Both are complex surveys and in order to work correctly with them you have to follow some rules. In the surveys, scores from the tests of students as well as teachers are given by 10 plausible values (PVs) which are the so-called multiple imputations (MIs). In addition, there are 80 replicate weights and 1 final weight for each observation that have to be used as well. As far as I know, you have to fit the same model with the 10 PVs and then use Rubin's rules in order to get the correct point estimates. For the correct standard errors, however, you have to also use the replicate weights and the final weight. Now, I have read some threads in the forum, e.g. https://www.cmm.bristol.ac.uk/forum/vie ... d4ea57043f , which discuss MI but I have not found any posts that discuss the use of the weights I mentioned. Is there a way to do such an analysis?

Regarding the software, I have looked for solutions with R and several packages deal with such complex survey. The "survey" package deals effectively with PVs with the command withPV, and is similar to what the pv module in STATA does, but you cannot do a multilevel analysis and you cannot account for the weights there. The package "BIFIEsurvey" is the closest that gets to the aim of my analysis but it is restricted to 2 levels only. Lastly, I will mention package "intsvy" that allows analysis using PVs and follows the rules of the two surveys to correctly estimate statistics, but it is not fitted to do multilevel analysis. That is why, I would like to use MLwiN through R as it is specifically designed to deal with multilevel analyses and I would be very thankful to anyone who can help me sort out this problem or has any suggestions. Below I will give an illustrative example of what my model looks like. Unfortunately, I cannot upload the dput since the file is too big, but if you need some further clarification, I would be happy to answer. This is the model:

Code: Select all

#Model with all level predictors for MATH
mod1.2 = PVMATH ~ GENDER + ESCS + IMMIG + SCHLTYPE + SCHSIZE + STRATIO + EDUSHORT + STAFFSHORT + PVNUM +
  (1|CNTRYID) + (1|CNTSCHID) + (1|CNTSTUID)
(VarCompModel1.2 <- runMLwiN(Formula = mod1.2, data = mypisa))
I have students, nested in schools, nested in countries and here the first 3 independent variables are related to the students, next 5 independent variables are school variables and PVNUM is the mean score of the teachers that I have obtained from the PIAAC data using "intsvy" command "piaac.mean.pv". The dependent variable should then be the average of PV1MATH to PV10MATH using Rubin's rules as mentioned above. Thank you in advance!
billb
Posts: 157
Joined: Fri May 21, 2010 1:21 pm

Re: Complex surveys with MI and replicate weights

Post by billb »

Dear GKonyarov,
Thanks for the post. I have read it a couple of times and although you describe your models I wasn't sure what exactly question you are asking here? Apologies that in lockdown I have been a bit slow to look at the forums and sadly we have lost our colleague Harvey Goldstein whose papers looked at including survey weights in multilevel modelling who died last month and who might have been the best person to send your question on to.
Best wishes,
Bill.
plopblurt
Posts: 9
Joined: Tue Jan 03, 2023 4:59 am

Re: Complex surveys with MI and replicate weights

Post by plopblurt »

I have given it a number of reads, and despite the fact that you describe your models breakout game I am still not entirely clear on the question that you are trying to pose here.
lisamassa
Posts: 2
Joined: Wed Oct 18, 2023 4:25 am

Re: Complex surveys with MI and replicate weights

Post by lisamassa »

GKonyarov wrote: Mon Apr 20, 2020 9:17 am Hi everyone,

I am trying to analyse data from the PISA and PIAAC surveys. If you are not familiar with the data, these two are surveys evaluating the competencies in subjects such as mathematics and reading for high school students and teachers, respectively. Both are complex surveys and in order to work correctly with them you have to follow some rules. In the surveys, scores from the tests of students as well as teachers are given by 10 plausible values (PVs) which are the so-called multiple imputations (MIs). In addition, there are 80 replicate weights and 1 final weight for each observation that have to be used as well. As far as I know, you have to fit the same model with the 10 PVs and then use Rubin's rules in order to get the correct point estimates. For the correct standard errors, however, you have to also use the replicate weights and the final weight. Now, I have read some threads in the forum, e.g. viewtopic.php?f=7&t=3447&sid=b85a60563a ... d4ea57043f rice purity test, which discuss MI but I have not found any posts that discuss the use of the weights I mentioned. Is there a way to do such an analysis?

Regarding the software, I have looked for solutions with R and several packages deal with such complex survey. The "survey" package deals effectively with PVs with the command withPV, and is similar to what the pv module in STATA does, but you cannot do a multilevel analysis and you cannot account for the weights there. The package "BIFIEsurvey" is the closest that gets to the aim of my analysis but it is restricted to 2 levels only. Lastly, I will mention package "intsvy" that allows analysis using PVs and follows the rules of the two surveys to correctly estimate statistics, but it is not fitted to do multilevel analysis. That is why, I would like to use MLwiN through R as it is specifically designed to deal with multilevel analyses and I would be very thankful to anyone who can help me sort out this problem or has any suggestions. Below I will give an illustrative example of what my model looks like. Unfortunately, I cannot upload the dput since the file is too big, but if you need some further clarification, I would be happy to answer. This is the model:

Code: Select all

#Model with all level predictors for MATH
mod1.2 = PVMATH ~ GENDER + ESCS + IMMIG + SCHLTYPE + SCHSIZE + STRATIO + EDUSHORT + STAFFSHORT + PVNUM +
  (1|CNTRYID) + (1|CNTSCHID) + (1|CNTSTUID)
(VarCompModel1.2 <- runMLwiN(Formula = mod1.2, data = mypisa))
I have students, nested in schools, nested in countries and here the first 3 independent variables are related to the students, next 5 independent variables are school variables and PVNUM is the mean score of the teachers that I have obtained from the PIAAC data using "intsvy" command "piaac.mean.pv". The dependent variable should then be the average of PV1MATH to PV10MATH using Rubin's rules as mentioned above. Thank you in advance!
It seems like you're looking for assistance with analyzing data from the PISA and PIAAC surveys. The surveys involve complex data with multiple imputations and replicate weights, and you're interested in using R and the "survey" package to perform multilevel analysis. Additionally, you've mentioned the "BIFIEsurvey" and "intsvy" packages but have encountered limitations with them. You've also provided an illustrative example of the model you're working with. It's clear that you've put in a lot of effort, and I'd be happy to help you with this. If you have any specific questions or need further clarification, feel free to ask.
Post Reply