www.cmm.bristol.ac.uk/forum

Posted: **Wed Sep 05, 2018 2:00 pm**

Hi,

I've been trying to use the nlevel imputation template to impute missing data for a four level model with 1775 respondents and I'm running into a few problems. My data has been processed in R and saved as a STATA file and contains a mixture of user collected test score data, combined with linked variables from an NPD extract (FSMever, KS1 scores, and school-level KS1 and KS2 averages. Whenever I've entered binary covariates into the imputation model such as treatment, fsm and sex of respondent, it throws an error and won't allow me to run the algorithm. If I exclude these or specify them as ordered categories the algorithm does run and converges to reasonably sensible values (similar to those I can impute on the fly in Rstan), but does this sound like a bug or is something potentially going wrong with the way I'm preparing the data? When I've imputed as a two-level model of pupils clustered in schools and specify the binary covariates correctly, the algorithm does run, which makes me think there is an underlying bug.

My other problem is that when I add a school level covariate or two into imputation model, the algorithm will run for 20-30 seconds or so, and then crash with a runtime error. I've tried it with a two-level model and it runs flawlessly, but I can't get past those initial 20-30 seconds when the algorithm is burning in and adapting. Any help or suggestions would be very much appreciated.

Many thanks,

Posted: **Wed Sep 05, 2018 3:00 pm**

This could indeed be a bug. Can you please provide more details regarding the error messages that you are receiving (preferably copy/pasted output from the software)? Could you also confirm whether all variables that contain missing values in your model are included as responses in the imputation model?

Posted: **Thu Sep 06, 2018 1:08 pm**

Hi Chris,

Thanks for the very quick reply. Yes all the variables with missing data are specified as responses in the imputation model. My MOI takes a subset of these variables. Here is the command I used, and a screen shot with the error messages is attached:

Command: RunStatJR(template='NLevelImpute', dataset='data1.dta', invars = {'want2': 'No', 'iterations': '10000', 'bin1_3': 'Normal', 'bin1_2': 'Binary', 'bin1_1': 'Normal', 'bin1_6': 'Normal', 'bin1_5': 'Binary', 'bin1_4': 'Binary', 'MOIslope': 'No', 'condmarg': 'Yes', 'x': 'cons,treatment:cat,pre', 'burnin': '5000', 'ximp1_1': 'cons', 'ximp1_0': 'cons', 'ximp1_3': 'cons', 'ximp1_2': 'cons', 'ximp1_5': 'cons', 'ximp1_4': 'cons', 'C3': 'group', 'C2': 'school', 'C1': 'ta', 'higherlevs': 'Yes', 'MOIRespHighLev': 'No', 'imputefirst': '10000', 'NumLevs': '3', 'yimp1': 'post,treatment,pre,gender,fsm,ks1', 'numimpute': '10', 'imputeevery': '1000', 'MOIdist': 'Normal', 'y': 'post', 'MOIlevel': 'yes'}, estoptions = {})

I've since been able to run an imputation model based on the above, but with additional school-level covariates without crashing, but at the end of the run it doesn't run a fitting process on the imputed and complete case datasets. The command code for this is presented below:

Command: RunStatJR(template='NLevelImpute', dataset='data1.dta', invars = {'want2': 'Yes', 'iterations': '10000', 'imputefirst': '10000', 'bin1_3': 'Normal', 'bin1_2': 'Binary', 'bin1_1': 'Normal', 'bin1_6': 'Normal', 'bin1_5': 'Binary', 'bin1_4': 'Binary', 'anyL3resp': 'Yes', 'anyL4resp': 'No', 'MOIslope': 'No', 'condmarg': 'Yes', 'x': 'cons,treatment:cat,pre,schoolks1,schoolks2', 'bin3_1': 'Normal', 'bin3_2': 'Normal', 'ximp3_1': 'cons', 'ximp3_0': 'cons', 'ximp1_1': 'cons', 'ximp1_0': 'cons', 'ximp1_3': 'cons', 'ximp1_2': 'cons', 'ximp1_5': 'cons', 'ximp1_4': 'cons', 'C3': 'group', 'C2': 'school', 'C1': 'ta', 'higherlevs': 'Yes', 'MOIRespHighLev': 'No', 'anyL2resp': 'No', 'burnin': '5000', 'NumLevs': '3', 'yimp3': 'schoolks1,schoolks2', 'yimp1': 'post,treatment,pre,gender,fsm,ks1', 'numimpute': '10', 'imputeevery': '1000', 'MOIdist': 'Normal', 'y': 'post', 'MOIlevel': 'yes'}, estoptions = {})

Posted: **Thu Sep 06, 2018 1:36 pm**

If you look at the output you will see the error:

Code: Select all

ERROR: Invalid value: Binary valid options are ['Normal', 'Binomial', 'Ordered', 'Unordered']

Looking a few lines above confirms that the response type is indeed "Binary". Looking at the same line we can see that the template that these inputs is being passed to is called "1LevelMVMixedResponsecc". The error message suggests that this sub-template is not expecting the response type to be specified as "Binary", but instead "Binomial". After checking the content of this template in the zip file downloadable from http://www.bristol.ac.uk/cmm/research/m ... jr_missing it appears that the most recent version does want "Binary" as the response type so it is possible that this has changed since you originally downloaded the files. Could you please try replacing your copy of this file with the currently downloadable version and see whether you still get the same error?

Posted: **Fri Sep 07, 2018 10:45 am**

Hi Chris,

Unfortunately, I still get the same error message.

Posted: **Fri Sep 07, 2018 11:03 am**

Thanks for checking. Could you open the file "1LevelMVMixedResponsecc.py" and confirm that the first few input lines are as follows?:

Code: Select all

for i in range(0, len(y)):
    context['bin'+str(i+1)] = Text('Response Type for response ' + y[i] + ': ', ['Normal', 'Binary', 'Ordered', 'Unordered'])
    if(context['bin'+str(i+1)] == 'Ordered'):
        context['ncats'+y[i]] = Integer('Number of categories for ' + y[i] + ': ')
    if(context['bin'+str(i+1)] == 'Unordered'):
        context['ncats'+y[i]] = Integer('Number of categories for ' + y[i] + ': ')
        for j in range(0, context['ncats'+y[i]]-1):
            context['x'+str(i+1)+'_'+str(j)] = DataMatrix('Explanatory variables for category ' + str(j) + ': ', allow_cat = True)
    else:
        context['x'+str(i+1)] = DataMatrix('Explanatory variables for response ' + y[i] + ': ', allow_cat = True)

If this is the case could you try selecting the template from within the Stat-JR TREE interface and check that the questions it asks still correspond?

Posted: **Mon Sep 10, 2018 11:47 am**

The code is identical, but when I select the template (1LevelMVMixedResponsecc.py) in the TREE interface, the response type for each response variable has binomial rather than the binary found on the n-level imputation template. This might not matter, but that's the only difference I can find so far.

Posted: **Mon Sep 10, 2018 12:18 pm**

That is odd, as the input text should come directly from the template. Could you please have a look on your machine to see whether you have a second copy of the template that might be interfering (for example in the C:\Stat-JR\templates directory)?

Posted: **Mon Sep 10, 2018 2:30 pm**

That's definitely helped thanks. I've now uninstalled, deleted all the files and reinstalled StatJR, along with the missing data templates from a fresh download. This seems to have fixed the first issue regarding the binary invalid value issue and the template now runs a four-level imputation of my data, producing plausible results. However, once school-level variables are included, the template still doesn't seem to provide the imputation and complete case analysis results, but it does finish the run of the algorithm. Any ideas?

Posted: **Mon Sep 10, 2018 3:04 pm**

Could you check the output window again to see whether there are any error/warning messages given? It may be that one of the later result combining templates has failed to run. Are you able to provide a rough list of the outputs that do appear (they should be similar to those referred to for 2LevelImpute in the "Inspecting the results" section of http://www.bristol.ac.uk/cmm/media/soft ... statjr.pdf)?

www.cmm.bristol.ac.uk/forum

Binary covariate and run time errors using the nlevel imputation template

Binary covariate and run time errors using the nlevel imputation template

Re: Binary covariate and run time errors using the nlevel imputation template

Re: Binary covariate and run time errors using the nlevel imputation template

Re: Binary covariate and run time errors using the nlevel imputation template

Re: Binary covariate and run time errors using the nlevel imputation template

Re: Binary covariate and run time errors using the nlevel imputation template

Re: Binary covariate and run time errors using the nlevel imputation template

Re: Binary covariate and run time errors using the nlevel imputation template

Re: Binary covariate and run time errors using the nlevel imputation template

Re: Binary covariate and run time errors using the nlevel imputation template