Dear Forum users,
My data are comprised of 2 levels, individuals (=level 1) are nested within household (=level 2,identifier is household_id that ranges from 1 to several tens of thousands).
One of response variables in my imputation model is a level 2 variable (household income), which should have a constant value within a household.
Household income in observed data are indeed constant within a household. And if it is missing, it is missing for all household members.
I found the imputed values varied across individuals within a household.
Is there a way to impute a constant value across individuals?
I prepared data by Stata's 'realcomimpute'. Response variables were ordered by level1 followed by level 2 (as it was instructed).
Predictor variables included continuous and categorical variables at both level1 and level 2 variables.
I highly appreciate your input.
Thank you for your time, in advance.
Imputation to level 2 variable: vary within group?
-
- Posts: 1371
- Joined: Mon Oct 19, 2009 10:34 am
Re: Imputation to level 2 variable: vary within group?
Realcom-Impute works out which responses are at each level by checking whether they vary within the provided level identifier. You could check that that the file generated by the Stata 'realcomimpute' command is correct by comparing it with the example on page 6 of the Realcom-Impute guide. The level identifier column should follow directly after the columns that contain the response variables. You should also be able to check which responses are being used at each level by looking at the equation within Realcom-Impute as well as the output in the command window. If these don't look correct I would suggest checking the data again to ensure that the values really do not vary within each group, It is probably worth doing this with the files generated by the Stata command in case an error has been been introduced when the files were exported.
Re: Imputation to level 2 variable: vary within group?
Dear Chris,
Thank you for you swift and helpful reply. I was not aware fo the updated manual - thank you for referring to it.
I have checked the Stata code for realcomimpute and .txt file generated by it. Both appeared correct. Also, when household income was observed, it was constant within households.
I had 2 imputations (different cross-sectional survey waves, at year xxxx and yyyy, so I imputed separately). In one of imputations, imputed level-2 values were correctly constant within households. But in the other imputation, it was inconsistent within households.
The only difference between the two datafiles was that in the datafile with failed imputation, household id was long, from 80,000 to 120,000. The other one remained <100,000. When I changed the household id in the dataset with failed imputation to 0 to 50,000, imputed household income became consistent within each household.
Thank you for your time, once again.
Thank you for you swift and helpful reply. I was not aware fo the updated manual - thank you for referring to it.
I have checked the Stata code for realcomimpute and .txt file generated by it. Both appeared correct. Also, when household income was observed, it was constant within households.
I had 2 imputations (different cross-sectional survey waves, at year xxxx and yyyy, so I imputed separately). In one of imputations, imputed level-2 values were correctly constant within households. But in the other imputation, it was inconsistent within households.
The only difference between the two datafiles was that in the datafile with failed imputation, household id was long, from 80,000 to 120,000. The other one remained <100,000. When I changed the household id in the dataset with failed imputation to 0 to 50,000, imputed household income became consistent within each household.
Thank you for your time, once again.
-
- Posts: 1371
- Joined: Mon Oct 19, 2009 10:34 am
Re: Imputation to level 2 variable: vary within group?
Thank you for confirming the cause of the error. I am curious as to where this apparent truncation is taking place. Would you be able to check the household id column in the file exported from Stata to see whether these high-valued IDs match the original data there? The file is loaded into Realcom-Impute using the Matlab importdata function so they should be read with enough precision to hold the whole record.
Re: Imputation to level 2 variable: vary within group?
Dear Chris,
I am sorry for taking time to reply.
I have checked the .txt file generated by Stata, and high-value IDs appeared intact. Just to be sure (and out of curiousity), I imported .txt to excel and those high-value IDs remained correct in exel too.
Thank you for thining about this.
I am sorry for taking time to reply.
I have checked the .txt file generated by Stata, and high-value IDs appeared intact. Just to be sure (and out of curiousity), I imported .txt to excel and those high-value IDs remained correct in exel too.
Thank you for thining about this.
-
- Posts: 1371
- Joined: Mon Oct 19, 2009 10:34 am
Re: Imputation to level 2 variable: vary within group?
Thank you very much for confirming this, it does sound like the problem is somewhere within the Matlab code then. The Matlab documentation I linked previously suggests that the data read should remain in double precision, however this documentation is for Matlab 2021a whereas Realcom runs under the 2012b runtime, so it could be that the function has changed behaviour over the years.