MCMC is not taking the same starting values but for different datasets

Welcome to the forum for R2MLwiN users. Feel free to post your question about R2MLwiN here. The Centre for Multilevel Modelling take no responsibility for the accuracy of these posts, we are unable to monitor them closely. Do go ahead and post your question and thank you in advance if you find the time to post any answers!

Go to R2MLwiN: Running MLwiN from within R >> http://www.bris.ac.uk/cmm/software/r2mlwin/
ChrisCharlton
Posts: 1351
Joined: Mon Oct 19, 2009 10:34 am

Re: MCMC is not taking the same starting values but for different datasets

Post by ChrisCharlton »

When you supply starting values what should happen is that before the MCMC run is started the model is run for two iterations with IGLS, which should set up the constraints and other aspects of the model correctly. In your case the constraints column is being left empty (you'd to have run the model with IGLS to see whether there are any informative messages as to why). Because the column is empty the code for creating the starting residuals decides that it is usable and therefore instructs MLwiN to save these there. During the residuals calculation the constraints column is recreated in the column allocated to it, hence why you get both sets of values in the output column.

I believe that as long as your ID structure us set up correctly you should be able to turn on cross-classified in your model specification, in which case the starting residuals will not be specified and you shouldn't run into the problem that you are seeing.
adeldaoud
Posts: 63
Joined: Sat Aug 15, 2015 4:00 pm

Re: MCMC is not taking the same starting values but for different datasets

Post by adeldaoud »

1) I am re-running the model in the development version currently. I will come back as soon as I have some new results.
The model is now being run without errors on the large data set, which is great. However, both the random and the fixed part of the model are showing some abnormally high discrepancies when I compare an IGLS and a MCMC model (with and without the IGLS starting values).

This is the IGLS estimation:
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MLwiN (version: unknown or >2.32) multilevel model (Binomial)
N min mean max
country 73 NA NA NA
CountryClusterHouse 567344 1 3.422499 303
Estimation algorithm: IGLS MQL1 Elapsed time : 92.82s
Number of obs: 1941734 (from total 1941734) The model converged after 5 iterations.
Log likelihood: NA
Deviance statistic: NA
---------------------------------------------------------------------------------------------------
The model formula:
logit(AbsolutDep, cons) ~ 1 + (1 | country) + (1 | CountryClusterHouse)
Level 3: country Level 2: CountryClusterHouse Level 1: l1id
---------------------------------------------------------------------------------------------------
The fixed part estimates:
Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval]
Intercept -0.49115 0.12858 -3.82 0.0001336 *** -0.74316 -0.23914
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
---------------------------------------------------------------------------------------------------
The random part estimates at the country level:
Coef. Std. Err.
var_Intercept 1.10715 0.19139
---------------------------------------------------------------------------------------------------
The random part estimates at the CountryClusterHouse level:
Coef. Std. Err.
var_Intercept 1.04919 0.00459
---------------------------------------------------------------------------------------------------
The random part estimates at the l1id level:
Coef. Std. Err.
var_bcons_1 1.00000 0.00000
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-

And this is the MCMC estimation without IGLS starting values:

-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MLwiN (version: unknown or >2.32) multilevel model (Binomial)
N min mean max
country 73 NA NA NA
CountryClusterHouse 567344 1 3.422499 303
Estimation algorithm: MCMC Elapsed time : 20214.35s
Number of obs: 1941734 (from total 1941734) Number of iter.: 15000 Burn-in: 5000
Bayesian Deviance Information Criterion (DIC)
Dbar D(thetabar) pD DIC
1094108.250 830068.562 264039.656 1358147.875
---------------------------------------------------------------------------------------------------
The model formula:
logit(AbsolutDep, cons) ~ 1 + (1 | country) + (1 | CountryClusterHouse)
Level 3: country Level 2: CountryClusterHouse Level 1: l1id
---------------------------------------------------------------------------------------------------
The fixed part estimates:
Coef. Std. Err. z Pr(>|z|) [95% Cred. Interval] ESS
Intercept -2.00636 0.46638 -4.30 1.694e-05 *** -2.92470 -1.09370 15000
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
---------------------------------------------------------------------------------------------------
The random part estimates at the country level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_Intercept 14.68141 2.62572 10.36612 20.65601 12218
---------------------------------------------------------------------------------------------------
The random part estimates at the CountryClusterHouse level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_Intercept 11.51579 0.06701 11.38947 11.64678 192
---------------------------------------------------------------------------------------------------
The random part estimates at the l1id level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_bcons_1 1.00000 0.00000 1.00000 1.00000 15000
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-


And this is the MCMC estimation with the IGLS starting values (from the above output):

-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MLwiN (version: 2.35) multilevel model (Binomial)
N min mean max
country 67 5813 28981.104478 198294
CountryClusterHouse 567344 1 3.422499 303
Estimation algorithm: MCMC Elapsed time : 2245.3s
Number of obs: 1941734 (from total 1941734) Number of iter.: 10 Chains: 1 Burn-in: 10
Bayesian Deviance Information Criterion (DIC)
Dbar D(thetabar) pD DIC
1095078.750 922104.438 172974.328 1268053.125
---------------------------------------------------------------------------------------------------
The model formula:
logit(AbsolutDep, cons) ~ 1 + (1 | country) + (1 | CountryClusterHouse)
Level 3: country Level 2: CountryClusterHouse Level 1: l1id
---------------------------------------------------------------------------------------------------
The fixed part estimates:
Coef. Std. Err. z Pr(>|z|) [95% Cred. Interval] ESS
Intercept -1.93446 0.46992 -4.12 3.846e-05 *** -2.50058 -1.17906 10
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
---------------------------------------------------------------------------------------------------
The random part estimates at the country level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_Intercept 15.55423 2.55589 12.30339 19.53754 10
---------------------------------------------------------------------------------------------------
The random part estimates at the CountryClusterHouse level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_Intercept 11.37617 0.02287 11.33832 11.40832 3
---------------------------------------------------------------------------------------------------
The random part estimates at the l1id level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_bcons_1 0.99990 0.00032 0.99922 1.00000 10
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-

NB: although this is only a 10 iteration model, it uses the IGLS starting values and the numbers are quite similar to the MCMC model without starting values.

(I am running this model with 15 000 iterations currently. That will take 9h +…)


What do you make out of this? I am using the same data and it is a random intercept model.
adeldaoud
Posts: 63
Joined: Sat Aug 15, 2015 4:00 pm

Re: MCMC is not taking the same starting values but for different datasets

Post by adeldaoud »

When you supply starting values what should happen is that before the MCMC run is started the model is run for two iterations with IGLS, which should set up the constraints and other aspects of the model correctly. In your case the constraints column is being left empty (you'd to have run the model with IGLS to see whether there are any informative messages as to why). Because the column is empty the code for creating the starting residuals decides that it is usable and therefore instructs MLwiN to save these there. During the residuals calculation the constraints column is recreated in the column allocated to it, hence why you get both sets of values in the output column.
Thanks for the clarification Chris. So this will happen depending on the data set used, right? Because I was/am puzzled about why the model will run with a subset of the data but not with the full. But I guess some of my higher level residuals are left empty for some reason.

I believe that as long as your ID structure us set up correctly you should be able to turn on cross-classified in your model specification, in which case the starting residuals will not be specified and you shouldn't run into the problem that you are seeing.
I am running a binary hierarchical model, not a cross-classified. You mean I should set the option cross-classified on even in the binary case to get rid this problem? Sorry I did not understand this point.
adeldaoud
Posts: 63
Joined: Sat Aug 15, 2015 4:00 pm

Re: MCMC is not taking the same starting values but for different datasets

Post by adeldaoud »

NB: although this is only a 10 iteration model, it uses the IGLS starting values and the numbers are quite similar to the MCMC model without starting values.

(I am running this model with 15 000 iterations currently. That will take 9h +…)
So, even if I run the model for 15 000 iterations, the difference between the IGLS and the MCMC with the IGLS starting values are quite large:



-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MLwiN (version: 2.35) multilevel model (Binomial)
N min mean max
country 67 5813 28981.104478 198294
CountryClusterHouse 567344 1 3.422499 303
Estimation algorithm: MCMC Elapsed time : 16191.64s
Number of obs: 1941734 (from total 1941734) Number of iter.: 15000 Chains: 1 Burn-in: 500
Bayesian Deviance Information Criterion (DIC)
Dbar D(thetabar) pD DIC
1094031.875 830008.125 264023.750 1358055.625
---------------------------------------------------------------------------------------------------
The model formula:
logit(AbsolutDep, cons) ~ 1 + (1 | country) + (1 | CountryClusterHouse)
Level 3: country Level 2: CountryClusterHouse Level 1: l1id
---------------------------------------------------------------------------------------------------
The fixed part estimates:
Coef. Std. Err. z Pr(>|z|) [95% Cred. Interval] ESS
Intercept -2.00245 0.47029 -4.26 2.063e-05 *** -2.93054 -1.08307 13645
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
---------------------------------------------------------------------------------------------------
The random part estimates at the country level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_Intercept 14.67061 2.61253 10.40662 20.67729 13304
---------------------------------------------------------------------------------------------------
The random part estimates at the CountryClusterHouse level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_Intercept 11.52211 0.06693 11.39604 11.65320 228
---------------------------------------------------------------------------------------------------
The random part estimates at the l1id level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_bcons_1 1.00000 1e-05 1.00000 1.00000 15000
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-


The random part differences are quite large: a difference of a factor of almost 14 for the country level and 11 for the household level.

How can this be? Which model should one trust?
ChrisCharlton
Posts: 1351
Joined: Mon Oct 19, 2009 10:34 am

Re: MCMC is not taking the same starting values but for different datasets

Post by ChrisCharlton »

The quasi-likelihood methods used for discrete models within MLwiN are known to be biased (see http://www.bristol.ac.uk/cmm/software/s ... entresults), so it's possible that this is the reason that you are seeing discrepancies. I would suggest that you try some of the other quasi-likelihood options available for IGLS (e.g. MQL2, PQL1, PQL2) and see whether their results are closer to those given by MCMC. I would also suggest visually inspecting your MCMC chains to ensure that the whole chain has converged to a distribution.

I am also puzzled as to why you were getting different behaviour with you subset than from the full data. MLwiN will be doing the same for both, so all I can think of is that an error occurs somewhere with the full data that prevents the constraints column from being created correctly.

My understanding is that as long as ID numbers are not shared between higher-level units is that hierarchical is a special case of cross-classified. The reason I suggested that you turned it on is when that R2MLwiN fits cross-classified models it does not attempt to use starting residuals from IGLS, as these would often not make sense. This would have caused it to skip the bit of code that was causing the error that you were seeing. As you now appear to have it running correctly however making this change should not be necessary (unless you wanted to compare results between the two).
Post Reply