1) I am re-running the model in the development version currently. I will come back as soon as I have some new results.
The model is now being run without errors on the large data set, which is great. However, both the random and the fixed part of the model are showing some abnormally high discrepancies when I compare an IGLS and a MCMC model (with and without the IGLS starting values).
This is the IGLS estimation:
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MLwiN (version: unknown or >2.32) multilevel model (Binomial)
N min mean max
country 73 NA NA NA
CountryClusterHouse 567344 1 3.422499 303
Estimation algorithm: IGLS MQL1 Elapsed time : 92.82s
Number of obs: 1941734 (from total 1941734) The model converged after 5 iterations.
Log likelihood: NA
Deviance statistic: NA
---------------------------------------------------------------------------------------------------
The model formula:
logit(AbsolutDep, cons) ~ 1 + (1 | country) + (1 | CountryClusterHouse)
Level 3: country Level 2: CountryClusterHouse Level 1: l1id
---------------------------------------------------------------------------------------------------
The fixed part estimates:
Coef. Std. Err. z Pr(>|z|) [95% Conf. Interval]
Intercept -0.49115 0.12858 -3.82 0.0001336 *** -0.74316 -0.23914
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
---------------------------------------------------------------------------------------------------
The random part estimates at the country level:
Coef. Std. Err.
var_Intercept 1.10715 0.19139
---------------------------------------------------------------------------------------------------
The random part estimates at the CountryClusterHouse level:
Coef. Std. Err.
var_Intercept 1.04919 0.00459
---------------------------------------------------------------------------------------------------
The random part estimates at the l1id level:
Coef. Std. Err.
var_bcons_1 1.00000 0.00000
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
And this is the MCMC estimation without IGLS starting values:
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MLwiN (version: unknown or >2.32) multilevel model (Binomial)
N min mean max
country 73 NA NA NA
CountryClusterHouse 567344 1 3.422499 303
Estimation algorithm: MCMC Elapsed time : 20214.35s
Number of obs: 1941734 (from total 1941734) Number of iter.: 15000 Burn-in: 5000
Bayesian Deviance Information Criterion (DIC)
Dbar D(thetabar) pD DIC
1094108.250 830068.562 264039.656 1358147.875
---------------------------------------------------------------------------------------------------
The model formula:
logit(AbsolutDep, cons) ~ 1 + (1 | country) + (1 | CountryClusterHouse)
Level 3: country Level 2: CountryClusterHouse Level 1: l1id
---------------------------------------------------------------------------------------------------
The fixed part estimates:
Coef. Std. Err. z Pr(>|z|) [95% Cred. Interval] ESS
Intercept -2.00636 0.46638 -4.30 1.694e-05 *** -2.92470 -1.09370 15000
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
---------------------------------------------------------------------------------------------------
The random part estimates at the country level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_Intercept 14.68141 2.62572 10.36612 20.65601 12218
---------------------------------------------------------------------------------------------------
The random part estimates at the CountryClusterHouse level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_Intercept 11.51579 0.06701 11.38947 11.64678 192
---------------------------------------------------------------------------------------------------
The random part estimates at the l1id level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_bcons_1 1.00000 0.00000 1.00000 1.00000 15000
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
And this is the MCMC estimation with the IGLS starting values (from the above output):
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
MLwiN (version: 2.35) multilevel model (Binomial)
N min mean max
country 67 5813 28981.104478 198294
CountryClusterHouse 567344 1 3.422499 303
Estimation algorithm: MCMC Elapsed time : 2245.3s
Number of obs: 1941734 (from total 1941734) Number of iter.: 10 Chains: 1 Burn-in: 10
Bayesian Deviance Information Criterion (DIC)
Dbar D(thetabar) pD DIC
1095078.750 922104.438 172974.328 1268053.125
---------------------------------------------------------------------------------------------------
The model formula:
logit(AbsolutDep, cons) ~ 1 + (1 | country) + (1 | CountryClusterHouse)
Level 3: country Level 2: CountryClusterHouse Level 1: l1id
---------------------------------------------------------------------------------------------------
The fixed part estimates:
Coef. Std. Err. z Pr(>|z|) [95% Cred. Interval] ESS
Intercept -1.93446 0.46992 -4.12 3.846e-05 *** -2.50058 -1.17906 10
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
---------------------------------------------------------------------------------------------------
The random part estimates at the country level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_Intercept 15.55423 2.55589 12.30339 19.53754 10
---------------------------------------------------------------------------------------------------
The random part estimates at the CountryClusterHouse level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_Intercept 11.37617 0.02287 11.33832 11.40832 3
---------------------------------------------------------------------------------------------------
The random part estimates at the l1id level:
Coef. Std. Err. [95% Cred. Interval] ESS
var_bcons_1 0.99990 0.00032 0.99922 1.00000 10
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
NB: although this is only a 10 iteration model, it uses the IGLS starting values and the numbers are quite similar to the MCMC model without starting values.
(I am running this model with 15 000 iterations currently. That will take 9h +…)
What do you make out of this? I am using the same data and it is a random intercept model.