Page 1 of 1

Fitting survival models using R2MLwiN

Posted: Wed Feb 15, 2017 6:43 pm
by dimension3
Hi there,

Is it possible to fit survival models using R2MLwiN? I haven't found any examples.

Many thanks in advance and kind regards,

Benjamin

Re: Fitting survival models using R2MLwiN

Posted: Fri Feb 17, 2017 10:13 am
by ChrisCharlton
There is no specific option for this, however as described in section 11.11 in the book "Multilevel statistical models" by Harvey Goldstein you can fit these models by specifying a zero/one response (i.e. whether the unit survives or not) at level-1 and then putting the actual level-1 units at level-2, etc.

Re: Fitting survival models using R2MLwiN

Posted: Wed Apr 26, 2017 12:57 pm
by dimension3
For the sake of completeness, here is my solution. As shown by Ma et al. (2003) and Feng et al. (2005), the likelihood function of Cox models with normal random effects is proportional to the likelihood of such random effects Poisson models. Specifically, Cox models with normal random effects can be estimated as generalized linear models with a binary Poisson count response and a specific offset parameter. The approach requires each observation in the data to be split into multiple records based on the complete set of failure times in the data-set (counting process format) and the offset to be the logarithm of the length of each time-interval. The baseline hazard is modelled as a smooth function of time, in our case a 4th order polynomial as suggested by Rabe-Hesketh and Skrondal (2012: 797-862). See also Elghafghuf, A., Stryhn, H., and Waldner, C. (2014) who apply such a model.

In R, this boils down to:

We identify each time point in which someone fails and then split each observation into multiple observations that represent the time intervals with the unique failure times

Code: Select all

library(survival)
failure_times <- sort(unique(dat$survt[dat$status==1])) # unique failure times
dat <- survSplit(dat, cut = failure_times, event = "status", start = "t0", end = "survt") # split data
We include the log of the interval length into the model

Code: Select all

dat <- within(dat, logexposure <- log(survt-t0)) # log (interval size)
And model time as orthogonal ploynomials (or splines...)

Code: Select all

bs <- poly(dat$survt, 4) # orthogonal polynomials (degree=4) of time if specified (i.e. time=T)
dat$bs1 <- bs[,1] 
dat$bs2 <- bs[,2] 
dat$bs3 <- bs[,3] 
dat$bs4 <- bs[,4]
rm(bs, failure_times)
Then we fit the model on single precision data

Code: Select all

dat <- double2singlePrecision(dat)
mmml <- runMLwiN(as.formula(paste(lp, "+ (1 | c) + (1 | p)")), 
              D = "Poisson", data = dat, 
              estoptions = list(EstM = 1, mcmcMeth = list(burnin = iter/10, nchains = 1, iterations = iter, thinning = 1, seed = seed), 
                                      optimat=T, debugmode = F, 
                                      mm = list(NA, list(mmvar = list("p", "p2", "p3", "p4"), weights = list("w1", "w2", "w3", "w4")), NA)))
In Stata, this code would look like this:

Code: Select all

stset dur, failure(event) id(gid)
drop if _st==0 // records that cannot be used
	
// stset: split data //
stsplit, at(failure) // counting-process format
gen lexposure = ln(_t-_t0) // log (interval sizes)
orthpoly _t, gen(t1-t4) degree(4) // smooth baseline hazard

// MLwiN //
global MLwiN_path C:\Program Files (x86)\MLwiN v2.36\x64\mlwin.exe
recast float t1-t4, force
gen cons = 1
sort cid pid gid

runmlwin _d t1-t4 X cons, ///
	        level3(cid: cons) ///
                level2(pid: cons, mmids(p-p9) mmweights(w-w9)) ///
	        level1(gid) ///
		mcmc(cc chain(5000) refresh(500) thin(10)) initsb(b) initsv(V) ///
	        discrete(distribution(poisson) offset(lexposure)) nopause