Errors when using smcfcs to impute missing covariates in competing risk setting

Alexandra Vegelien

unread,

Jun 4, 2020, 5:38:16 AM6/4/20

to Missing Data

Dear all,

The substantive model compatible verson of fully conditional specification imputation seems very promising for my situation. I have tried implementing it but I have run into the following error and am out of possible solutions.

"Error in agsurv(y[indx, , drop = F], x[indx, , drop = F], wt[indx], risk[indx], :

NA/NaN/Inf in foreign function call (arg 4)

In addition: Warning messages:

1: In fitter(X, Y, istrat, offset, init, control, weights = weights, :

Loglik converged before variable 1 ; coefficient may be infinite.

2: In fitter(X, Y, istrat, offset, init, control, weights = weights, :

Loglik converged before variable 1 ; coefficient may be infinite.

3: In fitter(X, Y, istrat, offset, init, control, weights = weights, :

Loglik converged before variable 1 ; coefficient may be infinite."

Some background information on the dataset: I am working on a dataset of leukemia patients with 60 variables including clinical variables (such as the type of mutation) and patient characteristics such as age and sex. The goal is to investigate the effect of the variables on the risk of relapse where the competing risk is death due to the treatment. Possible issues might be that many variables are categorical and some variables are considered as normal but are actually bounded (percentage).

For the meantime, I have implemented the Resche-Rignon method described byJonathan Bartlett and Jeremy Taylor. I understood this as including the value in the cumulative incidence function at the time of the event for all competing risks. So this adds as many variables as the number of competing risks. Doesn't this create a bias in case the type of event is censoring? I am not aware of another imputation method in the setting of competing risks than these two methods.

I would be very grateful for any help!

Kind regards,

Alexandra

Jonathan Bartlett

unread,

Jun 8, 2020, 9:17:25 AM6/8/20

to Missing Data

Hi Alexandra

The warnings about infinite coefficients suggest that you probably have some 'perfect prediction' between some of your covariates. i.e. two factor variables which perfectly predict each other. smcfcs doesn't yet handle such situations (unlike mice for example). What I would suggest is to get the imputation working with just a small number of predictors, ensuring they are not perfectly correlated with each other. Then you can gradually build it up to a more complex imputation model, again ensuring that you don't add a new covariate that perfectly predicts one of the ones you have already included.

As you say, in the Resche-Rigon approach, you estimate the marginal cumulative hazard functions for each event type. For each subject you need variables for the value of the cumulative hazards for all the event types. You then include these as covariate in the imputation model, as well as the event indicator as a factor/categorical variable. If there is some censoring this is fine, these will be just coded as 0 in the event indicator variable. If you think that the hazard of being censored is associated with the covariate(s) being imputed, then censoring should be treated as if it were another type of competing event, in both the Resche-Rigon approach and the smcfcs approach.

Best wishes

Jonathan

Message has been deleted

Alexandra Vegelien

unread,

Jun 19, 2020, 5:15:43 AM6/19/20

to Missing Data

Dear Jonathan,

Thank you for your help! Because of your advice, I was able to get the algorithm running. As you have said, mice is able to handle perfect prediction situations and produces a log of all errors detected. I used this as a guideline to change the predictor set for each variable; this solved the problem.

Thank you again!

Kind regards,

Alexandra

Reply all

Reply to author

Forward