Estimate Censoring Mechanisms — estimateCensoring • survtmle

Computes an estimate of the hazard for censoring using either glm or SuperLearner based on log-likelihood loss. The function then computes the censoring survival distribution based on these estimates. The structure of the function is specific to how it is called within survtmle. In particular, dataList must have a very specific structure for this function to run properly. The list should consist of data.frame objects. The first will have the number of rows for each observation equal to the ftime corresponding to that observation. Subsequent entries will have t0 rows for each observation and will set trt column equal to each value of trtOfInterest in turn. One of these columns must be named C that is a counting process for the right-censoring variable. The function will fit a regression with C as the outcome and functions of trt and names(adjustVars) as specified by glm.ctime or SL.ctime as predictors.

estimateCensoring(
  dataList,
  adjustVars,
  t0,
  SL.ctime = NULL,
  glm.ctime = NULL,
  glm.family,
  cvControl,
  returnModels = FALSE,
  verbose = TRUE,
  gtol = 0.001,
  ...
)

Arguments

dataList	A list of `data.frame` objects as described in the documentation of `makeDataList`.
adjustVars	Object of class `data.frame` that contains the variables to adjust for in the regression.
t0	The timepoint at which `survtmle` was called to evaluate. Needed only because the naming convention for the regression if `t == t0` is different than if `t != t0`.
SL.ctime	A character vector or list specification to be passed to the `SL.library` argument of `SuperLearner` for the outcome regression (either cause-specific hazards or conditional mean). See the documentation of `SuperLearner` for more information on how to specify valid `SuperLearner` libraries. It is expected that the wrappers used in the library will play nicely with the input variables, which will be called `"trt"` and `names(adjustVars)`.
glm.ctime	A character specification of the right-hand side of the equation passed to the `formula` option of a call to `glm` for the outcome regression (either cause-specific hazards or conditional mean). Ignored if `SL.ctime != NULL`. Use `"trt"` to specify the treatment in this formula (see examples). The The formula can additionally include any variables found in `names(adjustVars)`.
glm.family	The type of regression to be performed if fitting GLMs in the estimation and fluctuation procedures. The default is "binomial" for logistic regression. Only change this from the default if there are justifications that are well understood. This is inherited from the calling function (either `mean_tmle` or `hazard_tmle`).
cvControl	A `list` providing control options to be fed directly into calls to `SuperLearner`. This should match the contents of `SuperLearner.CV.control` exactly. For details, consult the documentation of the SuperLearner package. This is passed in from `mean_tmle` or `hazard_tmle` via `survtmle`.
returnModels	A `logical` indicating whether to return the `glm` or `SuperLearner` objects used to estimate the nuisance parameters. Must be set to `TRUE` to make downstream calls to `timepoints` for obtaining estimates at times other than `t0`. See documentation of `timepoints` for more information.
verbose	A `logical` indicating whether the function should print messages to indicate progress.
gtol	The truncation level of predicted censoring survival to handle positivity violations.
...	Other arguments. Not currently used.

Value

The function returns a list that is exactly the same as the input dataList, but with a column named G_dC added to it, which is the estimated conditional survival distribution for the censoring variable evaluated at the each of the rows of each data.frame in dataList.