This function computes an estimate of the G-computation regression at a specified time t using glm or SuperLearner. The structure of the function is specific to how it is called within mean_tmle. In particular, wideDataList must have a very specific structure for this function to run properly. The list should consist of data.frame objects. The first should have all rows set to their observed value of trt. The remaining should in turn have all rows set to each value of trtOfInterest in the survtmle call. Currently the code requires each data.frame to have named columns for each name in names(adjustVars), as well as a column named trt. It must also have a columns named Nj.Y where j corresponds with the numeric values input in allJ. These are the indicators of failure due to the various causes before time t and are necessary for determining who to include in the regression. Similarly, each data.frame should have a column call C.Y where Y is again t - 1, such that right-censored observations are not included in the regressions. The function will fit a regression with Qj.star.t+1 (also needed as a column in wideDataList) on functions of trt and names(adjustVars) as specified by glm.ftime or SL.ftime.

estimateIteratedMean(
  wideDataList,
  t,
  whichJ,
  allJ,
  t0,
  adjustVars,
  SL.ftime = NULL,
  glm.ftime = NULL,
  verbose,
  cvControl,
  returnModels = FALSE,
  bounds = NULL,
  ...
)

Arguments

wideDataList

A list of data.frame objects.

t

The timepoint at which to compute the iterated mean.

whichJ

Numeric value indicating the cause of failure for which regression should be computed.

allJ

Numeric vector indicating the labels of all causes of failure.

t0

The timepoint at which survtmle was called to evaluate. Needed only because the naming convention for the regression if t == t0 is different than if t != t0.

adjustVars

Object of class data.frame that contains the variables to adjust for in the regression.

SL.ftime

A character vector or list specification to be passed to the SL.library argument in the call to SuperLearner for the outcome regression (either cause-specific hazards or conditional mean). See the documentation of SuperLearner for more information on how to specify valid SuperLearner libraries. It is expected that the wrappers used in the library will play nicely with the input variables, which will be called "trt" and names(adjustVars).

glm.ftime

A character specification of the right-hand side of the equation passed to the formula option of a call to glm for the outcome regression (either cause-specific hazards or conditional mean). Ignored if SL.ftime != NULL. Use "trt" to specify the treatment in this formula (see examples). The formula can additionally include any variables found in names(adjustVars).

verbose

A logical indicating whether the function should print messages to indicate progress.

cvControl

A list providing control options to be fed directly into calls to SuperLearner. This should match the contents of SuperLearner.CV.control exactly. For details, consult the documentation of the SuperLearner package. This is passed in from mean_tmle or hazard_tmle via survtmle.

returnModels

A logical indicating whether to return the glm or SuperLearner objects used to estimate the nuisance parameters. Must be set to TRUE to make downstream calls to timepoints for obtaining estimates at times other than t0. See documentation of timepoints for more information.

bounds

A list of bounds to be used when performing the outcome regression (Q) with the Super Learner algorithm. NOT YET IMPLEMENTED.

...

Other arguments. Not currently used.

Value

The function then returns a list that is exactly the same as the input wideDataList, but with a column named Qj.t added to it, which is the estimated conditional mean of Qj.star.t+1 evaluated at the each of the rows of each data.frame in wideDataList.