This function estimates the marginal cumulative incidence for failures of specified types using targeted minimum loss-based estimation.

survtmle(
  ftime,
  ftype,
  trt,
  adjustVars,
  t0 = max(ftime[ftype > 0]),
  SL.ftime = NULL,
  SL.ctime = NULL,
  SL.trt = NULL,
  glm.ftime = NULL,
  glm.ctime = NULL,
  glm.trt = NULL,
  returnIC = TRUE,
  returnModels = TRUE,
  ftypeOfInterest = unique(ftype[ftype != 0]),
  trtOfInterest = unique(trt),
  cvControl = list(V = 10L, stratifyCV = FALSE, shuffle = TRUE, validRows = NULL),
  method = "hazard",
  bounds = NULL,
  verbose = FALSE,
  tol = 1/(sqrt(length(ftime))),
  maxIter = 10,
  Gcomp = FALSE,
  gtol = 0.001,
  returnCall = TRUE
)

Arguments

ftime

An integer-valued vector of failure times. Right-censored observations should have corresponding ftype set to 0.

ftype

An integer-valued vector indicating the type of failure. Observations with ftype=0 are treated as being right-censored. Each unique value besides zero is treated as a separate type of failure.

trt

A numeric vector indicating observed treatment assignment. Each unique value will be treated as a different type of treatment. Currently, only two unique values are supported.

adjustVars

A data.frame of adjustment variables that will be used in estimating the conditional treatment, censoring, and failure (hazard or conditional mean) probabilities.

t0

The time at which to return cumulative incidence estimates. By default this is set to max(ftime[ftype > 0]).

SL.ftime

A character vector or list specification to be passed to the SL.library in the call to SuperLearner for the outcome regression (either cause-specific hazards or iterated mean). See the documentation of SuperLearner for more information on how to specify valid SuperLearner libraries. It is expected that the wrappers used in the library will play nicely with the input variables, which will be called "trt", names(adjustVars), and "t" (if method="hazard").

SL.ctime

A character vector or list specification to be passed to the SL.library in the call to SuperLearner for the estimate of the conditional hazard for censoring. It is expected that the wrappers used in the library will play nicely with the input variables, which will be called "trt" and names(adjustVars).

SL.trt

A character vector or list specification to be passed to the SL.library in the call to SuperLearner for the estimate of the conditional probability of treatment. It is expected that the wrappers used in the library will play nicely with the input variables, which will be names(adjustVars).

glm.ftime

A character specification of the right-hand side of the equation passed to the formula option of a call to glm for the outcome regression. Ignored if SL.ftime is not equal to NULL. Use "trt" to specify the treatment in this formula (see examples). The formula can additionally include any variables found in names(adjustVars).

glm.ctime

A character specification of the right-hand side of the equation passed to the formula option of a call to glm for the estimate of the conditional hazard for censoring. Ignored if SL.ctime is not equal to NULL. Use "trt" to specify the treatment in this formula (see examples). The formula can additionally include any variables found in names(adjustVars).

glm.trt

A character specification of the right-hand side of the equation passed to the formula option of a call to glm for the estimate of the conditional probability of treatment. Ignored if SL.trt is not equal to NULL. The formula can include any variables found in names(adjustVars).

returnIC

A logical indicating whether to return vectors of influence curve estimates. These are needed for some post-hoc comparisons, so it is recommended to leave as TRUE (the default) unless the user is sure these estimates will not be needed later.

returnModels

A logical indicating whether to return the glm or SuperLearner objects used to estimate the nuisance parameters. Must be set to TRUE if the user plans to use timepoints to obtain estimates of incidence at times other than t0. See the documentation of timepoints for more information.

ftypeOfInterest

An input specifying what failure types to compute estimates of incidence for. The default value computes estimates for values unique(ftype). Can alternatively be set to a vector of values found in ftype.

trtOfInterest

An input specifying which levels of trt are of interest. The default value computes estimates for all of the values in unique(trt). Can alternatively be set to a vector of values found in trt.

cvControl

A list providing control options to be fed directly into calls to SuperLearner. This should match the contents of SuperLearner.CV.control exactly. For details, consult the documentation of the SuperLearner package.

method

A character specification of how the targeted minimum loss-based estimators should be computed, either "mean" or "hazard". The "mean" specification uses a closed-form targeted minimum loss-based estimation based on the G-computation formula of Bang and Robins (2005). The "hazard" specification uses an iterative algorithm based on cause-specific hazard functions. The latter specification has no guarantee of convergence in finite samples. The convergence can be influenced by the stopping criteria specified in the tol. Future versions may implement a closed-form version of this hazard-based estimator.

bounds

A data.frame of bounds on the conditional hazard function (if method = "hazard") or on the iterated conditional means (if method = "mean"). The data.frame should have a column named "t" that includes values seq_len(t0). The other columns should be names paste0("l",j) and paste0("u",j) for each unique failure type label j, denoting lower and upper bounds, respectively. See examples.

verbose

A logical indicating whether the function should print messages to indicate progress. If SuperLearner is called internally, this option will be passed to it.

tol

The stopping criteria when method="hazard". The TMLE algorithm performs updates to the initial estimators until the empirical mean of the efficient influence function is smaller than tol or until maxIter iterations have been completed. The default (1/length(ftime)) is a sensible value. Larger values can be used in situations where convergence of the algorithm is an issue; however, this may result in large finite-sample bias.

maxIter

A maximum number of iterations for the algorithm when method = "hazard". The algorithm will iterate until either the empirical mean of the efficient influence function is smaller than tol or until maxIter iterations have been completed.

Gcomp

A logical indicating whether to compute the G-computation estimator (i.e., a substitution estimator with no targeting step). Theory does not support inference for the G-computation estimator if Super Learner is used to estimate failure and censoring distributions. The G-computation is only implemented for method = "mean".

gtol

The truncation level of predicted censoring survival. Setting to larger values can help performance in data sets with practical positivity violations.

returnCall

A logical specifying whether to return the function call via match.call(expand.dots = TRUE). Set to FALSE to have the call slot return only NA when passing in pre-computed initial estimates to reduce memory inefficiency overhead. Defaults to TRUE.

Value

An object of class survtmle.

call

The call to survtmle.

est

A numeric vector of point estimates -- one for each combination of ftypeOfInterest and trtOfInterest.

var

A covariance matrix for the point estimates.

meanIC

The empirical mean of the efficient influence function at the estimated, targeted nuisance parameters. Each value should be small or the user will be warned that excessive finite-sample bias may exist in the point estimates.

ic

The efficient influence function at the estimated, fluctuated nuisance parameters, evaluated on each of the observations. These are used to construct confidence intervals for post-hoc comparisons.

ftimeMod

If returnModels=TRUE the fit object(s) for the call to glm or SuperLearner for the outcome regression models. If method="mean" this will be a list of length length(ftypeOfInterest) each of length t0 (one regression for each failure type and for each timepoint). If method="hazard" this will be a list of length length(ftypeOfInterest) with one fit corresponding to the hazard for each cause of failure. If returnModels = FALSE, this entry will be NULL.

ctimeMod

If returnModels=TRUE the fit object for the call to glm or SuperLearner for the pooled hazard regression model for the censoring distribution. If returnModels=FALSE, this entry will be NULL.

trtMod

If returnModels = TRUE the fit object for the call to glm or SuperLearner for the conditional probability of trt regression model. If returnModels = FALSE, this entry will be NULL.

t0

The timepoint at which the function was evaluated.

ftime

The numeric vector of failure times used in the fit.

ftype

The numeric vector of failure types used in the fit.

trt

The numeric vector of treatment assignments used in the fit.

adjustVars

The data.frame of failure times used in the fit.

Examples

# simulate data set.seed(1234) n <- 200 trt <- rbinom(n, 1, 0.5) adjustVars <- data.frame(W1 = round(runif(n)), W2 = round(runif(n, 0, 2))) ftime <- round(1 + runif(n, 1, 4) - trt + adjustVars$W1 + adjustVars$W2) ftype <- round(runif(n, 0, 1)) # Fit 1 # fit a survtmle object with glm estimators for treatment, censoring, and # failure using the "mean" method fit1 <- survtmle( ftime = ftime, ftype = ftype, trt = trt, adjustVars = adjustVars, glm.trt = "W1 + W2", glm.ftime = "trt + W1 + W2", glm.ctime = "trt + W1 + W2", method = "mean", t0 = 6 ) fit1
#> $est #> [,1] #> 0 1 0.5660495 #> 1 1 0.7412935 #> #> $var #> 0 1 1 1 #> 0 1 0.0042149062 0.0002205963 #> 1 1 0.0002205963 0.0023100990 #>
# Fit 2 # fit an survtmle object with SuperLearner estimators for failure and # censoring and empirical estimators for treatment using the "mean" method fit2 <- survtmle( ftime = ftime, ftype = ftype, trt = trt, adjustVars = adjustVars, SL.ftime = c("SL.mean"), SL.ctime = c("SL.mean"), method = "mean", t0 = 6 )
#> Warning: glm.trt and SL.trt not specified. Proceeding with glm.trt = '1'
#> Loading required package: nnls
fit2
#> $est #> [,1] #> 0 1 0.5284170 #> 1 1 0.6584069 #> #> $var #> 0 1 1 1 #> 0 1 0.005393382 0.000000000 #> 1 1 0.000000000 0.004369973 #>