cv_auc.Rd
This function computes K-fold cross-validated estimates of the area under the receiver operating characteristics (ROC) curve (hereafter, AUC). This quantity can be interpreted as the probability that a randomly selected case will have higher predicted risk than a randomly selected control.
cv_auc( Y, X, K = 10, learner = "glm_wrapper", nested_cv = TRUE, nested_K = K - 1, parallel = FALSE, max_cvtmle_iter = 10, cvtmle_ictol = 1/length(Y), prediction_list = NULL, ... )
Y | A numeric vector of outcomes, assume to equal |
---|---|
X | A |
K | The number of cross-validation folds (default is |
learner | A wrapper that implements the desired method for building a
prediction algorithm. See See |
nested_cv | A boolean indicating whether nested cross validation should
be used to estimate the distribution of the prediction function. Default ( |
nested_K | If nested cross validation is used, how many inner folds should
there be? Default ( |
parallel | A boolean indicating whether prediction algorithms should be
trained in parallel. Default to |
max_cvtmle_iter | Maximum number of iterations for the bias correction
step of the CV-TMLE estimator (default |
cvtmle_ictol | The CV-TMLE will iterate |
prediction_list | For power users: a list of predictions made by |
... | Other arguments, not currently used |
An object of class "cvauc"
.
est_cvtmle
cross-validated targeted minimum loss-based estimator of K-fold CV AUC
iter_cvtmle
iterations needed to achieve convergence of CVTMLE algorithm
cvtmle_trace
the value of the CVTMLE at each iteration of the targeting algorithm
se_cvtmle
estimated standard error based on targeted nuisance parameters
est_init
plug-in estimate of CV AUC where nuisance parameters are estimated in the training sample
est_empirical
the standard K-fold CV AUC estimator
se_empirical
estimated standard error for the standard estimator
est_onestep
cross-validated one-step estimate of K-fold CV AUC
se_onestep
estimated standard error for the one-step estimator
est_esteq
cross-validated estimating equations estimate of K-fold CV AUC
se_esteq
estimated standard error for the estimating equations estimator (same as for one-step)
folds
list of observation indexes in each validation fold
ic_cvtmle
influence function evaluated at the targeted nuisance parameter estimates
ic_onestep
influence function evaluated at the training-fold-estimated nuisance parameters
ic_esteq
influence function evaluated at the training-fold-estimated nuisance parameters
ic_empirical
influence function evaluated at the validation-fold estimated nuisance parameters
prediction_list
a list of output from the cross-validated model training; see the individual wrapper function documentation for further details
To estimate the AUC of a particular prediction algorithm, K-fold cross-validation is commonly used: data are partitioned into K distinct groups and the prediction algorithm is developed using K-1 of these groups. In standard K-fold cross-validation, the AUC of this prediction algorithm is estimated using the remaining fold. This can be problematic when the number of observations is small or the number of cross-validation folds is large.
Here, we estimate relevant nuisance parameters in the training sample and use
the validation sample to perform some form of bias correction -- either through
cross-validated targeted minimum loss-based estimation, estimating equations,
or one-step estimation. When aggressive learning algorithms are applied, it is
necessary to use an additional layer of cross-validation in the training sample
to estimate the nuisance parameters. This is controlled via the nested_cv
option below.
# simulate data n <- 200 p <- 10 X <- data.frame(matrix(rnorm(n*p), nrow = n, ncol = p)) Y <- rbinom(n, 1, plogis(X[,1] + X[,10])) # get cv auc estimates for logistic regression cv_auc_ests <- cv_auc(Y = Y, X = X, K = 5, learner = "glm_wrapper") # get cv auc estimates for random forest # using nested cross-validation for nuisance parameter estimationfit <- cv_auc(Y = Y, X = X, K = 5, learner = "randomforest_wrapper", nested_cv = TRUE)