Worker function to make long form data set needed for CVTMLE targeting step when nested cv is used

.make_long_data_nested_cv(
  x,
  prediction_list,
  folds,
  gn,
  update = FALSE,
  epsilon_0 = 0,
  epsilon_1 = 0,
  tol = 0.001
)

Arguments

x	The outer validation fold
prediction_list	The full prediction list
folds	Vector of CV folds
gn	An estimate of the marginal dist. of Y
update	Boolean of whether this is called for initial construction of the long data set or as part of the targeting loop. If the former, cross-validated empirical "density" estimates are used. If the latter these are derived from the targeted cdf.
epsilon_0	If `update = TRUE`, a vector of TMLE fluctuation parameter estimates used to add the CDF and PDF of Psi(X) to the data set
epsilon_1	Ditto above
tol	A truncation level when taking logit transformations.

Value

A long form data list of a particular set up. Columns are named id (multiple per obs. in validation sample), u (if Yi = 0, these are the unique values of psi(x) in the inner validation samples for psi fit on inner training samples for obs with Y = 1, if Yi = 1, these are values of psi(x) in the inner validation samples for psi fit on inner training samples for obs. with Y = 0), Yi (this id's value of Y), Fn (cross-validation estimated value of the cdf of psi(X) given Y = Yi in the training sample), dFn (cross-validated estimate of the density of psi(X) given Y = (1-Yi) in the training sample), psi (the value of this observations Psihat(P_n,B_n^0)), gn (estimate of marginal of Y e.g., computed in whole sample), outcome (indicator that psix <= u), logit_Fn (the cdf estimate on the logit scale, needed for offset in targeting model).

Worker function to make long form data set needed for CVTMLE targeting step when nested cv is used

Arguments

Value

Contents