Function that computes the optimal combination of multiple outcomes and a predictor of the optimal combination using Super Learning.

optWeight(Y, X, SL.library, family = "gaussian", CV.SuperLearner.V = 10,
  seed = 12345, whichAlgorithm = "SuperLearner",
  return.SuperLearner = TRUE, return.CV.SuperLearner = FALSE,
  return.IC = TRUE, parallel = FALSE, n.cores = parallel::detectCores(),
  ...)

Arguments

Y

A data.frame of outcomes with each column representing a different outcome

X

A data.frame that will be used to predict each outcome.

SL.library

A vector or list of the Super Learner library to be used for the prediction. See ?SuperLearner for more details. For now the same SL.library is used for prediction of each outcome.

family

An object of class family equal to either "gaussian" for continuous outcomes or "binomial" for binary outcomes.

CV.SuperLearner.V

The number of CV folds for the calls to CV.SuperLearner. For now, the inner calls to CV.SuperLearner always use V=10.

seed

The seed to set before each internal call to CV.SuperLearner

whichAlgorithm

What algorithm to compute optimal predictions and R^2 values for.

return.SuperLearner

A boolean indicating whether to return the fitted SuperLearner objects for each outcome. Default is TRUE, as these fits are needed for later predictions.

return.CV.SuperLearner

A boolean indicating whether to return the fitted CV.SuperLearner objects.

return.IC

A boolean indicating whether to return estimated influence functions.

parallel

A boolean indicating whether to run the CV.SuperLearner calls in parallel using mclapply. Be sure to set options()$mc.cores to

n.cores

A numeric indicating how many cores to use if parallel = TRUE. By default will use parallel::detectCores().

...

Other arguments

Value

TO DO: Add return documentation.

Examples

# Example 1 -- simple fit set.seed(1234) X <- data.frame(x1=runif(n=100,0,5), x2=runif(n=100,0,5)) Y1 <- rnorm(100, X$x1 + X$x2, 1) Y2 <- rnorm(100, X$x1 + X$x2, 3) Y <- data.frame(Y1 = Y1, Y2 = Y2) fit <- optWeight(Y = Y, X = X, seed = 1, SL.library = c("SL.glm","SL.mean","SL.step")) # Example 2 -- simple fit with parallelization #system.time( # fit <- optWeight(Y = Y, X = X, SL.library = c("SL.glm","SL.mean","SL.step"), #parallel = TRUE, n.cores = 3) #)