3 Running slapnap
To run the slapnap
container, we make use of the docker run
command. Note that administrator (sudo
) privileges are needed to execute this command. Additionally, note that slapnap
operates in UTC+0 time – this will be important when inspecting the files generated by slapnap
.
There are several options that are necessary to include in this command to control the behavior of slapnap
. These are discussed in separate subsections below.
3.1 slapnap
run options
The user has control over many aspects of slapnap
’s behavior. These options are passed in using the -e
option1. Semicolon-separated strings are used to set options. For example, to provide input for the option option_name
, we would used -e option_name="a;semicolon;separated;string"
. Note that there are no spaces between the option name and its value and no spaces after semicolons in the separated list. See Section 4 for full syntax.
Each description below lists the default value that is assumed if the option is not specified. Note that many of the default values are chosen simply so that naive calls to slapnap
compile quickly. Proper values should be determined based on scientific context.
-e options for slapnap
nab
: A semicolon-separated list of bnAbs (default ="VRC01"
). A list of possible bnAbs can be found here. If multiple bnAbs are listed, it is assumed that the analysis should be of estimatedoutcomes
for a combination of bnAbs (see Section 5.1 for details on how estimated outcomes for multiple bnAbs are computed).outcomes
: A semicolon-separated string of outcomes to include in the analysis. Possible values are"ic50"
(included in default),"ic80"
,"iip"
,"sens"
(included in default),"estsens"
, and"multsens"
. If only a singlenab
is specified, usesens
to include a dichotomous endpoint. If multiplenab
s are specified, useestsens
and/ormultsens
. For detailed definitions of outcomes see Section 5.1.combination_method
A string defining the method to use for predicting combination IC\(_{50}\) and/or IC\(_{80}\). Possible values are"additive"
(the default, for the additive model defined in Wagh et al. 2016) or"Bliss-Hill"
(for the Bliss-Hill model defined in Wagh et al. 2016).binary_outcomes
A string defining the measure of neutralization to use for defining binary outcomes. Possible values are"ic50"
(the default, for using IC\(_{50}\) to define sensitivity) or"ic80"
(for using IC\(_{80}\) to define sensitivity).sens_thresh
A numeric value defining the neutralization threshold for defining a sensitive versus resistant pseudovirus (default = 1). The dichotomous sensitivity/resistant outcome is defined as the indicator that (estimated) IC\(_{50}\) (or IC\(_{80}\), ifbinary_outcomes="ic80"
) is greater than or equal tosens_thresh
.multsens_nab
A numeric value used for defining whether a pseudovirus is resistant to a multi-nAb cocktail. Only used ifmultsens
is included inoutcomes
and more than onenab
is requested. The dichotomous outcomemultsens
is defined as the indicator that a virus has IC\(_{50}\) (or IC\(_{80}\), ifbinary_outcomes="ic80"
) greater thansens_thresh
for at leastmultsens_nab
nAbs.learners
: A semicolon-separated string of machine learning algorithms to include in the analysis. Possible values include"rf"
(random forest, default),"xgboost"
(eXtreme gradient boosting),"h2oboost"
(gradient boosting using H2O.ai) and"lasso"
(elastic net). See Section 5.2 for details on how tuning parameters are chosen. If more than one algorithm is included, then it is assumed that a cross-validated-based ensemble (i.e., a super learner) is desired (see Section 5.3).cvtune
: A boolean string (i.e., either"TRUE"
or"FALSE"
[default]) indicating whether thelearners
should be tuned using cross validation and a small grid search. Defaults to"FALSE"
. If multiplelearners
are specified, then the super learner ensemble includes up to three versions of each of the requestedlearners
with different tuning parameters.cvperf
: A boolean string (i.e., either"TRUE"
or"FALSE"
[default]) indicating whether thelearners
performance should be evaluated using cross validation. Ifcvtune="TRUE"
orlearners
includes multiple algorithms, then nested cross validation is used to evaluate the performance of the cross validation-selected best value of tuning parameters for the specified algorithm or the super learner, respectively.var_thresh
: A numeric string that defines a threshold for pre-screening features. If a single positive number, all binary features with fewer thanvar_thresh
0’s or 1’s are removed prior to the specifiedlearner
training. If several values are included invar_thresh
and a singlelearner
is specified, then cross-validation is used to select the optimal threshold. If multiplelearner
s are specified, then eachlearner
is included in the super learner with pre-screening based on each value ofvar_thresh
.nfolds
: A numeric string indicating the number of folds to use in cross validation procedures (default ="2"
).importance_grp
: A semicolon-separated string indicating which group-level variable importance measures should be computed. Possible values are none""
(default), marginal"marg"
, conditional"cond"
. See Section 5.4.1 for details on these measures.importance_ind
: A semicolon-separated string indicating which individual-level variable importance measures should be computed. Possible values are none""
(default), learner-level"pred"
, marginal"marg"
and conditional"cond"
. The latter two take significant computation time to compute. See Sections 5.4.1 and 5.4.2 for details on these measures.same_subset
If"FALSE"
(default) all data available for each outcome will be used in the analysis. If"TRUE"
, when multipleoutcomes
are requested, the data will be subset to just those sequences that have all measuredoutcomes
, and, ifiip
is requested, for whichiip
can be computed (i.e., measured IC\(_{50}\) and IC\(_{80}\) values are different). Thus, if"TRUE"
all requestedoutcomes
will be evaluated using thesame_subset
of the CATNAP data.report_name
: A string indicating the desired name of the output report (default =report_[_-separated list of nabs]_[date].html
).return
: A semicolon-separated string of the desired output. Possible values are"report"
(default),"learner"
for a.rds
object that contains the algorithm for each endpoint trained using the full analysis data,"data"
for the analysis dataset,"figures"
for all figures from the report, and"vimp"
for variable importance objects.view_port
: A boolean string indicating whether the compiled report should be made viewable onlocalhost
(default"FALSE"
). If"TRUE"
then-p
option should be used in thedocker run
command to identify the port. See example in Section 4.2 for details.
3.2 Returning output
At the end of a slapnap
run, user-specified output will be saved (see option return
in Section 3.1). To retrieve these files from the container, there are two options: mounting a local directory (Section 3.2.1) or, if the report is the only desired output, viewing and saving the report in a web browser (Section 3.2.2).
3.2.1 Mounting a local directory
To mount a local directory to the output directory in the container (/home/output/
), use the -v
option. Any items saved to the output directory in the container (file path in the container /home/output/
) will be available in the mounted directory. Conversely, all files in the mounted local directory will be visible to programs running inside the container.
Suppose /path/to/local/dir
is the file path on a local computer in which we wish to save the output files from a slapnap
run. A docker run
of slapnap
would include the option -v /path/to/local/dir:/home/output
. After a run completes, the requested output should be viewable in /path/to/local/dir
. See Section 4 for full syntax.
To avoid possible naming conflicts and file overwrites in the mounted directory, we recommend mounting an empty directory to store the output.
Widows users need to enable shared drives by clicking Settings > Shared Drives
in the Docker Desktop Daemon and sharing the drive that contains path/to/local/dir
.
3.2.2 Viewing report in browser
An alternative option to mounting local directories for viewing and downloading the report is to set the view_port
option to "TRUE"
and open a port to the container via the -p
option in the docker run
statement. In this case, rather than exiting upon completion of the analysis, the container will continuing to run and broadcast the compiled report to localhost
at the specified port (see examples below). The report can be downloaded from the web browser directly in this way.
References
Wagh, Kshitij, Tanmoy Bhattacharya, Carolyn Williamson, Alex Robles, Madeleine Bayne, Jetta Garrity, Michael Rist, et al. 2016. “Optimal Combinations of Broadly Neutralizing Antibodies for Prevention and Treatment of HIV-1 Clade C Infection.” PLoS Pathogens 12 (3). https://doi.org/10.1371/journal.ppat.1005520.
This sets an environment variable in the container environment. These variables are accessed by the various
R
andbash
scripts in the container to dictate how the container executes code.↩︎