4 Examples

4.1 Basic call to `slapnap`

A call to slapnap with all default options can be run using the following command.

docker run -v /path/to/local/dir:/home/output slapnap/slapnap

Note that this call mounts the local directory path/to/local/dir to receive output from the container (see Section 3.2.1).

When this command is executed, messages will print to indicate the progress of the container. The first message will report the name of the log file, which will appear in /path/to/local/dir (note that the name of the log file is based on the current time, which is in UTC+0). The container will then compile an analytic data set from the CATNAP database for the default bnAb (VRC01), train the default learner (random forest) for the default outcomes (ic50 and sens), evaluate its performance using two-fold (default for nfolds) cross validation, and compile a report detailing the results, and place the compiled report in path/to/local/dir (note that the default name of the report is based on the current time, which is in UTC+0).

4.2 Viewing report in browser

To have the results viewable in a web browser execute the following command².

docker run -v /path/to/local/dir:/home/output \
           -e view_port="TRUE" -p 80:80 \
           slapnap/slapnap

This command opens port 80 on the container. Once the report has been compiled, the container will not close down automatically. Instead it will continue to run, broadcasting the report to port 80. Open a web browser on your computer and navigate to localhost:80 and you should see the compiled report. Many web browsers should allow downloading of the report (e.g., by right-clicking and selecting save Save As...).

The container will continue to run until you tell it to stop. To do that, retrieve the container ID by executing docker container ps³. Copy the ID of the running container, which will be a string of numbers and letters (say a1b2c3d4) and then execute docker stop a1b2c3d4 to shut down the container.

Note that in the above command, we have still mounted a local directory, which may be considered best practice in case other output besides the report is desired to be returned.

4.3 Super learning

If multiple learners are specified, then a super learner ensemble (van der Laan, Polley, and Hubbard 2007) is constructed based on the requested learners and a predictor that simply returns the empirical average of the outcome (i.e., ignores all features entirely). In the following command, we construct an ensemble based on a random forest (Breiman 2001) and elastic net (Zou and Hastie 2005). Note that the execution time for super learners can be considerably greater than for single learners because of the extra layer of cross validation needed to construct the ensemble.

docker run -v /path/to/local/dir:/home/output \
           -e learners="rf;lasso" \
           slapnap/slapnap

For specific details on the super learner algorithms implemented in slapnap, see Section 5.3.

4.4 Train an algorithm

The previous commands train learners and evaluate their performance using cross validation. However, at times we may wish only to use slapnap to train a particular algorithm, while avoiding the additional computational time associated with evaluating its cross-validated performance and compiling a report. We show an example of this below using sensitivity as the outcome.

docker run -v /path/to/local/dir:/home/output \
           -e learners="rf" \
           -e return="learner" \
           -e cvperf="FALSE" \
           -e outcomes="sens" \
           slapnap/slapnap

After completion of this run, learner_sens.rds will appear in /path/to/local/dir that contains an R object of class ranger (the R package used by slapnap to fit random forests). You can confirm this from the command line by executing

Rscript -e "learner <- readRDS('/path/to/local/dir/learner_sens.rds'); class(learner)"

4.5 Pull and clean data

The slapnap container can also be used to return cleaned CATNAP data suitable for analyses not supported by the slapnap pipeline. In this case, the container avoids training machine learning algorithms and report generation, returning a data set and associated documentation. In the following call, return only includes "data"; thus, options pertaining to the machine learning portions of slapnap are essentially ignored by slapnap. The inputted outcomes are also irrelevant, as all outcomes are included in the resultant data set.

docker run -v /path/to/local/dir:/home/output \
           -e return="data" \
           slapnap/slapnap

Note that the data set returned by slapnap contains the outcomes used by slapnap; in other words, (estimated) IC\(_{50},\) (estimated) IC\(_{80}\), and IIP are all log-transformed (see Section 5.1 for more details).

4.6 Interactive sessions

To simply enter and explore the container, use an interactive session by including -it and overriding the container’s entry point.

docker run -it slapnap/slapnap /bin/bash

This will enter you into the container in a bash terminal prior to any portions of the analysis being run. This may be useful for exploring the file structure, examining versions of R packages that are included in the container, etc.

To enter the container interactively after the analysis has run, you can execute the following commands. Here we add the -d option to start the container in detached mode.

docker run -d -p 80:80 -e view_port="TRUE" slapnap/slapnap

# ...wait for analysis to finish...

# use this command to enter the container
docker exec -it /bin/bash

To close the interactive session type exit at the container command prompt and hit Return. This will close the container and stop its running.

References

Breiman, Leo. 2001. “Random Forests.” Machine Learning 45 (1): 5–32. https://doi.org/10.1023/A:1010933404324.

van der Laan, Mark J, Eric C Polley, and Alan E Hubbard. 2007. “Super Learner.” Statistical Applications in Genetics and Molecular Biology 6 (1). https://doi.org/10.2202/1544-6115.1309.

Zou, Hui, and Trevor Hastie. 2005. “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2): 301–20. https://doi.org/10.1111/j.1467-9868.2005.00503.x.

In this command, we use the escape character \ to break the command over multiple lines, which will work on Linux and Mac OS. In Windows Command Prompt, the equivalent escape character is ^; in Windows Powershell, the equivalent escape character is `. In all cases, take care not to include a space after the escape character.↩︎
To execute this command, you will need to hit control + c to return to the command prompt in the current shell or open a new shell. Alternatively, you could add the -d option to the docker run command, which will run the container in detached mode.↩︎