4 Examples
4.1 Basic call to slapnap
A call to slapnap
with all default options can be run using the following command.
Note that this call mounts the local directory path/to/local/dir
to receive output from the container (see Section 3.2.1).
When this command is executed, messages will print to indicate the progress of the container. The first message will report the name of the log file, which will appear in /path/to/local/dir
(note that the name of the log file is based on the current time, which is in UTC+0). The container will then compile an analytic data set from the CATNAP database for the default bnAb (VRC01), train the default learner (random forest) for the default outcomes (ic50
and sens
), evaluate its performance using two-fold (default for nfolds
) cross validation, and compile a report detailing the results, and place the compiled report in path/to/local/dir
(note that the default name of the report is based on the current time, which is in UTC+0).
4.2 Viewing report in browser
To have the results viewable in a web browser execute the following command2.
This command opens port 80 on the container. Once the report has been compiled, the container will not close down automatically. Instead it will continue to run, broadcasting the report to port 80. Open a web browser on your computer and navigate to localhost:80
and you should see the compiled report. Many web browsers should allow downloading of the report (e.g., by right-clicking and selecting save Save As...
).
The container will continue to run until you tell it to stop
. To do that, retrieve the container ID by executing docker container ps
3. Copy the ID of the running container, which will be a string of numbers and letters (say a1b2c3d4
) and then execute docker stop a1b2c3d4
to shut down the container.
Note that in the above command, we have still mounted a local directory, which may be considered best practice in case other output besides the report is desired to be returned.
4.3 Super learning
If multiple learners
are specified, then a super learner ensemble (van der Laan, Polley, and Hubbard 2007) is constructed based on the requested learners
and a predictor that simply returns the empirical average of the outcome (i.e., ignores all features entirely). In the following command, we construct an ensemble based on a random forest (Breiman 2001) and elastic net (Zou and Hastie 2005). Note that the execution time for super learners can be considerably greater than for single learner
s because of the extra layer of cross validation needed to construct the ensemble.
For specific details on the super learner algorithms implemented in slapnap
, see Section 5.3.
4.4 Train an algorithm
The previous commands train learners and evaluate their performance using cross validation. However, at times we may wish only to use slapnap
to train a particular algorithm, while avoiding the additional computational time associated with evaluating its cross-validated performance and compiling a report. We show an example of this below using sensitivity as the outcome.
docker run -v /path/to/local/dir:/home/output \
-e learners="rf" \
-e return="learner" \
-e cvperf="FALSE" \
-e outcomes="sens" \
slapnap/slapnap
After completion of this run, learner_sens.rds
will appear in /path/to/local/dir
that contains an R
object of class ranger
(the R
package used by slapnap
to fit random forests). You can confirm this from the command line by executing
4.5 Pull and clean data
The slapnap
container can also be used to return cleaned CATNAP data suitable for analyses not supported by the slapnap
pipeline. In this case, the container avoids training machine learning algorithms and report generation, returning a data set and associated documentation. In the following call, return
only includes "data"
; thus, options pertaining to the machine learning portions of slapnap
are essentially ignored by slapnap
. The inputted outcomes
are also irrelevant, as all outcomes
are included in the resultant data set.
Note that the data set returned by slapnap
contains the outcomes
used by slapnap
; in other words, (estimated) IC\(_{50},\) (estimated) IC\(_{80}\), and IIP are all log-transformed (see Section 5.1 for more details).
4.6 Interactive sessions
To simply enter and explore the container, use an interactive session by including -it
and overriding the container’s entry point.
This will enter you into the container in a bash terminal prior to any portions of the analysis being run. This may be useful for exploring the file structure, examining versions of R
packages that are included in the container, etc.
To enter the container interactively after the analysis has run, you can execute the following commands. Here we add the -d
option to start the container in detached mode.
docker run -d -p 80:80 -e view_port="TRUE" slapnap/slapnap
# ...wait for analysis to finish...
# use this command to enter the container
docker exec -it /bin/bash
To close the interactive session type exit
at the container command prompt and hit Return
. This will close the container and stop its running.
References
Breiman, Leo. 2001. “Random Forests.” Machine Learning 45 (1): 5–32. https://doi.org/10.1023/A:1010933404324.
van der Laan, Mark J, Eric C Polley, and Alan E Hubbard. 2007. “Super Learner.” Statistical Applications in Genetics and Molecular Biology 6 (1). https://doi.org/10.2202/1544-6115.1309.
Zou, Hui, and Trevor Hastie. 2005. “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2): 301–20. https://doi.org/10.1111/j.1467-9868.2005.00503.x.
In this command, we use the escape character
\
to break the command over multiple lines, which will work on Linux and Mac OS. In Windows Command Prompt, the equivalent escape character is^
; in Windows Powershell, the equivalent escape character is`
. In all cases, take care not to include a space after the escape character.↩︎To execute this command, you will need to hit
control + c
to return to the command prompt in the current shell or open a new shell. Alternatively, you could add the-d
option to thedocker run
command, which will run the container in detached mode.↩︎