1 Executive summary

The broadly neutralizing antibody (bNAb) studied in this analysis is 10-1074. The analysis considered 1 measure of neutralization sensitivity: sensitivity. Sensitivity is defined by the binary indicator that IC\(_{50}\) < 50. Based on this specification of bNAb and outcome:

Prediction of each outcome was performed using a super learner ensemble (van der Laan, Polley, and Hubbard 2007) of several random forests (Breiman 2001) with varied tuning parameters, several gradient boosted trees (Chen and Guestrin 2016) with varied tuning parameters and several elastic net regressions (Zou and Hastie 2005) with varied tuning parameters and intercept-only regression.

The specific algorithms used in the learning process are described in Table 1.1.

Table 1.1: Algorithms used in the super learner library
Label Description
rf_tune1 random forest with mtry equal to one-half times square root of number of predictors
rf_default random forest with mtry equal to square root of number of predictors
rf_tune2 random forest with mtry equal to two times square root of number of predictors
xgboost_default boosted regression trees with maximum depth of 4
xgboost_tune3 boosted regression trees with maximum depth of 8
xgboost_tune4 boosted regression trees with maximum depth of 12
lasso_default elastic net with \(\lambda\) selected by CV and \(\alpha\) equal to 0
lasso_tune1 elastic net with \(\lambda\) selected by 5-fold CV and \(\alpha\) equal to 0.25
lasso_tune2 elastic net with \(\lambda\) selected by 5-fold CV and \(\alpha\) equal to 0.5
lasso_tune3 elastic net with \(\lambda\) selected by 5-fold CV and \(\alpha\) equal to 0.75
mean intercept only regression

The predictive ability of the learner was assessed using cross-validation. The estimated cross-validated area under the receiver operating characteristic curve (AUC) of the learner for predicting sensitivity is shown in Table 1.2.

Table 1.2: Estimates of 5-fold cross-validated AUC for predictions of sensitivity (n = 581).
CV-AUC Lower 95% CI Upper 95% CI
Sensitivity 0.938 0.865 0.973

We define the marginal biological importance of a subgroup of features as the difference in population predictiveness between the best possible prediction function based on the features under consideration plus geographic confounders versus only geographic confounders (Williamson et al. 2020). In Table 1.3, we display the groups of variables and their ranked marginal biological variable importance for predicting sensitivity. For variable group definitions, please refer to Table 3.1.

Table 1.3: Ranked marginal variable importance of groups relative to the group of geographic confounders for predicting sensitivity. Importance is measured via AUC for sensitivity. Stars next to ranks denote groups with p-value less than 0.05 from a hypothesis test with null hypothesis of zero importance. (n = 581; for estimating the prediction functions based on the feature group of interest, n = 291; for estimating the prediction functions based on the group of geographic confounders, n = 290)
Variable group Sensitivity
gp120 V3 1*
gp120 CD4 binding sites 2
gp120 V2 3
gp41 MPER 4
Region-specific counts of PNG sites 5
Cysteine counts 6
Viral geometry 7

2 Results for sensitivity

2.1 Descriptive statistics

Out of the sequences with complete data, 440 were estimated to be sensitive to the bNAb, while 141 were estimated to be resistant, where sensitivity was defined as the indicator that IC\(_{50}\) was less than 50.

2.2 Super learner results

The weights assigned to each algorithm for Super Learner predicting sensitivity are shown in Table 2.1.

Table 2.1: Table of super learner weights for sensitivity (n = 581 observations).
Learner Weight
rf_tune1 0.00
rf_default 0.00
rf_tune2 0.00
xgboost_default 0.42
xgboost_tune3 0.00
xgboost_tune4 0.58
lasso_default 0.00
lasso_tune1 0.00
lasso_tune2 0.00
lasso_tune3 0.00
mean 0.00

2.3 Predictive performance

The cross-validated area under the ROC curve of super learner predictions of sensitivity relative to candidate algorithms is shown in Figure 2.1. Figure 2.2 shows cross-validated ROC curves for this endpoint.

The cross-validated area under the ROC curve of the learner with tuning parameters selected via cross-validation and learners with each individual value of tuning parameters are shown in Figure 2.2.

Cross-validated AUC for predicting sensitivity (n = 581 observations).

Figure 2.1: Cross-validated AUC for predicting sensitivity (n = 581 observations).

Figure 2.2 shows the cross-validated ROC curve for predicting sensitivity.

Cross-validated ROC curve for the super learner, discrete super learner, and single best performing algorithm for predicting sensitivity (n = 581 observations).

Figure 2.2: Cross-validated ROC curve for the super learner, discrete super learner, and single best performing algorithm for predicting sensitivity (n = 581 observations).

Cross-validated predicted probabilities of resistance made by super learner, discrete super learner, and single best performing algorithm colored by cross-validation fold (n = 581 observations).

Figure 2.3: Cross-validated predicted probabilities of resistance made by super learner, discrete super learner, and single best performing algorithm colored by cross-validation fold (n = 581 observations).

2.4 Variable importance

2.4.1 Biological importance

We show the biological variable importance of groups of features (defined in Table 3.1) in predicting sensitivity in Figure 2.4. Importance is defined using the difference in AUCs. The plot shows the marginal biological importance of the group relative to the null model with geographic confounders only.

Group biological variable importance for predicting sensitivity. 95\% confidence intervals and stars denoting p-values less than 0.05 are displayed in blue. (n = 581; for estimating the prediction function based on geographic confounders only, n = 291; for estimating the prediction function based on the feature group of interest plus geographic confounders, n = 290)

Figure 2.4: Group biological variable importance for predicting sensitivity. 95% confidence intervals and stars denoting p-values less than 0.05 are displayed in blue. (n = 581; for estimating the prediction function based on geographic confounders only, n = 291; for estimating the prediction function based on the feature group of interest plus geographic confounders, n = 290)

2.4.2 Predictive importance

Table 2.2 shows the top 20 features in terms of their predictive importance. Specifically, the algorithm with the largest weight in the super learner ensemble was selected and associated variable importance metrics for this algorithm are shown. In this case, the highest weight was assigned to a xgboost algorithm, and thus the variable importance measures presented correspond to xgboost gain importance measures were computed and are shown by their rank. Gain measures the improvement in accuracy brought by a given feature to the tree branches on which it appears. The essential idea is that before adding a split on a given feature to the branch, there may be some observations that are poorly predicted, while after adding an additional split on this feature, and each resultant branch is more accurate. Gain measures this change in accuracy.

Table 2.2: The top 20 important features for predicting sensitivity as measured by their algorithm-specific importance.
Feature Importance
hxb2.334.S.1mer hxb2.334.S.1mer
hxb2.492.E.1mer hxb2.492.E.1mer
hxb2.332.N.1mer hxb2.332.N.1mer
hxb2.325.D.1mer hxb2.325.D.1mer
hxb2.800.L.1mer hxb2.800.L.1mer
hxb2.816.N.1mer hxb2.816.N.1mer
hxb2.30.T.1mer hxb2.30.T.1mer
length.v2 length.v2
hxb2.778.A.1mer hxb2.778.A.1mer
hxb2.440.A.1mer hxb2.440.A.1mer
hxb2.704.I.1mer hxb2.704.I.1mer
hxb2.293.E.1mer hxb2.293.E.1mer
hxb2.507.E.1mer hxb2.507.E.1mer
geographic.region.of.origin.is.S.Africa geographic.region.of.origin.is.S.Africa
hxb2.300.N.1mer hxb2.300.N.1mer
hxb2.316.A.1mer hxb2.316.A.1mer
hxb2.837.C.1mer hxb2.837.C.1mer
hxb2.818.T.1mer hxb2.818.T.1mer
hxb2.26.I.1mer hxb2.26.I.1mer
hxb2.178.K.1mer hxb2.178.K.1mer

3 Variable group definitions

Table 3.1 provides the individual HXB2 coordinates and variable names of the variables that make up each of the variable groups considered for biological importance.

Table 3.1: Individual variables within each variable group. Numeric codes followed by a single letter denote the presence of an amino acid (AA) residue at a given site (relative to HXB2). Other suffixes are: ‘sequon_actual’, referring to the site having a leading AA for the canonical N-linked glycosylation motif (N[!P]{S/T]; in other words, this AA will be an N, and the following two AAs will conform to the motif); ‘gap’, referring to an observed gap at this site after alignment to maintain site-specific relevance; and ‘frameshift’, referring to a gap at this site that resulted in a frameshift. The prefix ‘num’ denotes the number (e.g., ‘num.cysteines’ refers to the number of cysteines), while the prefix ‘length’ denotes the length of the specified region (excluding gaps and frameshifts).
Variables
gp120_cd4bs 61.F, 61.H, 61.I, 61.L, 61.Q, 61.T, 61.V, 61.Y, 62.A, 62.D, 62.E, 62.G, 62.H, 62.I, 62.K, 62.M, 62.N, 62.R, 62.S, 62.T, 62.V, 62.Y, 120.I, 120.T, 120.V, 124.F, 124.H, 124.I, 124.P, 124.Y, 125.F, 125.I, 125.L, 125.M, 127.I, 127.V, 182.A, 182.E, 182.H, 182.I, 182.K, 182.L, 182.M, 182.N, 182.Q, 182.S, 182.T, 182.V, 197.D, 197.I, 197.K, 197.N, 197.R, 197.T, 198.A, 198.I, 198.S, 198.T, 198.V, 204.A, 204.E, 204.S, 204.T, 204.V, 206.P, 206.S, 206.T, 209.N, 209.S, 209.T, 274.A, 274.C, 274.F, 274.G, 274.S, 274.T, 274.V, 274.gap, 276.D, 276.E, 276.H, 276.K, 276.N, 276.S, 276.gap, 279.A, 279.C, 279.D, 279.E, 279.K, 279.N, 279.Q, 279.S, 280.D, 280.N, 280.S, 280.T, 281.A, 281.E, 281.G, 281.H, 281.I, 281.R, 281.S, 281.T, 281.V, 282.E, 282.G, 282.H, 282.K, 282.N, 282.P, 282.Q, 282.R, 282.S, 282.Y, 283.A, 283.I, 283.N, 283.P, 283.S, 283.T, 283.V, 304.G, 304.I, 304.K, 304.L, 304.R, 304.S, 304.V, 318.F, 318.H, 318.N, 318.Q, 318.R, 318.S, 318.V, 318.W, 318.Y, 326.A, 326.I, 326.M, 326.P, 326.S, 326.T, 362.A, 362.C, 362.D, 362.E, 362.F, 362.G, 362.K, 362.M, 362.N, 362.Q, 362.R, 362.S, 362.T, 362.V, 362.gap, 363.A, 363.E, 363.G, 363.H, 363.I, 363.K, 363.M, 363.N, 363.P, 363.Q, 363.R, 363.S, 363.T, 363.V, 365.A, 365.I, 365.L, 365.N, 365.P, 365.R, 365.S, 365.T, 365.V, 366.E, 366.G, 367.G, 367.S, 369.A, 369.E, 369.I, 369.L, 369.P, 369.Q, 369.S, 369.T, 369.V, 374.F, 374.H, 374.L, 386.D, 386.K, 386.N, 386.S, 386.T, 386.Y, 392.D, 392.E, 392.F, 392.H, 392.I, 392.K, 392.L, 392.N, 392.P, 392.S, 392.T, 392.Y, 392.gap, 425.N, 425.R, 426.A, 426.I, 426.K, 426.L, 426.M, 426.R, 426.S, 426.T, 426.V, 427.L, 427.W, 428.I, 428.M, 428.Q, 428.T, 429.A, 429.D, 429.E, 429.G, 429.K, 429.Q, 429.R, 429.S, 429.T, 430.A, 430.G, 430.I, 430.Q, 430.S, 430.T, 430.V, 431.A, 431.E, 431.G, 431.R, 431.V, 432.I, 432.K, 432.L, 432.Q, 432.R, 432.S, 455.A, 455.D, 455.E, 455.I, 455.L, 455.Q, 455.S, 455.T, 455.V, 456.H, 456.L, 456.M, 456.N, 456.R, 456.S, 456.V, 456.W, 456.Y, 457.A, 457.D, 457.N, 457.S, 458.A, 458.D, 458.E, 458.G, 458.K, 458.N, 458.Q, 458.S, 458.T, 458.Y, 459.D, 459.E, 459.G, 459.I, 459.N, 459.P, 459.S, 459.T, 459.V, 459.gap, 460.A, 460.C, 460.D, 460.E, 460.G, 460.I, 460.K, 460.L, 460.N, 460.P, 460.Q, 460.R, 460.S, 460.T, 460.V, 460.W, 460.gap, 461.A, 461.D, 461.E, 461.F, 461.G, 461.H, 461.I, 461.K, 461.L, 461.M, 461.N, 461.P, 461.Q, 461.R, 461.S, 461.T, 461.V, 461.Y, 461.gap, 462.A, 462.D, 462.E, 462.G, 462.H, 462.I, 462.K, 462.L, 462.M, 462.N, 462.P, 462.Q, 462.R, 462.S, 462.T, 462.V, 462.Y, 462.gap, 463.A, 463.C, 463.D, 463.E, 463.G, 463.H, 463.I, 463.K, 463.M, 463.N, 463.P, 463.R, 463.S, 463.T, 463.V, 463.Y, 463.gap, 469.K, 469.R, 469.S, 469.Y, 469.gap, 471.A, 471.E, 471.G, 471.I, 471.L, 471.Q, 471.S, 471.T, 471.V, 474.D, 474.E, 474.N, 474.Y, 475.I, 475.M, 475.V, 476.G, 476.K, 476.M, 476.Q, 476.R, 476.T, 476.V, 477.D, 477.N, 197.sequon_actual, 276.sequon_actual, 363.sequon_actual, 386.sequon_actual, 392.sequon_actual, 460.sequon_actual, 461.sequon_actual, 462.sequon_actual, 463.sequon_actual
gp120_v2 121.E, 121.K, 121.M, 121.Q, 121.R, 124.F, 124.H, 124.I, 124.P, 124.Y, 127.I, 127.V, 158.D, 158.E, 158.S, 158.T, 159.D, 159.F, 159.L, 159.Y, 160.D, 160.H, 160.I, 160.K, 160.N, 160.R, 160.S, 160.V, 160.Y, 160.gap, 161.A, 161.I, 161.L, 161.M, 161.S, 161.T, 161.V, 161.gap, 162.A, 162.H, 162.I, 162.N, 162.P, 162.Q, 162.S, 162.T, 162.gap, 163.A, 163.G, 163.K, 163.P, 163.R, 163.S, 163.T, 163.gap, 164.A, 164.D, 164.E, 164.F, 164.G, 164.I, 164.L, 164.M, 164.N, 164.P, 164.Q, 164.R, 164.S, 164.T, 164.V, 164.gap, 165.G, 165.I, 165.L, 165.M, 165.P, 165.Q, 165.R, 165.S, 165.T, 165.V, 166.A, 166.D, 166.G, 166.H, 166.I, 166.K, 166.M, 166.N, 166.Q, 166.R, 166.S, 166.T, 166.W, 167.D, 167.G, 167.K, 167.N, 167.P, 167.Q, 167.R, 167.T, 168.E, 168.I, 168.K, 168.L, 168.R, 168.S, 168.gap, 169.A, 169.E, 169.G, 169.H, 169.I, 169.K, 169.M, 169.N, 169.P, 169.Q, 169.R, 169.S, 169.T, 169.V, 169.W, 169.Y, 169.gap, 170.C, 170.E, 170.H, 170.K, 170.N, 170.Q, 170.R, 170.S, 170.T, 170.gap, 171.D, 171.E, 171.G, 171.H, 171.K, 171.L, 171.M, 171.N, 171.P, 171.Q, 171.R, 171.S, 171.T, 171.V, 171.gap, 172.A, 172.D, 172.E, 172.G, 172.I, 172.K, 172.M, 172.Q, 172.R, 172.T, 172.V, 172.Y, 173.A, 173.D, 173.E, 173.F, 173.G, 173.H, 173.N, 173.Q, 173.R, 173.S, 173.T, 173.Y, 174.A, 174.D, 174.S, 174.T, 174.V, 175.F, 175.H, 175.I, 175.L, 175.N, 175.Q, 175.S, 175.T, 175.V, 175.Y, 176.F, 176.L, 177.D, 177.F, 177.H, 177.N, 177.Y, 178.A, 178.D, 178.E, 178.I, 178.K, 178.L, 178.N, 178.R, 178.S, 178.T, 178.V, 179.A, 179.E, 179.I, 179.L, 179.M, 179.P, 179.Q, 179.R, 179.S, 179.T, 179.V, 179.Y, 181.I, 181.L, 181.T, 181.V, 182.A, 182.E, 182.H, 182.I, 182.K, 182.L, 182.M, 182.N, 182.Q, 182.S, 182.T, 182.V, 183.A, 183.E, 183.H, 183.K, 183.L, 183.N, 183.P, 183.Q, 183.S, 184.A, 184.F, 184.I, 184.L, 184.M, 184.N, 184.S, 184.T, 184.V, 184.gap, 185.A, 185.D, 185.E, 185.F, 185.G, 185.H, 185.I, 185.K, 185.L, 185.N, 185.Q, 185.R, 185.S, 185.T, 185.V, 185.Y, 185.gap, 186.A, 186.D, 186.E, 186.G, 186.H, 186.I, 186.K, 186.L, 186.N, 186.P, 186.Q, 186.R, 186.S, 186.T, 186.V, 186.gap, 187.A, 187.C, 187.D, 187.E, 187.G, 187.H, 187.K, 187.N, 187.Q, 187.R, 187.S, 187.T, 187.Y, 187.gap, 188.A, 188.D, 188.E, 188.F, 188.G, 188.H, 188.I, 188.K, 188.N, 188.P, 188.Q, 188.R, 188.S, 188.T, 188.V, 188.W, 188.Y, 188.gap, 189.A, 189.D, 189.E, 189.G, 189.H, 189.I, 189.K, 189.L, 189.M, 189.N, 189.P, 189.Q, 189.R, 189.S, 189.T, 189.Y, 189.gap, 190.A, 190.D, 190.E, 190.F, 190.G, 190.I, 190.K, 190.L, 190.M, 190.N, 190.P, 190.Q, 190.R, 190.S, 190.T, 190.V, 190.Y, 191.F, 191.H, 191.S, 191.Y, 192.G, 192.I, 192.K, 192.M, 192.R, 192.S, 192.T, 192.V, 193.F, 193.L, 193.M, 193.P, 194.I, 194.K, 194.L, 194.M, 194.R, 194.T, 194.V, 195.D, 195.H, 195.K, 195.N, 195.Q, 195.S, 195.T, 197.D, 197.I, 197.K, 197.N, 197.R, 197.T, 202.A, 202.K, 202.R, 202.S, 202.T, 203.K, 203.Q, 203.R, 312.A, 312.G, 315.A, 315.G, 315.K, 315.M, 315.Q, 315.R, 315.S, 315.T, 315.V, 160.sequon_actual, 171.sequon_actual, 173.sequon_actual, 185.sequon_actual, 186.sequon_actual, 187.sequon_actual, 188.sequon_actual, 189.sequon_actual, 197.sequon_actual
gp120_v3 296.C, 296.R, 297.A, 297.E, 297.I, 297.K, 297.L, 297.M, 297.N, 297.Q, 297.R, 297.S, 297.T, 297.V, 299.E, 299.F, 299.H, 299.L, 299.N, 299.P, 299.T, 299.V, 300.A, 300.D, 300.F, 300.G, 300.H, 300.N, 300.Q, 300.S, 300.T, 300.W, 300.Y, 301.D, 301.E, 301.H, 301.K, 301.N, 301.R, 301.T, 301.V, 301.Y, 301.gap, 302.G, 302.H, 302.K, 302.L, 302.N, 302.Q, 303.E, 303.I, 303.K, 303.Q, 303.R, 303.S, 303.T, 303.V, 304.G, 304.I, 304.K, 304.L, 304.R, 304.S, 304.V, 305.D, 305.E, 305.G, 305.H, 305.I, 305.K, 305.N, 305.Q, 305.R, 305.T, 305.Y, 306.D, 306.E, 306.G, 306.K, 306.Q, 306.R, 306.S, 306.gap, 307.A, 307.E, 307.F, 307.H, 307.I, 307.L, 307.M, 307.T, 307.V, 307.Y, 308.A, 308.G, 308.H, 308.K, 308.N, 308.P, 308.Q, 308.R, 308.S, 308.T, 308.W, 309.F, 309.I, 309.L, 309.M, 309.R, 309.T, 309.V, 310.Q, 310.gap, 311.R, 311.gap, 312.A, 312.G, 313.A, 313.L, 313.P, 313.Q, 313.S, 313.T, 313.V, 314.A, 314.G, 314.M, 314.P, 315.A, 315.G, 315.K, 315.M, 315.Q, 315.R, 315.S, 315.T, 315.V, 316.A, 316.E, 316.G, 316.I, 316.M, 316.R, 316.S, 316.T, 316.V, 316.W, 317.F, 317.I, 317.L, 317.M, 317.R, 317.S, 317.V, 317.W, 317.X, 317.Y, 318.F, 318.H, 318.N, 318.Q, 318.R, 318.S, 318.V, 318.W, 318.Y, 319.A, 319.G, 319.I, 319.K, 319.L, 319.M, 319.N, 319.R, 319.S, 319.T, 319.V, 319.Y, 319.gap, 320.A, 320.E, 320.G, 320.H, 320.I, 320.K, 320.M, 320.N, 320.P, 320.Q, 320.R, 320.S, 320.T, 320.W, 320.Y, 320.gap, 321.A, 321.D, 321.E, 321.F, 321.G, 321.H, 321.I, 321.K, 321.L, 321.N, 321.R, 321.S, 321.T, 321.V, 321.Y, 321.gap, 322.E, 322.G, 322.I, 322.K, 322.N, 322.Q, 322.T, 322.V, 322.Y, 322.gap, 323.D, 323.G, 323.I, 323.K, 323.M, 323.N, 323.Q, 323.S, 323.T, 323.V, 323.gap, 324.E, 324.G, 324.L, 324.N, 324.P, 324.S, 324.T, 325.D, 325.E, 325.G, 325.I, 325.K, 325.N, 325.Q, 325.R, 325.S, 325.T, 326.A, 326.I, 326.M, 326.P, 326.S, 326.T, 327.G, 327.K, 327.R, 328.A, 328.D, 328.E, 328.G, 328.I, 328.K, 328.L, 328.M, 328.N, 328.Q, 328.R, 328.S, 328.V, 329.A, 329.V, 330.F, 330.H, 330.N, 330.Q, 330.R, 330.S, 330.Y, 332.D, 332.E, 332.H, 332.I, 332.K, 332.N, 332.Q, 332.R, 332.S, 332.T, 332.V, 333.I, 333.L, 333.V, 333.Y, 334.A, 334.D, 334.E, 334.F, 334.G, 334.I, 334.K, 334.N, 334.R, 334.S, 334.T, 334.Y, 334.gap, 301.sequon_actual, 302.sequon_actual, 322.sequon_actual, 323.sequon_actual, 324.sequon_actual, 330.sequon_actual, 334.sequon_actual
gp41_mper 609.A, 609.F, 609.H, 609.K, 609.L, 609.P, 609.Q, 609.R, 609.S, 609.Y, 657.E, 657.K, 657.V, 658.E, 658.H, 658.K, 658.L, 658.N, 658.Q, 658.R, 659.A, 659.D, 659.E, 659.K, 659.N, 659.R, 659.S, 661.F, 661.L, 661.S, 662.A, 662.E, 662.K, 662.Q, 662.S, 662.T, 663.L, 663.M, 663.W, 664.D, 664.E, 664.G, 664.N, 664.S, 665.E, 665.K, 665.N, 665.Q, 665.R, 665.S, 665.T, 667.A, 667.D, 667.E, 667.G, 667.K, 667.N, 667.Q, 667.S, 667.T, 668.D, 668.F, 668.G, 668.H, 668.N, 668.Q, 668.S, 668.T, 669.I, 669.L, 671.D, 671.G, 671.K, 671.N, 671.S, 671.T, 672.L, 672.W, 673.F, 673.L, 673.S, 674.A, 674.D, 674.E, 674.G, 674.K, 674.N, 674.S, 674.T, 674.Y, 675.I, 675.L, 675.M, 676.A, 676.S, 676.T, 676.V, 677.E, 677.H, 677.K, 677.N, 677.Q, 677.R, 677.S, 680.G, 680.W, 681.D, 681.H, 681.S, 681.Y, 682.I, 682.T, 682.V, 683.K, 683.Q, 683.R, 684.I, 684.L, 684.M, 684.T, 684.V, 674.sequon_actual
glyco num.sequons.env, num.sequons.gp120, num.sequons.v2, num.sequons.v3, num.sequons.v5
cysteines num.cysteine.env, num.cysteine.gp120, num.cysteine.v2, num.cysteine.v3, num.cysteine.v5
geometry length.env, length.gp120, length.v2, length.v3, length.v5

References

Breiman, Leo. 2001. “Random Forests.” Machine Learning 45 (1). Springer: 5–32. doi:10.1023/A:1010933404324.

Chen, Tianqi, and Carlos Guestrin. 2016. “Xgboost: A Scalable Tree Boosting System.” In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–94. doi:10.1145/2939672.2939785.

van der Laan, Mark J, Eric C Polley, and Alan E Hubbard. 2007. “Super Learner.” Statistical Applications in Genetics and Molecular Biology 6 (1). De Gruyter. doi:10.2202/1544-6115.1309.

Williamson, Brian D, Peter B Gilbert, Noah R Simon, and Marco Carone. 2020. “A Unified Approach for Inference on Algorithm-Agnostic Variable Importance.” arXiv Preprint. https://arxiv.org/abs/2004.03683.

Yoon, Hyejin, Jennifer Macke, Anthony P West Jr, Brian Foley, Pamela J Bjorkman, Bette Korber, and Karina Yusim. 2015. “CATNAP: A Tool to Compile, Analyze and Tally Neutralizing Antibody Panels.” Nucleic Acids Research 43 (W1). Oxford University Press: W213–W219. doi:10.1093/nar/gkv404.

Zou, Hui, and Trevor Hastie. 2005. “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 (2). Wiley Online Library: 301–20. doi:10.1111/j.1467-9868.2005.00503.x.