7 Data
The analysis dataset includes neutralization outcomes for the requested bnAb(s) and AA sequence features for the gp160 protein. The possible outcomes are described in Section 5.1.
The additional groups of variables in the data include:
- geographic information: binary indicator variables describing the region of origin of each pseudovirus;
- subtype: binary indicator variables denoting the HIV-1 subtype for the given pseudovirus;
- AA sequence features: binary indicator variables denoting: presence or absence of a residue containing a specific AA at each HXB2-referenced site in gp160, the site having a leading AA for the canonical N-linked glycosylation motif (N[!P]{S/T]), an observed gap at this site after alignment to maintain site-specific relevance, or a gap at this site that resulted in a frameshift;
- viral geometry features: length of Env, gp120, V2, V3, V5;
- numbers of sequons: number of sequons in Env, gp120, V2, V3, V5; and
- numbers of cysteines: number of cysteines in Env, gp120, V2, V3, V5.