1 About

The Bead-Based Epitope Assay (BBEA) can be used to quantify the amount of epitope- or peptide-specific antibodies (e.g., IgE) in plasma or serum samples. A detailed assay description is outlined in this publication

In this tutorial, we will analyze a dataset of the sequential epitope-specific (ses-) IgE profiles in patients allergic to milk that were treated with milk oral immunotherapy (OIT) (Suárez-Fariñas et al (2018)). Subjects (7-35 years of age) with IgE-mediated cow’s milk allergy were randomized 1:1 to receive omalizumab or placebo in a double-blind, placebo-controlled trial to evaluate whether the addition of omalizumab to milk OIT would reduce treatment-related adverse reactions and increase the frequency of patients achieving “sustained unresponsiveness” to milk. Detailed description and primary outcomes of this trail were published in Wood et al (2015).

The levels of 66 ses-IgE antibodies were measured using BBEA in plasma of 47 enrolled subjects at baseline and at 32-months of treatment with milk OIT plus placebo or omalizumab.

We will start by installing bbeaR and reading in the BBEA’s raw data (alternatively, one can load an R dataset that comes with this package). We will look at several quality control (QC) measures and normalize the data. Then we will demonstrate an approach to identify ses-IgE antibodies that is different between children treated with Placebo or Omalizumab as adjuvants for milk Oral Immunotherapy (mOIT), using limma modeling framework.

2 Installation

bbeaR is an R package and is installed using a standard github command:

Loading additional packages that will be used in the analyses.

3 Raw Data Import

47 patients were assayed before and after treatment, for a total of 94 samples, each ran in triplicate on four 96-well plates. The runs additionally included three background wells, without any sample (aka “Buffer”, for background signal quantification).

The original .csv files from the Luminex-200 assay, generated with the xPONENT® software, can be downloaded here. The .csv sheet consists of several sections, of which “Median”, “Count”, and “NetMFI” hold the assay’s readout values for each epitope-specific antibody (j) in columns and each sample (i) in rows: as shown below

First, we need to create a plate layout using the create.plate.db() function. This layout will be used in the import and some of the plotting functions. The only input to this function is the direction of the plate read (horizontal or vertical).

##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
## [1,] "A1" "A2" "A3" "A4" "A5" "A6" "A7" "A8" "A9" "A10" "A11" "A12"
## [2,] "B1" "B2" "B3" "B4" "B5" "B6" "B7" "B8" "B9" "B10" "B11" "B12"
## [3,] "C1" "C2" "C3" "C4" "C5" "C6" "C7" "C8" "C9" "C10" "C11" "C12"
## [4,] "D1" "D2" "D3" "D4" "D5" "D6" "D7" "D8" "D9" "D10" "D11" "D12"
## [5,] "E1" "E2" "E3" "E4" "E5" "E6" "E7" "E8" "E9" "E10" "E11" "E12"
## [6,] "F1" "F2" "F3" "F4" "F5" "F6" "F7" "F8" "F9" "F10" "F11" "F12"
## [7,] "G1" "G2" "G3" "G4" "G5" "G6" "G7" "G8" "G9" "G10" "G11" "G12"
## [8,] "H1" "H2" "H3" "H4" "H5" "H6" "H7" "H8" "H9" "H10" "H11" "H12"

Then read all four .csv files.
Note: a separate tutorial (Egg Allergy Example) has an example of importing and processing a single plate.

## [1] "Reading file: MOIT_IgEplate_02.csv"
## [1] "Reading file: MOIT_IgEplate_03.csv"
## [1] "Reading file: MOIT_IgEplate_04.csv"
## [1] "Reading file: MOIT_IgEplate01.csv"

The bbea list object contains several elements, extracted from the assay’s output.

## [1] "Median"    "NetMFI"    "Count"     "pData"     "AssayInfo"

The Median, NetMFI, and Count are matrices with rows as Analytes (epitopes) and columns as Samples. Note: this is changed from the original data layout, where samples were in rows and epitopes in columns, to make the dataset usable with common analytical tools in R.

The Median are the Median Fluorescence Intensities (MFIs) and the NetMFI are the Medians normalized to background. In our example, Median and NetMFI have exactly same values, since normalization was not selected during the assay run. This post has more details about the Luminex outputs.

##            Plate2.1.1.A1. Plate2.9.1.A2.
## Analyte 10              5              4
## Analyte 12              1              1
## Analyte 14              2              2
## Analyte 15              5              4
## Analyte 16              1              0

The Count are the numbers of beads counted per analyte, and is important quality control measure.

##            Plate2.1.1.A1. Plate2.9.1.A2.
## Analyte 10            161            149
## Analyte 12            147            140
## Analyte 14            191            165
## Analyte 15            175            143
## Analyte 16            194            144

The AssayInfo saves parameters of the assay.

##                           V1
## 1                    Program
## 2                      Build
## 3                       Date
## 4                           
## 5                         SN
## 6                      Batch
## 7                    Version
## 8                   Operator
## 9               ComputerName
## 10              Country Code
## 11              ProtocolName
## 12           ProtocolVersion
## 13       ProtocolDescription
## 14 ProtocolDevelopingCompany
## 15              SampleVolume
##                                                                    V2
## 1                                                             xPONENT
## 2                                                           3.1.971.0
## 3                                                             7/22/15
## 4                                                                    
## 5                                                       LX10010124401
## 6                                                              Plate2
## 7                                                                   1
## 8                                                                    
## 9                                                        XPONENT31-PC
## 10                                                                409
## 11                                       Milk OIT IgE Plate2 07.22.15
## 12                                                                  1
## 13 Run by Gustavo Gimenez, Peiatric Allergy and Immunology Department
## 14                                                                   
## 15                                                              90 uL

Finally, bbea.read.csv() creates a phenotype (p)Data that has some basic information about the samples and the assay run.

##  [1] "Location"          "Sample"            "filename"         
##  [4] "File"              "Plate"             "SampleNumber"     
##  [7] "Well.Number"       "Well.Letter"       "Well_coord"       
## [10] "print.plate.order" "letters_numeric"   "Plate.Date"       
## [13] "Plate.Time"        "Plate.TimeHR"      "CountSum"         
## [16] "CountMean"         "CountMin"          "CountMax"
##                Location   Sample
## Plate2.1.1.A1.  1(1,A1) Unknown1
## Plate2.9.1.A2.  9(1,A2) Unknown1

We can now add external information about the samples and analytes.
Load already imported data that includes phenotype (clinical) data PD and an annotation file Annot. Note: this will also load a bbea list object (generated in the previous steps).

The annotation Annot dataset contains the mapping of Luminex beads (Analytes) to the peptides/epitopes.

## [1] 66  6
##                Analyte     Peptide Protein lableName Bead PeptideName
## alphas1-p01  Analyte 5 alphas1-p01 alphas1    a1.p01    5 alphas1-p01
## alphas1-p02 Analyte 58 alphas1-p02 alphas1    a1.p02   58 alphas1-p02
## alphas1-p03  Analyte 7 alphas1-p03 alphas1    a1.p03    7 alphas1-p03
## alphas1-p04  Analyte 8 alphas1-p04 alphas1    a1.p04    8 alphas1-p04
## alphas1-p05 Analyte 60 alphas1-p05 alphas1    a1.p05   60 alphas1-p05
## alphas1-p06 Analyte 10 alphas1-p06 alphas1    a1.p06   10 alphas1-p06

Changing the bbea object to have milk epitope names instead of Luminex analyte numbers.

##             Plate1.1.1.A1. Plate1.2.1.A2.
## alphas1-p06              4              4
## alphas1-p08             18             19
## alphas1-p09             35             33
## alphas1-p13             22             21
## alphas1-p17              2              2

The PD dataset has clinical information about our samples.

## [1] 285   5
##   SUBJECT.ID    Visit Treatment  Plate  Location
## 1     MILK01 Baseline   Placebo Plate1 34(1,C10)
## 2     MILK01 Baseline   Placebo Plate1 35(1,C11)
## 3     MILK01 Baseline   Placebo Plate1 36(1,C12)
## 4     MILK01  Month32   Placebo Plate1 46(1,D10)
## 5     MILK01  Month32   Placebo Plate1 47(1,D11)
## 6     MILK01  Month32   Placebo Plate1 48(1,D12)

We will clean up the pData and then merge it with PD. This step is not necessary, but will be useful when we do statistical modelling.

4 Quality Control of the Raw Data

We want to make sure that there are no missing samples or analytes. This would be reflected by the very low counts (<25).
The heatmap shows that all samples and analytes are were included.

All counts seem to be high and we don’t see any specific sample or epitope to be missing.
However, when having multiple plates, it might be more useful to have a counts heatmap for each plate separately. This can be achieved by running bbea.QC.heatmap.counts.byPlate() function.

Now we look at the overall distribution of counts: the minimum count is ~75, which is pretty good.

Our counts look good, so we don’t need to exclude any samples. However, if this were not the case, samples with low counts can be removed using the bbea.subset() function, only keeping samples with average counts > 25.

5 Data Normalization

We convert the Median to normalized nMFI by taking the log2 of values and subtracting the average of the background wells. Note: the Sample column of the pData should include a “Buffer” string in the wells dedicated for the background.

## [1] "Plate1"
## [1] "Plate2"
## [1] "Plate3"
## [1] "Plate4"
## [1] "nMFI"      "NetMFI"    "Count"     "pData"     "AssayInfo"
##             Plate1.1.1.A1. Plate1.2.1.A2.
## alphas1-p06       2.641604       2.641604
## alphas1-p08       4.681133       4.757081
## alphas1-p09       5.621426       5.537768
## alphas1-p13       4.435211       4.369623
## alphas1-p17       1.793607       1.793607

R object of class ExpressionSet eset is convenient for a high-throughput data analysis.
bbea object can be converted to eset:

## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 66 features, 285 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: Plate1.1.1.A1. Plate1.2.1.A2. ... Plate4.24.1.B12. (285
##     total)
##   varLabels: Location Plate ... Treatment (22 total)
##   varMetadata: labelDescription
## featureData: none
## experimentData: use 'experimentData(object)'
## Annotation:

6 QC of Normalized Data

We can print the layouts of all experimental plates, to quicky do a visual inspection of the samples.

In this figure, we can clearly see triplicates and several wells (grey) that were used for the background calculation. We can also notice that last plate (plate #4) had only two samples.

6.1 Batch Effect

Batch effects are a well-known phenomenon in the high-throughput experiments, i.e. Microarrays, RNAseq, Luminex. For the BBEA, batch effects are individual microplate runs. Those are the effects that capture experimental rather than biological variability. Batch effects are easy to detect and eliminate, if experimental conditions are randomized across plate runs.

Principal Component Analysis (PCA) is a broadly used dimensionality reduction technique that allows visualization of high-dimensional data (such as epitopes, genes, proteins, etc.). Data is projected onto a few principal components (PCs), designed to maximize the explained variance of the data, and as such, the 1st PC explains most of the variance, followed by the 2nd PC, 3rd PC, and etc. A researcher can then use a 2D or 3D plot of a small number of PCs to evaluate ‘overall’ differences in the samples. If the samples are visually clustered together by plate, this indicates that variability in the data can be explained by the plate, which is oftentimes undesirable, since it means that technical experimental factors (instrument calibration, room light and temperature, etc.) are affecting the data.

##  PC1  PC2  PC3  PC4  PC5  PC6  PC7  PC8  PC9 PC10 
##   64    6    5    3    2    2    2    2    1    1

While PCA is a quick way to visualize the data, it does not offer a succinct way to quantify how much of the observed variation is explained by the experimental factors, especially when there are many of such factors and/or there are suspected interaction effects. Principal Variance Component Analysis (PVCA) can be used to quantify the amount of variability attributed to different experimental conditions or sample phenotypes. It models the multivariate distribution of the PCs computed for the PCA as a function of experimental factors (such as plate, visit, diagnosis, etc.) and estimates the total variance explained by each factor via mixed-effect models. As a result, the researcher receives as output a proportion of variance that is attributed to each factor. This can be used to quantify what percentage of the total variance is explained by each factor and thus evaluate the most relevant experimental factors, including plate.
pvcaBatchAssess.bbea() function is based on the pvca R package available through Bioconductor.

The PVCA analysis shows that 5% of the variability is attributed to the “plate” effect. There are many ways to remove this variability. We will show the example of fitting a limma model for each epitope with Plate as a covariate and then subtracting the plate coefficient. A more detailed description of this method is provided in this publication.

PCA after the adjustment looks almost identical to the unadjusted one, as oftentimes the batch effect is not easily detectable “by eye”.

##  PC1  PC2  PC3  PC4  PC5  PC6  PC7  PC8  PC9 PC10 
##   66    6    4    3    2    2    2    2    1    1

However, the PVCA analysis shows that after the adjustment, “plate” accounts for 0.3% of the overall variability.

6.2 Distributions

Cullen-Frey plots can be used to evaluate the distribution of the data. It shows how the skewness and kurtosis of our data compare to the theoretical distributions.
CullenFreyPlot() function is a wrapper of the fitdistrplus::descdist(). Since there are different levels of the antibody to each peptide, the data will be scaled before plotting. Y-axis of the boxplot represents a mean MFI or nMFI of 50 scaled peptides.

We can see that while both MFI and nMFI data are skewed, nMFI data is closer to log-normal rather than exponential distributions.

6.3 Technical Replicates

Intraclass Correlation Coefficient (ICC) is used to evaluate agreement among technical replicates for each peptide, where 0 means no agreement, and 1 is a perfect agreement.
The function returns the ICC and a 95% confidence interval.

##                 Peptide       ICC       LCI       UCI
## alphas1-p06 alphas1-p06 0.9903578 0.9864010 0.9933039
## alphas1-p08 alphas1-p08 0.9815516 0.9741442 0.9871304
## alphas1-p09 alphas1-p09 0.9938350 0.9913312 0.9957099
## alphas1-p13 alphas1-p13 0.9858467 0.9801419 0.9901349
## alphas1-p17 alphas1-p17 0.9847886 0.9786641 0.9893949
## alphas1-p18 alphas1-p18 0.9375927 0.9135439 0.9560780

Coefficient of Variation (CV) is used to estimate variability of the replicates. Generally, for biological assays the desired CV is below 20%.

##       Peptide   mean.cv    sd.cv    se.cv median.cv
## 1 alphas1-p01  9.694102 12.05228 1.236538  5.555556
## 2 alphas1-p02  8.842593 10.22483 1.049046  5.808490
## 3 alphas1-p03 10.605762 14.01331 1.437736  5.412659
## 4 alphas1-p04 11.050888 12.66247 1.299142  5.972589
## 5 alphas1-p05 10.385327 10.84504 1.112677  7.530656
## 6 alphas1-p06 10.724139 13.39413 1.374209  5.555556

7 Averaging Technical Replicates

If there are no samples to exclude based on ICC, CV and QC of counts, technical triplicates can be averaged for the downstream analyses.

## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 66 features, 94 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   rowNames: MILK01_Baseline MILK01_Month32 ... MILK77_Month32 (94
##     total)
##   varLabels: SUBJECT.ID Visit ... uf (23 total)
##   varMetadata: labelDescription
## featureData: none
## experimentData: use 'experimentData(object)'
## Annotation:
##             MILK01_Baseline MILK01_Month32 MILK02_Baseline
## alphas1-p06        6.747561       4.447867        6.317169
## alphas1-p08        5.685712       2.983762        4.935668
## alphas1-p09        3.559142       1.955416        8.414214
## alphas1-p13        3.379424       1.427095        8.261503
## alphas1-p17        2.839442       1.302297        7.474694

The heatmap shows that maybe ses-IgE levels somewhat decreased post treatment. To test this statistically, we will do some modelling.

8 Differential Analysis - Limma Modeling

Generally, milk OIT has been shown to induce decreases in antigen-specific IgE (Mantyla et al (2018)); however, ses-IgEs have not yet been evaluated. Additionally, we are interested whether ses-IgE antibodies decreased after patients were treated with an adjuvant Omalizumab or Placebo in addition to milk OIT.

For this, we will use a limma framework where a linear model is fit to every peptide to compare omalizumab and placebo arms.

##                           Down NotSig Up
## PostvsBaseline.Placebo      63      3  0
## PostvsBaseline.Omalizumab   45     21  0
## TreatmentEffect              0     66  0

We can see that ses-IgE decreased in both Omalizumab and Placebo groups, with greater changes in more (63/66) ses-IgEs in the Placebo arm.
As described in the primary publication of the trial outcomes by (Wood et al (2015)), the majority of safety measures were in fact improved while the withdrawal rate and time needed to achieve maintenance dosing were decreased in patients taking omalizumab as compared to placebo. However, the overall clinical efficacy was similar among both groups. Further mechanistic study of this cohort (Frischmeyer-Guerrerio et al (2017)) showed that milk OIT plus omalizumab had distinct changes in basophil reactivity (but not T-cells) compared to the placebo arm. Wood et al (2015) and Suarez-Farinas et al (2018) found that decreases in IgE binding to milk proteins and epitopes appeared less pronounced in subjects after milk OIT plus omalizumab compared with those receiving milk OIT plus placebo. This observation (at least in terms of IgE specific to the whole protein extract) can be explained by omalizumab 1) forming complexes with soluble IgE, which results in the increase of total IgE (since these complexes take longer to clear) and 2) reducing free IgE that will in turn diminish the amount of IgE that could bind to FceRI receptors on the surface of mast cells and basophils, thus reducing the allergic symptoms (Lowe et al (2009)).

For the circle plots, user needs to make sure that the Annotation file has a column named “lableName”, as it will be used to print names of the epitopes.