# Counting and Normalization

## Counting

Protein counting is performed using DRAGEN BCL Convert. Sequencing produces barcoded reads for each sample that correspond to protein abundance. Barcoded reads are simultaneously demultiplexed and counted using DRAGEN BCL Convert. These barcode counts are stored as the Raw Counts ADAT.

## Normalization Summary

<figure><img src="https://958780164-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FnyQb4WG1K4VQZKovLv5v%2Fuploads%2Fn671DZPKYam9oWnC5HqC%2Fnorm_flowchat.png?alt=media&#x26;token=7a438778-2406-4c48-8353-e056e76f8af0" alt=""><figcaption></figcaption></figure>

Normalization corrects for the sources of confounding variation, such as overall protein concentration differences, minor deviations in volume transfer during the assay, or efficiency of library preparation steps. It is performed sequentially and produces an individual ADAT file with counts for each of the three following steps:&#x20;

* <mark style="color:blue;">**Readout Normalization:**</mark> This step uses **SOMAmer controls** to reduce technical variability introduced in the **NGS library prep**.
* <mark style="color:green;">**Plate Normalization**</mark><mark style="color:blue;">**:**</mark> This step uses positive controls (calibrators) to correct for biases between plates. An external reference provided by Illumina ensures all Illumina Protein Prep plates are comparable. This step can be broken down to five steps. See the diagram above and the section below for more information.
* <mark style="color:$warning;">**Sample Normalization**</mark><mark style="color:blue;">**:**</mark> This step uses **protein abundance** in each sample to reduce technical variability introduced during **protein quantification**, and **correct for differences in over all protein concentration**.

See [metrics-appendix](https://help.multiomics.illumina.com/dragen-protein-quantification/counting-and-normalization/metrics-appendix "mention") for a summary of all metrics.

## Normalization Steps in Detail

* <mark style="color:blue;">**Hybridization Normalization (also known as Readout Normalization):**</mark> This step corrects for biases that can occur during the hybridization and sequencing preparation stages of the assay. During hybridization, controls are spiked into each sample; during normalization, the counts for these controls are compared against an internal reference based on the non-blank controls on the plate. A scale factor is calculated for each sample, and if the scale factor is outside of the specification range (**0.4**–**2.5**), the sample will receive a **FLAG**.
* <mark style="color:blue;">**Internal Reference Median Normalization:**</mark> This step corrects for differences in the total protein abundance measurement of a sample. It is performed for each dilution group and runs separately for blank and calibrator control samples. A scale factor is calculated for each dilution group, by comparing the observed protein measurements to a reference of expected values for each protein.

  This reference is based on median protein counts across all samples of the same sample type on the same plate. If any of the scale factors for a sample are outside of the specification range (**0.4**–**2.5**), the sample will receive a **FLAG**.
* <mark style="color:blue;">**Plate Scaling:**</mark> This step corrects for possible changes in measured total protein counts between plates, when calibrator samples are present. The median of each SOMAmer measurement across the five calibrators is compared to an external calibration reference to calculate a plate scale factor.

  The first plate scaling step compares the calibrator medians to a reference derived from the sequencing instrument (NovaSeq 6000 or NovaSeq X Series) and adjusts the entire plate accordingly.

  The second step compares the scaled calibrator medians to a reference derived from the NovaSeq X Series (10B flow cell).\
  There is no specification range for plate scaling.

  * The references used in this step can be found in the SOMAmer metadata, under Ref.Bridging.\<CalibratorId>.\<Instrument>.\<Flowcell>.\<MasterMixLot#>. The reference used by cross-instrument plate scaling is Ref.Bridging.\<PlateBarcode>.NovaSeqX.10B.AA.
* <mark style="color:blue;">**Calibration:**</mark> This step corrects for batch effects that impact individual SOMAmers.

  The first calibration step (Platform Specific Calibration) compares the median of each SOMAmer measurement across the five Calibrator sample replicates to an external Calibration reference. This reference is derived from runs using the same instrument type, flow cell type, calibrator lot, and sample input type used in the run being analyzed. It then calculates a SOMAmer-specific scale factor and a plate-wide Calibrator metric (PlatformSpecificCalibrationTailPercent). PlatformSpecificCalibrationTailPercent corresponds to the percentage of SOMAmers with scale factors outside the specification range (**0.6**–**1.4**). If **15%** of SOMAmers fall outside of this specification, the plate receives a **WARNING** for the PlatformSpecificTailPercent\_PassFlag metric.

  The second calibration step (Cross Platform Calibration) compares the updated calibrator medians to a reference derived from the NovaSeq X Series (10B flow cell), using the same calibrator lot and sample input type as the run being analyzed. The scale factors used to align the median of the calibrators to the reference value are applied to all samples on the same plate.

  * The references used in this step can be found in the SOMAmer metadata, under Ref.Bridging.\<CalibratorId>.\<Instrument>.\<Flowcell>.\<MasterMixLot#>. The reference used by cross-instrument calibration is Ref.Bridging.\<CalibratorId>.NovaSeqX.10B.AA.
* <mark style="color:blue;">**External Reference Median Normalization (also known as Sample Normalization):**</mark> This step corrects for differences in the total protein signal for samples on each dilution plate. A scale factor is calculated by comparing the observed protein measurements to a reference of expected values for each protein. It is performed on plasma/serum samples and QC samples.

  If any of the scale factors for a sample are outside of the specification range (**0.4**–**2.5**), the sample will receive a **FLAG**.

  * The reference used in this step can be found in the SOMAmer metadata, under Ref.MedNormExt.Plasma or Ref.MedNormExt.Serum (dependent on the input type).

<figure><img src="https://958780164-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FnyQb4WG1K4VQZKovLv5v%2Fuploads%2Fgit-blob-d63b6193e28aff179923d1b29fd7cef72ad40b44%2Fimage%20(23)%20(1).png?alt=media" alt=""><figcaption><p>Outline of the protein capture assay and what parts of the process each normalization step impacts</p></figcaption></figure>
