# QA/QC, data processing, and dimension reduction

### QA/QC & Data Processing <a href="#qa-qc-dataprocessing-anddimensionreduction-qa-qc-and-dataprocessing" id="qa-qc-dataprocessing-anddimensionreduction-qa-qc-and-dataprocessing"></a>

<figure><img src="/files/mJGNiOTRkQYdrnoCxDjE" alt=""><figcaption></figcaption></figure>

Once the data has been imported in the project we can start pre-processing the data:

We will first remove all non-expression features in the data (e.g. NegProbes).

* Click on *Filtering > Filter features* from the menu on the right
* Select *Metadata* and set the task settings as follows
* Click **Finish**

<figure><img src="/files/sMMZ4CrkXe0LpdKn4oXq" alt=""><figcaption></figcaption></figure>

In the Analyses tab

* Click on the resulting filtered counts node
* Select *QA/QC > Single cell* QA/QC from toolbox, once the task has completed we can open the report by double-clicking the node:

<figure><img src="/files/bY02LQezMW8rn1VlBeGb" alt=""><figcaption></figcaption></figure>

We will remove the cells with low counts and number of detected features.

* Click on *Select & Filter* and set lower threshold to 50 for both (remember that this is data-dependent and will change based on your dataset)
* Click Filter![](https://documentation.partek.com/download/thumbnails/98206053/Screenshot%202024-07-04%20at%2017.18.31.png?version=1\&modificationDate=1720109917705\&api=v2) include
* Click *Apply observation filter* to the filtered counts node:

<figure><img src="/files/aefwZw0iGDRYUS6mPUWH" alt=""><figcaption></figcaption></figure>

Click on the node generated by the filtering task in the Analyses tab.

* Click *Filtering > Filter features.* Apply a noise reduction filter:

<figure><img src="/files/ey8682ULGfx5VNG0bN1P" alt=""><figcaption></figcaption></figure>

We can now normalize our filtered data.

* Click *Normalization and scaling > Normalization.* Use the recommended settings by clicking ![](https://documentation.partek.com/download/thumbnails/98206053/Screenshot%202024-07-04%20at%2017.29.01.png?version=1\&modificationDate=1720110545187\&api=v2):

<figure><img src="/files/UKFzkHGg7FU8m3csst4o" alt=""><figcaption></figcaption></figure>

### Data Exploration <a href="#qa-qc-dataprocessing-anddimensionreduction-dataexploration" id="qa-qc-dataprocessing-anddimensionreduction-dataexploration"></a>

Now that we have filtered low quality cells and normalized our data, we can start clustering to identify cell populations.

* Click on the normalized data node
* From the menu on the right select *Exploratory analysis > PCA.* We are going to use the top 2000 features by variance and calculate the first 50 principal components (PCs):

<figure><img src="/files/Z6z3dW1oXPlPlaDnxY1p" alt=""><figcaption></figcaption></figure>

Once the PCA has run, click on the PCA result node in the Analyses tab.

* Select *Exploratory analysis > UMAP* from the toolbox. Set the UMAP parameters as follows:
  * Top **20** PCs
  * Local neighborhood size **60**
  * Minimal distance **0.20**

<figure><img src="/files/EYRNEz0c0UJA4mKvReO2" alt=""><figcaption></figcaption></figure>

While the UMAP is running we can also queue a clustering task. Click on the PCA result node in the Analyses tab, select *Exploratory analysis > Graph-based clustering.*

* We are going to use the Leiden algorithm to cluster our data (make sure to select the radio button for it)
* Set the number of PCs to **10**
* In the advanced settings, set the resolution parameter to **8e-5** and click **Apply**:

<figure><img src="/files/2IJSZSJHz3acbZNqKsys" alt=""><figcaption></figcaption></figure>

\\

### Additional Assistance <a href="#qa-qc-dataprocessing-anddimensionreduction-additionalassistance" id="qa-qc-dataprocessing-anddimensionreduction-additionalassistance"></a>

If you need additional assistance, please visit [our support page](http://www.partek.com/support) to submit a help ticket or find phone numbers for regional support.

![](https://documentation.partek.com/download/resources/com.adaptavist.confluence.rate:rate/resources/themes/v2/gfx/loading_mini.gif)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.multiomics.illumina.com/partek/partek-flow/tutorials/nanostring-cosmx-tutorial/qa-qc-data-processing-and-dimension-reduction.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
