WAT3R – Recovery of T Cell Receptor Variable Regions From 3’ Single-Cell RNA-Sequencing

Google+ Pinterest LinkedIn Tumblr +


Diversity of the T cell receptor (TCR) repertoire is central to adaptive immunity. The TCR is composed of α and β chains, encoded by the TRA and TRB genes, of which the variable regions determine antigen specificity. To generate novel biological insights into the complex functioning of immune cells, combined capture of variable regions and single-cell transcriptomes provides a compelling approach. Recent developments enable the enrichment of TRA and TRB variable regions from widely used technologies for 3′-based single-cell RNA-sequencing (scRNA-seq). However, a comprehensive computational pipeline to process TCR-enriched data from 3′ scRNA-seq is not available.

Researchers from Brigham and Women’s Hospital and the Broad Institute of MIT and Harvard have developed an analysis pipeline to process TCR variable regions enriched from 3′ scRNA-seq cDNA. The tool reports TRA and TRB nucleotide and amino acid sequences linked to cell barcodes, enabling the reconstruction of T cell clonotypes with associated transcriptomes. The researchers demonstrate the software using peripheral blood mononuclear cells (PBMCs) from a healthy donor and detect TCR sequences in a high proportion of single T cells. Detection of TCR sequences is low in non-T cell populations, demonstrating specificity. Finally, they show that TCR clones are larger in CD8 Memory T cells than in other T cell types, indicating an association between T cell clonotypes and differentiation states.

Overview of WAT3R

The workflow starts by merging two FASTQ files, correction of cell barcodes and UMIs, and quality filtering (top left). Clustering of TCR sequences with identical barcode and UMI is then performed. Top right dotplot shows evaluation of cluster quality by comparing the proportion of reads supporting the most abundant cluster (x-axis), the ratio of the most abundant cluster to the second (y-axis), and the number of reads supporting each TCR sequence with a specific barcode and UMI (color). Bottom left: TCR consensus sequences are generated and used for V(D)J alignment. In downstream analysis, results are integrated with a paired scRNA-seq dataset. Bottom right barplot shows the proportion of cells in the dataset, separated by cell type, for which WAT3R returned information on the TRA gene, TRB gene, or both.

Availability – The Workflow for Association of T cell receptors from 3′ single-cell RNA-seq (WAT3R), including test data, is available on GitHub (https://github.com/mainciburu/WAT3R), Docker Hub (https://hub.docker.com/r/mainciburu/wat3r), and a workflow on the Terra platform (https://app.terra.bio).


Ainciburu M, Morgan DM, DePasquale EAK, Love JC, Prósper F, van Galen P. (2022) WAT3R: Recovery of T Cell Receptor Variable Regions From 3′ Single-Cell RNA-Sequencing. Bioinformatics [Epub ahead of print]. [abstract]
Share.