Strategies for integrating single-cell RNA sequencing results with multiple species

Google+ Pinterest LinkedIn Tumblr +


Single-cell RNA sequencing (scRNAseq) is a robust technology for parsing gene expression in individual cells from a tissue or other complex source. One application involves experiments where cells from multiple species are recovered from a single sample, such as when human cells are transplanted into an animal model. Rutgers University researchers transplanted microglial precursor cells into newborn mouse brain and then recovered unenriched cortical tissue six months later. Dissociated cells were assessed by scRNAseq. The default method for analyzing these results begins by aligning sequencing reads with a mixture of both mouse and human reference genomes. While this clearly identifies the human cells as a distinct cluster, the clustering is artificially driven by expression from non-comparable gene identifiers from different species. The researchers devised a method for translating expression counts from human to mouse and evaluated four algorithms for parsing mixed-species scRNAseq data. Their optimal approach split raw sequencing reads according to the best alignment score in each genome, and then re-aligned reads only with the appropriate genome. After gene symbol translation, pooled results indicate that cell types are more appropriately clustered and that differential expression analysis identifies species-specific patterns. This method should be applicable to any mixed-species scRNAseq experiment.

Optimized method: Split by score excluding overlaps

The BP algorithm was run as before except that barcodes found in both species were excluded after gene translation but prior to merging species. (A) tSNE plot labeled by cell type. (B) Number of barcodes assigned to each genome, with no overlap by design. (C) Top 10 GO-BP terms from genes different between the human and mouse microglial clusters. (D) Top 10 GO-BP terms enriched in the human cluster. (E) Top 4 GO-BP terms enriched in the mouse cluster (only 4 terms passed the p-value threshold). Bar colors match terms shown in panel C.


Hart RP. (2023) Strategies for Integrating Single-Cell RNA Sequencing Results With Multiple Species. bioRXiv [online preprint]. [article]
Share.