DGAN – improved downstream functional analysis of single-cell RNA-sequence data

Google+ Pinterest LinkedIn Tumblr +

The dramatic increase in the number of single-cell RNA-sequence (scRNA-seq) investigations is indeed an endorsement of the new-fangled proficiencies of next generation sequencing technologies that facilitate the accurate measurement of tens of thousands of RNA expression levels at the cellular resolution. Nevertheless, missing values of RNA amplification persist and remain as a significant computational challenge, as these data omission induce further noise in their respective cellular data and ultimately impede downstream functional analysis of scRNA-seq data. Consequently, it turns imperative to develop robust and efficient scRNA-seq data imputation methods for improved downstream functional analysis outcomes.

To overcome this adversity, researchers at the National Institute of Technology, India have designed an imputation framework namely deep generative autoencoder network [DGAN]. In essence, DGAN is an evolved variational autoencoder designed to robustly impute data dropouts in scRNA-seq data manifested as a sparse gene expression matrix. DGAN principally reckons count distribution, besides data sparsity utilizing a gaussian model whereby, cell dependencies are capitalized to detect and exclude outlier cells via imputation. When tested on five publicly available scRNA-seq data, DGAN outperformed every single baseline method paralleled, with respect to downstream functional analysis including cell data visualization, clustering, classification and differential expression analysis.

Schematic of deep generative autoencoder network (DGAN) downstream
functional analysis pipeline for scRNA-seq data

Figure 1

The real input matrix ‘m’ is filtered for bad genes, normalize them according to library size and pruned by log transformed and scaling. The processed matrix is then fed into the DGAN model, which learns gene expression data depiction and reconstructs the imputed matrix. Finally, these imputed matrix facilitate extensive downstream analysis.

Availability – DGAN is executed in Python and is accessible at https://github.com/dikshap11/DGAN.

Pandey D, Onkara PP. (2023) Improved downstream functional analysis of single-cell RNA-sequence data using DGAN. Sci Rep 13(1):1618. [article]