A deep learning model to classify neoplastic state and tissue origin from transcriptomic data

Google+ Pinterest LinkedIn Tumblr +


Application of deep learning methods to transcriptomic data has the potential to enhance the accuracy and efficiency of tissue classification and cell state identification. Researchers at the Krembil Research Institute have developed a multitask deep learning model for tissue classification combining publicly available whole transcriptomic (RNA-seq) datasets of non-neoplastic, neoplastic and peri-neoplastic tissue to classify disease state, tissue origin and neoplastic subclass. RNA-seq data from a total of 10,116 patient samples processed through a common pipeline were used for model training and validation. The model achieved 99% accuracy for disease state classification (ROC-AUC of 0.98) and 97% accuracy for tissue origin (ROC-AUC of 0.99). Moreover, the model achieved an accuracy of 92% (ROC-AUC 0.95) for neoplastic subclassification. This is the first multitask deep learning algorithm developed for tissue classification employing a uniform pipeline analysis of transcriptomic data with multiple tissue classifiers. This model serves as a framework for incorporating large transcriptomic datasets across conditions to facilitate clinical diagnosis and cell-based treatment strategies.

Bayesian Hyperparameter Tuning of Deep Learning Models

Figure 1

(A) Search space of hyperparameters for Bayesian tuning; (B) Architecture of multitask classifier for disease state and tissue origin along with tuned hyperparameters; (C) Architecture of neoplastic subtype classifier along with tuned hyperparameters.


Hong J, Hachem LD, Fehlings MG. (2022) A deep learning model to classify neoplastic state and tissue origin from transcriptomic data. Sci Rep 12(1):9669. [article]
Share.