Single-cell RNA-sequencing (scRNA-seq) has been rapidly adopted in the biomedical research community. Its astonishing popularity has led to a "land rush" of computational methods development, with hundreds of tools already published addressing various aspects of single-cell RNA-seq analysis workflows. What we might call "data reconstruction" methods have proven particularly popular. This class of methods loosely covers matrix factorisation, autoencoder, factor analysis and associated approaches that can learn latent-space representations of single-cell data and "reconstruct" the original data matrix from a restricted (perhaps "denoised") version of it. These methods hold much promise for the critical tasks of data normalisation, batch effect correction, dimension reduction and other preparation for downstream analyses. But with dozens of methods already published and high algorithmic and software complexity it is challenging for data analysis practitioners to know which tools are worth their while to learn and apply. This project will systematically benchmark the performance (across many dimensions) of published scRNA-seq "data reconstruction" methods to provide practical and actionable recommendations about which broad classes of methods and specific software tools are most promising and practically useful in single-cell genomics data analysis
Supervised by:
Disease Focus:
Research Unit:
For further information about this project, contact: [email protected]