My research focuses on two major areas: 1) developing statistical and machine learning methods for the analysis of high-throughput biological data and implementing them in open-source software, and 2) applying such methods to analyse and interpret complex biological data to answer experimentally-driven questions.

I have contributed to popular methods for the analysis of differential expression in RNA-seq data, and recently have developed methods for the analysis of single-cell RNA-seq data. I primarily implement methods in the R and Python languages and publish open-source software packages through the Bioconductor project.

I am also interested in studying the effects of DNA variation on gene expression measured in individual cells. We can explore single-cell genetics in two ways: by studying effects of common DNA variation on single-cell gene expression (single-cell quantitative trait locus mapping) and by studying the effects of somatic DNA mutations on single-cell gene expression (clonal cell populations). The former provides information about genetic regulation of natural gene expression variation, while the latter informs us about the effects of DNA accumulated mutations in tissues that are relevant both to healthy ageing and to cancer.

I enjoy collaborating with biologists and other researchers to contribute computational and data analysis expertise to biologically-focused studies.

Key achievements

2021-2025   NHMRC Investigator Grant (Emerging Leadership 2)

2016-2020   NHMRC Early Career (CJ Martin) Fellowship

2011-2014   General Sir John Monash Scholarship

Selected publications

Lyu, R., Tsui, V., Crismani, W., Liu, R., Shim, H., & McCarthy, D. J. (2022). sgcocaller and comapr: personalised haplotype assembly and comparative crossover map analysis using single-gamete sequencing data. Nucleic Acids Research.

Azodi, C. B., Zappia, L., Oshlack, A., & McCarthy, D. J. (2021). splatPop: simulating population scale single-cell RNA sequencing data. Genome Biology, 22(1), 341.

McCarthy, D. J., Rostom, R., Huang, Y., Kunz, D. J., Danecek, P., Bonder, M. J., Hagai, T., Lyu, R., HipSci Consortium, Wang, W., Gaffney, D. J., Simons, B. D., Stegle, O., & Teichmann, S. A. (2020). Cardelino: computational integration of somatic clonal substructure and single-cell transcriptomes. Nature Methods, 17(4), 414–421.

McCarthy, D. J., Campbell, K. R., Lun, A. T. L., & Wills, Q. F. (2017). Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R. Bioinformatics, 33(8), 1179–1186.

McCarthy, D. J., Chen, Y., & Smyth, G. K. (2012). Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Research, 40(10), 4288–4297.

Robinson, M. D., McCarthy, D. J., & Smyth, G. K. (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics , 26(1), 139–140.

Related news