Bioinformatics and cellular genomics
|Head, Bioinformatics & Cellular Genomics Laboratory
|NHMRC Early Career Fellow
||BA/BSc, University of Melbourne
||BSc (Hons), University of Melbourne
||DPhil, Department of Statistics, University of Oxford, UK
||Post-doctoral Fellow, EMBL-EBI, UK
||NHMRC Early Career Fellow
||Head, Bioinformatics and Cellular Genomics, St Vincent’s Institute
||General Sir John Monash Scholarship
||NHMRC Early Career (CJ Martin) Fellowship
My research focuses on two major areas: 1) developing statistical and machine learning methods for the analysis of high-throughput biological data and implementing them in open-source software, and 2) applying such methods to analyse and interpret complex biological data to answer experimentally-driven questions.
I have contributed to popular methods for the analysis of differential expression in RNA-seq data, and recently have developed methods for the analysis of single-cell RNA-seq data. I primarily implement methods in the R and Python languages and publish open-source software packages through the Bioconductor project.
I am also interested in studying the effects of DNA variation on gene expression measured in individual cells. We can explore single-cell genetics in two ways: by studying effects of common DNA variation on single-cell gene expression (single-cell quantitative trait locus mapping) and and by studying the effects of somatic DNA mutations on single-cell gene expression (clonal cell populations). The former provides information about genetic regulation of natural gene expression variation, while the latter informs us about the effects of DNA accumulated mutations in tissues that are relevant both to healthy ageing and to cancer.
I enjoy collaborating with biologists and other researchers to contribute computational and data analysis expertise to biologically-focused studies.
- Buettner F, Pratanwanich N, MCCARTHY DJ, Marioni JC, Stegle O. f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq. Genome Biol. 2017;18: 212.
- Kilpinen H, Goncalves A, Leha A, Afzal V, Alasoo K, Ashford S, et al. Common genetic variation drives molecular heterogeneity in human iPSCs. Nature. 2017;546: 370–375.
- MCCARTHY DJ, Campbell KR, Lun ATL, Wills QF. Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R. Bioinformatics. 2017;33: 1179–1186.
- Lun ATL, MCCARTHY DJ, Marioni JC. A step-by-step workflow for low-level analysis of single-cell RNA-seq data. F1000Res. 2016;5. doi:10.12688/f1000research.9501.
- Fuchsberger C, Flannick J, Teslovich TM, Mahajan A, Agarwala V, Gaulton KJ, et al. The genetic architecture of type 2 diabetes. Nature. 2016;536: 41–47.
- MCCARTHY DJ, Humburg P, Kanapin A, Rivas MA, Gaulton K, Cazier J-B, et al. Choice of transcripts and software has a large effect on variant annotation. Genome Med. 2014;6: 26.
- MCCARTHY DJ*, Chen Y*, Smyth GK. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40: 4288–4297.
- Lund SP, Nettleton D, MCCARTHY DJ, Smyth GK. Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Stat Appl Genet Mol Biol. 2012;11. doi:10.1515/1544-6115.1826.
- Robinson MD*, MCCARTHY DJ*, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26: 139–140.
- MCCARTHY DJ, Smyth GK. Testing significance relative to a fold-change threshold is a TREAT. Bioinformatics. 2009;25: 765–771.