The challenge of single-cell RNA-seq and differential expression

One of the common analysis tasks we have at Diamond Age is to analyze single cell RNA-seq data. Our customers are largely therapeutics-development biotechs who use this new technology to assess the impact of their development candidates on gene expression in selected cell types. scRNA-seq is a very different beast than its apparent predecessor, bulk RNA-seq. There are gotchas in both the experimental design and analysis of this data that simply didn’t apply to the older technology. One of them relates to appropriate experimental design for differential expression studies.

The problem

Folks often think that when designing a scRNA-seq experiment, they need only collect data from one sample per treatment group to reliably find differences in gene expression. They are then surprised when we tell them that they need multiple biological replicates, even though each replicate provides them with measurements from 1000+ cells of their cell type of interest. A recent Twitter thread started by John Hogenesch (@jbhclock) makes it clear that this misconception is widespread.

Vito Zanotelli (@ZanotelliVRT) summed up the problem rather succinctly:

Vito RT Zanotelli (@ZanotelliVRT) tweets: People tend to forget that the statistically independent entity of single cell experiments is mostly still the biological sample and not the cells. Distributions/features of cells can be used to calculate properties of that sample that need to be confirmed by replication.

He’s right; the gene expression profile of the individual cells in a sample aren’t independent measurements. They are more accurately described as repeated measurements on the sample.

Consider a single patient; we expect that any B cells collected from one patient would have a more-similar expression profile to each other than to B cells collected from another patient. If we dose one patient with drug and the other with vehicle, how do we know that differences in expression between those two patients’ B cells aren’t driven by biological difference between the patients? Short answer: we don’t. We *must* collect data from more than one patient, so we must have more than one patient (or animal, or dish of cells) in each treatment group, no matter how many cells we collect from each.

Experimental design

To drive home the point: imagine if we measured blood glucose from a mouse 1000 times. Exsanguination aside, all 1000 of those replicated measurements give us a very good idea of what is happening in that one animal, but doesn’t tell us much about the rest of all mouse-kind. In single-cell RNA-seq, each gene expression profile collected from 1000 different B cells from that mouse are analogous to those glucose measurements.

If we want to figure out how a drug affects B-cells generally across all mice, we must treat multiple mice with the drug, and compare the gene expression of, say, 1000 B-cells from each animal in one treatment group against the profiles of the other group. We treat those 1000 cells as repeated measurements of one animal, or one biological replicate. That means that our N is still counted in animals: three animals means we have three replicates, not 3000.

The upshot of this is that a properly-powered single-cell RNA-seq experiment can get quite expensive. As of this writing, the total cost of a scRNA-seq experiment is in the thousands of dollars *per sample*. If we need a minimum of three samples per group (and we do), that’s a hefty price tag. But it’s worth it to get real data.

Analyzing the data

Once we have a well-designed experiment with biological replicates, how do we handle the analysis? Most of the differential expression methods for single-cell analysis are only suitable for within-sample analysis: they treat each cell as an independent measurement and can only reliably tell you about how one group of cells from the same sample compares to another group in that sample. Differential expression tests using these methods result in improbably low p-values.

One tool that does handle differential expression across multiple samples properly is an R package called MAST. It does this by essentially grouping expression profiles from each sample together, and comparing those groups rather than comparing individual cells. It uses what’s called a mixed-effects model to accomplish this. It’s quite computational intensive to use, but the results are solid. I’d love to hear from folks who have found other tools that do good work on these experimental designs.

Getting the most from the data

One of the hardest things to do in this business is to tell clients that the experiment they ran – the one that cost so much – isn’t going to give them the answers they need. Single-cell RNA-seq experiments are repeat offenders in this space because they are expensive and very new; despite the somewhat familiar name, they are very different beasts than good old bulk-RNAseq.

We hate seeing good mice (or even cell lines) go to waste. Reach out if you’d like to chat about experimental design and making sure your investment pays off.

–Eleanor