Human Cell Atlas: Comparison, calibration, and benchmarking single cell RNA-seq techniques

The Human Cell Atlas (HCA) project is a recent, world-wide initiative that aims to identify and functionally characterize all cell-types in the human body in health and disease. Single-cell RNA-sequencing (scRNA-seq) methods can be and are being extensively used to profile cells by the HCA for this purpose, in different tissues and at different developmental stages by their unique gene expression patterns.

But there are many different scRNA-seq techniques and technologies which vary in cell and RNA capture efficiencies, library preparation, and throughput. Importantly, these approaches also vary in sensitivity and accuracy of mRNA quantification. In addition, to map the cells in different tissue-types, and under different development and activation conditions, the HCA project will need to collate data generated across many labs located around the world, necessitating evaluation of experimental replicability, comparison of different experimental procedures and benchmarking their performance on select cells and tissues that are representative of myriad cells and tissue-types in the human body and capture healthy as well as diseased states.

To aid in this, we, in collaboration with Dr. Sebastian Pott, Human Genetics, and Dr. Yoav Gilad, Genetic Medicine, performed a systematic comparison of two high-thorughput single cell RNA sequencing techniques, viz., Drop-seq, and DroNc-seq. We are using human iPSC-derived cardiomyocytes for this comparison. Cardiomyocytes comprise the heart muscle and while an extremely important group of cells, primary adult cardiomyocytes form dense tissue that can be hard to dissociate into intact single cell suspension and therefore difficult to use for single cell RNA-seq. To bypass this difficulty, we chose reprogrammed stem cells (iPSC) that were then differentiated into cardiomyocytes, in vitro.


We sampled the differentiating cardiomyocytes at various time-points, up to Day 15 of the differentiation process to compare the changing RNA expression profiles of the iPS cells across their differentiation and development time-course. For reference, starting on Day 6 or Day 7 of the differentiation process, the iPSC derived cardiomyocytes start beating spontaneously on the petridish and thus Day 7 is considered a significant time-point in iPSC-cardiomyocyte maturation. Similar to other iPSC-derived cells, only a fraction of the differentiating cells succeeded to transform into mature cardiomyocytes- not all the cells starting beating simultaneously on Day 7. While there is considerable heterogeneity among single cells and this asynchronous development is expected, we were curious about the cells that lagged behind in differentiation (did not start beating yet) or failed to transform into cardiomyocytes altogether. High-throughput single-cell techniques such as Drop-seq and DroNc-seq were uniquely suited to investigate this.

We ran Drop-seq and DroNc-seq on two batches of human iPSC differentiation, sampling cells from Days 0 (induction), 1, 3, 7 (beating) and 15, and analyzed using clustering and lineage tracing algorithms. We found that Drop-seq and DroNc-seq performed very similarly, broadly speaking. Both techniques identified the major cellular subtypes (top panels, Dropseq- left and DroNc-seq, right) including iPSCs, cardiac progenitor cells, cardiomyocytes, and a few other cell-types that seemed to have differentiated into alternative lineages with markers of mesendodermal and endodermal cells. Not only were the same cell-types identified, but they were seen to be represented in similar fractions of the entire population, as seen in the middle panels (Dropseq- left and DroNc-seq, right). Cell differentiation trajectories inferred in pseudo-time showed two major branches of differentiating cells, for cardiomyocytes and those from alternative lineages, consistent with cell-types identified from clustering analyses. In addition, Drop-seq identified a small proportion of cells from an additional alternative lineage, that DroNc-seq failed to detect.

We were also able to perform DroNc-seq to profile all cell-types from primary adult heart tissue, for comparison. A small fragment of frozen adult male heart tissue was used and we found that the majority of the nuclei were from cardiomyocytes and myofibroblasts, as expected.

Our initial plan was to compare Drop-seq and DroNc-seq, two high-throughput, 3′ DGE techniques with Fluidigm C1 platform, for lower throughput but full-length RNA-seq. Unfortunately, we were unable to load the iPSC-derived cardiomyocytes into the C1 cartridge to proceed with this set of experiments.

We are grateful to the Chan-Zuckerberg Initiative for their support to the HCA with funds, and computational and data-sharing infrastructure that we are using for this project.