Development of a Semantically-Coherent and Statistically-Comparable Representation of Reference Cell Types for the Human Cell Atlas

The mission of the Human Cell Atlas (HCA) is to create comprehensive reference maps of all cells in the healthy human body as a basis for understanding health and disease. To achieve this objective, it is essential that project data used to characterize cell types, cell states and lineage relationships are findable, accessible, interoperable, and reusable (the FAIR Principles) by the research community.

Historically, cell types were defined by “low-resolution” experiments that identify characteristic features, including anatomic location (e.g. splenocytes), cellular structures (spiderweb neuroglioform cells), marker protein expression (CD19+ B cells), cellular function (cytotoxic T cells), or some combination of the above (hippocampal interneurons). As such, the Cell Ontology developed standard representations of cell types based on semantic definitions, incorporating these characteristics as necessary and sufficient conditions.

More recently, new “high-throughput/high-content/high-resolution” (HT/HC/HR) technologies, including high-dimensional fluorescence and mass cytometry and single cell RNA sequencing (scRNA-seq), are driving the discovery of novel cell phenotypes at an unprecedented scale. This rapid pace of technology-driven cell type discovery has revealed a serious bottleneck in our ability to define and quantitatively represent cell phenotypes so that they can be compared across studies.

We propose to extend our previous work developing biomedical ontologies, minimum information standards, and statistical cell data matching approaches to develop a scalable approach for semantically-coherent and statistically-comparable cell type definitions, using data from these emerging HT/HC/HR technologies.


BMC bioinformatics. 2017-12-21; 18.Suppl 17: 559.
Cell type discovery and representation in the era of high-content single cell phenotyping
Bakken T, Cowell L, Aevermann BD, Novotny M, Hodge R, Miller JA, Lee A, Chang I, McCorrison J, Pulendran B, Qian Y, Schork NJ, Lasken RS, Lein ES, Scheuermann RH
PMID: 29322913


This project is funded by the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation.

Principal Investigator

Key Staff

  • Brian Aevermann, MS


Trygve Bakken, Jeremy A. Miller, and Ed Lein
Allen Institute for Brain Science

Alexander D. Diehl
University of Buffalo

David Osumi-Sutherland
European Bioinformatics Institute


Related Research