anndata: Annotated data

Virshup, I.; Rybakov, S.; Theis, F. J.; Angerer, P.; Wolf, F. A.

2021-12-19 bioinformatics

10.1101/2021.12.16.473007 bioRxiv

Show abstract

anndata is a Python package for handling annotated data matrices in memory and on disk (github.com/theislab/anndata), positioned between pandas and xarray. anndata offers a broad range of computationally efficient features including, among others, sparse data support, lazy operations, and a PyTorch interface. Statement of needGenerating insight from high-dimensional data matrices typically works through training models that annotate observations and variables via low-dimensional representations. In exploratory data analysis, this involves iterative training and analysis using original and learned annotations and task-associated representations. anndata offers a canonical data structure for book-keeping these, which is neither addressed by pandas (McKinney, 2010), nor xarray (Hoyer & Hamman, 2017), nor commonly-used modeling packages like scikit-learn (Pedregosa et al., 2011).

Matching journals

●Non-profit ◐University press ○Commercial

The top 3 journals account for 50% of the predicted probability mass.

Only show non-profit

◐ 1061 papers in training set

Bioinformatics Advances

◐ 184 papers in training set

◐ 172 papers in training set

50% of probability mass above

Journal of Open Source Software

● 22 papers in training set

Nature Communications

○ 4913 papers in training set

PLOS Computational Biology

● 1633 papers in training set

● 4510 papers in training set

Scientific Reports

○ 3102 papers in training set

BMC Bioinformatics

○ 383 papers in training set

Frontiers in Neuroinformatics

○ 38 papers in training set

● 5422 papers in training set

◐ 261 papers in training set

○ 70 papers in training set

Frontiers in Genetics

○ 197 papers in training set

Methods in Ecology and Evolution

○ 160 papers in training set

○ 1063 papers in training set

Frontiers in Physiology

○ 93 papers in training set

Computational and Structural Biotechnology Journal

● 216 papers in training set

Communications Biology

○ 886 papers in training set

Proceedings of the National Academy of Sciences

● 2130 papers in training set

Peer Community Journal

● 254 papers in training set