Back

anndata: Annotated data

Virshup, I.; Rybakov, S.; Theis, F. J.; Angerer, P.; Wolf, F. A.

2021-12-19 bioinformatics
10.1101/2021.12.16.473007 bioRxiv
Show abstract

anndata is a Python package for handling annotated data matrices in memory and on disk (github.com/theislab/anndata), positioned between pandas and xarray. anndata offers a broad range of computationally efficient features including, among others, sparse data support, lazy operations, and a PyTorch interface. Statement of needGenerating insight from high-dimensional data matrices typically works through training models that annotate observations and variables via low-dimensional representations. In exploratory data analysis, this involves iterative training and analysis using original and learned annotations and task-associated representations. anndata offers a canonical data structure for book-keeping these, which is neither addressed by pandas (McKinney, 2010), nor xarray (Hoyer & Hamman, 2017), nor commonly-used modeling packages like scikit-learn (Pedregosa et al., 2011).

Matching journals

The top 3 journals account for 50% of the predicted probability mass.