Back

AlphaPeptStats: an open-source Python package for automated, scalable and industrial-strength statistical analysis of mass spectrometry-based proteomics

Krismer, E.; Strauss, M. T.; Mann, M.

2023-03-11 biochemistry
10.1101/2023.03.10.532057 bioRxiv
Show abstract

SummaryThe widespread application of mass spectrometry (MS)-based proteomics in biomedical research increasingly requires robust, transparent and streamlined solutions to extract statistically reliable insights. Existing, popular tools were generally developed for specific uses in academic environments and did not fully embrace current open-source principles and best practices of software engineering. We have designed and implemented AlphaPeptStats, an inclusive python package with broad functionalities for normalization, imputation, visualization, and statistical analysis of proteomics data. It modularly builds on the established stack of Python scientific libraries, and is accompanied by a rigorous testing framework with 98% test coverage. It imports the output of a range of popular search engines. Data can be filtered and normalized according to user specifications. At its heart, AlphaPeptStats provides a wide range of robust statistical algorithms such as t-tests, ANOVA, PCA, hierarchical clustering and multiple covariate analysis - all in an automatable manner. Data visualization capabilities include heat maps, volcano plots, scatter plots in publication-ready format. AlphaPeptStats advances proteomic research through its robust tools that enable researchers to manually or automatically explore complex datasets to identify interesting patterns and outliers. AvailabilityAlphaPeptStats is implemented in Python and part of the AlphaPept framework. It is released under a permissive Apache license. The source code and one-click installers are freely available and on GitHub at https://github.com/MannLabs/alphapeptstats. Contactmmann@biochem.mpg.de, maximilian.strauss@cpr.ku.dk

Matching journals

The top 3 journals account for 50% of the predicted probability mass.