Back

ProCAST: A Bioinformatics Suite for Mass Spectrometry-Based Protein Corona Proteomics Analysis

Mun, H.; Leamy, M.; Kaushik, A.; Kieslich, C.; Douglas-Green, S. A.

2026-05-12 bioinformatics
10.64898/2026.05.08.723620 bioRxiv
Show abstract

When nanoparticles are exposed to biological fluids, they spontaneously adsorb proteins, forming a protein corona that defines their biological identity and dictates cellular uptake, biodistribution, and toxicity. Characterizing protein coronas includes using proteomics approaches (e.g., LC-MS/MS) to identify proteins and generate vast lists of adsorbed proteins, often visualized via complex heatmaps. While heatmaps display data they do not offer heuristic guide, leaving the driving mechanisms of adsorption unknown. Moreover, interpretation of protein corona proteomics data remains limited by fragmented workflows, inconsistent preprocessing, and visual outputs that are often descriptive rather than readily interpretable. These conventional methods identify adsorbed proteins but fail to explain why specific proteins are selected or how they influence the particles biological fate. Here, we developed ProCAST (Protein Corona Analysis and Statistical Tool), an R-based framework for protein corona proteomics that integrates proteomics data, nanoparticle metadata, protein annotations, and multi-level visualization within a single analytical workflow. ProCAST facilitates abundant protein clustering based on sample conditions, sequence descriptors, property or protein correlations, and gene ontology-based functional visualization. It also distinguishes abundant proteins from frequent proteins, providing distinct layers of information from the same dataset. ProCAST was used to re-analyze previously published PAMAM G4 dendrimer-FBS datasets, demonstrating that ProCAST reproduces descriptor-level visualizations and offers new insights through clearer comparisons of functional patterns and hypothesis generation from dominant corona proteins. By organizing results as complementary views of the same dataset, ProCAST facilitates the shift of protein corona analysis from descriptive outputs toward structured, comparative, and experimentally testable interpretations.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 9%
14.9%
2
Journal of Proteome Research
215 papers in training set
Top 0.2%
14.5%
3
Molecular & Cellular Proteomics
158 papers in training set
Top 0.3%
7.3%
4
Bioinformatics
1061 papers in training set
Top 4%
6.4%
5
PLOS ONE
4510 papers in training set
Top 30%
4.9%
6
Analytical Chemistry
205 papers in training set
Top 0.8%
3.6%
50% of probability mass above
7
PLOS Computational Biology
1633 papers in training set
Top 9%
3.6%
8
PROTEOMICS
35 papers in training set
Top 0.2%
3.6%
9
ACS Nano
99 papers in training set
Top 1%
3.6%
10
Genome Biology
555 papers in training set
Top 3%
3.1%
11
Journal of the American Society for Mass Spectrometry
33 papers in training set
Top 0.2%
2.1%
12
Cell Systems
167 papers in training set
Top 7%
1.7%
13
Nano Letters
63 papers in training set
Top 2%
1.7%
14
Nature Biotechnology
147 papers in training set
Top 5%
1.5%
15
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.5%
16
mSystems
361 papers in training set
Top 6%
1.2%
17
Nature Methods
336 papers in training set
Top 5%
1.0%
18
Scientific Reports
3102 papers in training set
Top 70%
0.9%
19
Cell Reports Methods
141 papers in training set
Top 5%
0.8%
20
Protein Science
221 papers in training set
Top 2%
0.7%
21
Advanced Science
249 papers in training set
Top 20%
0.7%
22
GigaScience
172 papers in training set
Top 4%
0.7%
23
Journal of Structural Biology
58 papers in training set
Top 2%
0.7%
24
Cancer Research Communications
46 papers in training set
Top 2%
0.5%
25
mSphere
281 papers in training set
Top 7%
0.5%
26
iScience
1063 papers in training set
Top 40%
0.5%
27
Molecular Systems Biology
142 papers in training set
Top 3%
0.5%