Back

PROTEOMICS

Wiley

Preprints posted in the last 90 days, ranked by how well they match PROTEOMICS's content profile, based on 35 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.

1
FiCOPS: Hardware/Software Co-Design of FPGA Computational Framework for Mass Spectrometry-Based Peptide Database Search

Kumar, S.; Zambreno, J.; Khokhar, A.; Akram, S.; Saeed, F.

2026-02-17 bioinformatics 10.64898/2026.02.15.706012 medRxiv
Top 0.1%
12.8%
Show abstract

Improving the speed and efficiency of database search algorithms that deduce peptides from mass spectrometry (MS) data has been an active area of research for more than three decades. The significance of the need for faster database search methods has rapidly increased due to the growing interest in studying non-model organisms, meta-proteomics, and proteogenomic data, which are notorious for their enormous search space. Poor scalability of serial algorithms with the growing size of the database and increasing parameters of post-translational modifications is a widely recognized problem. While high-performance computing techniques can be used on supercomputing machines, the need for real-time, on-the-instrument solutions necessitates the development of an efficient sytem-on-chip that optimizes design constraints such as cost, performance, and power of the system. To show case that such a system can work, we present an FPGA-based computational framework called FiCOPS to accelerate database search using a hardware/software co-design methodology. First, we theoretically analyze the database-search algorithm (closed-search) to reveal opportunities for parallelism and uncover computational bottlenecks. We then design an FPGA-based architectural template to exploit parallelism inherent in the search workload. We also formulate an analytical performance model for the architecture template to perform rapid design space exploration and find a near-optimal accelerator configuration. Finally, we implement our design on the Intel Stratix 10 FPGA platform and evaluate it using real-world datasets. Our experiments demonstrate that FiCOPS achieves 3.5 x speed-up over existing CPU solutions and 3x and 5x reduction in power consumption compared to existing CPU and GPU solutions.

2
The Cell Surface Proteome of Malignant Peripheral Nerve Sheath Tumors Reveals Therapeutic Targets

Stehn, C. M.; Wang, L.; Seeman, Z.; Largaespada, D. A.

2026-03-14 cancer biology 10.64898/2026.03.11.711103 medRxiv
Top 0.1%
12.3%
Show abstract

Malignant peripheral nerve sheath tumors (MPNSTs) are aggressive soft tissue sarcomas and the most common cause of disease-associated death for Neurofibromatosis Type 1 (NF1) patients. In the context of NF1, MPSNTs develop from benign premalignant precursors. The transition to malignancy is usually accompanied by loss of the polycomb repressive complex 2 (PRC2), leading to aberrant upregulation of many genes. The specific mechanisms disrupted by PRC2 loss remain incompletely understood. There is a significant gap in our knowledge of which cell-surface targets become derepressed and therapeutically actionable following PRC2 loss, contributing to the current lack of effective targeted therapies for MPNSTs. This study aims to address this gap by using cell-surface capture technology with mass spectrometry to profile MPNST models. In doing so, we define PRC2-dependent effects on the cell surface proteome, including specific biological pathways that are enhanced or suppressed at the cell surface protein level. We also create an MPNST cell-surface protein compendium comprised of proteins that are highly expressed across a variety of well-defined MPNST models. We prioritized proteins that are preferentially expressed in MPNST or other cancers and for which FDA-approved therapies already exist. Specific proteins from this compendium were molecularly targeted with antibody-drug conjugates in these models to surmise their therapeutic efficacy. Results reveal PTK7 as a novel and promising target for MPNST. In total, these efforts represent a step toward addressing the knowledge gap in MPNST genesis and identifying new therapeutic targets for further testing. Additionally, this data serves as a resource for other researchers wishing to characterize specific molecular targets. KEY POINTSPRC2 modulates key MPNST signaling pathways through the cell surface proteome Cell surface proteomics identifies a plethora of therapeutic targets for MPNST targeted therapy Antibody-drug conjugates targeting PTK7 show enhanced efficacy in reducing MPNST viability IMPORTANCE OF THE STUDYThis study utilizes advances in biochemistry to profile the surface proteome of malignant peripheral nerve sheath tumors. In doing so, it identifies many proteins whose presence is abundant on the cell surface of MPNST cells. Pre-clinical drug testing shows that use of antibody-drug conjugates may be effective in killing MPNST cells when targeted to epitopes identified in our MPNST cell surface proteome compendium. This study is a departure from more commonly used transcriptomic methods to identify cell surface proteins by using direct surface capture and mass spectrometry, providing a more direct measurement of cell surface protein abundance. Additionally, it identifies a handful of proteins which can be directly targeted pharmaceutically and one in particular, PTK7, whose targeting is highly effective in killing MPNST cells.

3
No One-Size-Fits-All: An Evidence-Based Framework to Select Plasma EV Isolation Methods

Werle, S. J.; Nautrup Therkelsen, M. L.; Groenborg, M.; Gluud, L. L.; Daamgard, D.

2026-03-11 molecular biology 10.64898/2026.03.09.710675 medRxiv
Top 0.1%
8.8%
Show abstract

Extracellular vesicles (EVs) hold significant promise as biomarkers, but their clinical translation is constrained by variability in pre-analytical handling and isolation. EV isolation methods directly shape which EV populations are captured and characterized, yet systematic method comparisons across multiple analytical dimensions are limited. We comprehensively evaluated eleven EV isolation methods to define their performance and applications. EVs were quantified by NanoFCM, profiled for tetraspanins (CD9, CD63, CD81) via MSD assays, and further characterized by LC-MS/MS proteomics. We show that different EV isolation methods recover different EV populations. Our data provide guidance on method selection based on downstream application needs and serve as a look-up tool if a protein of interest is detected. EV isolation methods broadened proteome coverage but showed divergent performance and recover different EV populations. While all methods captured EVs in the 50-150nm range, centrifugation and ultracentrifugation identified the broadest proteomes (up to 1093 proteins) driven by higher plasma protein carryover. Conversely, ExoEasy and qEV 70 isolated larger EVs and achieved stronger depletion of abundant plasma proteins but showed lower proteome coverage. A total of 117 proteins were detected across all isolation methods. Pre-clearing samples removed contaminants but at the cost of protein identifications. We demonstrate that method selection must align with the specific analytical goal: centrifugation for comprehensive proteome profiling, affinity/size-exclusion methods for contaminant-sensitive assays, and precipitation for high-throughput applications. This systematic characterization provides an evidence-based framework and look-up resource for matching isolation strategies to downstream applications and research questions. Graphical Abstract for Table of Contents O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=147 SRC="FIGDIR/small/710675v1_ufig1.gif" ALT="Figure 1"> View larger version (37K): org.highwire.dtl.DTLVardef@12ad967org.highwire.dtl.DTLVardef@270e4eorg.highwire.dtl.DTLVardef@1c41bcorg.highwire.dtl.DTLVardef@11fb236_HPS_FORMAT_FIGEXP M_FIG C_FIG This study evaluated 11 extracellular vesicle (EV) isolation methods which enriched distinct EV subpopulations with varying degrees of contaminants. No single approach optimized purity or proteome coverage; in this paper we present an Evidence-Based Framework to select plasma EV isolation methods based on downstream application needs.

4
Carafe2 enables high quality in silico spectral library generation for timsTOF data-independent acquisition proteomics

Wen, B.; Paez, J. S.; Hsu, C.; Canzani, D.; Chang, A. T.; Shulman, N.; MacLean, B. X.; Berg, M. D.; Villen, J.; Fondrie, W.; Pino, L.; MacCoss, M. J.; Noble, W. S.

2026-03-31 bioinformatics 10.64898/2026.03.27.714846 medRxiv
Top 0.1%
8.6%
Show abstract

Data-independent acquisition (DIA) proteomics enables reproducible and systematic peptide detection and quantification, and trapped ion mobility spectrometry (TIMS) on the timsTOF platform further improves DIA by synchronizing ion mobility separation with quadrupole precursor sampling. Analyzing the highly multiplexed spectra generated by DIA typically relies on spectral libraries, and fully leveraging the additional ion mobility dimension requires these libraries to include accurate retention time, fragment ion intensity, and ion mobility annotations. Existing in silico spectral library generation tools either lack ion mobility support entirely or rely on models trained on data-dependent acquisition (DDA) data, that can introduce a mismatch that may not capture unique experiment-specific biases when applied to each respective timsTOF dataset. Carafe is a software tool that uses deep learning models to generate high-quality, experiment-specific in silico libraries by training directly on DIA data. In this study, we extend Carafe to generate libraries for timsTOF DIA data, which involves fine-tuning retention time (RT), fragment ion intensity, and ion mobility prediction models using timsTOF DIA data. Carafe2 operates directly on native timsTOF raw data (Bruker .d directories) without the need for data conversion. We demonstrate the performance of Carafe2 across a wide range of DIA applications, including global proteome, phosphoproteome, and plasma proteome datasets. Comparing Carafe2 fine-tuned RT, fragment ion intensity, and ion mobility prediction models with pretrained DDA models, we find that Carafe2 models outperform pretrained models on a variety of DIA datasets. We then demonstrate the utility of in silico libraries generated by Carafe2 for peptide detection on several different types of timsTOF DIA datasets by comparing with the libraries generated with DDA-trained AlphaPeptDeep models, DIA-NN built-in models, and empirical spectral libraries generated from DDA experiments.

5
Importance of taking Single Amino Acid Variant and accessory proteome variability into account in Data Independent Acquisition Proteomics: illustrated with Legionella pneumophila analysis

Dupas, A.; Ibranosyan, M.; Ginevra, C.; Jarraud, S.; Lemoine, J.

2026-04-03 bioinformatics 10.64898/2026.04.01.715759 medRxiv
Top 0.1%
8.6%
Show abstract

Understanding allelic variability is crucial for elucidating intrinsic bacterial mechanisms and distinguishing phenotypic profiles. However, such variability poses a major challenge for the reliable identification of proteins in data-independent acquisition (DIA) proteomics. To address this, we developed an analytical workflow that integrates protein sequence variability to enhance proteome coverage. Fifteen Legionella pneumophila isolates were analyzed using DIA-NN, with spectral libraries generated either from a reference proteome or incorporating allelic variability. Our workflow includes protein clustering and subsequent protein inference from these clusters, allowing the accurate assignment of shared and variant-specific peptides. Integration of variability enabled the identification of a comparable number of proteins as the reference proteome while capturing between 28 and 77 % of variant-specific sequences in each isolate, all while maintaining a low false positive rate. These findings demonstrate that accounting for allelic variability substantially improves proteomic coverage and identification confidence, providing a more comprehensive view of the proteome. This approach facilitates a deeper understanding of biological mechanisms and enables precise bacterial proteotyping of Legionella pneumophila isolates.

6
Direct empirical in-house assessment of peptide proteotypicity for targeted proteomics

Butenko, I. O.; Kitsilovskaya, N. A.; Vakaryuk, A. V.; Lazareva, A. A.; Gremyacheva, V. D.; Kovalenko, A. V.; Lebedeva, A. A.; Baraboshkin, N. M.; Chudinov, I. K.; Khchoian, A. G.; Kurylova, O. V.; Gorbunov, K. S.; Pavlenko, A.; Kozhemyakin, G. L.; Fedorov, O. V.; Ilina, E.; Govorun, V. M.

2026-02-23 genomics 10.64898/2026.02.22.699713 medRxiv
Top 0.1%
7.3%
Show abstract

In bottom-up proteomics peptide it was early shown that despite a certain protein is present in a sample, only a subset of its proteolytic peptide products will be detected with LC-MS analysis. Property of peptide being frequently detected given its source proteins identification was called proteotypicity. Much effort has been since applied to predict proteotypic peptides and summarize evidence on peptide detection. Nevertheless, when targeted proteomics method is being developed, prediction or inference from communal experience might be inaccurate and prior knowledge of true peptide proteotypicity in a selected setup for a selected population is necessary. In this work we test fully in-house approach for proteotypicity assessment including comprehensive peptide synthesis and detection verification. Proteotypicity and contribution of sample processing and biology-related factors are estimated in a model experiment for three plasma proteins, albumin, ceruloplasmin and C-reactive protein.

7
Standardized brain and plasma EV enrichment pipeline validated for Single sample multi-Omic and fatty acids applications in Mouse and Human

Barry-Carroll, L.; varilh, m.; Marchaland, F.; Chen, C. T.; Sadeyen, A.-L.; Dupuy, J. W.; McDade, K.; Millar, T.; Bazinet, R.; Laye, S.; Raymond, A.-A.; Favereaux, A.; Madore, C.; Delpech, J. C.

2026-01-24 neuroscience 10.64898/2026.01.22.700328 medRxiv
Top 0.1%
6.9%
Show abstract

Extracellular vesicles (EVs) are key mediators of intercellular communication, yet their molecular profiles across tissues and species remain poorly characterized, particularly due to currently available methods requiring a large amount of biological material (tissue or biofluids). Here, we established a workflow allowing the deep phenotyping of EV cargos starting from single samples of human and mouse origin. We took advantage of standardised EV isolation procedures and multi-omic techniques for the isolation and analysis of EVs from brain and plasma of human and mouse, integrating flow cytometric profiling, proteomics, miRNA sequencing, and fatty acid profiling. Here we report specific brain-derived EVs proteome, enriched in neuronal and glial proteins, polyunsaturated fatty acids profiles, and distinct miRNAs. At the periphery, we also report plasma-derived EVs signatures reflecting immune, metabolic, and systemic transport functions. Despite these expected material-specific differences, EVs from the same source displayed greater similarity across species than EVs from different material, supporting the translational relevance of mouse models. Importantly, using state-of-the-art miRNA profiling approach, we identified novel EV-specific miRNAs in human and mouse brain EVs, potentially allowing the exploration of new roles in neuronal signalling. Overall, we report here a method enabling deep multi-omic characterization from minimal starting material, offering a practical approach for studies with limited biological samples. These findings also demonstrate that the origin strongly shapes EV composition, highlighting conserved and species-specific molecular features, and provide a scalable framework for multi-omic investigations of EV biology. Summary StatementWe present a standardised workflow allowing multi-omic profiling of brain and plasma-derived EVs from minimal human and mouse material. Our findings reveal both tissue-specific and species specific EV molecular signatures.

8
DIA-NN EasyFilter workflow for the fast and user-friendly critical assessment and visualization of DIA-NN proteomics analysis outcome

Moagi, M. G.; Thatiana, F. F.; Kristof, E. K.; Arda, A. G.; Arianti, R.; Horvatovich, P.; Csosz, E.

2026-03-10 bioinformatics 10.64898/2026.03.07.710308 medRxiv
Top 0.1%
6.6%
Show abstract

Liquid chromatography-tandem mass spectrometry (LC-MS/MS) based proteomics, particularly data-independent acquisition (DIA), has become widely adopted across in One Health approaches for biological and clinical research for quantitative protein characterization. Among the many computational tools available, DIA-NN has demonstrated superior performance; however, the primary output of the current versions is provided as a compact, compressed PARQUET file that can be difficult to interrogate without programming expertise. To address this limitation, we developed DIA-NN EasyFilter (DEF), a fast, user-friendly, KNIME-based workflow for comprehensive protein filtering, and visualization. DEF integrates chromatographic peak-based filtering, curated contaminant libraries, and quantity-quality assessment, along with interactive modules for qualitative and quantitative data exploration. The workflow is optimized for efficient execution within the KNIME local desktop environment and is designed to support end-users in improving accuracy and interpretability without requiring coding skills. We provide detailed description on how to run DEF and demonstrate the utility and robustness of DEF using published large-scale proteomics datasets, showing high comparability across studies regardless of instrument platform or dataset size. Table of Contents graphic O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=194 SRC="FIGDIR/small/710308v1_ufig1.gif" ALT="Figure 1"> View larger version (35K): org.highwire.dtl.DTLVardef@ce9f1dorg.highwire.dtl.DTLVardef@13042faorg.highwire.dtl.DTLVardef@17d3907org.highwire.dtl.DTLVardef@2b3aee_HPS_FORMAT_FIGEXP M_FIG C_FIG

9
Peptide-to-protein data aggregation using Fisher's method improves target identification in chemical proteomics

Lyu, H.; Gharibi, H.; Meng, Z.; Sokolova, B.; Zhang, X.; Zubarev, R.

2026-02-04 bioinformatics 10.64898/2026.02.02.702201 medRxiv
Top 0.1%
6.5%
Show abstract

Protein-level statistical tests in proteomics aimed at obtaining p-value are conventionally made on protein abundances aggregated from peptide data. This integral approach overlooks peptide-level heterogeneity and ignores important information coded in individual peptide data, while protein p-value can also be obtained by Fishers method of combining peptide p-values using chi-square statistics. Here we test this latter approach across diverse chemical proteomics datasets based on assessments of protein expression, solubility and protease accessibility. Using the top four peptides ranked by their p-values consistently outperformed protein-level analysis and avoided biases introduced by inclusion of deviant peptides or imputation of missing peptide values. Fishers method provides a simple and robust strategy, improving identification of regulated/shifted proteins in diverse proteomics assays.

10
High resolution, proteome-wide mapping of subcellular protein localization in plants

van Schie, M.; Roosjen, M.; Albrecht, C.; van Marsdijk, J.; Weijers, D.

2026-03-02 plant biology 10.64898/2026.02.27.708449 medRxiv
Top 0.1%
6.5%
Show abstract

Protein function is intimately connected to subcellular localization, and experimental determination of protein localization is a key element of understanding biological roles. However, even in the best-studied model plants, such as Arabidopsis thaliana, a minority of proteins has an experimentally defined subcellular localization. We present an experimental strategy to globally map plant subcellular proteomes by mass spectrometry. We annotated subcellular localization of 7815 proteins in Arabidopsis roots, 4672 in Arabidopsis seedlings, and 2782 in the liverwort Marchantia polymorpha. By independent validation, we find that these annotations are highly predictive and can be integrated with other proteomics datasets. Cross-species comparisons reveal substantial global conservation of subcellular localization. Furthermore, we demonstrate that the same approach can be used to identify dynamically translocating proteins upon treatment or in a mutant. This work shows the power of global spatial proteome mapping in plants and offers an extensive resource for protein subcellular localization in plants. HighlightsO_LIOptimized approach for global mapping of protein subcellular localization by differential centrifugation in plants C_LIO_LIInteractive resource of subcellular localization of plant proteins at unprecedented depth and resolution C_LIO_LICross-species comparison reveals that the plant subcellular proteome is deeply conserved C_LIO_LIComparative subcellular proteomics of a Brefeldin A treatment and a gnom mutant robustly describes global shifts in protein localization C_LI

11
Assessing extracellular vesicle proteins as predictive biomarkers for developing type 1 diabetes

Dakup, P. P.; Bramer, L.; Schepmoes, A.; Diaz Ludovico, I.; Flores, J.; Mirmira, R.; Webb-Robertson, B.-J.; Metz, T. O.; Sims, E. K.; Nakayasu, E. S.

2026-02-09 systems biology 10.64898/2026.02.06.703600 medRxiv
Top 0.1%
6.5%
Show abstract

Plasma extracellular vesicles (EVs) are considered excellent sources for biomarker discovery since they carry signatures of their cellular origin and disease processes. In this paper, we evaluate the potential of plasma EV proteomics analysis for identifying predictive biomarkers of developing type 1 diabetes (T1D), which results from autoimmune destruction of insulin-producing {beta} cells in the islet. We used strong anion exchange beads (Mag-Net) to capture plasma EVs from 19 donors with islet autoimmunity (diagnosed by circulating autoantibodies against islet proteins - AAB+) vs. 17 control individuals and analyzed their protein cargo by mass spectrometry. The analysis identified and quantified 5,480 proteins, a 3.2-fold increase in proteome coverage compared to our previous T1D biomarker proteomics study that used whole plasma depleted of the 14 most abundant proteins. The Mag-Net approach also detected 1,306 out of the 1,717 proteins (76%) that we previously verified as EV proteins. Statistical tests revealed 448 proteins to be differentially abundant in AAB+ vs control volunteers, including 69 previously verified EV proteins. A functional-enrichment analysis resulted in overrepresentation of 25 pathways among the differentially abundant proteins, including pathways related to autoimmune response and lipid metabolism. The capacity of this data to predict AAB+ was tested with a machine learning analysis using a random forest model, resulting in a receiver operating characteristic-area under the curve of 0.81. Overall, our study indicates that plasma EV proteomics analysis can be an exciting approach for studying biomarkers for developing T1D. Significance of the studyType 1 diabetes (T1D) is a disease characterized by the bodys inability to produce insulin and consequently, to control blood glucose levels. Despite the initial trigger being unclear, the disease development process involves an autoimmune response to the islets of Langerhans, resulting in the death of insulin-producing {beta} cells. There is no cure for the disease, and treatment relies on exogenous administration of insulin. Therefore, preventive therapies that block the autoimmune process are attractive for treating T1D. In fact, anti-CD3 antibody (Teplizumab) delays the onset of T1D by 2 years by targeting T cells. Predictive biomarkers for developing T1D are needed to aid the development and implementation of new therapies and to identify the initial trigger and mechanisms of the islet autoimmune process. In this paper, we assess the potential of plasma extracellular vesicle (EV) proteomics analysis for identifying predictive biomarkers of T1D. Our results show excellent potential of the approach, opening opportunities to perform broader studies to identify biomarkers for developing T1D.

12
Is Protein Quantification and Physical Normalization Always Necessary in Proteomics?

Zelter, A.; Riffle, M.; Merrihew, G. E.; Mutawe, B.; Maurais, A.; Inman, J. L.; Celniker, S. E.; Mao, J.-H.; Wan, K. H.; Snijders, A. M.; Wu, C. C.; MacCoss, M. J.

2026-02-15 biochemistry 10.64898/2026.02.13.705808 medRxiv
Top 0.1%
6.5%
Show abstract

Dogma suggests protein quantification is a pre-requisite to LC-MS/MS based proteomics studies. Such quantification allows a standardized ratio of sample to digestion enzyme and enables physical normalization of protein digest loaded onto the mass spectrometer for analysis. Most proteomics studies include these steps. However, there are significant costs in time, money and experimental complexity, associated with performing protein quantification and physical normalization for every sample, especially for larger studies. Proteomics data analysis pipelines typically include computational normalization strategies to compensate for unavoidable systematic biases. These strategies also have the potential to compensate for avoidable variation such as omitting sample amount normalization. Here we investigate the effects of either physically normalizing the amount of protein for each individual sample or leaving it unnormalized. Our results show the relationship between increased protein amount variation in sample input, and the variance of quantified relative abundances of peptides and proteins output after data analysis. The experiments presented here suggest that protein quantification and physical normalization steps can be omitted from some quantitative proteomic experiments without incurring an unacceptable increase in measurement variability after computational normalization has been applied. This work will enable important time and cost saving optimizations to be made to many proteomics workflows.

13
Proteome landscape of B-cell malignancies identifies mantle cell lymphoma protein signature

Swenson, S. A.; Winship, C. B.; Dobish, K. K.; Wittorf, K. J.; Law, H. C.; Vose, J. M.; Greiner, T.; Green, M. R.; Woods, N. T. R.; Buckley, S. M.

2026-03-05 cancer biology 10.64898/2026.03.02.709116 medRxiv
Top 0.1%
6.4%
Show abstract

Mantle cell lymphoma (MCL) is one of the deadliest forms of Non-Hodgkins B-cell lymphoma. Typically, patients present with both overexpression of CyclinD1 and secondary mutations identified by genomic sequencing. Although MCL patients may initially respond to treatment, they eventually relapse and succumb to disease, highlighting the essential need to identify new targets for treatment. Here we performed proteomic profiling of healthy B cells and three different forms of B-cell malignancies, including MCL, to define the proteomic signature of MCL. We compared the proteome of each to MCL and identified 10 proteins that are specifically upregulated in MCL. Of these 10 proteins, seven of them show no transcriptional changes and have been overlooked by conventional RNA expression analysis. Further analysis of the proteomic signature reveals potential avenues for dual targeting in CAR T-cell therapy and provides guidance for personalized therapeutics based on protein expression. STATEMENT OF SIGNIFICANCEWe present a resource defining the protein landscape of MCL, CLL, and FL as compared to healthy b cells identified utilizing quantitative proteomics from primary patient samples. Applied to MCL, our results identify 10 proteins specifically upregulated in MCL that may prove to be therapeutic targets to treat the disease.

14
Evaluation and application of chemical decrosslinking in the context of histopathological spatial proteomics

Nwosu, A. J.; Chen, L.; Kumar, R.; Kwon, Y.; Goodyear, S. M.; Kardosh, A.; Fulcher, J. M.; Pasa-Tolic, L.

2026-02-09 cancer biology 10.64898/2026.02.06.704439 medRxiv
Top 0.1%
6.4%
Show abstract

Laser capture microdissection (LCM) - based spatial mass spectrometry proteomics is a rapidly emerging technique with strong potential for use in formalin-fixed, paraffin-embedded (FFPE) tissues. Several sample-preparation methods have been developed to decrosslink FFPE proteins for spatial proteomics; however, residual crosslinks often remain, and depth can remain impaired relative to fresh frozen tissue samples. To increase proteome coverage in spatially resolved LCM-FFPE samples, we investigated a panel of chemical compounds with the potential to catalyze the decrosslinking of nucleophilic functional groups on proteins. Systematic screening and optimization of temperature, incubation time, and reagent concentration led to the identification of 3,4-diaminobenzoic acid as an effective agent for improving proteome coverage in FFPE pancreatic tissue. This compound could boost precursor identifications by more than 10% at both reduced (70 {degrees}C) and high (90 {degrees}C) temperatures. Application of this chemical-decrosslinking strategy to a pancreatic ductal adenocarcinoma tissue section enabled the identification of numerous cell-type-enriched proteins with clinical and therapeutic relevance. Taken together, our findings show that chemical decrosslinking can increase proteome coverage in FFPE tissues, thereby advancing our understanding of tissue microenvironments in physiological and pathological contexts.

15
A Benchmarking Framework for Comparative Evaluation of Low-Complexity Region Detection Tools in the Human Proteome

Chatterjee, A.; Vijay, N.

2026-01-26 bioinformatics 10.64898/2026.01.24.701293 medRxiv
Top 0.1%
6.4%
Show abstract

Low-complexity regions (LCRs) are compositionally biased segments of proteins that play critical roles in molecular recognition, structural flexibility, and phase separation. Yet, their accurate detection remains challenging due to methodological variability among computational tools. In this study, we conducted a comprehensive benchmarking of eight widely used LCR detection methods (with different parameter settings) across the Homo sapiens proteome. A modular computational framework was developed to systematically compare LCR characteristics, including residue-centric analyses such as length distributions and coverage percentages. Protein-centric analyses consisted of compositional bias, amino acid composition, and Shannon entropy. Consensus analyses revealed that regions detected by multiple tools were typically longer, more repetitive, and compositionally purer, suggesting stronger structural or functional relevance. Jaccard similarity matrices demonstrated distinct clustering patterns among algorithms based on shared detection principles. Additionally, entropy and purity analyses highlighted fundamental differences in sequence complexity captured by each tool. Together, these results provide a unified, reproducible framework for evaluating LCR detection performance and offer practical guidelines for reliably annotating low-complexity regions in proteome-scale studies.

16
PTMOverlay: A Proteomic Tool to Visualize Post-Translational Modifications Across Evolution

Krieger, C.; Everton, Z.; You, Y.; Lewis, B.; Bank, T.; Burnet, M. C.; Williams, S.; Walukiewicz, H.; Rao, C.; Wolfe, A.; Payne, S. H.; Nakayasu, E. S.

2026-02-06 systems biology 10.64898/2026.02.03.703592 medRxiv
Top 0.1%
6.3%
Show abstract

Evolutionary conservation has been considered a hallmark of essential basic functions in cells. Therefore, the study of evolutionarily conserved post-translational modifications (PTMs) can provide insight into their role in protein function. In this context, mass spectrometry can identify and quantify thousands of PTM sites. However, a major bottleneck lies in analyzing the large amounts of data collected by the mass spectrometer. Here we address the need for a protein sequence alignment tool for multiple PTMs across several species. We developed a tool named PTMOverlay that takes peptide identification output files and overlays PTM sites onto multiple protein sequence alignments. Examining 31 bacteria isolates, we combined their protein sequences with select PTM types, including acetylation, phosphorylation, monomethylation, dimethylation, and trimethylation. The tool revealed a variety of conserved modification sites on the bacterial central carbon metabolism. Further structural analysis revealed possible interactions between methylated arginine and lysine residues with phosphothreonine/serine sites on the homodimer interface of enolase. Overall, this tool can parse large amounts of mass spectrometry data and allows for more informed and efficient selection of sites for future studies of protein function.

17
Decades of dreams coming true: capillary zone electrophoresis-mass spectrometry for reproducible multi-level proteomics

Zhu, G.; Yue, Y.; Rosado, J. A. C.; Gao, G.; Liu, X.; Sun, L.

2026-01-31 systems biology 10.64898/2026.01.28.702308 medRxiv
Top 0.1%
5.0%
Show abstract

Capillary zone electrophoresis (CZE)-mass spectrometry (MS) has been proposed as a powerful analytical tool for bottom-up, top-down, and native proteomics (multi-level proteomics) decades ago to analyze complex biological samples at the levels of peptides (bottom-up), proteoforms (top-down), and complexoforms (native). However, its broad adoption has been impeded by the limited robustness and reproducibility. Here, we present multi-level proteomics data from nearly 170 CZE-MS runs ([~]170 hours of instrument time), demonstrating qualitatively (i.e., the number of identified peptides and proteoforms, the number of detected complexoforms, and their migration time) and quantitatively (i.e., peptide, proteoform, and complexoform intensity) reproducible measurement of complex samples with varying levels of complexity, i.e., Escherichia coli cells, HeLa cells, and human plasma. CZE-MS-based native proteomics enabled the detection of hundreds of complexoforms up to 800 kDa from the complex systems via consuming only nanograms of protein material. The results indicate that CZE-MS is sensitive and reproducible enough for broad adoption for multi-level proteomics-based biomedical research.

18
Proteomics for cultivated meat: the importance of Analytical Standardization

Palma, J.; Leblanc, C. C.; Kusters, R.; Kamgang Nzekoue, A. F.

2026-03-25 systems biology 10.64898/2026.03.23.713501 medRxiv
Top 0.1%
4.8%
Show abstract

Cultivated meat production requires robust and validated analytical methods for comprehensive characterization. While transcriptomics-based approaches establish the foundational profile of molecular analysis, proteomics provides additional resolution that further enhances scientific certainty in both product development and safety characterization. However, the industry adoption of proteomics is currently hindered by technical complexity and a critical lack of analytical standardization, which leads to significant workflow-dependent variations in proteome coverage. To address this gap, we investigated the influence of key workflow steps (digestion, cleanup, LC-MS conditions) on the proteome profile of cultivated duck biomass. We compared five bottom-up sample preparation protocols - two traditional in-solution options (urea and SDC-based protocols), two device-based approaches (PreOmics iST and EasyPep kits), and an innovative protocol (SPEED), and demonstrated that device-based protocols offered the highest peptide yield and proteome coverage. However, optimization allowed cost-effective in-solution methods to achieve comparable performance. Specifically, an optimal digestion time of 3 hours at 37{degrees}C and the use of polymer-based desalting columns significantly enhanced protein identification ([~]4500 - 5000 IDs). Moreover, data independent acquisition (DIA) provided deeper proteome coverage than data dependent acquisition (DDA) with higher precision ([~]6500 vs 5000 IDs). The validated Standard Operating Procedures presented here establish a standardized framework for bulk bottom-up proteomics in cultivated meat, facilitating the generation of reliable and comparable data required for robust multi-omics characterization. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=80 SRC="FIGDIR/small/713501v1_ufig1.gif" ALT="Figure 1"> View larger version (32K): org.highwire.dtl.DTLVardef@5b61b8org.highwire.dtl.DTLVardef@16c7e65org.highwire.dtl.DTLVardef@1de21d2org.highwire.dtl.DTLVardef@7e984a_HPS_FORMAT_FIGEXP M_FIG C_FIG HighlightsO_LIComplexity and non-standardization limit MS-proteomics use in cultivated meat (CM). C_LIO_LICM protein profile varies with sample prep, LC-MS, and data processing pipeline. C_LIO_LIDevice-based and optimized cost-effective protocols offer a high proteome coverage. C_LIO_LIProteomics can complement transcriptomics for a comprehensive CM characterization. C_LIO_LIProposed standardized methods ensure reliable data for future regulatory submissions. C_LI

19
From variability to consensus: rescoring harmonizes peptide identification across diverse search engines and datasets

Winkelhardt, D.; Berres, S.; Uszkoreit, J.

2026-03-06 bioinformatics 10.64898/2026.03.04.709532 medRxiv
Top 0.1%
4.6%
Show abstract

Peptide-spectrum match (PSM) rescoring has become standard in proteomics workflows, improving peptide identification accuracy across diverse search engines. Despite the availability of multiple rescoring strategies, systematic comparisons spanning several search engines, datasets, and database configurations remain limited. Here, we benchmarked seven publicly available search engines, evaluating standard target-decoy-based false discovery rate (FDR) estimation alongside Percolator, MS2Rescore, and Oktoberfest across four datasets acquired on different mass spectrometry platforms and searched against protein databases of varying size and composition. Rescoring substantially increased identification consensus and reduced variability between search engines, with prediction-based approaches yielding the largest gains. While database size had limited impact for human datasets, it significantly affected identification rates on a metaproteomic dataset. Entrapment-based evaluation indicated generally adequate FDR control across methods, although prediction-based rescoring exhibited a slightly higher tendency toward FDR underestimation in specific configurations. Overall, advanced rescoring strategies harmonize peptide identification outcomes across search engines, thereby enhancing robustness and comparability in proteomics analyses. However, careful feature selection and appropriate database choice remain essential to ensure reliable FDR control and optimal performance across diverse experimental settings.

20
Targeted follicular fluid proteomics using reverse phase protein arrays (RPPA); a feasibility study

Bloom, M. S.; Sanchez, V. G.; Fujimoto, V. Y.; Tamrat, M.; Krall, J. R.; Espina, V.

2026-02-04 sexual and reproductive health 10.64898/2026.02.02.26345389 medRxiv
Top 0.1%
4.4%
Show abstract

This small pilot feasibility study shows that reverse phase protein array (RPPA) technology is a useful tool for targeted proteomics analysis in human ovarian follicular fluid. RPPA supplements mass spectrometry approaches that are currently used by providing functional signal transduction data that drive cellular biology. Herein, we present the first report of using RPPA in follicular fluid to elucidate protein signaling pathways. The results show potential associations between follicular fluid proteins measured with RPPA and reproductive outcomes from in vitro fertilization, including oocyte maturity, oocyte fertilization, embryo quality, and pregnancy. This study provides evidence that RPPA is a feasible approach to be used in clinical studies of reproductive endpoints. However, a larger study of RPPA to identify diagnostic and prognostic follicular fluid protein biomarkers of infertility is needed.