Back

PROTEOMICS

Wiley

Preprints posted in the last 90 days, ranked by how well they match PROTEOMICS's content profile, based on 35 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.

1
Network-based integration of cross-dataset proteomic profiles using fold-change directionality

Nishizaki, M.; Araki, N.; Kawano, S.

2026-04-22 bioinformatics 10.64898/2026.04.19.718092 medRxiv
Top 0.1%
12.8%
Show abstract

MotivationThe rapid expansion of proteomic data has created new opportunities for large-scale integrative analyses. However, substantial variability across platforms, experimental designs, and processing pipelines limits direct quantitative comparisons among studies. Differential proteomic changes between conditions are often considered to be more reproducible than absolute abundances and may therefore provide a robust basis for cross-dataset integration. However, the systematic ability of differential change-based approaches to capture biologically meaningful relationships across heterogeneous datasets remains unclear. ResultsWe developed a differential-change framework and applied it to public proteomic datasets. Pairwise contrasts were defined as differential proteomic profiles, and the concordance of up- and down-regulated proteins was quantified using odds ratios. Significant profile pairs were visualized as an integrative network. The treatment of anti-cancer drug doxorubicin vs control (MCF-7) comparison emerged as a central hub, with breast cancer proteome profiles clustering around it and associating with tumor stage (p = 0.03). Enrichment analysis revealed overrepresentation of lipid- and cholesterol-related pathways. Availability and implementationThe source code for proteome network integration is available at https://github.com/manakanishizaki/proteome-network-integration.git.

2
Systems-Informed prioritization of Exosomal Protein Candidates in TNBC Identifies an ECM Invasion Module and Nominates Agrin as a High-Priority Target

Nguyen, T. M.

2026-05-19 cancer biology 10.64898/2026.05.14.725271 medRxiv
Top 0.1%
12.6%
Show abstract

BackgroundTriple-negative breast cancer (TNBC) remains the most clinically challenging breast cancer subtype, in part due to the absence of validated molecular targets and the limited availability of non-invasive early detection strategies. Tumor-derived exosomes have emerged as promising liquid biopsy analytes, yet the functional organization of their protein cargo and the identification of biologically meaningful candidates remain incompletely characterized. MethodsWe present a Composite Driver Score (CDS) framework that integrates differential expression magnitude with protein-protein interaction network topology and Analytic Hierarchy Process (AHP)-based multi-criteria weighting to prioritize exosomal protein candidates in a systems-informed manner. The framework was applied to publicly available label-free quantitative proteomic datasets comparing MDA-MB-231 (TNBC) and MCF-10A (non-tumorigenic) exosomal fractions, with cross-dataset validation performed on an independent proteomic dataset. ResultsCDS prioritization demonstrated robustness to variations in proteome depth and parameter weighting, consistently recovering a functionally coherent set of extracellular matrix (ECM) and adhesion-associated proteins. Network and pathway analyses revealed coordinated co-enrichment of integrin receptors, cognate ECM ligands, and associated co-receptors -- consistent with selective packaging of a functionally integrated invasion module. Agrin (AGRN), a heparan sulfate proteoglycan with virtually limited prior characterization in TNBC exosome biology, emerged as a high-priority candidate through its network integration within this ECM program. ConclusionsThese findings support a model in which TNBC-derived exosomes carry coordinated molecular programs capable of modulating extracellular matrix architecture. The CDS framework offers a transferable strategy for integrative exosomal biomarker prioritization and a systems-level foundation for targeted liquid biopsy panel development.

3
The Cell Surface Proteome of Malignant Peripheral Nerve Sheath Tumors Reveals Therapeutic Targets

Stehn, C. M.; Wang, L.; Seeman, Z.; Largaespada, D. A.

2026-03-14 cancer biology 10.64898/2026.03.11.711103 medRxiv
Top 0.1%
12.3%
Show abstract

Malignant peripheral nerve sheath tumors (MPNSTs) are aggressive soft tissue sarcomas and the most common cause of disease-associated death for Neurofibromatosis Type 1 (NF1) patients. In the context of NF1, MPSNTs develop from benign premalignant precursors. The transition to malignancy is usually accompanied by loss of the polycomb repressive complex 2 (PRC2), leading to aberrant upregulation of many genes. The specific mechanisms disrupted by PRC2 loss remain incompletely understood. There is a significant gap in our knowledge of which cell-surface targets become derepressed and therapeutically actionable following PRC2 loss, contributing to the current lack of effective targeted therapies for MPNSTs. This study aims to address this gap by using cell-surface capture technology with mass spectrometry to profile MPNST models. In doing so, we define PRC2-dependent effects on the cell surface proteome, including specific biological pathways that are enhanced or suppressed at the cell surface protein level. We also create an MPNST cell-surface protein compendium comprised of proteins that are highly expressed across a variety of well-defined MPNST models. We prioritized proteins that are preferentially expressed in MPNST or other cancers and for which FDA-approved therapies already exist. Specific proteins from this compendium were molecularly targeted with antibody-drug conjugates in these models to surmise their therapeutic efficacy. Results reveal PTK7 as a novel and promising target for MPNST. In total, these efforts represent a step toward addressing the knowledge gap in MPNST genesis and identifying new therapeutic targets for further testing. Additionally, this data serves as a resource for other researchers wishing to characterize specific molecular targets. KEY POINTSPRC2 modulates key MPNST signaling pathways through the cell surface proteome Cell surface proteomics identifies a plethora of therapeutic targets for MPNST targeted therapy Antibody-drug conjugates targeting PTK7 show enhanced efficacy in reducing MPNST viability IMPORTANCE OF THE STUDYThis study utilizes advances in biochemistry to profile the surface proteome of malignant peripheral nerve sheath tumors. In doing so, it identifies many proteins whose presence is abundant on the cell surface of MPNST cells. Pre-clinical drug testing shows that use of antibody-drug conjugates may be effective in killing MPNST cells when targeted to epitopes identified in our MPNST cell surface proteome compendium. This study is a departure from more commonly used transcriptomic methods to identify cell surface proteins by using direct surface capture and mass spectrometry, providing a more direct measurement of cell surface protein abundance. Additionally, it identifies a handful of proteins which can be directly targeted pharmaceutically and one in particular, PTK7, whose targeting is highly effective in killing MPNST cells.

4
No One-Size-Fits-All: An Evidence-Based Framework to Select Plasma EV Isolation Methods

Werle, S. J.; Nautrup Therkelsen, M. L.; Groenborg, M.; Gluud, L. L.; Daamgard, D.

2026-03-11 molecular biology 10.64898/2026.03.09.710675 medRxiv
Top 0.1%
8.8%
Show abstract

Extracellular vesicles (EVs) hold significant promise as biomarkers, but their clinical translation is constrained by variability in pre-analytical handling and isolation. EV isolation methods directly shape which EV populations are captured and characterized, yet systematic method comparisons across multiple analytical dimensions are limited. We comprehensively evaluated eleven EV isolation methods to define their performance and applications. EVs were quantified by NanoFCM, profiled for tetraspanins (CD9, CD63, CD81) via MSD assays, and further characterized by LC-MS/MS proteomics. We show that different EV isolation methods recover different EV populations. Our data provide guidance on method selection based on downstream application needs and serve as a look-up tool if a protein of interest is detected. EV isolation methods broadened proteome coverage but showed divergent performance and recover different EV populations. While all methods captured EVs in the 50-150nm range, centrifugation and ultracentrifugation identified the broadest proteomes (up to 1093 proteins) driven by higher plasma protein carryover. Conversely, ExoEasy and qEV 70 isolated larger EVs and achieved stronger depletion of abundant plasma proteins but showed lower proteome coverage. A total of 117 proteins were detected across all isolation methods. Pre-clearing samples removed contaminants but at the cost of protein identifications. We demonstrate that method selection must align with the specific analytical goal: centrifugation for comprehensive proteome profiling, affinity/size-exclusion methods for contaminant-sensitive assays, and precipitation for high-throughput applications. This systematic characterization provides an evidence-based framework and look-up resource for matching isolation strategies to downstream applications and research questions. Graphical Abstract for Table of Contents O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=147 SRC="FIGDIR/small/710675v1_ufig1.gif" ALT="Figure 1"> View larger version (37K): org.highwire.dtl.DTLVardef@12ad967org.highwire.dtl.DTLVardef@270e4eorg.highwire.dtl.DTLVardef@1c41bcorg.highwire.dtl.DTLVardef@11fb236_HPS_FORMAT_FIGEXP M_FIG C_FIG This study evaluated 11 extracellular vesicle (EV) isolation methods which enriched distinct EV subpopulations with varying degrees of contaminants. No single approach optimized purity or proteome coverage; in this paper we present an Evidence-Based Framework to select plasma EV isolation methods based on downstream application needs.

5
Carafe2 enables high quality in silico spectral library generation for timsTOF data-independent acquisition proteomics

Wen, B.; Paez, J. S.; Hsu, C.; Canzani, D.; Chang, A. T.; Shulman, N.; MacLean, B. X.; Berg, M. D.; Villen, J.; Fondrie, W.; Pino, L.; MacCoss, M. J.; Noble, W. S.

2026-03-31 bioinformatics 10.64898/2026.03.27.714846 medRxiv
Top 0.1%
8.6%
Show abstract

Data-independent acquisition (DIA) proteomics enables reproducible and systematic peptide detection and quantification, and trapped ion mobility spectrometry (TIMS) on the timsTOF platform further improves DIA by synchronizing ion mobility separation with quadrupole precursor sampling. Analyzing the highly multiplexed spectra generated by DIA typically relies on spectral libraries, and fully leveraging the additional ion mobility dimension requires these libraries to include accurate retention time, fragment ion intensity, and ion mobility annotations. Existing in silico spectral library generation tools either lack ion mobility support entirely or rely on models trained on data-dependent acquisition (DDA) data, that can introduce a mismatch that may not capture unique experiment-specific biases when applied to each respective timsTOF dataset. Carafe is a software tool that uses deep learning models to generate high-quality, experiment-specific in silico libraries by training directly on DIA data. In this study, we extend Carafe to generate libraries for timsTOF DIA data, which involves fine-tuning retention time (RT), fragment ion intensity, and ion mobility prediction models using timsTOF DIA data. Carafe2 operates directly on native timsTOF raw data (Bruker .d directories) without the need for data conversion. We demonstrate the performance of Carafe2 across a wide range of DIA applications, including global proteome, phosphoproteome, and plasma proteome datasets. Comparing Carafe2 fine-tuned RT, fragment ion intensity, and ion mobility prediction models with pretrained DDA models, we find that Carafe2 models outperform pretrained models on a variety of DIA datasets. We then demonstrate the utility of in silico libraries generated by Carafe2 for peptide detection on several different types of timsTOF DIA datasets by comparing with the libraries generated with DDA-trained AlphaPeptDeep models, DIA-NN built-in models, and empirical spectral libraries generated from DDA experiments.

6
Importance of taking Single Amino Acid Variant and accessory proteome variability into account in Data Independent Acquisition Proteomics: illustrated with Legionella pneumophila analysis

Dupas, A.; Ibranosyan, M.; Ginevra, C.; Jarraud, S.; Lemoine, J.

2026-04-03 bioinformatics 10.64898/2026.04.01.715759 medRxiv
Top 0.1%
8.6%
Show abstract

Understanding allelic variability is crucial for elucidating intrinsic bacterial mechanisms and distinguishing phenotypic profiles. However, such variability poses a major challenge for the reliable identification of proteins in data-independent acquisition (DIA) proteomics. To address this, we developed an analytical workflow that integrates protein sequence variability to enhance proteome coverage. Fifteen Legionella pneumophila isolates were analyzed using DIA-NN, with spectral libraries generated either from a reference proteome or incorporating allelic variability. Our workflow includes protein clustering and subsequent protein inference from these clusters, allowing the accurate assignment of shared and variant-specific peptides. Integration of variability enabled the identification of a comparable number of proteins as the reference proteome while capturing between 28 and 77 % of variant-specific sequences in each isolate, all while maintaining a low false positive rate. These findings demonstrate that accounting for allelic variability substantially improves proteomic coverage and identification confidence, providing a more comprehensive view of the proteome. This approach facilitates a deeper understanding of biological mechanisms and enables precise bacterial proteotyping of Legionella pneumophila isolates.

7
BioTrendFinder - an interactive web tool for exploring functional drivers in gene- and protein-level bulk omics data

Gronning, A. G. B.; Scheele, C.

2026-04-14 bioinformatics 10.64898/2026.04.12.717932 medRxiv
Top 0.1%
7.7%
Show abstract

The analysis of bulk omics data, such as RNA-seq and proteomics, has enabled numerous biological discoveries. Standard analytical workflows typically comprise dimensionality reduction, group-wise statistical comparisons, functional enrichment analysis, and mapping of molecules to biological networks. Although informative, these steps are often applied independently, limiting integrative interpretation and the efficient identification of functional drivers and candidate targets. To address these limitations, we developed BioTrendFinder, an interactive web tool for exploring functional drivers in gene- and protein-level bulk omics data. BioTrendFinder employs a sample-ranking strategy to identify significant molecular trendlines that capture expression patterns across ranked sample compositions in dimensionally reduced data. These trends are integrated with statistical results, sample-group metadata and functional information from STRING and eleven bio-ontologies, enabling interactive network-based exploration and the generation of entity-ranked functional modules. BioTrendFinders unique approach and functionalities add additional analytical dimensions to bulk omics data by facilitating the extraction of high-level information from alternative analytical perspectives. Using previously published proteomics and transcriptomics datasets, we demonstrate that BioTrendFinder supports both hypothesis-driven and exploratory investigations, enabling the prioritization of candidate molecular targets and effectively narrowing the search space for downstream validation steps.

8
De-N-glycosylation of in vivo and in vitro adipogenic stem cell products unmasks differential expression of CD36 glycoprotein in human adipogenesis

Wongtrakul-Kish, K.; Herbert, B. R.; Haynes, P. A.; Packer, N. H.

2026-05-05 cell biology 10.64898/2026.05.01.722121 medRxiv
Top 0.1%
7.3%
Show abstract

Adipogenesis is the process of adipose-derived stem cells (ADSCs) responding to extracellular signals from the stem cell niche to differentiate into adipocytes (fat cells) and may be studied in vitro using a cocktail of chemicals that promote adipogenic differentiation to produce differentiated ADSCs (dADSCs). The global membrane N- and O-glycosylation changes of this process have been previously analysed and compared to native adipocytes as a benchmark for a true adipocyte profile, and revealed that bisecting GlcNAc type N-glycans are characteristic of adipogenesis. As stem cell differentiation has been widely reported to result in cellular protein changes, the same cells (ADSCs, dADSCs and mature adipocytes) were characterised for their membrane proteome here using label-free quantitative shotgun proteomics analysis. The membrane proteome displayed more differences in protein numbers between the cell types compared to the previously reported N-glycome which had shown high identical glycomes between stem cells and in vitro dADSCs, suggesting that the proteome is more dynamic during in vitro adipogenesis. Following the global shotgun proteomics analysis, a more targeted approach of carrying out proteomic analysis of de-N-glycosylated peptides of gel-separated proteins unearthed new glycoproteins not detected in the shotgun proteomic analysis. This approach identified the adipogenic marker, CD36, to be under-represented in the shotgun proteome analysis, but as the dominant (glyco)protein in the adipocyte membrane proteome that was also up-regulated at the mRNA transcript level in both the in vitro differentiated ADSCs (7.1-fold increase) and mature adipocytes (102.9-fold increase). A comparison of CD36 sequence coverage in the global shotgun analysis with the de-N-glycosylated CD36 revealed a 41% increase when N-glycans were removed prior to trypsin digestion, explaining its observed increased abundance and highlights the crucial need for de-N-glycosylation of proteins in proteomics experiments for increased identification of glycoproteins. The systems glycobiology approach by the integration of previously reported glycomics data and the proteomics and transcriptomics analyses in this work extended the investigation of membrane protein glycosylation changes in adipose-derived stem cell differentiation. The work provides a framework for future glycoproteomics-based investigations into the differentiation of stem cells into adipocytes, and will allow their related pathologies and potential therapeutic applications to be discovered. GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=121 SRC="FIGDIR/small/722121v1_ufig1.gif" ALT="Figure 1"> View larger version (44K): org.highwire.dtl.DTLVardef@189a786org.highwire.dtl.DTLVardef@5563b8org.highwire.dtl.DTLVardef@5cb5borg.highwire.dtl.DTLVardef@69e11f_HPS_FORMAT_FIGEXP M_FIG C_FIG

9
A comparison of four proteomics software for hair proteome analyses

Mukonyora, M.

2026-04-20 bioinformatics 10.64898/2026.04.17.719199 medRxiv
Top 0.1%
7.1%
Show abstract

1.1Hair has applications in biomarker discovery and forensics, yet the influence of proteomics software tools on hair proteome characterisation remains underexplored. This study compares four bottom-up proteomics workflows (MaxQuant, FragPipe, MetaMorpheus, and SearchGUI/PeptideShaker). Publicly available hair proteomes were analysed following extraction with 1-dodecyl-3-methylimidazolium chloride (DMC), sodium dodecanoate (SDD), sodium dodecyl sulfate (SDS), and urea. Data were acquired on Orbitrap-based DDA platforms. Peptide identification, protein inference, functional annotation, physicochemical properties, and label-free quantification (LFQ) were evaluated. Peptide-level performance differed across tools. MS-GF+ and FragPipe identified the most unique peptides, while X!Tandem reported the fewest. Protein inference showed a dissociation from peptide-level results. MetaMorpheus reported the highest number of protein groups despite only the third highest peptide counts. FragPipe and MaxQuant followed, while PeptideShaker consistently inferred the fewest proteins. Protein-level concordance was low, with only 30.3% overlap across tools and extraction methods. These differences extended to downstream analyses. Functional enrichment showed moderate concordance (38.25% overlap). Physicochemical profiles varied, with MetaMorpheus identifying more hydrophobic proteomes and PeptideShaker more hydrophilic profiles. At the quantitative level, reproducibility depended on extraction buffer. SDS and urea showed lower variability (CV =< 0.025), while DMC and SDD showed higher variability (up to 0.10). Absolute LFQ intensities and differential expression outputs varied across tools despite moderate to strong correlation (r = 0.77 to 0.93). Overall, software choice influences proteome coverage, physicochemical profiles, and quantitative outcomes. Relative trends were partially conserved, but magnitude and significance varied. These findings support careful method selection and multi-tool validation in hair proteomics

10
DIA-NN EasyFilter workflow for the fast and user-friendly critical assessment and visualization of DIA-NN proteomics analysis outcome

Moagi, M. G.; Thatiana, F. F.; Kristof, E. K.; Arda, A. G.; Arianti, R.; Horvatovich, P.; Csosz, E.

2026-03-10 bioinformatics 10.64898/2026.03.07.710308 medRxiv
Top 0.1%
6.6%
Show abstract

Liquid chromatography-tandem mass spectrometry (LC-MS/MS) based proteomics, particularly data-independent acquisition (DIA), has become widely adopted across in One Health approaches for biological and clinical research for quantitative protein characterization. Among the many computational tools available, DIA-NN has demonstrated superior performance; however, the primary output of the current versions is provided as a compact, compressed PARQUET file that can be difficult to interrogate without programming expertise. To address this limitation, we developed DIA-NN EasyFilter (DEF), a fast, user-friendly, KNIME-based workflow for comprehensive protein filtering, and visualization. DEF integrates chromatographic peak-based filtering, curated contaminant libraries, and quantity-quality assessment, along with interactive modules for qualitative and quantitative data exploration. The workflow is optimized for efficient execution within the KNIME local desktop environment and is designed to support end-users in improving accuracy and interpretability without requiring coding skills. We provide detailed description on how to run DEF and demonstrate the utility and robustness of DEF using published large-scale proteomics datasets, showing high comparability across studies regardless of instrument platform or dataset size. Table of Contents graphic O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=194 SRC="FIGDIR/small/710308v1_ufig1.gif" ALT="Figure 1"> View larger version (35K): org.highwire.dtl.DTLVardef@ce9f1dorg.highwire.dtl.DTLVardef@13042faorg.highwire.dtl.DTLVardef@17d3907org.highwire.dtl.DTLVardef@2b3aee_HPS_FORMAT_FIGEXP M_FIG C_FIG

11
Proteome landscape of B-cell malignancies identifies mantle cell lymphoma protein signature

Swenson, S. A.; Winship, C. B.; Dobish, K. K.; Wittorf, K. J.; Law, H. C.; Vose, J. M.; Greiner, T.; Green, M. R.; Woods, N. T. R.; Buckley, S. M.

2026-03-05 cancer biology 10.64898/2026.03.02.709116 medRxiv
Top 0.1%
6.4%
Show abstract

Mantle cell lymphoma (MCL) is one of the deadliest forms of Non-Hodgkins B-cell lymphoma. Typically, patients present with both overexpression of CyclinD1 and secondary mutations identified by genomic sequencing. Although MCL patients may initially respond to treatment, they eventually relapse and succumb to disease, highlighting the essential need to identify new targets for treatment. Here we performed proteomic profiling of healthy B cells and three different forms of B-cell malignancies, including MCL, to define the proteomic signature of MCL. We compared the proteome of each to MCL and identified 10 proteins that are specifically upregulated in MCL. Of these 10 proteins, seven of them show no transcriptional changes and have been overlooked by conventional RNA expression analysis. Further analysis of the proteomic signature reveals potential avenues for dual targeting in CAR T-cell therapy and provides guidance for personalized therapeutics based on protein expression. STATEMENT OF SIGNIFICANCEWe present a resource defining the protein landscape of MCL, CLL, and FL as compared to healthy b cells identified utilizing quantitative proteomics from primary patient samples. Applied to MCL, our results identify 10 proteins specifically upregulated in MCL that may prove to be therapeutic targets to treat the disease.

12
Trypsin exhibits exopeptidase-like activity toward N-terminal arginine that biases proteomic analyses

Ambrose, E. A.; Kandasamy, G.; Meulener, M. M.; Zhang, F.

2026-05-16 biochemistry 10.64898/2026.05.15.725550 medRxiv
Top 0.1%
5.0%
Show abstract

Many proteomics protocols rely on enzymatic digestion of complex protein mixtures to generate peptides with predictable cleavage patterns for the mass spectrometry analysis. One of the most utilized enzymes, trypsin, is classically defined as a serine endopeptidase with high specificity for cleaving peptide bonds on the C-terminal side of internal lysine and arginine residues. Accordingly, trypsin is not expected to remove the N-terminal arginine, which may arise through posttranslational modification such as arginylation or by proteolysis exposing internal residues as the new N-termini. N-terminal arginine plays important biological roles, including functioning as an N-degron and modulating protein interactions/signaling through its positive charge. Curiously, prior mass spectrometry-based studies utilizing trypsin to identify proteins bearing N-terminal arginine have frequently reported low and inconsistent yields, suggesting potential systematic bias in current proteomic approaches. Here, we explored whether trypsin would affect the integrity of the N-terminal arginine. By using antibodies specifically recognizing N-terminal arginine of different peptides, and by using mass spectrometry peptide analysis, we show that trypsin can remove N-terminal arginine residues in an exopeptidase-like manner. This effect occurs across a range of digestion conditions consistent with standard proteomic workflows, on peptides or whole proteins, and depends on trypsin concentration, incubation time, and catalytic activity. In addition, we show that the alternative arginine-cleavage enzyme Arg-C can also affect N-terminal arginine in a sequence-dependent context. In contrast, Lys-C and LysargiNase do not exhibit such effects, providing suitable alternative digestion strategies. Together, these findings reveal an unappreciated enzymatic behavior of arginine-cleaving proteases and suggest that their widespread use may systematically compromise the detection of N-terminal arginine in proteomic studies.

13
Proteomics for cultivated meat: the importance of Analytical Standardization

Palma, J.; Leblanc, C. C.; Kusters, R.; Kamgang Nzekoue, A. F.

2026-03-25 systems biology 10.64898/2026.03.23.713501 medRxiv
Top 0.1%
4.8%
Show abstract

Cultivated meat production requires robust and validated analytical methods for comprehensive characterization. While transcriptomics-based approaches establish the foundational profile of molecular analysis, proteomics provides additional resolution that further enhances scientific certainty in both product development and safety characterization. However, the industry adoption of proteomics is currently hindered by technical complexity and a critical lack of analytical standardization, which leads to significant workflow-dependent variations in proteome coverage. To address this gap, we investigated the influence of key workflow steps (digestion, cleanup, LC-MS conditions) on the proteome profile of cultivated duck biomass. We compared five bottom-up sample preparation protocols - two traditional in-solution options (urea and SDC-based protocols), two device-based approaches (PreOmics iST and EasyPep kits), and an innovative protocol (SPEED), and demonstrated that device-based protocols offered the highest peptide yield and proteome coverage. However, optimization allowed cost-effective in-solution methods to achieve comparable performance. Specifically, an optimal digestion time of 3 hours at 37{degrees}C and the use of polymer-based desalting columns significantly enhanced protein identification ([~]4500 - 5000 IDs). Moreover, data independent acquisition (DIA) provided deeper proteome coverage than data dependent acquisition (DDA) with higher precision ([~]6500 vs 5000 IDs). The validated Standard Operating Procedures presented here establish a standardized framework for bulk bottom-up proteomics in cultivated meat, facilitating the generation of reliable and comparable data required for robust multi-omics characterization. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=80 SRC="FIGDIR/small/713501v1_ufig1.gif" ALT="Figure 1"> View larger version (32K): org.highwire.dtl.DTLVardef@5b61b8org.highwire.dtl.DTLVardef@16c7e65org.highwire.dtl.DTLVardef@1de21d2org.highwire.dtl.DTLVardef@7e984a_HPS_FORMAT_FIGEXP M_FIG C_FIG HighlightsO_LIComplexity and non-standardization limit MS-proteomics use in cultivated meat (CM). C_LIO_LICM protein profile varies with sample prep, LC-MS, and data processing pipeline. C_LIO_LIDevice-based and optimized cost-effective protocols offer a high proteome coverage. C_LIO_LIProteomics can complement transcriptomics for a comprehensive CM characterization. C_LIO_LIProposed standardized methods ensure reliable data for future regulatory submissions. C_LI

14
From variability to consensus: rescoring harmonizes peptide identification across diverse search engines and datasets

Winkelhardt, D.; Berres, S.; Uszkoreit, J.

2026-03-06 bioinformatics 10.64898/2026.03.04.709532 medRxiv
Top 0.1%
4.6%
Show abstract

Peptide-spectrum match (PSM) rescoring has become standard in proteomics workflows, improving peptide identification accuracy across diverse search engines. Despite the availability of multiple rescoring strategies, systematic comparisons spanning several search engines, datasets, and database configurations remain limited. Here, we benchmarked seven publicly available search engines, evaluating standard target-decoy-based false discovery rate (FDR) estimation alongside Percolator, MS2Rescore, and Oktoberfest across four datasets acquired on different mass spectrometry platforms and searched against protein databases of varying size and composition. Rescoring substantially increased identification consensus and reduced variability between search engines, with prediction-based approaches yielding the largest gains. While database size had limited impact for human datasets, it significantly affected identification rates on a metaproteomic dataset. Entrapment-based evaluation indicated generally adequate FDR control across methods, although prediction-based rescoring exhibited a slightly higher tendency toward FDR underestimation in specific configurations. Overall, advanced rescoring strategies harmonize peptide identification outcomes across search engines, thereby enhancing robustness and comparability in proteomics analyses. However, careful feature selection and appropriate database choice remain essential to ensure reliable FDR control and optimal performance across diverse experimental settings.

15
Evaluation of deep and dynamic proteomic screening strategies at sub-50Hz scan rate and without automation.

Parmar, B.; Liu, Y.; Ghezellou, P.; Muench, C.

2026-04-20 molecular biology 10.64898/2026.04.15.718630 medRxiv
Top 0.1%
4.5%
Show abstract

Advances in ultra-fast mass analyzer technology and procedural automation have enabled proteomics screening at the throughput of hundreds of proteomes per day. However, these approaches often require expensive instrumentation upgrades and robotic automation that remain inaccessible to many research laboratories and core facilities. In this study we address the feasibility of scaling up proteomic screening capabilities with minimal upgrade cost by focusing on (a) strategies for non-automated high-throughput sample preparation from 96-well cell culture, (b) data acquisition on sub-50Hz scan speed hybrid and tribrid Orbitrap instruments and (c) data analysis strategies for label-free and labeled proteomic screening. We find that the 96-well format STrap, in combination with C18 plates, provides the most robust throughput for a non-automated sample preparation workflow. Furthermore, we show that for static proteomes, an isobaric tandem mass tag-based (TMT) multiplexing approach provides deeper and more precise proteome coverage whereas label-free data-independent acquisition (DIA) is more accurate, albeit with a reduced dynamic range and more missing values. Finally, we extend the optimized workflow to proteome turnover studies using pulsed stable isotope labeling by amino acids in cell culture (pSILAC), highlighting the key advantages and trade-offs of DIA and TMT data-dependent acquisition strategies for capturing protein translation. Together, these results provide a practical framework for designing high-throughput proteomics experiments that balance throughput, depth, and quantitative accuracy using existing instrumentation, without requiring major hardware upgrades or automation.

16
LAMPrEY: a Python-based automated quality control tool for large-scale proteomics datasets

Valdes-Tresanco, M. E.; Wacker, S.; Valdes-Tresanco, M. S.; Plakhotnyk, A.; Brodie, N. I.; Hepburn, M.; Ulke-Lemee, A.; Huttlin, E. L.; Lewis, I. A.

2026-05-11 bioinformatics 10.64898/2026.05.06.722826 medRxiv
Top 0.1%
4.4%
Show abstract

Over the past years, proteomics has moved increasingly towards the analysis of large cohorts of biological specimens. This has been made possible by significant improvements in mass spectrometry technology, chromatographic separation methods, and improved data acquisition strategies. These technological advances now routinely enable experiments that yield vast datasets that substantially outstrip the capacity of existing proteomics data analysis approaches. Processing such large datasets requires purpose-built, quality control tools designed to organize and analyze the data while recording all processing parameters for reproducibility. To address this need, we developed an open-source, Python-based software platform, Large-scale Automated Multi-level Proteomics Evaluation by Python (LAMPrEY), a comprehensive quality-control pipeline for quantitative proteomics analyses of large cohorts of samples. LAMPrEY features GUI-based file submission, automated processing with MaxQuant and RawTools, an interactive analytics dashboard, and an application programming interface (API) for programmatic usage that collectively enable rapid, reproducible analysis and interpretation of proteomics data. We demonstrate the longitudinal monitoring and analytical capabilities of LAMPrEY using TMT11 quantitative proteomics data generated from 910 Enterococcus faecium isolates collected from bloodstream infection patients. LAMPrEY is an open-source software that can be accessed at www.lewisresearchgroup.org/software.

17
Comparative proteomics reveals a conserved core of tegumental proteins in parasitic flatworms.

Guarnaschelli, I.; Lima, A.; Velazco, R.; Bergmann, M.; Preza, M.; Calvelo, J.; Cucher, M.; Rosenzvit, M. C.; Brehm, K.; Iriarte, A.; Koziol, U.

2026-04-24 cell biology 10.64898/2026.04.22.720116 medRxiv
Top 0.1%
4.4%
Show abstract

Parasitic flatworms, including cestodes and trematodes, are covered by a specialized syncytial tegument that mediates nutrient uptake and host-parasite interactions. While the tegument of trematodes has been extensively characterized, its molecular composition in cestodes remains largely unknown. In this work, we performed a comparative proteomic analysis of the tegument of three cestode species, including larval and adult stages: Hymenolepis microstoma, Mesocestoides corti (syn. M. vogae) and Echinococcus multilocularis. Using stringent enrichment criteria relative to whole-worm extracts, we identified hundreds of tegument-enriched proteins in each species. Comparative analyses revealed a conserved core of tegumental proteins shared among all three species, including members of the Tegument Allergen-Like (TAL) family, vesicular trafficking components and calcium-sensing proteins, and identified candidates for nutrient uptake activities such as glucose and nucleoside transporters. Further comparative analyses revealed a set of shared tegumental proteins with the trematode Schistosoma mansoni, including conserved proteins that are specific to parasitic flatworms, supporting the existence of a conserved ancestral tegumental proteome. Finally, we confirmed tegumental expression of several candidate genes in H. microstoma and E. multilocularis, and demonstrated regionally restricted gene expression among tegumental cytons, suggesting functional specialization within the syncytial tegument. Altogether, these results reveal an evolutionarily conserved composition of the tegument of parasitic flatworms, providing a foundation for future work targeting this critical host-parasite interface.

18
Integrated Analysis of HeberFERON-Driven Comparative Proteomic regulation in Glioblastoma Cells U-87MG

Vazquez-Blomquist, D.; Besada, V.; Miranda, J.; Ramos, Y.; Palomares, C. S.; Guirola, O.; Bringas, R.; Vonasek, E.; Gil, Y.; Perez, W.; Diaz, T.; Quinones-Vega, M.; Gonzalez, L. J.; Bello-Rivero, I.

2026-04-24 cancer biology 10.64898/2026.04.22.720155 medRxiv
Top 0.1%
4.3%
Show abstract

Glioblastoma is a very aggressive brain tumor with few therapeutics options. Type I and II Interferons (IFNs) co-formulation HeberFERON has been used in cancer treatment, with promising results in high grade brain tumors. High throughput techniques in easy-to-handle models have been important to interrogate biomolecules changes, describe mechanisms and find pharmacodynamic biomarkers. This study aims to elucidate the effect of HeberFERON over the cell proteome in comparison to its individual IFNs components. Proteomic changes with HeberFERON in the glioblastoma-derived cell line U-87MG, in comparison with individual IFN-2b and IFN-{gamma}, were studied using a nanoLC instrument EasyLC coupled to Velos Pro mass spectrometer; Maxquant and Perseus were also used. Several enrichment tools, networking analysis and canSAR for drug targets were employed. Translation, RNA processing, mitotic cell cycle, cytoskeleton and chromosome organization, apoptosis, autophagy, DNA repair are enriched to limit cellular growing together with changes in immune response components, supporting HeberFERON as a multitarget treatment. This co-formulation is distinguished at modulating RNA splicing with SMN complex, cytoskeleton organization and microtubule-based movement, nuclear envelope breakdown, DNA conformational changes, and oxidative phosphorylation, with a better drawing of effects over a variety of systems inside the tumoral cell. Together with previous microarray experiment, informative genes and proteins as pharmacodynamic biomarkers for antiproliferative effects showed up (ex. STAT1/2, CENPE, ATRIP, MAP1B, LIMA1, VCP, several ribosomal, spliceosome and proteasomal complexes proteins). This study complements transcriptomic and phosphoproteomic previous experiments in this model and underscore HeberFERON as a glioblastoma therapeutic.

19
Statistical Principles Define an Open-Source Differential Analysis Workflow for Mass Spectrometry Imaging Experiments with Complex Designs

Rogers, E. B. T.; Lakkimsetty, S. S.; Bemis, K. A.; Schurman, C. A.; Angel, P. A.; Schilling, B.; Vitek, O.

2026-04-10 bioinformatics 10.64898/2026.04.08.717212 medRxiv
Top 0.1%
4.2%
Show abstract

Mass spectrometry imaging (MSI) characterizes the spatial heterogeneity of molecular abundances in biological samples. Experiments with complex designs, involving multiple conditions and multiple samples, provide particularly useful insight into differential abundance of analytes. However, analyses of these experiments require attention to details such as signal processing, selection of regions of interest, and statistical methodology. This manuscript contributes a statistical analysis workflow for detecting differentially abundant analytes in MSI experiments with complex designs. Using a case study of histologic samples of human tibial plateaus from knees of osteoarthritis patients and cadaveric controls, as well as simulated datasets, we illustrate the impact of the analysis decisions. We illustrate the importance of signal processing and feature aggregation for preserving biological relevance and alleviating the stringency of multiple testing. We further demonstrate the importance of selecting regions of interest in ways that are compatible with differential analysis. Finally, we contrast several common statistical models for differential analysis, showcase the appropriate use of replication, and demonstrate model-based calculation of sample size for followup investigations. The discussion is accompanied by detailed recommendations and an open-source R-based implementation that can be followed by other investigations.

20
Reference-Based Library Construction Improves Performance in low-input diaPASEF Workflows

Charkow, J.; Ghaznavi, M.; Seale, B.; Peng, J.; Gingras, A.-C.; Rost, H.

2026-05-04 bioinformatics 10.64898/2026.04.29.721088 medRxiv
Top 0.1%
4.0%
Show abstract

In low input mass spectrometry-based proteomics, Data Independent Acquisition (DIA), including diaPASEF, is quickly becoming the method of choice for label free quantification. Whether using empirical or in silico spectral libraries, performance is dependent on the library; however, the optimal library construction strategy for low input proteomics remains an open question. To address this, we examine and develop library construction approaches that are compatible with both spectrum-centric and peptide-centric analysis workflows. These approaches leverage a closely related, high-quality sample to improve library quality. First, we validated our approach in bulk sample amounts where we observed that the effects of gas-phase fractionation based library construction is dependent on the software framework, with improvements more pronounced in OpenSWATH compared to DIA-NN. In OpenSWATH, our peptide-centric library reconstruction workflow consistently outperforms a transfer learning strategy, an emerging alternative approach. In DIA-NN, trends are dependent on library source highlighting OpenSWATHs stronger dependence on the search space. In low-input applications, such as single-cell-equivalent injection amounts (100 pg) of HeLa cell digest on a timsTOF SCP, our library construction approach provided more pronounced improvements across both software tools compared to bulk samples. Using a peptide-centric reconstruction approach with the OpenSWATH analysis framework, we detected over 15,000 peptide precursors (2480 protein groups), a 90% improvement over the original library. Furthermore, using a spectrum-centric construction approach, peptide precursor identification rates improved over 6-fold ([~]1000 to [~]6000). Our strategy provides a practical solution for generating high-quality libraries in low-input applications.