PROTEOMICS
○ Wiley
Preprints posted in the last 30 days, ranked by how well they match PROTEOMICS's content profile, based on 35 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.
Wen, B.; Paez, J. S.; Hsu, C.; Canzani, D.; Chang, A. T.; Shulman, N.; MacLean, B. X.; Berg, M. D.; Villen, J.; Fondrie, W.; Pino, L.; MacCoss, M. J.; Noble, W. S.
Show abstract
Data-independent acquisition (DIA) proteomics enables reproducible and systematic peptide detection and quantification, and trapped ion mobility spectrometry (TIMS) on the timsTOF platform further improves DIA by synchronizing ion mobility separation with quadrupole precursor sampling. Analyzing the highly multiplexed spectra generated by DIA typically relies on spectral libraries, and fully leveraging the additional ion mobility dimension requires these libraries to include accurate retention time, fragment ion intensity, and ion mobility annotations. Existing in silico spectral library generation tools either lack ion mobility support entirely or rely on models trained on data-dependent acquisition (DDA) data, that can introduce a mismatch that may not capture unique experiment-specific biases when applied to each respective timsTOF dataset. Carafe is a software tool that uses deep learning models to generate high-quality, experiment-specific in silico libraries by training directly on DIA data. In this study, we extend Carafe to generate libraries for timsTOF DIA data, which involves fine-tuning retention time (RT), fragment ion intensity, and ion mobility prediction models using timsTOF DIA data. Carafe2 operates directly on native timsTOF raw data (Bruker .d directories) without the need for data conversion. We demonstrate the performance of Carafe2 across a wide range of DIA applications, including global proteome, phosphoproteome, and plasma proteome datasets. Comparing Carafe2 fine-tuned RT, fragment ion intensity, and ion mobility prediction models with pretrained DDA models, we find that Carafe2 models outperform pretrained models on a variety of DIA datasets. We then demonstrate the utility of in silico libraries generated by Carafe2 for peptide detection on several different types of timsTOF DIA datasets by comparing with the libraries generated with DDA-trained AlphaPeptDeep models, DIA-NN built-in models, and empirical spectral libraries generated from DDA experiments.
Dupas, A.; Ibranosyan, M.; Ginevra, C.; Jarraud, S.; Lemoine, J.
Show abstract
Understanding allelic variability is crucial for elucidating intrinsic bacterial mechanisms and distinguishing phenotypic profiles. However, such variability poses a major challenge for the reliable identification of proteins in data-independent acquisition (DIA) proteomics. To address this, we developed an analytical workflow that integrates protein sequence variability to enhance proteome coverage. Fifteen Legionella pneumophila isolates were analyzed using DIA-NN, with spectral libraries generated either from a reference proteome or incorporating allelic variability. Our workflow includes protein clustering and subsequent protein inference from these clusters, allowing the accurate assignment of shared and variant-specific peptides. Integration of variability enabled the identification of a comparable number of proteins as the reference proteome while capturing between 28 and 77 % of variant-specific sequences in each isolate, all while maintaining a low false positive rate. These findings demonstrate that accounting for allelic variability substantially improves proteomic coverage and identification confidence, providing a more comprehensive view of the proteome. This approach facilitates a deeper understanding of biological mechanisms and enables precise bacterial proteotyping of Legionella pneumophila isolates.
Palma, J.; Leblanc, C. C.; Kusters, R.; Kamgang Nzekoue, A. F.
Show abstract
Cultivated meat production requires robust and validated analytical methods for comprehensive characterization. While transcriptomics-based approaches establish the foundational profile of molecular analysis, proteomics provides additional resolution that further enhances scientific certainty in both product development and safety characterization. However, the industry adoption of proteomics is currently hindered by technical complexity and a critical lack of analytical standardization, which leads to significant workflow-dependent variations in proteome coverage. To address this gap, we investigated the influence of key workflow steps (digestion, cleanup, LC-MS conditions) on the proteome profile of cultivated duck biomass. We compared five bottom-up sample preparation protocols - two traditional in-solution options (urea and SDC-based protocols), two device-based approaches (PreOmics iST and EasyPep kits), and an innovative protocol (SPEED), and demonstrated that device-based protocols offered the highest peptide yield and proteome coverage. However, optimization allowed cost-effective in-solution methods to achieve comparable performance. Specifically, an optimal digestion time of 3 hours at 37{degrees}C and the use of polymer-based desalting columns significantly enhanced protein identification ([~]4500 - 5000 IDs). Moreover, data independent acquisition (DIA) provided deeper proteome coverage than data dependent acquisition (DDA) with higher precision ([~]6500 vs 5000 IDs). The validated Standard Operating Procedures presented here establish a standardized framework for bulk bottom-up proteomics in cultivated meat, facilitating the generation of reliable and comparable data required for robust multi-omics characterization. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=80 SRC="FIGDIR/small/713501v1_ufig1.gif" ALT="Figure 1"> View larger version (32K): org.highwire.dtl.DTLVardef@5b61b8org.highwire.dtl.DTLVardef@16c7e65org.highwire.dtl.DTLVardef@1de21d2org.highwire.dtl.DTLVardef@7e984a_HPS_FORMAT_FIGEXP M_FIG C_FIG HighlightsO_LIComplexity and non-standardization limit MS-proteomics use in cultivated meat (CM). C_LIO_LICM protein profile varies with sample prep, LC-MS, and data processing pipeline. C_LIO_LIDevice-based and optimized cost-effective protocols offer a high proteome coverage. C_LIO_LIProteomics can complement transcriptomics for a comprehensive CM characterization. C_LIO_LIProposed standardized methods ensure reliable data for future regulatory submissions. C_LI
Buur, L. M.; Winkler, S.; Dorfer, V.
Show abstract
Open modification search (OMS) strategies have gained popularity in mass spectrometry-based proteomics for identification of peptides carrying unknown or unexpected post-translational modifications. However, most OMS search engines report only the overall mass difference between the precursor and the matched peptide and do not explicitly identify or score combinations of multiple modifications at the peptide-spectrum match (PSM) level, leaving the interpretation of mass shifts up to the end user and to using downstream analysis tools. Here, we introduce MS Andrea, a novel OMS search engine developed to directly identify and score combinations of up to four variable modifications per peptide without having to predefine them. MS Andrea uses a sequence tag-based strategy to efficiently filter candidate peptides prior to scoring. Remaining candidates are evaluated using the MS Amanda scoring function, first considering fixed modifications only, followed by a second scoring stage in which combinations of modifications from the Unimod database are considered based on the observed mass difference and matched to the spectrum. We evaluated MS Andrea using phosphopeptide datasets from HeLa cells and Arabidopsis thaliana and compared its performance with the widely used OMS engines MSFragger and Sage. Across datasets, MS Andrea identified the highest number of PSMs at 1% false discovery rate while achieving comparable peptide-level identifications. Importantly, MS Andrea directly reports modification identities and sites at the PSM level and enables the identification of peptides having up to four variable modifications. Together, these results demonstrate that MS Andrea facilitates more detailed and interpretable characterization of peptide modifications while maintaining competitive identification performance in OMS-based proteomic analyses. TOC Graphic O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=132 SRC="FIGDIR/small/714851v1_ufig1.gif" ALT="Figure 1"> View larger version (19K): org.highwire.dtl.DTLVardef@52f65forg.highwire.dtl.DTLVardef@acf4e3org.highwire.dtl.DTLVardef@10171caorg.highwire.dtl.DTLVardef@1d594ad_HPS_FORMAT_FIGEXP M_FIG C_FIG
Schramm, T.; Gillet, L.; Reber, V.; de Souza, N.; Gstaiger, M.; Picotti, P.
Show abstract
Peptide-level analyses are becoming increasingly popular in mass spectrometry-based proteomics and are being applied, for example, in immunopeptidomics, structural proteomics, and analyses of post-translational modifications. In such analyses, peptides that are not biologically meaningful but instead arise as artifacts prior to mass spectrometry analysis pose the risk of data misinterpretation. Here, we describe an approach based on retention time analysis and precise chromatographic peak matching to identify peptides generated by in-source fragmentation (ISF), which occurs between chromatographic separation of peptide mixtures and the first mass filter of a tandem mass spectrometer (MS). To understand the prevalence and properties of ISF, we generated 13 proteomics datasets and analyzed them along with additional 25 previously published datasets spanning a broad range of sample types, MS, and proteomics approaches including classical bottom-up proteomics, immunopeptidomics, structural proteomics, and phosphoproteomics. We found that, in typical trypsin-digested samples on average 1 % of fully-tryptic peptides and 22 % of semi-tryptic peptides originated from ISF. However, we observed large variations between datasets, and in-source fragments exceeded, in some cases, a third of the total peptide identifications. The extent of ISF was dependent on the peptide sequence, the instrument, method parameters, and sample complexity. Although ISF did not impair relative quantification across samples, it generated peptides that could be misinterpreted qualitatively, inflated peptide identifications, and comprised up to 37 percent of peptides shorter than 9 amino acids in immunopeptidomics datasets. We propose that, for peptide-centric applications, our open-source ISF detection approach be used to re-annotate peptides generated by ISF and remove them to avoid misinterpretation of data. ISF is an increasing concern with improving mass spectrometers, as they enable detection of an ever-increasing number of m/z features, including low abundance features like ISF products. Our work thus addresses a growing issue in proteomics and presents solutions to mitigate the impact of in-source fragment peptides. In the future, improved feature detection algorithms may enable elucidation of new ISF patterns affecting side chains that have been missed so far, which could contribute to explaining the vast space of as-yet unannotated proteomics data.
Van Leene, C.; Araftpoor, E.; Gevaert, K.
Show abstract
Limited proteolysis coupled to mass spectrometry (LiP-MS) is a peptide-centric conformational proteomics approach during which a brief incubation with a non-specific protease (e.g., proteinase K) under native conditions generates structural fingerprints that report on treatment-induced conformational changes, which is followed by a tryptic digest under denaturing conditions allowing to read out these fingerprints 1. In contrast, the recently introduced peptide-centric local stability assay (PELSA) uses a high trypsin-to-substrate ratio under native conditions to release fully tryptic peptides that reflect structural stability upon ligand binding 2. In their paper, Li et al. compared PELSA and LiP-MS across several benchmarks and reported that PELSA exhibited quantitative sensitivity comparable to or exceeding LiP-MS. Notably, PELSA quantified a 21-fold greater rapamycin-induced change for FKBP1A compared to LiP-MS. Because such claims influence method selection for conformational proteomics, we reanalyzed the publicly deposited datasets underlying these comparisons and assessed the experimental and analytical choices that contributed to the reported effect sizes. Our evaluation indicates that the reported 21-fold difference arises from non-matched experimental conditions and undisclosed data imputation, and that conclusions regarding quantitative superiority or biological interpretability should therefore be treated with caution.
Dahlberg, C. L.; Zinkgraf, M.; Laugesen, S. H.; Soltoft, C. L.; Ginebra, Q.; Bennett, E. P.; Hartmann-Petersen, R.; Ellgaard, L.
Show abstract
The unfolded protein response (UPR) helps reinstate cellular proteostasis upon an accumulation of misfolded proteins in the endoplasmic reticulum (ER), in part through ER-associated degradation (ERAD). Ube2j2 is an ER-localized E2 ubiquitin-conjugating enzyme that participates in ERAD. We used mass spectrometry analysis of cultured U2OS cells to investigate how the loss of Ube2j2 affects the cellular proteome in response to tunicamycin-induced ER stress. We constructed a network of twelve statistically distinct modules of protein abundance profiles across conditions. We describe the Gene Ontology annotations for each module along with the "hub gene" proteins whose abundance levels most closely adhere to each modules protein abundance profile. Our analysis identifies known Ube2j2-associated pathways (e.g., the UPR and ERAD) and cellular functions that were previously unassociated with Ube2j2 (e.g., RNA metabolism, ER-Golgi transport, and cell-cycle progression). These data are available via ProteomeXchange with identifier PXD076153 and provide avenues for further investigation into the cellular functions of Ube2j2 under basal and ER-stressed conditions.
Juarez Guzman, C. A.; Yao, L.; Broeckling, C. D.; Argueso, C. T.
Show abstract
Accurate, simultaneous, and efficient quantification of chemically diverse phytohormone species is a critical task towards understanding the complex system of phytohormone signaling pathways. Quantification of phytohormones with the commonly used technique liquid chromatography coupled to tandem mass spectrometry is susceptible to the influence of non-phytohormone components present in the sample, a phenomenon referred to as matrix effect. To reduce matrix effect, some phytohormone quantification methods include additional steps of cleanup of crude extracts. However, to what extent additional purification steps provide increased accuracy compared to simpler, less laborious methods is seldomly evaluated. We evaluated three previously described phytohormone extraction methods, two of which include solid-phase extraction and one that does not, in their ability to minimize matrix effect and generate accurate estimates of phytohormone species spanning six classifications, from fruit and leaf tissue of Solanum lycopersicum cv. Micro-Tom (tomato). Our results show that, while the methods that included solid phase extraction occasionally outperformed each other regarding matrix effect and/or recovery efficiency for broad range of phytohormones, they rarely outperformed the simpler single-phase extraction method. Short AbstractAccurate, simultaneous quantification of chemically diverse phytohormones by LC-MS/MS is frequently confounded by matrix effects, leading to the incorporation of additional purification steps. We systematically compared three published extraction protocols with or without solid-phase extraction in tomato tissues across six hormone classes. Solid-phase methods occasionally improved matrix suppression or recovery, but did not consistently outperform the single-phase approach, questioning the added value of extra cleanup steps, particularly when high-throughput is desired, as in the case of systems biology interrogations.
Schmollinger, S.; Strenkert, D.; Purvine, S. O.; Nicora, C. D.; Soubeyrand, E.; Basset, G. J.; Merchant, S.
Show abstract
An unbiased, quantitative view of biomolecules in a living cell is a prerequisite for accurate modeling approaches and informs our understanding of cellular metabolism at scale. In this work, we used the total protein approach (TPA), in which the total protein mass of a given proteomics sample is used as a calibrator for absolute protein quantification, to determine protein abundances during the Chlamydomonas reinhardtii diurnal cycle. We use external, independently measured quantitative markers (metals, pigments) to assess the absolute protein abundances in unlabeled whole cell extracts. We calculate protein abundances in fg / cell of 7322 Chlamydomonas proteins, 2266 of which were captured in every time point, including the major proteins involved in the light reactions, photoprotection, proteostasis and fatty acid metabolism during a cell cycle. As expected, Rubisco large and small subunits are present in a 1:1 stoichiometry, with the large subunit being the most abundant protein in our data set, averaging 5.05 x 106 molecules per cell, reflecting 2.7% of the total protein mass. We noticed that PSII is the most abundant complex involved in the light reactions with 2.08 x 106 complexes per cell. PSI averages 1.75 x 106 complexes per cell and cytochrome b6f averages 0.77 x 106 complexes per cell. The TPA is a robust tool to study proteome dynamics quantitatively, while avoiding artefacts due to biochemical fractionation. Our proteome data set with an unprecedented temporal resolution is a valuable resource to assess protein abundances during the cell cycle in the reference alga Chlamydomonas.
Merle, L.; Martin-Jaular, L.; Thery, C.; Joliot, A.
Show abstract
Extracellular vesicles are key intercellular messengers that modulate the function of target cells by carrying effectors, either at their surface or in their lumen. In the latter case, their action depends on the ability to deliver their content into the cytosol of target cells. How efficiently EVs deliver their content upon interaction with their target cell is thus a central question for understanding the functional impact of this mode of action. To address this question, signal-driven bimolecular interactions between two partners located respectively in the EV lumen and the target cell cytosol have become a widely used strategy to detect the cytosolic delivery EV content. However, the detection of cytosolic delivery with these assays was often tributary to the artificial enhancement of the fusion between EV and cell membranes, through for instance VSV-G fusogenic protein expression. Here we provide a robust and quantitative LUCiferase-based complementation assay (HiBiT/LgBiT), to quantify the Internalization and cytosolic Delivery of EV content: LUCID-EV. By optimizing the signal-to-noise ratio of the assay, the method for loading HiBiT fragment into EVs (fusion to a lipid-binding domain rather than to tetraspanins), and the intracellular position of LgBiT (associated to membranes), we could quantify cytosolic delivery from various non-VSV-G-expressing EVs into target immune dendritic cells. Importantly, this delivery did not involve the acidic late endosomes environment required for VSV-G-dependent EV cytosolic delivery. The limited efficacy of the process highlights the need for highly sensitive assays like the one described here. Further development of the LUCID-EV assay could help identifying EV/target cells pairs with enhanced cytosolic delivery properties and characterize the cellular route for delivery.
Singh, P. D.; Nayak, R.; Dittrich, Y.; Guzinski, R.; Pant, Y.; Masakapalli, S. K.
Show abstract
Smart irrigation management is essential for improving crop resilience under increasing drought frequency driven by climate change. Although satellite-based remote sensing provides valuable tools for monitoring crop water status at large spatial scales, its accuracy is often limited in mountainous and heterogeneous agricultural landscapes. In this study, we investigated drought-induced metabolic responses in potato (Solanum tuberosum L.) to identify biochemical biomarkers that could complement satellite-based irrigation advisories in the mid-Himalayan region of India. A field experiment was conducted using a gradient of soil moisture regimes corresponding to moderate (50% field capacity), critical (25% field capacity), and extreme drought stress (5-8% field capacity). Satellite-derived evapotranspiration-based irrigation advisories were validated against in situ soil moisture measurements, revealing discrepancies attributed to the inability of satellite estimates to capture actual water loss under drought stress conditions, highlighting the need for additional ground-truth biomarkers across heterogeneous field conditions. To capture plant-level physiological responses, untargeted metabolite profiling of potato leaves was performed using gas chromatography-mass spectrometry (GC-MS). Approximately fifty metabolites belonging to amino acids, organic acids, sugars, and sugar alcohols were detected. Multivariate statistical analyses revealed distinct metabolic signatures associated with progressive drought stress. Notably, accumulation of proline, serine, isoleucine, sucrose, fructose, glucose, and polyols such as mannitol and myo-inositol reflected key metabolic reprogramming associated with osmoprotection, redox homeostasis, and energy metabolism under drought conditions. Collectively, this ensemble of stress-responsive metabolites represents a robust panel of drought stress biomarkers. As a proof of concept, proline was validated as a qualitative biomarker of plant water status through a rapid and cost-effective colorimetric biochemical assay, demonstrating its practical applicability for field-level irrigation management. These findings demonstrate that metabolomics-derived biomarkers can provide sensitive plant-level indicators of drought stress that complement satellite-based monitoring systems. The integration of biochemical diagnostics with remote sensing platforms offers a promising approach for improving drought detection and developing low-cost, field-deployable tools for smart irrigation advisories in heterogeneous agricultural landscapes. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=93 SRC="FIGDIR/small/712810v1_ufig1.gif" ALT="Figure 1"> View larger version (33K): org.highwire.dtl.DTLVardef@59919corg.highwire.dtl.DTLVardef@66ce49org.highwire.dtl.DTLVardef@17143dcorg.highwire.dtl.DTLVardef@11e2769_HPS_FORMAT_FIGEXP M_FIG C_FIG
Thang, N. X.; Martiensen, E. L. B.; Abdelhalim, M.; Tran, T. T.; Ledsaak, M.; Rogne, M.; Thiede, B.; Eskeland, R.
Show abstract
Osteosarcoma (OS) is an aggressive bone cancer that most commonly affects children and young adults. OS exhibits a high degree of genomic complexity, as well as cellular plasticity, and dynamic transcriptional regulation is suggested to contribute to treatment resistance and metastasis. Cell lines are well characterized as models to advance our knowledge on OS biology. HOS and U2OS cells have increased invasiveness and higher migratory ability compared with MG63. In this study, we employed a tandem array of consensus transcription factor response elements (catTFREs) proteomic approach to characterize transcription factor (TF) regulatory networks related to OS aggressiveness. We mapped 7,594 proteins and enriched 352 transcription factors and coregulators. When we integrated proteomics with cell line specific gene expression and chromatin accessibility we classified the proteins into different OS cell line dependent sub-clusters and identified TFs and coregulators common for all cell lines and specific for individual cell lines. We demonstrate that RUNX2, MYBL2 and HMGA2 are specifically enriched in HOS and U2OS and may be linked to the cell aggressiveness. ETV5, JUNB, NFIX and ZEB1 were among TFs specific to MG63. Our analysis provides a more comprehensive understanding of the transcriptional drivers that shape OS regulatory landscapes and may have future therapeutic implications.
Torrejon, E.; Sleegers, J.; Matthiesen, R.; Macedo, M. P.; Baudot, A.; Machado de Oliveira, R.
Show abstract
SummaryExtracellular vesicles (EVs) are bilayer vesicles that carry a diverse cargo of molecules, such as nucleic acids, proteins and metabolites. These EVs can be transported throughout the organism to specific recipient tissues. For this reason, EVs have been recognized as pivotal mediators of cell-to-cell communication (CCC). Importantly, alterations in EV-mediated communication have been linked to pathological processes, further highlighting their biological relevance. However, the in silico exploration of the functional effects of EV cargo in recipient tissues remains limited due to the lack of dedicated tools that can be applied to EV omics datasets. Most current bioinformatics tools for assessing CCC rely on ligand-mediated communication and therefore cannot be used to explore EV-mediated communication. To address this gap, we developed EV-Net, a bioinformatics tool designed to explore the effects of EV cargo on recipient tissues. EV-Net was built by adapting NicheNet, a CCC bioinformatics tool that relies on ligand-receptor mediated communication, for the analysis of EVs proteomics and RNA-seq data. The EV-Net framework enables the identification and prioritization of EV cargo molecules with high regulatory potential in a recipient tissue of interest. This prioritization facilitates the systematic translation of EV cargo profiles into testable biological hypotheses. Availability and documentationThe source code of EV-Net is stored in GitHub https://github.com/torrejoNia/EV-Net alongside instructions on how to install it. Comprehensive tutorials and additional documentation are available at https://torrejonia.github.io/EV-Net/. The datasets used in the use cases were deposited in Zenodo. The corresponding Zenodo links are provided in the tutorials for each use case. This software is distributed under a GLP3 licence.
Awan, A.; Blakeley-Ruiz, A.; Kleiner, M.; Hinzke, T.
Show abstract
Metaproteomics enables the functional characterization of microbiomes and host-microbe interactions by detecting and quantifying thousands of proteins. In data-dependent acquisition metaproteomics, protein quantification is commonly performed using either MS1-based area under the curve (AUC) or MS2-based peptide spectral counts (SpC). In AUC quantification, match between runs (MBR) is frequently employed to minimize data sparsity, yet its impact on metaproteomic data remains unclear. Understanding MBRs impact on metaproteomics data is especially important due to the high peak density in the MS1 mass spectra and the potential presence of not only proteins, but even entire organisms, in one sample and their absence in the other, which would complicate accurate feature mapping and transfer. While accurate quantification is essential for deriving meaningful biological inferences from metaproteomic analyses, systematic evaluations of AUC and SpC quantification in metaproteomics remain scarce. In this study, we used defined complex metaproteomic samples to perform a ground truth-based evaluation of AUC and SpC quantification and to determine the impact of MBR on AUC quantification. We found that MBR led to a substantial number of falsely identified proteins in complex samples. Protein identifications from an organism not present in the sample were wrongly transferred from other samples when MBR was used. We found that MBR-free AUC data had a wider dynamic range, higher quantitative accuracy, and more sensitive detection of abundance differences. Significance of the StudyAlthough metaproteomics is increasingly used to advance microbiome research, quantification strategies in metaproteomics are mostly selected based on convention rather than evidence, due to a lack of ground truth-based evaluation of quantification strategies in metaproteomics. Accurate protein quantification is key to deriving meaningful biological inferences from metaproteomic samples, yet it remains challenging due to their high complexity and uneven protein abundances. Here, we used defined metaproteomic samples to evaluate widely used quantification strategies in metaproteomics and to determine the effects of match between runs (MBR) on quantitative accuracy. Based on our findings, MBR adds falsely identified proteins to metaproteomic data. While MBR-free AUC offers a broader dynamic range and higher quantitative accuracy, SpC offers better proteome coverage. With this study, we provide an evidence-based framework for the informed selection of quantification strategies in metaproteomics, and highlight the strengths and limitations of these approaches with respect to proteome coverage, dynamic range, quantitative accuracy, and error propagation. Our findings also have important implications for the biological interpretation of data derived from these strategies and lay the groundwork for future studies validating quantitative approaches in data-independent acquisition workflows.
Kartashov, A. V.; Zlobin, I. E.; Ivanov, Y. V.; Ivanova, A. I.; Orlova, A.; Frolova, N.; Soboleva, A.; Silinskaya, S.; Bilova, T.; Frolov, A.; Kuznetsov, V. V.
Show abstract
During drought, numerous compounds accumulate in plant tissues, but their physiological roles remain unclear - they may function as osmolytes, osmoprotectants, or merely arise as by-products of stress-induced metabolic shifts. We developed an experimental approach to link accumulation patterns with specific functions, using Scots pine (Pinus sylvestris L.) saplings subjected to water deprivation and subsequent rewatering as a model system. We monitored changes in relative water content (RWC) and osmotic adjustment dynamics, employed untargeted primary metabolite profiling for preliminary screening of compounds correlated with water status, and performed quantitative GC-MS and LC-MS analyses of selected metabolites. Major inorganic cations (K, Ca{superscript 2}, Mg{superscript 2}) were also quantified to assess their potential roles. Our results revealed that tryptophan, valine, and lysine - though generally present in low abundance - exhibited selective accumulation under severely reduced RWC ([≤] 70%), suggesting their involvement as osmoprotectants. Major organic acids, particularly shikimic acid, showed trends consistent with osmotic adjustment. Notably, neither sucrose nor inorganic cations appeared to function as primary osmolytes in this context. The proposed approach offers a viable strategy for identifying compounds involved in plant adaptation to water deficit, with potential applications in breeding programs aimed at improving drought tolerance. HighlightsAn approach to identify osmolytes and osmoprotectants was implemented Accumulation of Trp, Val and Lys was consistent with their role in osmoprotection Osmotic adjustment relied predominantly on organic acids than on inorganic ions Monosaccharides but not sucrose correlates with changes in needle water status
Franziscus, C. A.; Ferrand, A.; Biehlmaier, O.; Schmidt, A.; Spang, A.
Show abstract
Cells contain different organelles and compartments that are essential for cellular function and life. These organelles and compartments need to communicate to assess cellular state in a changing environment, adapt to the new situation, and also to ensure functionality and homeostasis. Moreover, organization and communication differ between cell types. However, our knowledge about these changes is still rather scarce. Subcellular spatial proteomics aims to fill this knowledge gap. While proximity labeling techniques represent a great advance, they do not provide precise spatial resolution. To overcome this limitation, we developed SPEx (Subcellular spatial Proteomics coupled to Expansion), in which we first expand cells about 10- fold, laser micro-dissect regions of interests and then perform mass spectrometry-based proteomics on these samples. We demonstrate the effectiveness of SPEx by determining the proteome of the Golgi, the nucleus and nucleoli. Satisfyingly, we also identify novel components of these organelles. Combining inexpensive already existing technologies makes SPEx readily usable by the wider scientific community.
Shamorkina, T. M.; Kalaidopoulou Nteak, S.; Lay, S.; Kallor, A. A.; Ly, S.; Duong, V.; Heck, A. J. R.; Cantaert, T.; Snijder, J.
Show abstract
Dengue virus (DENV) is a major burden to global public health, affecting hundreds of millions annually. Children represent the major proportion of global dengue cases, ranging from asymptomatic or subclinical presentation to dengue fever (DF) and severe dengue hemorrhagic fever or shock syndrome (DHF/DSS). The factors that distinguish this range of disease severity are still poorly understood. To identify biomarkers of severity, we analyzed the plasma proteome of acute DENV infected children including both subclinical and hospitalized cases. Proteins associated with the acute-phase response, innate immune and lysosomal activation, and components of the coagulation cascade showed marked differences between hospitalized and subclinical cases during early infection. Longitudinal profiling demonstrated that endothelial dysfunction emerges early, with PTX3 showing the strongest and most rapid upregulation in hospitalized patients, supporting its potential role as a marker of imminent vascular involvement. When comparing severe (DHF/DSS) and classical DF hospitalized cases, CLEC11A displayed the highest fold change at hospital admittance. We used machine-learning analysis to predict disease severity at the acute phase of infection, distinguishing subclinical from hospitalized cases and patients that develop classical dengue fever or severe disease based on the identified complement regulators and inflammatory markers. The panel of identified plasma proteins shed light on the mechanisms of dengue related disease progression and may provide a handle to predict disease severity based on blood markers present during the acute phase of infection.
Salomo Coll, C.; Makar, A. N.; Brenes, A. J.; Inns, J.; Trost, M.; Rajan, N.; Wilkinson, S.; von Kriegsheim, A.
Show abstract
Single-cell proteomics (SCP) by mass spectrometry can now quantify hundreds to thousands of proteins per cell, but the field still lacks standardised analytical pipelines that accommodate the diversity of instruments, sample preparation workflows and biological contexts encountered in practice. Existing workflows, largely adapted from single-cell transcriptomics, do not account for the informative missingness, pervasive ambient protein contamination and limited feature space that distinguish proteomic from transcriptomic data. In addition, cell type annotation remains a manual bottleneck that is subjective, difficult to reproduce and hard to scale. Here we present an end-to-end pipeline that integrates adaptive quality control, entropy-guided iterative batch correction, multi-modal marker discovery that exploits detection patterns unique to proteomics, and context-aware annotation by large language models (LLMs) coupled to structured contradiction reasoning and orthogonal data-driven validation. Benchmarking on published single-cell proteomic datasets from developing human brain and glioblastoma-associated neutrophils revealed systematic LLM failure modes, including context-insensitive marker vocabulary and misinterpretation of phagocytic or lytic cell states. We addressed these errors using a three-round prompt architecture that combines general biological principles with auto-generated dataset-specific constraints. In held-out validation on a skin tumour dataset acquired, the pipeline showed high concordance with FACS-sorted ground truth. In the caerulein-injured pancreas, orthogonal immunohistochemistry further supported annotations of macrophage, stellate and immune populations. The pipeline is fully automated under fixed settings, and available as Context-Aware Single-Cell Proteomics Analysis (CASPA), providing SCP laboratories and facilities with a reproducible workflow that delivers interpretable, confidence-quantified annotations suitable for downstream expert review.
Reznikov, G.; Kusters, F.; Mohammadi, M.; van den Toorn, H. W. P.; Sinitcyn, P.
Show abstract
Large-scale proteomics relies heavily on target-decoy competition for false discovery rate estimation in peptide identification, and the performance of this strategy depends strongly on the design of the decoy database. Classical generators such as reversal and shuffling remain widely used. Here, we introduce protein language model-based (PLM) decoy generation for peptide identification and benchmark it against classical strategies. We evaluate these approaches using three complementary quality-control layers: sequence-based separability, search-engine-agnostic spectral-space diagnostics, and end-to-end mass spectrometry benchmarks, including pipelines with rescoring. Across these analyses, PLM-based decoys are harder for sequence-only neural networks to distinguish than most classical generators, suggesting fewer obvious sequence-level artifacts. However, this signal is only weakly informative for search performance. Spectral diagnostics further show that short peptides occupy a particularly crowded target-decoy space and are therefore especially prone to local collisions across all generators. In full search pipelines, reverse decoys remain a strong baseline, and current PLM-based generators do not yet provide a clear overall advantage. We therefore view PLM-based decoys not as universal replacements for reverse decoys, but as tunable tools for benchmarking, diagnostics, stress testing, and future adaptive decoy optimization, with increasing value as search models become more expressive.
zangene, e.; gholizadeh, e.; Vadadokhau, U.; Ritz, D.; Saei, A.; JAFARI, M.
Show abstract
Combination therapies are widely used in acute myeloid leukemia (AML), but systematic datasets capturing proteome-wide responses to multi-drug perturbations remain limited. Here we present CoPISA (Combinatorial Proteome Integral Solubility/Stability Alteration), a quantitative proteomics assay designed to profile protein solubility and stability responses to single and combined drug treatments. The dataset includes two AML drug pairs (LY3009120-sapanisertib and ruxolitinib-ulixertinib) applied to four AML cell lines (MOLM-13, MOLM-16, SKM-1, and NOMO-1) under control, single-agent, and combination conditions in both lysate and intact-cell formats. Thermal solubility profiling coupled with TMT-based multiplexed LC-MS/MS generated 16 TMT16-plex experiments comprising 192 LC-MS/MS raw files, providing deep proteome coverage across treatments and biological contexts. The resource includes raw and processed proteomics data, detailed experimental metadata in Sample and Data Relationship Format (SDRF), and reproducible analysis scripts for reporter normalization, protein-level aggregation, statistical modeling, and classification of combinatorial response patterns. The experimental design enables identification of proteins responding uniquely to combination treatments as well as overlapping single-agent effects. Technical validation demonstrates reproducible quantification across multiplex experiments and assay formats. All data are publicly available through the PRIDE repository (PXD066812) together with analysis code, enabling independent reanalysis and method development. This dataset provides a benchmark resource for studying proteome responses to drug combinations, comparing lysate and intact-cell perturbation profiles, developing computational approaches for combinatorial target inference, and supporting training in computational proteomics.