SoftwareX — Latest Matching Preprints

1

The NMR Exchange Format (NEF): Specification and Applications

Ploskon, E.; Baskaran, K.; Tejero, R.; Schwieters, C. D.; Bardiaux, B.; Guentert, P.; Fogh, R. H.; Gutmanas, A.; Brooksbank, E. J.; Yokochi, M.; Wishart, D. S.; Wedell, J. R.; Vranken, W. F.; Thompson, D.; Thompson, G.; Smith, B. O.; Rehman, S.; Ramelot, T. A.; Ragan, T. J.; Perez, A.; Perera, B. L.; Peisach, E.; Nilges, M.; Mureddu, L. G.; Mondal, A.; Lubicka, E. A.; Liwo, A.; Kurisu, G.; Kobayashi, N.; Klukowski, P.; Johnston, B. A.; Huang, Y. J.; Hoch, J. C.; Higman, V. A.; Herrmann, T.; Hayward, M. W.; Garnet, J. A.; Case, D. A.; Burley, S. K.; Adams, P. D.; Montelione, G. T.; Vuister, G.

2026-04-24 biochemistry 10.64898/2026.04.22.715536 medRxiv

Top 0.1%

8.4%

Show abstract

The NMR Exchange Format (NEF) is a community-driven standard for representing NMR experimental data in a consistent, interoperable, and machine-readable form. Built on the STAR syntax, NEF provides a structured framework for storing and exchanging chemical shifts, peak lists, various types of structural restraints, and related metadata, thus allowing for data exchange across software platforms. By enabling direct, lossless transfer of information, NEF simplifies multi-software workflows, improves reproducibility, and supports FAIR (Findable, Accessible, Interoperable, Reusable) data principles. We describe the NEF specification, its current implementation across commonly used NMR software packages, and its application in areas including biomolecular structure determination, metabolomics, and ligand screening. Testing demonstrates that NEF can be used to exchange complete datasets between programs without loss of information or functionality. We also outline recent developments and future directions, such as inclusion of NMR relaxation data and support for non-standard residue topologies. NEFs growing adoption highlights its potential as a unifying standard for NMR data, enabling more efficient, transparent and collaborative research.

2

A Python Dash App and cPanel workflow to automate metabolomics data analyses and visualisation

O'Loughlin, J.; Moses, T.

2026-05-05 biochemistry 10.64898/2026.05.01.722139 medRxiv

Top 0.1%

7.2%

Show abstract

Metabolomics offers a sophisticated analytical framework for characterising the molecular phenotype of biological organisms and complex living systems at a high resolution. As the functional endpoint of the omics cascade, the metabolome serves as a close reflection of cellular activity. It integrates genetic, transcriptomic and proteomic variations with external environmental influences. However, the inherent complexity of metabolomic datasets, characterised by high-dimensional chemical diversity, wide dynamic ranges, and significant matrix effects, necessitates a rigorous suite of chemometric and bioinformatic workflows. For researchers uninitiated in computational biology, the multi-stage requirement for raw data pre-processing, signal deconvolution, and multivariate statistical modelling (such as PCA or PLS-DA) presents a substantial barrier to entry. Navigating these convoluted data architectures remains a primary challenge in deriving biological meaning from the global metabolic profile. Here, we present a workflow to use Python Dash Apps to create a user-friendly interface for simplifying data processing and statistical calculations. Users can select their desired samples to initiate calculations for various statistical tests, generating interactive and publication-quality figures to explore their results. These apps were deployed on an Apache server via cPanel, allowing individuals to share their findings with collaborators and for research facilities to share metabolomics results with their users.

3

AQuA2-Cloud: a web platform for fluorescence bioimaging activity analysis

Bright, M.; Mi, X.; Duarte, D.; Carey, E.; Lyu, B.; Wang, Y.; Nimmerjahn, A.; Yu, G.

2026-03-10 bioinformatics 10.64898/2026.03.06.709938 medRxiv

Top 0.1%

6.6%

Show abstract

BackgroundAdvanced biological imaging analysis platforms such as Activity Quantification and Analysis (AQuA2) enable accurate spatiotemporal activity analysis across diverse cell populations within many species. These tools are increasingly important for investigating cellular signaling dynamics and behavior. However, despite advances in the accuracy and species capability of AQuA2, it remains computationally demanding for analysis of long time-series datasets and requires all users to maintain a MATLAB license, which may limit accessibility and large-scale deployment. ResultsTo address these limitations, we have designed and made available AQuA2-Cloud, a portable software stack and web platform developed as an improvement and further evolution of AQuA2. This container-deployable system permits multi-user cloud-based high accuracy activity quantification with intuitive workflows, export of analysis data and project files, and comparable processing times. The platform offers integrated features such as in-browser analysis control interfaces, asynchronous program state control, multiple users and user management, support for unreliable connections, file uploading and downloading via web browsers and File Transfer Protocol, and centralized organization of analysis output. ConclusionAQuA2-Cloud constitutes a cloud-native solution for laboratories or research groups seeking to centralize analysis of spatiotemporal biological imaging datasets while reducing software installation and licensing barriers for end users. The platform enables researchers with minimal technical expertise to perform advanced bioimaging analysis through standard web browsers while maintaining the analytical capabilities of AQuA2. AQuA2-Cloud source code, deployment procedures, and documentation are freely available at (https://github.com/yu-lab-vt/AQuA2-Cloud).

4

TracktorLive: an integrated real-time object tracking and response system

Minasandra, P.; Sridhar, V. H.; Roche, D. G.; Planas-Sitja, I.

2026-03-16 animal behavior and cognition 10.64898/2026.03.12.711471 medRxiv

Top 0.1%

4.9%

Show abstract

Real-time tracking and automated response systems are essential for standardising experiments, reducing observer bias, and improving reproducibility in studies of movement and behaviour. However, existing solutions face significant challenges: AI-based tracking systems require expensive hardware and impose computational delays, creating challenges for closed-loop experiments; existing real-time tracking tools lack standardised implementations for response delivery; and steep learning curves limit accessibility for users without programming or computer vision expertise. Here, we introduce TracktorLive, an open-source Python package designed to overcome these challenges through concurrency and a modular, cassette-based architecture. TracktorLive leverages traditional computer vision techniques to perform image-based object detection without the need for expensive hardware or deep learning. By parallelizing object tracking and response delivery into separate, concurrent server and client processes, the software minimizes frame processing time, enabling rapid, real-time analysis and response delivery. User-friendly cassettes--portable code snippets that can be copy-pasted into scripts--enable users with minimal programming experience to implement complex workflows for use in experiments and practical applications. We demonstrate TracktorLives utility through several applications, including microcontroller-based stimulus delivery for location-dependent manipulations; conditional video recording that activates only during events of interest; kinematic-based response triggering using real-time velocity computations; and multi-cassette experimental designs combining multiple functionalities. Detailed tutorials are provided to familiarize users with TracktorLives operation and functionality, and a growing library of cassettes supports diverse applications out of the box. We validated the software by comparing its response timing to human experimenters in a stimulus delivery task involving two fish species, where TracktorLive demonstrated consistently higher accuracy and lower variability, particularly for fast-moving subjects. Beyond experimental biology, TracktorLives unique architecture and versatility could support many different applications in fields ranging from neuroscience to wildlife management. As an open-source software combining accessibility, modularity, and computational efficiency, TracktorLive can help democratize real-time tracking and automated response systems across disciplines.

5

PALMS: A Computational Implementation for Pavlovian Associative Learning Models Simulation

Fixman, M.; Abati, A.; Jimenez Nimo, J.; Lim, S.; Mondragon, E.

2026-05-08 animal behavior and cognition 10.64898/2026.05.05.722899 medRxiv

Top 0.1%

4.8%

Show abstract

In contrast to static formalisms, computational definitions describe the operational mechanisms of a model. Simulations are an essential part of the cycle of theory development and refinement, assisting researchers in formulating the precise definitions that models require, and making accurate predictions. This manuscript introduces a computational implementation of Pavlovian learning models in a Python environment, termed Pavlovian Associative Learning Models Simulation (PALMS). In addition to the canonical Rescorla-Wagner model, attentional approaches are implemented, including Pearce-Kaye-Hall, Mackintosh Extended, Le Pelleys Hybrid, and a novel extension of the Rescorla-Wagner model featuring a unified variable learning rate that synthesises Mackintoshs and Pearce and Halls opposing conceptualisations. To our knowledge, only the first attentional model has been previously specified computationally in a general design tool. PALMS integrates a graphical interface that permits the input of entire experimental designs in an alphanumeric format, akin to that used by experimental neuroscientists. It uniquely enables the simulation of experiments involving hundreds of stimuli, such as those used with human participants, and the computation of configural cues and configural-cue compounds across all models, thereby substantially broadening their predictive capabilities. A comprehensive description of the models implementation and the environment functionalities is provided in the paper; these include efficient and accurate operation and instant visualisation of predicted results across different models within a single architecture and environment. We evaluate PALMS by simulating five published experiments in the associative learning literature that assessed the predictive scope of existing models, and we show that this implementation provides neuroscientists with a useful tool for identifying critical variables, refining experimental designs, making precise predictions, comparing model fitness, and formulating new theoretical approaches. PALMS is licensed under the open-source GNU Lesser General Public License 3.0. The environment source code and the latest multiplatform release build are accessible as a GitHub repository at https://github.com/cal-r/PALMS-Simulator. Author summaryResearch on associative learning is multidisciplinary, encompassing disciplines such as neuroscience, AI, psychology, psychiatry, behavioural sciences, planning, and marketing. Unlike static formalisms, precise computational definitions specify how a model operates, enabling model simulation, swift and error-free prediction calculations, which are essential for testing theories, comparing predictions, holding models accountable, and providing a common language across fields. We introduce Pavlovian Associative Learning Models Simulation (PALMS), a user-friendly, open-source Python environment for simulating classical conditioning and studying the role of attention in learning. PALMS implements the prescriptive Rescorla-Wagner and attentional models: Pearce-Kaye-Hall, Mackintosh Extended, Le Pelleys Hybrid, and a new hybrid model with a unified variable learning rate that blends Mackintosh and Pearce-Halls conflicting views. Its graphical interface makes it easy for neuroscientists to enter experiments. Our computational implementation supports simulations with hundreds of stimuli, configural cues, and compounds, broadening the models predictive power. Designed for efficiency, it offers instant visual results and useful features. We evaluate PALMS by simulating five published experiments, highlighting its value for model comparison and refinement, and, more generally, as a tool to assist research.

6

MicrobeMS - A MATLAB Toolbox for Microbial Identification Based on Mass Spectrometry

Lasch, P.

2026-05-12 bioinformatics 10.64898/2026.05.08.723807 medRxiv

Top 0.1%

4.2%

Show abstract

1.Over the last two decades, matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-ToF MS) has become the standard method for identifying bacteria and has found a wide range of applications, especially in clinical microbiology. The methods high taxonomic resolution, minimal sample preparation, and complete, ready-to-use commercial systems, which include instrumentation, experimental protocols, spectral databases, and identification analysis software, were key factors in the success of MALDI-ToF MS as the standard for identifying microorganisms in routine diagnostic laboratories. However, despite the availability of these commercial solutions, there is also a growing need for efficient, cost-effective, vendor-neutral databases and analysis tools. These tools would enable the compilation of user-defined mass spectral databases and the testing of new analysis methods and algorithms, particularly in an academic context. To this end, MicrobeMS software has been developed to cover all stages of MALDI-ToF MS-based identification analysis. MicrobeMS is an easy-to-use desktop application for analyzing mass spectra from microorganisms and performing tasks related to spectrum database compilation. It includes routines for direct data import and export, biomarker peak searches, management of spectrum metadata, testing of spectrum quality, supervised and unsupervised identification analysis and intuitive result display. MicrobeMS is implemented in MATLAB and is freely available as MATLAB pcode for Windows and Linux, as well as a standalone application. Over the last fifteen years, the software has undergone continuous development and is now used routinely in various settings at the Centre for Biological Threats and Special Pathogens (ZBS) at the Robert Koch Institute (RKI) in Berlin, Germany, for example in supporting spectrum database compilation, to identify special or rare pathogenic bacteria by advanced identification analysis concepts, or to test in silico MALDI-ToF MS databases derived from microbial genomes. In this software publication the versatility and capabilities of MicrobeMS are demonstrated using a test data set from highly pathogenic bacteria (HPB) which has been obtained as part of a published European Union (EU)-funded External Quality Assurance Exercise (EQAE). MicrobeMS and HPB test data can both be downloaded from https://wiki.microbe-ms.com/. The goal of this software publication is twofold: to raise awareness of MicrobeMS within the scientific community and to encourage the testing of the software and custom-developed MALDI-ToF MS databases of the RKI, which are published at the ZENODO data repository (https://doi.org/10.5281/zenodo.7702374).

7

An Open Reproducible Framework for CNN-Based Cetacean Vocalization Detection in Passive Acoustic Monitoring

De Marco, R.

2026-05-06 animal behavior and cognition 10.64898/2026.05.01.721665 medRxiv

Top 0.1%

3.7%

Show abstract

This paper presents a six-stage methodological framework for Convolutional Neural Net-work (CNN)-based cetacean vocalization detection and classification in Passive Acoustic Monitoring (PAM), implemented as the open-source toolkit ai-pam-pipeline. The frame-work is generalizable across species and fully parameterised through a single configuration file, guaranteeing exact experimental reproducibility. Two experiments are reported. Experiment A examines the effect of FFT window length Nfft [isin] {256, 512, 1024} on binary Bottlenose dolphin (Tursiops truncatus) whistle detection using stratified 10-fold cross-validation on an in-domain dataset (Oltremare, 192 kHz) and a cross-domain benchmark (DCLDE 2022). In-domain performance is uniformly high (macro F1{approx} 0.98; Wilcoxon, all p > 0.05). Cross-domain results diverge substantially: Nfft = 256 is significantly superior (p = 0.006, rank-biserial r = 0.89). The mechanism is an upsampling amplification effect: coarser spectral bins produce wider, higher-contrast FM traces after bilinear resampling to fixed image dimensions. This superiority is threshold-invariant: precision equals 1.000 across all configurations and thresholds{theta} [isin] [0.1, 0.9], confirming that the advantage is not an artifact of threshold choice. These findings demonstrate that preprocessing choices -- often treated as secondary implementation details -- can significantly affect cross-domain generalisation. While Nfft serves here as a controlled case study, the framework is designed to enable systematic, reproducible evaluation of arbitrary preprocessing parameters within a unified experimental protocol. Experiment B demonstrates multiclass capability on five T. truncatus vocalization cate-gories (macro F1 = 0.843); inter-class confusion between click trains and burst-pulse sounds reflects biological signal overlap rather than classifier failure.

8

HP2NET: Empowering Efficient Phylogenetic Network Analysis through High-Performance Computing

Terra, R.; Carvalho, D.; Machado, D. J.; Osthoff, C.; Ocana, K.

2026-03-08 bioinformatics 10.64898/2026.03.05.709005 medRxiv

Top 0.1%

3.6%

Show abstract

Advances in High-Performance Computing (HPC) have enabled increasingly complex genomic analyses, including those in phylogenomics. These analyses contribute to understanding the evolution of viruses and pathogens, improving our knowledge of disease transmission, and supporting targeted public health strategies. However, due to the increasing number of tools and processing steps involved, executing these analyses manually, step by step, becomes error-prone and inefficient. To address this challenge, we present HP2NET, a robust framework for reproducible, efficient, and scalable phylogenetic network analysis. HP2NET integrates five workflows based on state-of-the-art tools such as PhyloNetworks and PhyloNet, allowing the analysis of multiple datasets and workflows in a single execution. The framework includes features such as task packaging and data reuse to improve performance and resource utilization in HPC environments. We perform a comprehensive performance evaluation of the software used within HP2NET, identifying bottlenecks and analyzing gains from parallel processing. Data reuse provided up to 15.35% time reduction, for a small dataset, in our experimental environment, while parallel execution of the five pipelines reduced total runtime by up to 90.96% compared to sequential runs. Finally, we validate HP2NET in a real-world case study by analyzing Dengue virus genomes, demonstrating its applicability value for large-scale phylogenetic analyses.

9

AI and Hierarchical clustering techniques for accurate patient stratification

Diaz Ochoa, J. G.; Puskaric, M.; Layer, N.; Jensch, A.; Knott, M.; Krohn, A.

2026-03-15 health informatics 10.64898/2026.03.13.26348331 medRxiv

Top 0.1%

3.0%

Show abstract

Graph-based methods for data representation and analysis are well suited for encoding both data points and their interrelationships. This approach integrates data and topology, enabling the representation of interrelated information. In this study, we represent patient cohorts as cohort graphs and discuss their application for real-world patient data. We particularly focus on developing methods to cluster patients with similar symptoms and examine how bias parameters (such as sex and age group) influence interlinking within CGs, thereby improving results for accurate patient stratification and personalized decision-making in a clinical context. In particular we illustrate how by considering sex and age groups we can improve the symptom-clustering of a patient population with lung and gastro-intestinal cancer. Finally, we discuss the essential role of high-performance computing (HPC) in upscaling analytical methods for CGs.

10

A general methodology for liver sinusoid fenestration analysis based on 3D electron microscopy data

Pohar, C.; Rekik, Y.; Phan, M. S.; Gallet, B.; Desroches-Castane, A.; Chevallet, M.; Tinevez, J.-Y.; Tillet, E.; Vigano, N.; Jouneau, P.-H.; Deniaud, A.

2026-03-09 cell biology 10.64898/2026.03.07.710307 medRxiv

Top 0.1%

2.6%

Show abstract

The liver has a complex architecture composed of millions of lobules. Within these lobules, hepatocytes, the main hepatic cells, are organized in rows separated by blood capillaries known as sinusoids. These capillaries are lined by liver sinusoidal endothelial cells (LSEC) that form a very specific fenestrated endothelium essential for the exchange of metabolites and proteins between the blood and hepatocytes. Alterations in the size and number of LSEC fenestrations are associated with the onset and the progression of various liver diseases. The analysis of liver architecture is thus of utmost importance for advancing our knowledge of liver ultrastructure and its alterations. Liver architecture has been studied since decades, mainly using 2D electron microscopy, and more recently using advanced super-resolution fluorescence microscopy. In recent years, volume electron microscopy techniques, including focused ion beam-scanning electron microscopy (FIB-SEM) progressed and nowadays enable the 3D reconstruction of biological ultrastructures down to nanometer resolution. However, the analysis of large volumes (e.g., several tens of {micro}m3) remains challenging due to various constraints in the segmentation of large datasets. In the current study, we developed a workflow to semi-automatically segment hepatic sinusoids from FIB-SEM mice liver datasets using the CNN-based (convolutional neural network) tool known as "nnU-Net", after fine-tuning a ground truth model. We also implemented tools for semi-automatic quantification of LSEC fenestrae diameters and sinusoid porosity from segmented datasets. This workflow enabled us to compare the distribution of LSEC fenestrae diameters in wild-type versus Bmp9-deleted mice, a hepatic factor known to be involved in fenestration maintenance. Our results confirm the importance of BMP9 for LSEC differentiation. Therefore, the developed methodology represents a valuable tool for characterizing the fenestrated endothelium under various physiological and pathological conditions.

11

ARACRA: Automated RNA-seq Analysis for Chemical Risk Assessment

sharma, S.; Kumar, S.; Brull, J. B.; Deepika, D.; Kumar, V.

2026-04-09 bioinformatics 10.64898/2026.04.07.716912 medRxiv

Top 0.1%

2.4%

Show abstract

Transcriptomic analysis is considered a powerful approach for biomarker discovery, however still exploring large scale omics dataset to extract meaningful biological insights remains a challenge for biologists. To address this gap, we present ARACRA a fully automated RNA-seq analysis pipeline including entire transcriptomics workflow from raw FASTQ files to the transcriptomics Point of Departure (tPoD) with human-in-the-loop review process. Overall, the analysis is performed in two phases: Phase 1 carries out the acquisition of raw reads, pre-alignment quality control, alignment to reference genome and quantification of gene expression. Whereas, Phase 2 performs statistical analysis including Differential Gene Expression analysis and Dose-Response modelling. Two phases are separated by an extensive quality control step which allows the user to visually inspect the quality of data processed and helps in filtering noise and outlier samples. ARACRA facilitates end-to-end analysis of RNA-Seq data through an interactive web-based application developed on nextflow and streamlit for minimizing computational complexities while ensuring correct downstream processing. Availability and implementationARACRA is freely available online at the GitHub with MIT License and stream lit-based web application: ARACRA. Researchers can use the demo data or even upload their own data to do the analysis. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=78 SRC="FIGDIR/small/716912v1_fig1.gif" ALT="Figure 1"> View larger version (27K): org.highwire.dtl.DTLVardef@15170a9org.highwire.dtl.DTLVardef@1bb9822org.highwire.dtl.DTLVardef@1010f3aorg.highwire.dtl.DTLVardef@8ee6e6_HPS_FORMAT_FIGEXP M_FIG O_FLOATNOFig 1:C_FLOATNO Overall Architecture of ARACRA C_FIG

12

fishROI: A specialized workflow for semi-automated muscle morphometry analysis in teleosts

Lu, Y.; Pan, M.; Jamwal, V.; Locop, J.; Ruparelia, A. A.; Currie, P. D.

2026-03-30 cell biology 10.64898/2026.03.27.714781 medRxiv

Top 0.1%

2.4%

Show abstract

Quantitative histological analysis of skeletal muscle morphometry provides critical insights into muscle physiology but remains labor-intensive and technically demanding. While recent developments in machine-learning-based image segmentation techniques have facilitated large-scale tissue analysis, existing tools that automate muscle morphometry analysis are largely tailored to mammalian models, with limited applicability to teleosts. Moreover, there is a lack of effective tools for visualizing spatial organization and morphometric variability of teleost muscle fibers, a feature that is important for understanding hyperplastic muscle growth dynamics in teleosts. In this study, we show that cytoplasmic staining combined with deep learning-based cell segmentation offers a robust and accurate approach for automated muscle morphometry analysis in developing zebrafish. We also introduce a FIJI2 plugin, implemented in Jython, that streamlines both morphometric analysis and visualization. This tool accommodates shallow and deep learning-based segmentation techniques and incorporates novel quantification and visualization methods suited to teleost-specific muscle features, including mosaic hyperplasia dynamics. The plugin features an intuitive graphical user interface and is designed for flexibility, with minimal constraints regarding species, image quality, or staining protocol. Its modular architecture allows it to be used as a baseline for automated muscle morphometry analysis, while permitting integration with other tools and workflows.

13

Unspecific Molecular Adsorption (UMA) sample preparation method for bottom-up and whole protein analysis. The foundation.

Zougman, A.

2026-03-05 biochemistry 10.64898/2026.03.02.709073 medRxiv

Top 0.1%

1.9%

Show abstract

The protein sample preparation methods for shotgun proteomics are nowadays well-established unlike the ones for whole protein analysis. The goal of my work has been to create a simple methodology which provides a single uncomplicated sample preparation tool for these two fields. Nowadays the bulk of proteomics work is done using detergents for protein solubilization. The presented concept, which is based on unspecific adsorption of protein molecules on wide pore materials, allows for protein capture and clean-up from solutions of the most commonly used sodium dodecyl sulfate detergent. It could also be applied to proteins in detergent-free solutions. After the capture and clean-up, proteins could be either cleaved for the downstream peptide analysis or eluted for the whole protein analysis. If required, the eluted whole proteins could be recaptured and cleaved into peptides. Depending on the experimental goals, the sample preparation device could be fitted with embedded proteolytic enzymes to simplify routine sample processing and/or reversed phase media for the downstream peptide or protein separation.

14

A high-performance end-to-end 3D CLEM processing workflow for facilities

Roberge, H.; Woller, T.; Pavie, B.; Hennies, J.; de Heus, C.; Edakkandiyil, L.; Liv, N.; Munck, S.

2026-03-16 cell biology 10.64898/2026.03.13.711046 medRxiv

Top 0.1%

1.9%

Show abstract

Correlative Light and Electron Microscopy (CLEM) integrates the molecular specificity of light microscopy (LM) with the ultrastructural detail of electron microscopy (EM), enabling comprehensive spatial analysis of biological samples. Despite growing demand, processing 3D CLEM datasets remains challenging, specifically for service provision in facilities, due to their multimodal nature and the lack of unified approaches. Typical steps include EM slice alignment, LM-EM registration, segmentation, and 3D visualization. We present a modular, end-to-end pipeline that consolidates existing and newly developed tools into a coherent workflow for 3D CLEM analysis and allows railroading the approach. Designed as interoperable modules accessible through a user-friendly interface, the pipeline is fully open-source and scales from standard workstations to high-performance computing environments to address the need for analysis of growing datasets. While some steps still require manual input, individual components can be automated to increase throughput and reproducibility. Together, this integrated solution lowers technical barriers and supports broader adoption of 3D CLEM methodologies.

15

PEPTERGENT: A Peptide-Based Method for Detergent-Free Extraction and Purification of Membrane Proteins and Membrane Proteomes

Antony, F.; Bhattacharya, A.; Duong van Hoa, F.

2026-03-18 biochemistry 10.64898/2026.03.17.711971 medRxiv

Top 0.1%

1.7%

Show abstract

Peptergent is a novel class of amphipathic peptides that enable detergent-free extraction and purification of membrane proteins (MPs). These designed peptides self-assemble around hydrophobic transmembrane regions of proteins, forming stable, water-soluble assemblies that can be isolated directly from biological membranes. By doing so, Peptergent bypass the limitations imposed by traditional detergents, which often destabilize proteins and restrict downstream analyses. Since detergents are completely avoided, Peptergent-isolated MPs are directly amenable to structural and mass spectrometry (MS) analysis, thereby addressing their persistent underrepresentation in proteomic datasets and improving their accessibility for drug-screening strategies. Here, we describe a streamlined protocol for isolating MPs with the Peptergent PDET-1, followed by exchange into His-tagged Peptidiscs for Ni-NTA-based affinity purification. The method comprises membrane isolation, peptide preparation, protein extraction, clarification, and exchange of MPs from Peptergent to Peptidiscs. Application of this workflow yields enriched membrane proteomes compatible with downstream LC-MS/MS analysis, with improved recovery of hydrophobic and multi-pass membrane proteins. Key featuresO_LIDirect extraction and solubilization of membrane proteins in Peptergents C_LIO_LIExchange into His-tagged Peptidiscs enabling affinity purification of MPs C_LIO_LI100% detergent-free workflow compatible with LC-MS/MS analysis C_LIO_LIApplicable to cultured cells and tissue-derived membrane fractions C_LI In BriefWe describe a Peptergent-based workflow for isolating membrane proteins directly from membrane preparations. Proteins are extracted with the Peptergent peptide scaffold (PDET-1) and transferred into His-tagged Peptidisc (HD-43). The water-soluble membrane proteins are enriched by Ni-NTA affinity purification and prepared for bottom-up mass spectrometry, yielding enriched membrane proteomes and dried peptide samples ready for LC-MS analysis Graphical Overview O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=128 SRC="FIGDIR/small/711971v1_ufig1.gif" ALT="Figure 1"> View larger version (36K): org.highwire.dtl.DTLVardef@af3241org.highwire.dtl.DTLVardef@c6a94org.highwire.dtl.DTLVardef@129322aorg.highwire.dtl.DTLVardef@19c7c9d_HPS_FORMAT_FIGEXP M_FIG C_FIG

16

BrightEyes-FFS: an open-source platform for comprehensive analysis of fluorescence fluctuation spectroscopy experiments with small detector arrays

Slenders, E.; Perego, E.; Zappone, S.; Vicidomini, G.

2026-04-10 bioinformatics 10.64898/2026.04.08.717207 medRxiv

Top 0.2%

1.7%

Show abstract

Fluorescence fluctuation spectroscopy (FFS) is an ensemble of techniques for quantitative measurement of molecular dynamics and interactions. Recently, the introduction of small-format array detectors has opened up a new range of spatiotemporal information, allowing for more detailed analysis of system kinetics. However, there is currently no open-source software available for analyzing the high-dimensional FFS data sets. We present BrightEyes-FFS, an open-source Python-based environment for FFS analysis with array detectors. The environment includes a Python package for reading raw FFS data, computing auto- and cross-correlations using various algorithms, and fitting the correlations to several models. A graphical user interface (GUI), available as a standalone executable, makes the analysis fast and user-friendly. An automated Jupyter Notebook writing tool enables transition from the GUI to Jupyter Notebook for custom analysis. We believe that BrightEyes-FFS will enable a wider community to study diffusion, flow, and interaction dynamics.

17

BioDCASE: Using data challenges to make community advances in computational bioacoustics

Stowell, D.; Nolasco, I.; McEwen, B.; Vidana Vila, E.; Jean-Labadye, L.; Benhamadi, Y.; Lostanlen, V.; Dubus, G.; Hoffman, B.; Linhart, P.; Morandi, I.; Cazau, D.; White, E.; White, P.; Miller, B.; Nguyen Hong Duc, P.; Schall, E.; Parcerisas, C.; Gros-Martial, A.; Moummad, I.

2026-04-06 animal behavior and cognition 10.64898/2026.04.02.716062 medRxiv

Top 0.2%

1.7%

Show abstract

Computational bioacoustics has seen significant advances in recent decades. However, the rate of insights from automated analysis of bioacoustic audio lags behind our rate of collecting the data - due to key capacity constraints in data annotation and bioacoustic algorithm development. Gaps in analysis methodology persist: not because they are intractable, but because of resource limitations in the bioacoustics community. To bridge these gaps, we advocate the open science method of data challenges, structured as public contests. We conducted a bioacoustics data challenge named BioDCASE, within the format of an existing event (DCASE). In this work we report on the procedures needed to select and then conduct useful bioacoustics data challenges. We consider aspects of task design such as dataset curation, annotation, and evaluation metrics. We report the three tasks included in BioDCASE 2025 and the resulting progress made. Based on this we make recommendations for open community initiatives in computational bioacoustics.

18

A Web Application for Exploring Distribution in Academic Publications Across Geography and Institutions in India

Hou, Y.; Cohen, E.; Higginbottom, J.; Rountree, L.; Ren, Y.; Wahl, B.; Nyhan, K.; Mukherjee, B.

2026-03-20 health informatics 10.64898/2026.03.18.26348755 medRxiv

Top 0.2%

1.6%

Show abstract

India's national research capacity and infrastructure are unevenly distributed across states and union territories (UTs), contributing to geographic variation in academic publication output. We developed Indiapub, an open-access web application that quantitatively enumerates and visually displays geographic and temporal publication patterns for research products with at least one author affiliated with an Indian institution, using OpenAlex data. The app is designed for ease of use, with automated data retrieval, cleaning, and aggregation. Indiapub allows users to filter publications by topic, publication year range, author position, publication type, minimum citation count, state/UT, and population size of the state/UT where the author institution is located. The app also provides downloadable tables and ranked institution lists by publication count. Its interactive dashboard includes five modules: (i) a map of publication distribution, (ii) time trend plots for nation and state/UT, (iii) publication-share versus population-share plots highlighting over- and underrepresentation, (iv) stacked bar charts of state/UT contributions over time with population benchmarks, and (v) bubble plots relating the Human Development Index to publication volume over time. This tool may support resource prioritization and identification of institutional strengths for trainees, researchers, higher education administrators, and policymakers. To illustrate its utility, we present sample findings derived from the app. For publications across all topics from 2014 to 2025, the largest research participation footprints were observed in Tamil Nadu, Maharashtra, Delhi, Uttar Pradesh, and Karnataka. Tamil Nadu and Delhi were home to three of the highest-publishing institutions nationally: Vellore Institute of Technology, All India Institute of Medical Sciences, and Indian Institute of Technology Delhi. We also examined six curated case studies of broad scientific interest: electronic health records (EHR), genome-wide association studies (GWAS), artificial intelligence (AI), development economics, environmental science, and COVID-19. Findings from these case studies revealed over- and underrepresentation in publication output across states and UTs. For example, in EHR publications among high-population states, Tamil Nadu's publication share exceeded its population share by 31.3 percentage points (pp), whereas Bihar's was 12.8 pp lower. Our tool offers insights into India's research landscape across states and UTs with easy-to-digest visuals. Such interactive tools have the potential to serve as a starting point for fostering a more inclusive research ecosystem supporting targeted research policy and planning.

19

SMEW: An interactive multi-scale toolkit for cross-condition and network-based analysis of spatial metabolomics data

Williams, E.; Hulme, H.; Zakirov, A.; Buszta, D.; Hamm, G.; Flint, L.; Franzen, L.; Olsson Lindvall, M.; Stamou, M.; Andersson, P.; Tan, J.; Ling, S.; Mohorianu, I.

2026-04-29 bioinformatics 10.64898/2026.04.27.721059 medRxiv

Top 0.2%

1.5%

Show abstract

Spatial metabolomics, measured through mass spectrometry imaging (MSI), provides high-throughput, spatially resolved information on metabolite distributions within tissues, including endogenous metabolites and exogenous compounds. This offers a direct readout of cellular biochemical activity and phenotypes, not fully captured by transcriptomics or proteomic profiling. However, inferring biologically meaningful patterns from noisy, high-dimensional MSI data, particularly across multiple samples and complex experimental designs, remains challenging, and often requires substantial programming expertise. Here we introduce SMEW (Spatial Metabolomics Enhanced Workflow), a flexible, interactive and shareable Shiny-based platform designed to enable code-free downstream analysis of spatial metabolomics MSI data. SMEW provides a unified environment for hierarchical analysis across bulk-, region- and pixel-level resolutions, allowing comparisons between experimental conditions like disease or treatment groups while highlighting coherent metabolic patterns and linking these patterns to biological pathways. The workflow leverages local spatial covariation to robustly summarise MSI data through dimensionality reduction, clustering and identification of spatially variable metabolites. In addition, metabolite co-localisation and covariation network analysis, together with spatially resolved pathway enrichment facilitate the biological interpretation of cross-condition datasets within a single integrated interface. SMEW is applicable across MSI technologies and mass resolutions, as illustrated through case studies on DESI and MALDI-ToF datasets from lung, liver, and kidney. By complementing existing MSI processing and visualisation tools with an accessible, multi-sample, and biologically interpretable analysis framework, SMEW enables functional, flexible, rigorous and intuitive exploration of spatial metabolomics datasets. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=84 SRC="FIGDIR/small/721059v1_ufig1.gif" ALT="Figure 1"> View larger version (29K): org.highwire.dtl.DTLVardef@1e2abaeorg.highwire.dtl.DTLVardef@753ee9org.highwire.dtl.DTLVardef@1756fc1org.highwire.dtl.DTLVardef@fbedc7_HPS_FORMAT_FIGEXP M_FIG C_FIG Key PointsO_LISMEW provides a flexible, interactive and shareable Shiny-based platform designed to enable code-free downstream analysis of spatial metabolomics MSI data C_LIO_LIThe SMEW framework enables hierarchical analysis at bulk-, region- and pixel levels within a unified framework without relying on extensive programming expertise C_LIO_LIThe pipeline integrates spatially aware clustering, pathway analysis and identification of metabolite co-localisation modules C_LIO_LIThe workflow facilitates flexible comparison of multi-sample experimental conditions through multivariate modelling, differential testing and covariation networks to study treatment- and disease-associated metabolite dynamics C_LIO_LISMEW has been applied to interrogate diverse biological questions, including characterising disease-associated remodelling in a mouse bleomycin model of pulmonary fibrosis, exploring the therapeutic index of antisense oligonucleotides in the liver and assessing metabolic heterogeneity in a small molecule-treated mouse renal tumour model C_LI

20

Quantitative imaging of calcium dynamics with a green fluorescent biosensor and fluorescence lifetime imaging

Caldarola, A.; Palacios Martinez, S.; Goedhart, J.

2026-04-13 cell biology 10.64898/2026.04.10.717680 medRxiv

Top 0.2%

1.3%

Show abstract

Genetically encoded biosensors are GFP-based tools that can visualize the dynamics and spatial features of cellular processes. The design of a genetically encoded biosensor dictates the method that is used to measure the response. Common read-outs use some sort of fluorescence intensity measurement, which is subject to both technical and biological perturbations, including sample drift, excitation power fluctuations, changes in sample size/volume, or a change in expression level. Yet, the fluorescence lifetime of a fluorophore is not affected by the aforementioned perturbations. Therefore, biosensors that respond with a large lifetime change offer a more robust method of detecting cellular processes. Here, we report on protocols for calcium imaging using fluorescence lifetime imaging microscopy (FLIM) to measure the response of a genetically encoded lifetime biosensor. The protocols include details on biosensor production and purification, calibration of purified biosensor with FLIM, introduction of the plasmid in HeLa and endothelial cells, and timelapse analysis of FLIM data. In this chapter we use the green fluorescent biosensor G-Ca-FLITS as an example but the protocols can be generally applied to biosensors with lifetime contrast. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=139 SRC="FIGDIR/small/717680v1_ufig1.gif" ALT="Figure 1"> View larger version (39K): org.highwire.dtl.DTLVardef@167f612org.highwire.dtl.DTLVardef@4c5603org.highwire.dtl.DTLVardef@1a2eb6borg.highwire.dtl.DTLVardef@10ddc63_HPS_FORMAT_FIGEXP M_FIG C_FIG