Back

BatchVaria: a variance-aware framework for evaluating batch correction in high-dimensional omics data

Moir, N.; Sherwood, K.; Simpson, I.

2026-05-12 bioinformatics
10.64898/2026.05.07.721996 bioRxiv
Show abstract

SummaryBatch effects and other unwanted technical sources of variation remain a persistent challenge in the integrative analysis of high-dimensional-omics data. Although established methods such as ComBat effectively mitigate batch-associated signal, their impact on biologically meaningful variation is frequently evaluated in an ad hoc and non-quantitative manner. This is particularly problematic in heterogeneous disease contexts, such as breast cancer transcriptomics, where technical and biological sources of variation may be partially confounded. We present BatchVaria, an R package that implements a variance-aware framework for batch correction and post-adjustment evaluation. BatchVaria integrates variance component modelling, batch adjustment, and systematic re-profiling within a unified analysis container, enabling iterative quantification and reassessment of technical and biological variance contributions while preserving analytical provenance. By supporting multiple variance profiling engines and structured storage of intermediate results, BatchVaria facilitates transparent and reproducible evaluation of batch correction strategies. We demonstrate the utility of BatchVaria using a publicly available breast cancer transcriptomic dataset with known covariate-driven structure, illustrating how iterative variance profiling can guide responsible batch correction without erosion of subtype-associated biological signal.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.