Reproducible Tools and Enhanced Computational Workflows for Batch Effect Evaluation of High-Throughput Data Using BatchQC
Anderson, J. K.; Zhang, J.; Ge, X.; Fan, H.; Leng, Y.; Silverstein, M.; Conrad, R.; Li, Z.; Holmes, E.; Joseph, S. S.; Lu, S.; Shinohara, R.; Li, T.; Johnson, W. E.; Alzheimers Disease Neuroimaging Initiative,
Show abstract
Batch effect correction is a common and often necessary step in data analysis to reduce bias due to technical and experimental factors when combining multiple batches of data. The severity of the batch effects dictates the correction strategy; therefore, a careful assessment of each datasets batch effects is necessary. BatchQC is an R package that provides reproducible tools and visualizations for quantitatively and qualitatively addressing batch effects across a broad range of data types. BatchQC integrates with standardized Bioconductor data structures and features an object-oriented design, enabling the application of workflows that can freely evaluate and process data within and outside the package tools. Common batch evaluation methods, along with novel quantitative metrics, help determine the benefits of batch correction for each dataset and enable direct comparisons between methods. Here, we present BatchQC as the first comprehensive batch-correction R package, with independent tools, reproducible workflows, visualization, and novel statistics.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.