Integrating the ENCODE blocklist for machine learning quality control of ChIP-seq with seqQscorer
Albrecht, S.; Krämer, C.; Röchner, P.; Mayer, J. U.; Rothlauf, F.; Andrade-Navarro, M. A.; Sprang, M.
Show abstract
MotivationQuality assessment of next-generation sequencing data is a complex but important task to ensure correct conclusions from experiments in molecular biology, biomedicine, and biotechnology. We previously introduced seqQscorer, a quality assessment tool using machine learning to support this process. To improve seqQscorer in terms of accuracy and processing time, we integrated the ENCODE blocklist* to derive a new type of quality-related features, supposed to be more informative and faster in generation than those conventionally used by seqQscorer. ResultsThe novel seqQscorer extension, called seqBLQ, allows us to improve the quality assessment for ChIP-seq data derived from human tissues and cell lines. Furthermore, seqBLQ enhances the usability of the tool by simplifying the installation procedure and reducing the computational resources required for feature generation. Availability and implementationhttps://github.com/salbrec/seqQscorer
Matching journals
The top 3 journals account for 50% of the predicted probability mass.