PanACRpred: Predicting Accessible Chromatin Regions in Pangenomes using Motif Chaining
Warr, M. J.; Dinh, T.; Root, B.; Onstott, E.; Yu, K.; Mudge, J.; Ramaraj, T.; Kahanda, I.; Mumey, B.
Show abstract
In this work, we investigate using motif subsequence features to predict whether a genomic region is accessible to regulatory proteins, i.e. an accessible chromatin region (ACR), enabling transcription of associated genes. We focus on plants, whose agricultural and ecological importance make them interesting and important organisms to study, and whose complex genomes provide important stress tests for our algorithm. We show that motif sequence similarity as found by co-linear chaining can be used in combination with machine learning models to effectively predict ACRs in genome assemblies.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.