Back

Using combined RNA/DNA short read sequencing to investigate allele-specific expression from the inactive X chromosome in human cells

Thomas, R.; Blower, M.

2026-05-24 bioinformatics
10.64898/2026.05.21.726886 bioRxiv
Show abstract

Many genomic regions exhibit allele-specific expression. This effect is most pronounced in imprinted genes, where one copy of a gene is epigenetically silenced, and the inactive X chromosome of female cells, where almost the entire chromosome is silenced. Allele specific gene expression can have significant effects on human health and is implicated in a wide array of diseases. Research into allele specific expression is most often carried out in mouse models where cross breeding of mouse strains can yield progeny with well characterised haplotypes where parent of origin is known for a huge number of SNPs. The same approach cannot be taken with human data and haplotypes must be assembled using expensive and labour intensive long read sequencing and Hi-C based approaches. Although resolved haplotypes are available for a number of cell lines, allowing accurate measurement of allele-specific gene expression, this type of analysis is inaccessible for non-specialist labs. We demonstrate how to use previously published haplotypes to investigate X linked gene silencing and epigenetic changes. Additionally, in this paper we present a method to exploit the profound difference in expression levels between the two human X chromosomes to assign SNPs in expressed RNA to the active or inactive X chromosome using only short read DNA and RNA sequencing. We demonstrate this technique using sequencing libraries generated in house and sequencing data from publicly available databases including for a cell line with a complex karyotype. In each instance we identified genes that were silenced in each cell line opening them up to further research avenues. This X chromosome haplotyping technique can be applied to any clonally derived human cell line with 2 or more X chromosomes allowing researchers to investigate X linked gene silencing in cell lines already present in their lab rather than in the limited number of cell lines for which a haplotype is available.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 3%
9.9%
2
Epigenetics & Chromatin
42 papers in training set
Top 0.1%
8.3%
3
PLOS ONE
4510 papers in training set
Top 25%
6.7%
4
BMC Bioinformatics
383 papers in training set
Top 1%
6.7%
5
Scientific Reports
3102 papers in training set
Top 19%
6.3%
6
Nucleic Acids Research
1128 papers in training set
Top 3%
6.3%
7
Frontiers in Genetics
197 papers in training set
Top 0.9%
6.2%
50% of probability mass above
8
Bioinformatics Advances
184 papers in training set
Top 1%
3.5%
9
Methods
29 papers in training set
Top 0.1%
3.5%
10
Genes
126 papers in training set
Top 0.3%
3.5%
11
iScience
1063 papers in training set
Top 7%
2.8%
12
NAR Genomics and Bioinformatics
214 papers in training set
Top 1.0%
2.8%
13
Epigenetics
43 papers in training set
Top 0.2%
2.6%
14
BMC Genomics
328 papers in training set
Top 2%
1.9%
15
PLOS Genetics
756 papers in training set
Top 9%
1.7%
16
Genome Research
409 papers in training set
Top 3%
1.3%
17
International Journal of Molecular Sciences
453 papers in training set
Top 10%
1.3%
18
Journal of Bioinformatics and Systems Biology
14 papers in training set
Top 0.3%
1.2%
19
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.2%
20
PLOS Computational Biology
1633 papers in training set
Top 21%
0.9%
21
F1000Research
79 papers in training set
Top 3%
0.9%
22
Cell Reports Methods
141 papers in training set
Top 6%
0.6%
23
Genome Biology
555 papers in training set
Top 9%
0.6%
24
European Journal of Human Genetics
49 papers in training set
Top 2%
0.6%