Back

Generation and analysis of a mouse multi-tissue genome annotation atlas

Adams, M. S.; Vollmers, C.

2024-02-01 genomics
10.1101/2024.01.31.578267 bioRxiv
Show abstract

Generating an accurate and complete genome annotation for an organism is complex because the cells within each tissue can express a unique set of transcript isoforms from a unique set of genes. A comprehensive genome annotation should contain information on what tissues express what transcript isoforms at what level. This tissue-level isoform information can then inform a wide range of research questions as well as experiment designs. Long-read sequencing technology combined with advanced full-length cDNA library preparation methods has now achieved throughput and accuracy where generating these types of annotations is achievable. Here, we show this by generating a genome annotation of the mouse (Mus musculus). We used the nanopore-based R2C2 long-read sequencing method to generate 64 million highly accurate full length cDNA consensus reads - averaging 5.4 million reads per tissue for a dozen tissues. Using the Mandalorion tool we processed these reads to generate the Tissue-level Atlas of Mouse Isoforms (TAMI - available at https://genome.ucsc.edu/s/vollmers/TAMI) which we believe will be a valuable complement to conventional, manually curated reference genome annotations.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Genome Research
409 papers in training set
Top 0.1%
22.0%
2
Nature Communications
4913 papers in training set
Top 21%
8.9%
3
Science
429 papers in training set
Top 4%
8.2%
4
Nature Methods
336 papers in training set
Top 1%
7.0%
5
Genome Biology
555 papers in training set
Top 1%
6.2%
50% of probability mass above
6
Nucleic Acids Research
1128 papers in training set
Top 3%
6.2%
7
Nature Biotechnology
147 papers in training set
Top 1%
6.2%
8
Cell Genomics
162 papers in training set
Top 1%
3.9%
9
Nature Genetics
240 papers in training set
Top 3%
2.8%
10
eLife
5422 papers in training set
Top 39%
1.8%
11
Bioinformatics
1061 papers in training set
Top 7%
1.8%
12
G3: Genes, Genomes, Genetics
222 papers in training set
Top 0.4%
1.7%
13
Cell
370 papers in training set
Top 12%
1.7%
14
Scientific Reports
3102 papers in training set
Top 61%
1.6%
15
Genetics
225 papers in training set
Top 3%
1.3%
16
BMC Genomics
328 papers in training set
Top 4%
1.2%
17
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.1%
18
Cell Reports Methods
141 papers in training set
Top 4%
1.1%
19
Cell Systems
167 papers in training set
Top 10%
0.9%
20
iScience
1063 papers in training set
Top 28%
0.9%
21
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 43%
0.8%
22
Nature
575 papers in training set
Top 15%
0.8%
23
Life Science Alliance
263 papers in training set
Top 1%
0.8%
24
GigaScience
172 papers in training set
Top 4%
0.6%
25
Communications Biology
886 papers in training set
Top 30%
0.6%
26
Bioinformatics Advances
184 papers in training set
Top 5%
0.6%