Back

MolluscaGenes: A Transcriptomic Database for the Mollusca

Perez-Moreno, J. L.; Katz, P. S.

2026-05-08 genomics
10.64898/2026.05.05.723003 bioRxiv
Show abstract

The phylum Mollusca constitutes one of the most taxonomically and morphologically diverse animal clades; however, the genomic exploration of this group has been hampered by fragmented and taxonomically incomplete transcriptomic resources. To address this fundamental limitation, we present MolluscaGenes, a centralized database that unifies transcriptomes from 299 molluscan species spanning all eight recognized classes, encompassing a broad array of tissues and developmental stages. MolluscaGenes provides searchable databases via BLAST and DIAMOND alongside a suite of 196 molluscan-optimized Hidden Markov Models (HMMs) for sensitive protein family identification. To demonstrate the utility of this resource, we performed a comprehensive phylum-wide characterization of the nicotinic acetylcholine receptor (nAChR) superfamily, recovering 3,586 sequences from over 190 species and resolving 15 distinct phylogenetic clades. This analysis revealed substantial lineage-specific expansions across multiple molluscan classes, the identification of novel clades with substitutions in canonical ligand-binding residues, and the evolutionary placement of chemotactile receptors (CRs) and CR-like sequences as predominantly cephalopod clades within the broader nAChR phylogeny. MolluscaGenes constitutes a foundational resource that will accelerate the elucidation of the unique biology and evolutionary history of Mollusca.

Matching journals

The top 10 journals account for 50% of the predicted probability mass.

1
Science
429 papers in training set
Top 3%
9.1%
2
Nature Communications
4913 papers in training set
Top 25%
7.1%
3
Molecular Biology and Evolution
488 papers in training set
Top 0.6%
7.1%
4
Cell
370 papers in training set
Top 4%
4.8%
5
Scientific Data
174 papers in training set
Top 0.3%
4.8%
6
Genome Medicine
154 papers in training set
Top 2%
4.3%
7
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 1%
4.3%
8
Nature Ecology & Evolution
113 papers in training set
Top 1%
4.3%
9
Nucleic Acids Research
1128 papers in training set
Top 5%
4.1%
10
Communications Biology
886 papers in training set
Top 2%
3.6%
50% of probability mass above
11
Nature
575 papers in training set
Top 7%
3.6%
12
BMC Biology
248 papers in training set
Top 0.3%
3.6%
13
PLOS Biology
408 papers in training set
Top 4%
3.0%
14
Scientific Reports
3102 papers in training set
Top 45%
2.6%
15
BMC Genomics
328 papers in training set
Top 1%
2.6%
16
eLife
5422 papers in training set
Top 34%
2.3%
17
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 27%
2.3%
18
Cell Reports
1338 papers in training set
Top 23%
1.8%
19
PLOS ONE
4510 papers in training set
Top 54%
1.7%
20
Nature Biotechnology
147 papers in training set
Top 5%
1.7%
21
Bioinformatics Advances
184 papers in training set
Top 3%
1.5%
22
Cell Genomics
162 papers in training set
Top 4%
1.5%
23
GigaScience
172 papers in training set
Top 2%
1.5%
24
Nature Methods
336 papers in training set
Top 5%
1.5%
25
Science Advances
1098 papers in training set
Top 22%
1.3%
26
Genome Research
409 papers in training set
Top 3%
1.3%
27
Genome Biology
555 papers in training set
Top 6%
0.9%
28
PLOS Genetics
756 papers in training set
Top 15%
0.7%
29
iScience
1063 papers in training set
Top 37%
0.6%
30
Nature Neuroscience
216 papers in training set
Top 7%
0.6%