Back

Genomic sampling and population structure of farmer-maintained varieties reveal previously uncharacterized diversity of Theobroma cacao L. in Costa Rica

Herrighty, E. M.; Specht, C. D.; Gore, M. A.; Solano, L.; Estrada-Gamboa, J.; Hernandez, C. E.; Tufan, H. A.; Landis, J. B.

2026-04-01 genomics
10.64898/2026.03.30.715340 bioRxiv
Show abstract

Understanding crop genetic diversity is essential for conservation and breeding, yet farmer-maintained germplasm remains largely underrepresented in genomic studies. Theobroma cacao L. has a complex domestication history and extensive global diversity, and cacao currently cultivated in Central America, particularly in Costa Rica, has been understudied compared to South American and Mexican cultivars despite cultural and historical importance. In this study, we investigate the genetic diversity of cacao from farmer-managed systems across Costa Rica to search for Criollo germplasm and identify and characterize any unique local genetic groups. Ninety-four trees were sampled from 17 farms across four regions of the country and sequenced using whole genome resequencing. Farmer materials were analyzed alongside 166 previously characterized reference accessions representing major cacao genetic groups. Population structure analyses, phylogenetic reconstruction, and network approaches revealed that Costa Rican cacao encompasses multiple known genetic groups, including Criollo-derived lineages, while also harboring locally distinct diversity not fully represented in current global reference collections. Analyses revealed close kinship between many accessions with no clear geographic patterns corresponding to the observed population differentiation, reflecting the effects of farmers in creating dominant patterns of gene flow through seed-saving, clonal propagation, and sharing genotypes among farms. Heterozygosity levels varied substantially among individuals, consistent with a mixture of highly inbred Criollo trees and more heterozygous, admixed genotypes. We find that farmer-managed cacao systems are reservoirs of genetic diversity, including possibly rare or historically important lineages, underscoring the value of these farming systems for effective conservation and management of genomic resources for cacao resilience and improvement.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
PLANTS, PEOPLE, PLANET
21 papers in training set
Top 0.1%
22.8%
2
Frontiers in Plant Science
240 papers in training set
Top 0.6%
12.8%
3
New Phytologist
309 papers in training set
Top 1.0%
6.4%
4
Scientific Reports
3102 papers in training set
Top 23%
4.9%
5
The Plant Genome
53 papers in training set
Top 0.1%
4.4%
50% of probability mass above
6
PLOS ONE
4510 papers in training set
Top 36%
4.0%
7
Horticulture Research
43 papers in training set
Top 0.5%
3.7%
8
Applications in Plant Sciences
21 papers in training set
Top 0.1%
3.6%
9
The Plant Journal
197 papers in training set
Top 2%
2.8%
10
Plant Direct
81 papers in training set
Top 0.9%
2.4%
11
BMC Plant Biology
47 papers in training set
Top 0.3%
1.9%
12
Journal of Experimental Botany
195 papers in training set
Top 2%
1.7%
13
G3 Genes|Genomes|Genetics
351 papers in training set
Top 1%
1.7%
14
Plants
39 papers in training set
Top 1%
1.7%
15
Molecular Ecology
304 papers in training set
Top 3%
1.7%
16
Plant Biotechnology Journal
56 papers in training set
Top 0.7%
1.5%
17
The Plant Phenome Journal
14 papers in training set
Top 0.2%
1.5%
18
Evolutionary Applications
91 papers in training set
Top 0.9%
1.1%
19
G3
33 papers in training set
Top 0.3%
1.0%
20
Frontiers in Genetics
197 papers in training set
Top 8%
0.9%
21
Plant Physiology
217 papers in training set
Top 2%
0.9%
22
G3: Genes, Genomes, Genetics
222 papers in training set
Top 0.7%
0.9%
23
BMC Genomics
328 papers in training set
Top 5%
0.8%
24
Theoretical and Applied Genetics
46 papers in training set
Top 0.5%
0.7%
25
Genetics
225 papers in training set
Top 4%
0.7%
26
Current Biology
596 papers in training set
Top 14%
0.7%
27
Nature Communications
4913 papers in training set
Top 64%
0.7%
28
Journal of Applied Ecology
35 papers in training set
Top 0.7%
0.7%
29
The Plant Cell
141 papers in training set
Top 2%
0.7%
30
Genes
126 papers in training set
Top 4%
0.7%