Back

The Common Fund Data Ecosystem (CFDE)

Jurgens, J. A.; Bueckle, A.; Vora, J.; Maurya, M. R.; Mohseni Ahooyi, T.; Zheng, E.; Stear, B.; Wang, D.; Ree, C.; Ramachandran, S.; Nekrutenko, A.; Brandes, M.; Thaker, S.; Katz, D. H.; Munoz-Torres, M. C.; Diamant, I.; Chun, H.-J. E.; Simmons, J. A.; Tasian, S. K.; Jenkins, S. L.; Evangelista, J. E.; Dodia, H.; Saha, S.; Lindquist, M. A.; Gajjala, V.; Nemarich, C.; Zhen, J.; Ross, K. E.; Byrd, A. I.; Shilin, A.; Metzger, V. T.; Bologa, C. G.; Srinivasan, S.; Jang, D.; Kumar, P.; Taub, L. D.; Levanto, M. P.; Petrosyan, V.; Anandakrishnan, M.; Kim, M.; Clarke, D. J. B.; Ivich, A.; Crichton, D.

2026-04-12 scientific communication and education
10.64898/2026.04.10.717672 bioRxiv
Show abstract

The NIH Common Fund Data Ecosystem (CFDE) integrates data resources from 18 NIH Common Fund programs for discovery and integrative analysis. These programs generate valuable but heterogeneous datasets that can be difficult to discover, access, and reuse. CFDE aims to provide a collaborative, community-built infrastructure that links and enriches Common Fund programs. We describe the evolution, structure, and core technologies of CFDE, including practical approaches that support submission, integration, visualization, and public release of multimodal data. Training programs and workforce initiatives lower barriers to adoption. CFDE has devised solutions to critical issues facing cross-program initiatives, including data scale and heterogeneity, dataset integration, and long-term sustainability. We demonstrate the utility of linking Common Fund resources through integrative tools and cross-dataset queries to yield insights that would otherwise be infeasible. Collectively, CFDE shows that a standards-driven, federated approach enhances and unifies cross-disciplinary resources, fostering collaboration and data-driven discovery.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Nature Biotechnology
147 papers in training set
Top 0.1%
28.4%
2
PLOS ONE
4510 papers in training set
Top 20%
9.4%
3
eLife
5422 papers in training set
Top 10%
7.4%
4
GigaScience
172 papers in training set
Top 0.2%
6.5%
50% of probability mass above
5
Nature Neuroscience
216 papers in training set
Top 1%
6.5%
6
Patterns
70 papers in training set
Top 0.2%
3.7%
7
Nature Methods
336 papers in training set
Top 3%
3.7%
8
PLOS Computational Biology
1633 papers in training set
Top 11%
3.3%
9
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.9%
2.8%
10
Journal of Cell Biology
333 papers in training set
Top 1%
2.8%
11
Scientific Data
174 papers in training set
Top 0.7%
2.4%
12
PLOS Biology
408 papers in training set
Top 8%
1.9%
13
eneuro
389 papers in training set
Top 5%
1.9%
14
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 4%
1.3%
15
Cell Systems
167 papers in training set
Top 9%
1.1%
16
Communications Biology
886 papers in training set
Top 16%
1.0%
17
Nature
575 papers in training set
Top 13%
1.0%
18
npj Digital Medicine
97 papers in training set
Top 3%
1.0%
19
Nature Computational Science
50 papers in training set
Top 1%
0.8%
20
Scientific Reports
3102 papers in training set
Top 74%
0.8%
21
Nature Human Behaviour
85 papers in training set
Top 4%
0.8%
22
Bioinformatics
1061 papers in training set
Top 10%
0.7%
23
Nature Communications
4913 papers in training set
Top 65%
0.7%
24
Acta Crystallographica Section D Structural Biology
54 papers in training set
Top 0.5%
0.5%
25
Neuron
282 papers in training set
Top 10%
0.5%
26
Genome Biology
555 papers in training set
Top 9%
0.5%
27
Nucleic Acids Research
1128 papers in training set
Top 21%
0.5%
28
Bioinformatics Advances
184 papers in training set
Top 5%
0.5%