Back

Maggot: An ecosystem for sharing metadata within the web of FAIR Data

Jacob, D.; Ehrenmann, F.; David, R.; Tran, J.; Mirande-Ney, C.; Chaumeil, P.

2024-05-29 bioinformatics
10.1101/2024.05.24.595703 bioRxiv
Show abstract

BackgroundDescriptive metadata are crucial for the discovery, reporting and mobilisation of research datasets. Addressing all metadata issues within the Data Management Plan often poses challenges for data producers. Organising and documenting data within data storage entails creating various descriptive metadata. Subsequently, data sharing involves ensuring metadata interoperability in alignment with FAIR principles. Given the tangible nature of these challenges, a real need for management tools has to be addressed to assist data managers to the fullest extent. Moreover, these tools have to meet data producers requirements and be user-friendly as well with minimal training as prerequisites. ResultsWe developed Maggot which stands for Metadata Aggregation on Data Storage, specifically designed to annotate datasets by generating metadata files to be linked into storage spaces. Maggot enables users to seamlessly generate and attach comprehensible metadata to datasets within a collaborative environment. This approach seamlessly integrates into a data management plan, effectively tackling challenges related to data organisation, documentation, storage, and frictionless FAIR metadata sharing within the collaborative group and beyond. Furthermore, for enabling metadata crosswalk, metadata generated with Maggot can be converted for a specific data repository or configured to be exported into a suitable format for data harvesting by third-party applications. ConclusionThe primary feature of Maggot is to ease metadata capture based on a carefully selected schema and standards. Then, it greatly eases access to data through metadata as requested nowadays in projects funded by public institutions and entities such as Europe Commission. Thus, Maggot can be used on one hand to promote good local versus global data management with open data sharing in mind while respecting FAIR principles, and on the other hand to prepare the future EOSC FAIR Web of Data within the framework of the European Open Science Cloud.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
GigaScience
172 papers in training set
Top 0.1%
23.7%
2
PLOS ONE
4510 papers in training set
Top 8%
19.6%
3
Bioinformatics
1061 papers in training set
Top 3%
8.8%
50% of probability mass above
4
BMC Bioinformatics
383 papers in training set
Top 2%
4.5%
5
Computational and Structural Biotechnology Journal
216 papers in training set
Top 1%
3.8%
6
Gigabyte
60 papers in training set
Top 0.3%
3.8%
7
Scientific Reports
3102 papers in training set
Top 48%
2.2%
8
PeerJ
261 papers in training set
Top 5%
2.0%
9
Database
51 papers in training set
Top 0.3%
1.9%
10
Scientific Data
174 papers in training set
Top 0.9%
1.9%
11
SoftwareX
15 papers in training set
Top 0.1%
1.9%
12
F1000Research
79 papers in training set
Top 2%
1.3%
13
Frontiers in Plant Science
240 papers in training set
Top 4%
1.3%
14
Limnology and Oceanography: Methods
11 papers in training set
Top 0.2%
1.3%
15
PLOS Computational Biology
1633 papers in training set
Top 20%
1.2%
16
Frontiers in Marine Science
55 papers in training set
Top 1%
0.7%
17
International Journal of Environmental Research and Public Health
124 papers in training set
Top 7%
0.7%
18
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
19
International Journal of Medical Informatics
25 papers in training set
Top 2%
0.7%
20
BMC Medical Informatics and Decision Making
39 papers in training set
Top 3%
0.5%
21
JMIR mHealth and uHealth
10 papers in training set
Top 0.5%
0.5%
22
Royal Society Open Science
193 papers in training set
Top 6%
0.5%
23
Frontiers in Physiology
93 papers in training set
Top 7%
0.5%
24
Computers in Biology and Medicine
120 papers in training set
Top 6%
0.5%
25
Bioinformatics Advances
184 papers in training set
Top 5%
0.5%
26
Bioengineering
24 papers in training set
Top 2%
0.5%
27
Journal of Translational Medicine
46 papers in training set
Top 4%
0.5%