Back

A Python module for programmatic access to TrypTag genome-wide subcellular protein localisation data in Trypanosoma brucei

Dobramysl, U.; Wheeler, R. J.

2026-02-17 microbiology
10.64898/2026.02.17.706163 bioRxiv
Show abstract

Protein subcellular localisation is informative for understanding potential protein function, particularly in highly structured unicellular eukaryotes. Microscopy is especially powerful for interrogating localisation, providing high resolution single cell data about where a protein resides. We previously generated the TrypTag dataset - a genome-wide protein localisation resource for the human unicellular parasite Trypanosoma brucei using fluorescent protein tagging. This is a puissant dataset due to its scale: Originally captured with high content image analysis in mind, it is a formidable resource for machine learning or artificial intelligence tool development and testing. Here, we describe a Python module for programmatic access to this data rich resource. Images of each tagged cell line, together with segmented cell masks, can be accessed arbitrarily by gene ID and tagging terminus, the database can be searched by protein localisation, and tools are provided to assist foundational image analysis of individual T. brucei cell cycle stage and morphology. We stress-tested this tool by using it to examine a key feature of T. brucei morphogenesis during division: The old and newly formed flagellum and associated organelles tend to have different protein compositions, and using the TrypTag toolkit we show that there is extensive age-based differential content of these organelles while the daughter nuclei completely lack such asymmetry.

Matching journals

The top 11 journals account for 50% of the predicted probability mass.

1
Molecular Biology of the Cell
272 papers in training set
Top 0.2%
8.2%
2
PLOS Computational Biology
1633 papers in training set
Top 6%
6.2%
3
BMC Biology
248 papers in training set
Top 0.1%
6.2%
4
Open Biology
95 papers in training set
Top 0.1%
4.2%
5
Journal of Microscopy
18 papers in training set
Top 0.1%
4.1%
6
Journal of Cell Biology
333 papers in training set
Top 0.9%
3.9%
7
Scientific Reports
3102 papers in training set
Top 32%
3.9%
8
Disease Models & Mechanisms
119 papers in training set
Top 0.5%
3.5%
9
PLOS ONE
4510 papers in training set
Top 41%
3.5%
10
Nature Communications
4913 papers in training set
Top 41%
3.5%
11
iScience
1063 papers in training set
Top 6%
3.5%
50% of probability mass above
12
Bioinformatics
1061 papers in training set
Top 6%
3.2%
13
eLife
5422 papers in training set
Top 31%
2.7%
14
Communications Biology
886 papers in training set
Top 4%
2.5%
15
Nucleic Acids Research
1128 papers in training set
Top 8%
2.3%
16
Journal of Cell Science
353 papers in training set
Top 0.8%
2.0%
17
Biology Open
130 papers in training set
Top 0.8%
1.8%
18
Life Science Alliance
263 papers in training set
Top 0.2%
1.7%
19
PLOS Pathogens
721 papers in training set
Top 6%
1.7%
20
Scientific Data
174 papers in training set
Top 1%
1.7%
21
Frontiers in Cellular and Infection Microbiology
98 papers in training set
Top 3%
1.5%
22
Frontiers in Physiology
93 papers in training set
Top 4%
1.3%
23
PLOS Biology
408 papers in training set
Top 13%
1.3%
24
Acta Crystallographica Section D Structural Biology
54 papers in training set
Top 0.3%
1.2%
25
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 39%
1.1%
26
GigaScience
172 papers in training set
Top 3%
0.9%
27
Journal of Molecular Biology
217 papers in training set
Top 4%
0.7%
28
PLOS Genetics
756 papers in training set
Top 15%
0.7%
29
Computational and Structural Biotechnology Journal
216 papers in training set
Top 10%
0.7%
30
Biological Imaging
15 papers in training set
Top 0.3%
0.7%