Back

TomoSwin3D: a Swin3D Transformer for the Identification and Classification of Macromolecules in 3D Cryo-ET Tomograms

Dhakal, A.; Gyawali, R.; Cheng, J.

2026-04-21 biochemistry
10.64898/2026.04.17.719219 bioRxiv
Show abstract

Cryo-electron tomography (cryo-ET) enables in situ three-dimensional visualization of many protein complexes and other macromolecular assemblies such as ribosomes in cells, yet automated macromolecule particle identification in 3D cryo-ET tomograms remains a major bottleneck due to dose-limited low signal-to-noise ratios, missing-wedge artifacts, and densely crowded cellular backgrounds. We present TomoSwin3D, an end-to-end three-dimensional (3D) macromolecule particle identification and classification pipeline centered on a Swin Transformer-based U-Net that performs particle identification and classification and outputs particle centroid coordinates. TomoSwin3D leverages a multi-channel input representation that augments raw tomogram densities with complementary 3D feature maps capturing edge strength (Sobel gradients), local contrast enhancement (morphological top-hat), and multiscale blob responses (Difference-of-Gaussians), improving detectability of small and low-contrast targets. To better preserve particle geometry and avoid hand-crafted shape assumptions, it adopts occupancy-preserving supervision that directly uses available 3D instance masks rather than heuristic Gaussian/spherical labels and applies scalable patch-wise inference followed by lightweight post-processing (connected-component analysis, size filtering, centroid extraction) for robust centroid coordinate extraction. Across diverse simulated and experimental cryo-ET tomogram benchmarks including SHREC 2021 and 2020 test datasets, EMPIAR dataset, and Cryo-ET data portal dataset, TomoSwin3D achieves strong and consistent performance in detecting proteins and other particles, outperforming existing methods, with a pronounced advantage in picking hard, small protein particles. These results establish TomoSwin3D as a scalable and accurate solution for high-throughput cryo-ET macromolecule particle picking and downstream subtomogram averaging.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Nature Methods
336 papers in training set
Top 0.3%
19.7%
2
Journal of Structural Biology
58 papers in training set
Top 0.1%
12.9%
3
Nature Communications
4913 papers in training set
Top 13%
12.7%
4
IUCrJ
29 papers in training set
Top 0.1%
4.9%
50% of probability mass above
5
Structure
175 papers in training set
Top 0.7%
4.0%
6
Communications Biology
886 papers in training set
Top 2%
3.6%
7
Journal of Structural Biology: X
15 papers in training set
Top 0.1%
3.6%
8
Cell Reports Methods
141 papers in training set
Top 1%
2.9%
9
Nature Biotechnology
147 papers in training set
Top 3%
2.8%
10
Science
429 papers in training set
Top 11%
2.6%
11
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 27%
2.1%
12
Nature
575 papers in training set
Top 9%
2.1%
13
PLOS ONE
4510 papers in training set
Top 51%
1.8%
14
Advanced Science
249 papers in training set
Top 11%
1.7%
15
eLife
5422 papers in training set
Top 41%
1.7%
16
Acta Crystallographica Section D Structural Biology
54 papers in training set
Top 0.2%
1.7%
17
Scientific Reports
3102 papers in training set
Top 66%
1.2%
18
Nucleic Acids Research
1128 papers in training set
Top 13%
1.2%
19
Journal of Cell Biology
333 papers in training set
Top 4%
0.9%
20
Cell Systems
167 papers in training set
Top 11%
0.8%
21
Bioinformatics
1061 papers in training set
Top 9%
0.8%
22
Microscopy and Microanalysis
12 papers in training set
Top 0.2%
0.5%
23
iScience
1063 papers in training set
Top 40%
0.5%