Back

SynLS: A novel diffusion-transformer framework for generating high-quality wearable sensor time series data to enhance health monitoring

Lin, D.; Ji, Y.; McArt, J.; Li, J.

2025-05-15 bioinformatics
10.1101/2025.05.11.653212 bioRxiv
Show abstract

While global medical research is poised to benefit from the rapid advance of artificial intelligence (AI) technologies, veterinary medicine research often faces significant limitations due to data scarcity and availability issues. To address this issue, we proposed a generative modeling framework, SynLS, for generating highly realistic synthetic wearable sensor data. Leveraging diffusion architecture and transformer encoder mechanism, SynLS addressed the intricate challenges posed by these real-world wearable sensor data, including varied length, multiple dimensions, high diversity, high noise, periodicity, and trend. We have validated SynLS on four publicly-available livestock wearables databases with records for three health events (calving, estrus and diseases), and demonstrated its ablility in producing high-fidelity wearable sensor data, which could improve the downstream health events prediction tasks by 18.5% and 26.8% under two evaluation scenarios based on instance and timestamp, respectively. Additionally, introducting raw tri-axial accelerometer databases collected from animals and human further demonstrated extensibility of our framework, significantly enhancing downstream behavior classification tasks by 38.8% and 83.8%, respectively. The technical framework proposed in this work offers a potential generalized solution for data supplementation in wearables sensor databases, with potential applicability across veterinary medicine and other medical domains facing resource constraints.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.1%
32.8%
2
Advanced Science
249 papers in training set
Top 4%
4.8%
3
PLOS Computational Biology
1633 papers in training set
Top 7%
4.8%
4
PLOS ONE
4510 papers in training set
Top 34%
4.3%
5
IEEE Access
31 papers in training set
Top 0.1%
3.9%
50% of probability mass above
6
Scientific Reports
3102 papers in training set
Top 31%
3.9%
7
Nature Communications
4913 papers in training set
Top 40%
3.6%
8
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.1%
9
npj Digital Medicine
97 papers in training set
Top 1%
3.1%
10
Nature Machine Intelligence
61 papers in training set
Top 2%
1.9%
11
GigaScience
172 papers in training set
Top 1%
1.8%
12
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.2%
1.8%
13
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.8%
14
iScience
1063 papers in training set
Top 15%
1.7%
15
Frontiers in Genetics
197 papers in training set
Top 5%
1.7%
16
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 4%
1.2%
17
Bioinformatics
1061 papers in training set
Top 8%
1.1%
18
Heliyon
146 papers in training set
Top 5%
0.9%
19
Science Advances
1098 papers in training set
Top 28%
0.8%
20
Bioinformatics Advances
184 papers in training set
Top 4%
0.8%
21
Journal of The Royal Society Interface
189 papers in training set
Top 5%
0.7%
22
Patterns
70 papers in training set
Top 2%
0.7%
23
BMC Medical Informatics and Decision Making
39 papers in training set
Top 3%
0.7%
24
Sensors
39 papers in training set
Top 2%
0.7%
25
Database
51 papers in training set
Top 1%
0.7%
26
eLife
5422 papers in training set
Top 60%
0.7%
27
Communications Biology
886 papers in training set
Top 29%
0.6%
28
IEEE Transactions on Biomedical Engineering
38 papers in training set
Top 1%
0.6%
29
Nature Methods
336 papers in training set
Top 7%
0.6%