Back

Individuality and information content of infrared molecular profiles: insights from a large longitudinal health-profiling study

Zarandy, Z. I.; Nemeth, F. B.; Eissa, T.; Lakatos, C.; Nagy, D.; Debreceni, D.; Fleischmann, F.; Kovacs, Z.; Gero, D.; Zigman, M.; Krausz, F.; Kepesidis, K. V.

2026-04-13 biophysics
10.64898/2026.04.09.717448 bioRxiv
Show abstract

In this study, we investigate the individuality and information content of infrared molecular profiles derived from blood samples in a large, longitudinal health-profiling cohort and compare them to a standard clinical laboratory panel. Using Fourier-transform infrared spectroscopy, we obtained comprehensive molecular fingerprints from 4,704 self-reported healthy individuals over five visits spanning 1.5 years, alongside routine clinical laboratory measurements. We show that infrared profiles are highly individual-specific and remarkably stable over time, with intra-individual variability significantly lower than inter-individual differences--paralleling the characteristics observed in clinical laboratory data. To quantify and compare the information content of these molecular datasets, we employ individual identification as a proxy for Shannon entropy. In this framework, higher identification accuracy reflects a higher amount of information. Infrared profiles outperform the clinical laboratory panel in identifying individuals at scale, suggesting higher intrinsic information content. Furthermore, combining infrared and clinical laboratory data substantially improves identification performance (the identification of less than 3000 individuals by the clinical laboratory panel is boosted to more than 4000 by incorporating the infrared spectroscopic markers), highlighting the value of integrating complementary data modalities. These findings suggest a practical framework, rooted in information theory, for comparing molecular profiling approaches and emphasize the potential of infrared spectroscopy as a complementary tool in personalized medicine.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Scientific Reports
3102 papers in training set
Top 0.1%
38.3%
2
Advanced Science
249 papers in training set
Top 4%
4.4%
3
eLife
5422 papers in training set
Top 20%
4.2%
4
Nature Communications
4913 papers in training set
Top 37%
4.0%
50% of probability mass above
5
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 19%
3.6%
6
The Journal of Physical Chemistry Letters
58 papers in training set
Top 0.4%
3.1%
7
PLOS ONE
4510 papers in training set
Top 45%
2.6%
8
Frontiers in Molecular Biosciences
100 papers in training set
Top 1%
2.1%
9
Communications Biology
886 papers in training set
Top 7%
1.8%
10
Nucleic Acids Research
1128 papers in training set
Top 10%
1.8%
11
iScience
1063 papers in training set
Top 15%
1.7%
12
Journal of The Royal Society Interface
189 papers in training set
Top 3%
1.7%
13
PNAS Nexus
147 papers in training set
Top 0.4%
1.5%
14
Analytical Chemistry
205 papers in training set
Top 2%
1.1%
15
Journal of Biomedical Optics
25 papers in training set
Top 0.5%
1.0%
16
Journal of the American Chemical Society
199 papers in training set
Top 4%
1.0%
17
Science Advances
1098 papers in training set
Top 26%
0.9%
18
Frontiers in Physics
20 papers in training set
Top 0.8%
0.8%
19
Light: Science & Applications
16 papers in training set
Top 0.6%
0.8%
20
Biophysical Reports
36 papers in training set
Top 0.5%
0.8%
21
PLOS Computational Biology
1633 papers in training set
Top 24%
0.8%
22
The European Physical Journal Plus
13 papers in training set
Top 0.8%
0.8%
23
Biosensors and Bioelectronics
52 papers in training set
Top 1%
0.8%
24
Optica
25 papers in training set
Top 0.7%
0.8%
25
Cancers
200 papers in training set
Top 5%
0.7%
26
Journal of Neurotrauma
27 papers in training set
Top 0.6%
0.7%
27
EMBO Molecular Medicine
85 papers in training set
Top 5%
0.7%
28
Journal of Genetics and Genomics
36 papers in training set
Top 2%
0.7%
29
Microbiome
139 papers in training set
Top 3%
0.7%
30
Lab on a Chip
88 papers in training set
Top 1%
0.7%