Back

Capturing India's phenotypic diversity: Health insights from the GenomeIndia project

Mondal, D.; Bhattacharyya, C.; Shekhawat, D. S.; Tada, N. G.; Rajial, T.; Parameswaran, A. S.; Jena, D.; Datta, S.; Swain, M.; Jena, S.; Mishra, A.; Mahapatra, S.; Sathi, S. N.; Alam, M.; Ali, A.; Choudhury, P.; Ghosh, P.; Tripathi, D.; Anilkumar, S.; Ashwath, D.; Chithimmaiah, M.; Hameed, S. K. S.; Gunasegaran, R.; Singh, N.; Mala, G.; De, T.; Reza, S.; Mukherjee, A.; Prajapati, B.; Dave, B.; Yumnam, S.; Vimi, K.; Sharma, G. N.; Malik, A.; Sarma, R. J.; Vanlallawma, A.; Samartha, D. K.; G, T. S.; Kavya, P. V.; Deshpande, S.; GenomeIndia Consortium, ; Singh, K.; Sharma, P.; Raghav, S. K.; Pra

2026-04-02 public and global health
10.64898/2026.04.01.26349926 medRxiv
Show abstract

Background India represents 18% of the global population yet remains underrepresented in health research. Moreover, existing national surveys miss critical variation across its 4,600 ethnolinguistic groups. We present a comprehensive phenotypic characterisation of 81 populations from the GenomeIndia project. Methods We analysed 67 sociodemographic, anthropometric, and blood biochemistry variables from 17,777 individuals sampled across 81 ethnolinguistic populations from India, examining population-level variation, disease reporting fractions, and age- and sex-specific life-course trends. Findings Ethnolinguistic identity predicted health outcomes independently of administrative state, improving phenotypic variance explained by an average of 7.4%. 95% of participants had at least one abnormal biochemical or anthropometric marker, driven by low HDL (52.2%) and elevated triglycerides (43.6%). Metabolic risk, however, was highly stratified: adjusted prevalence for low HDL ranged four-fold across ancestry groups from 17.2% to 67.7%. We also identified an "awareness gap"; only 17.6% of people with hypertension and 2.2% of people with dyslipidemia were aware of their condition. This awareness gap was higher in tribal populations, in which women did not show the higher HDL levels typically seen compared to men, pointing to distinct metabolic profiles and healthcare access barriers across India. Interpretation The Indian phenotypic landscape is highly structured along ethnolinguistic lines, where ancestry and environment both influence risk. The high systemic burden of abnormalities necessitates population-specific reference intervals. GenomeIndia provides a foundational map for precision public health, shifting the focus from state-level averages to population-specific risk profiles. Funding This work was funded by the Department of Biotechnology, Ministry of Science and Technology, Government of India.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
PLOS Medicine
98 papers in training set
Top 0.1%
14.5%
2
International Journal of Epidemiology
74 papers in training set
Top 0.1%
10.5%
3
BMJ Open
554 papers in training set
Top 2%
8.5%
4
PLOS Global Public Health
293 papers in training set
Top 1%
6.4%
5
PLOS ONE
4510 papers in training set
Top 27%
6.4%
6
Wellcome Open Research
57 papers in training set
Top 0.2%
4.4%
50% of probability mass above
7
Scientific Reports
3102 papers in training set
Top 44%
2.8%
8
Nature Communications
4913 papers in training set
Top 44%
2.6%
9
eBioMedicine
130 papers in training set
Top 0.6%
2.4%
10
PLOS Digital Health
91 papers in training set
Top 1%
1.9%
11
British Journal of General Practice
22 papers in training set
Top 0.2%
1.9%
12
JMIR Public Health and Surveillance
45 papers in training set
Top 2%
1.7%
13
BMC Medicine
163 papers in training set
Top 4%
1.7%
14
Journal of the American Heart Association
119 papers in training set
Top 3%
1.5%
15
Frontiers in Public Health
140 papers in training set
Top 5%
1.5%
16
BMJ Global Health
98 papers in training set
Top 2%
1.3%
17
Frontiers in Cardiovascular Medicine
49 papers in training set
Top 2%
1.3%
18
eClinicalMedicine
55 papers in training set
Top 0.9%
1.3%
19
eLife
5422 papers in training set
Top 49%
1.2%
20
F1000Research
79 papers in training set
Top 3%
0.9%
21
Trials
25 papers in training set
Top 1%
0.8%
22
SSM - Population Health
17 papers in training set
Top 0.4%
0.8%
23
JMIRx Med
31 papers in training set
Top 2%
0.8%
24
Frontiers in Medicine
113 papers in training set
Top 7%
0.8%
25
BMJ Open Diabetes Research & Care
15 papers in training set
Top 1%
0.8%
26
Obesity
19 papers in training set
Top 0.6%
0.8%
27
Journal of Racial and Ethnic Health Disparities
11 papers in training set
Top 0.4%
0.8%
28
The Lancet
16 papers in training set
Top 0.9%
0.6%
29
Current Developments in Nutrition
15 papers in training set
Top 1.0%
0.6%
30
The Lancet Global Health
24 papers in training set
Top 1%
0.6%