Back

A Web Application for Exploring Distribution in Academic Publications Across Geography and Institutions in India

Hou, Y.; Cohen, E.; Higginbottom, J.; Rountree, L.; Ren, Y.; Wahl, B.; Nyhan, K.; Mukherjee, B.

2026-03-20 health informatics
10.64898/2026.03.18.26348755 medRxiv
Show abstract

India's national research capacity and infrastructure are unevenly distributed across states and union territories (UTs), contributing to geographic variation in academic publication output. We developed Indiapub, an open-access web application that quantitatively enumerates and visually displays geographic and temporal publication patterns for research products with at least one author affiliated with an Indian institution, using OpenAlex data. The app is designed for ease of use, with automated data retrieval, cleaning, and aggregation. Indiapub allows users to filter publications by topic, publication year range, author position, publication type, minimum citation count, state/UT, and population size of the state/UT where the author institution is located. The app also provides downloadable tables and ranked institution lists by publication count. Its interactive dashboard includes five modules: (i) a map of publication distribution, (ii) time trend plots for nation and state/UT, (iii) publication-share versus population-share plots highlighting over- and underrepresentation, (iv) stacked bar charts of state/UT contributions over time with population benchmarks, and (v) bubble plots relating the Human Development Index to publication volume over time. This tool may support resource prioritization and identification of institutional strengths for trainees, researchers, higher education administrators, and policymakers. To illustrate its utility, we present sample findings derived from the app. For publications across all topics from 2014 to 2025, the largest research participation footprints were observed in Tamil Nadu, Maharashtra, Delhi, Uttar Pradesh, and Karnataka. Tamil Nadu and Delhi were home to three of the highest-publishing institutions nationally: Vellore Institute of Technology, All India Institute of Medical Sciences, and Indian Institute of Technology Delhi. We also examined six curated case studies of broad scientific interest: electronic health records (EHR), genome-wide association studies (GWAS), artificial intelligence (AI), development economics, environmental science, and COVID-19. Findings from these case studies revealed over- and underrepresentation in publication output across states and UTs. For example, in EHR publications among high-population states, Tamil Nadu's publication share exceeded its population share by 31.3 percentage points (pp), whereas Bihar's was 12.8 pp lower. Our tool offers insights into India's research landscape across states and UTs with easy-to-digest visuals. Such interactive tools have the potential to serve as a starting point for fostering a more inclusive research ecosystem supporting targeted research policy and planning.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
PLOS Digital Health
91 papers in training set
Top 0.2%
10.0%
2
PLOS ONE
4510 papers in training set
Top 26%
6.7%
3
Scientific Reports
3102 papers in training set
Top 19%
6.3%
4
DIGITAL HEALTH
12 papers in training set
Top 0.1%
6.2%
5
JMIR Public Health and Surveillance
45 papers in training set
Top 0.2%
6.2%
6
JAMIA Open
37 papers in training set
Top 0.3%
4.8%
7
BMJ Health & Care Informatics
13 papers in training set
Top 0.1%
3.9%
8
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.8%
3.5%
9
GigaScience
172 papers in training set
Top 0.5%
3.5%
50% of probability mass above
10
Royal Society Open Science
193 papers in training set
Top 0.8%
3.2%
11
Scientific Data
174 papers in training set
Top 0.6%
3.0%
12
Wellcome Open Research
57 papers in training set
Top 0.4%
3.0%
13
PeerJ
261 papers in training set
Top 4%
2.3%
14
International Journal of Medical Informatics
25 papers in training set
Top 0.7%
2.0%
15
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 3%
1.7%
16
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
1.7%
17
International Journal of Environmental Research and Public Health
124 papers in training set
Top 4%
1.7%
18
SoftwareX
15 papers in training set
Top 0.2%
1.6%
19
BMC Research Notes
29 papers in training set
Top 0.2%
1.5%
20
Frontiers in Public Health
140 papers in training set
Top 6%
1.3%
21
The Lancet Digital Health
25 papers in training set
Top 0.7%
1.1%
22
FACETS
11 papers in training set
Top 0.2%
1.1%
23
F1000Research
79 papers in training set
Top 3%
0.9%
24
Journal of Medical Internet Research
85 papers in training set
Top 4%
0.9%
25
Medicine
30 papers in training set
Top 2%
0.9%
26
BMC Bioinformatics
383 papers in training set
Top 6%
0.9%
27
Journal of Clinical and Translational Science
11 papers in training set
Top 0.3%
0.9%
28
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 1%
0.7%
29
JMIRx Med
31 papers in training set
Top 2%
0.7%
30
Data in Brief
13 papers in training set
Top 0.6%
0.7%