Back

The complete genome of the KOLF2.1J reference iPSC line

Alvarez Jerez, P.; Rhie, A.; Kim, J.; Hebbar, P.; Nag, S.; Antipov, D.; Koren, S.; Lara, E.; Beilina, A.; Hansen, N. F.; Arber, C. F.; Zulueta, J.; Wild-Crea, P.; Patel, D.; Hickey, G.; Waltz, B.; Malik, L.; Skarnes, W. C.; Reed, X.; Genner, R.; Daida, K.; Pantazis, C. B.; Grenn, F.; Nalls, M. A.; Billingsley, K.; Fossati, V.; Wray, S.; Ward, M.; Ryten, M.; Cookson, M. R.; Jain, M.; Paten, B.; Phillippy, A. M.; Blauwendraat, C.

2026-03-09 genomics
10.64898/2026.03.06.710144 bioRxiv
Show abstract

While induced pluripotent stem cells (iPSCs) have gained popularity in studying neurodegenerative diseases, the heterogeneity of stem cells used across studies impacts cross-study comparison. The iPSC Neurodegenerative Disease Initiative (iNDI) selected the KOLF2.1J cell line and prioritized its use as a reference standard for studying the effects of pathogenic variants on cell biology due to its stability and neutral neurodegenerative disease genetic risk. This cell line, and its derivatives expressing over 100 variants related to Alzheimers disease, Parkinsons disease, and other neurological diseases, are available for academic and industry access. Current genomic data analyses are limited by the use of a human reference genome that does not capture the complete genetic background of a given iPSC line. While in the future this issue may be partially mitigated by the creation of a comprehensive human pangenome, previous work has shown that generating custom genomes is of value both to characterize the variation present and to serve as a more appropriate genomic reference. Here, we generated and characterized a custom complete genome assembly from KOLF2.1J. Mapping of sequencing reads to a personalized diploid assembly results in more comprehensive mapping compared to traditional linear references (i.e GRCh38). In addition, we provide a comprehensive custom gene annotation along with isoform expression and differential methylation analyses across multiple cell types. The assembly and all additional data is browsable and publicly available. This resource will enable more accurate investigation of the KOLF2.1J cell line and any genomics data generated compared to using traditional generalized references, while also serving as a foundational approach for establishing custom reference assemblies for other high-value iPSC lines.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
BMC Genomics
328 papers in training set
Top 0.1%
23.5%
2
Frontiers in Genetics
197 papers in training set
Top 0.2%
12.9%
3
PLOS ONE
4510 papers in training set
Top 37%
3.7%
4
Scientific Reports
3102 papers in training set
Top 40%
3.4%
5
Biology Methods and Protocols
53 papers in training set
Top 0.3%
3.2%
6
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.2%
7
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 4%
1.8%
50% of probability mass above
8
International Journal of Molecular Sciences
453 papers in training set
Top 7%
1.8%
9
Methods
29 papers in training set
Top 0.2%
1.8%
10
Human Molecular Genetics
130 papers in training set
Top 2%
1.7%
11
Scientific Data
174 papers in training set
Top 1%
1.4%
12
DNA Research
23 papers in training set
Top 0.3%
1.3%
13
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.3%
14
Stem Cell Research & Therapy
30 papers in training set
Top 0.5%
1.3%
15
Cell Genomics
162 papers in training set
Top 5%
1.2%
16
Genome Medicine
154 papers in training set
Top 6%
1.0%
17
Cells
232 papers in training set
Top 4%
1.0%
18
Genome Biology
555 papers in training set
Top 6%
1.0%
19
Bioinformatics Advances
184 papers in training set
Top 4%
0.9%
20
npj Genomic Medicine
33 papers in training set
Top 0.7%
0.9%
21
Database
51 papers in training set
Top 0.7%
0.9%
22
Human Genomics
21 papers in training set
Top 0.2%
0.9%
23
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.9%
24
Stem Cell Reports
118 papers in training set
Top 0.8%
0.8%
25
Molecular Omics
21 papers in training set
Top 0.3%
0.8%
26
Neurology Genetics
14 papers in training set
Top 0.3%
0.8%
27
Gigabyte
60 papers in training set
Top 1%
0.8%
28
Molecular Biology Reports
19 papers in training set
Top 0.5%
0.8%
29
Advanced Science
249 papers in training set
Top 19%
0.7%
30
Alzheimer's & Dementia
143 papers in training set
Top 3%
0.7%