Back

The First 1,000 Days (1kD) Project - Collecting and Analyzing an Ultra-Dense Naturalistic Dataset of Human Baby Development

Raviv, H.; Hasenfratz, L.; Gousios, K.; Faryna, M.; Beaty, R.; Johnson, D.; Chen, B.; Altenhof, A.; Ryan, B.; Greenberg, C. A.; Hong, Z.; Assayag, G.; Tsyhanov, A.; Malakhov, V.; Rosenwein, T.; Raviv, O.; Lew-Williams, C.; Hasson, U.

2026-03-23 neuroscience
10.64898/2026.03.19.712982 bioRxiv
Show abstract

Human development unfolds in continuous, multimodal environments across seconds, days, and years, yet most developmental datasets capture sparse, context-limited samples of everyday life. We introduce the First 1,000 Days (1kD) Project, an initiative designed to collect ultra-dense, longitudinal, child-centered data that capture developmental trajectories within their full ecological context. Fifteen U.S. homes with 17 infants were recorded 12-14 hours per day over a median of 944 days, yielding [~]1.18 million hours of raw audiovisual data. We present an end-to-end framework for large-scale longitudinal naturalistic measurement and a scalable analysis pipeline of the collected data. In a case study, we describe how we utilized our pipeline to isolate child-centered speech, resulting in the collection of 2,000 to 6,000 hours of transcribed speech for each infant. We demonstrate that dense sampling within the home environment reveals a stable, household-specific lexical structure, which sparse sampling methods consistently fail to capture. The 1kD project offers a blueprint for teams aiming to collect and analyze natural behavior at scale in real-world settings.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Nature
575 papers in training set
Top 0.6%
26.0%
2
Nature Human Behaviour
85 papers in training set
Top 0.1%
10.2%
3
Nature Methods
336 papers in training set
Top 1%
6.9%
4
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 10%
6.5%
5
Scientific Data
174 papers in training set
Top 0.2%
6.5%
50% of probability mass above
6
Nature Neuroscience
216 papers in training set
Top 2%
4.9%
7
Science
429 papers in training set
Top 7%
4.4%
8
Nature Communications
4913 papers in training set
Top 36%
4.0%
9
Nature Biotechnology
147 papers in training set
Top 2%
3.6%
10
Scientific Reports
3102 papers in training set
Top 40%
3.3%
11
PLOS Computational Biology
1633 papers in training set
Top 12%
2.8%
12
PLOS ONE
4510 papers in training set
Top 48%
2.1%
13
Science Advances
1098 papers in training set
Top 20%
1.5%
14
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 4%
1.2%
15
Neuron
282 papers in training set
Top 7%
1.0%
16
eLife
5422 papers in training set
Top 53%
0.9%
17
Imaging Neuroscience
242 papers in training set
Top 3%
0.8%
18
Nature Medicine
117 papers in training set
Top 4%
0.8%
19
iScience
1063 papers in training set
Top 29%
0.8%
20
Communications Psychology
20 papers in training set
Top 0.3%
0.8%
21
eneuro
389 papers in training set
Top 9%
0.8%
22
npj Digital Medicine
97 papers in training set
Top 4%
0.7%
23
Advanced Science
249 papers in training set
Top 21%
0.7%
24
Nature Computational Science
50 papers in training set
Top 2%
0.7%
25
Cell Reports Methods
141 papers in training set
Top 7%
0.5%