Research News

Vanderbilt researcher shares more than 3,000 brain scans to support the study of reading and language development

MRI brain scans (Getty Images)

Vanderbilt University neuroscientist James R. Booth is publicly releasing two large scale neuroimaging datasets on reading and language development to support other researchers across the world who are working to understand how academic skills develop in childhood.

“We have been able to follow our curiosity and answer some really interesting questions with these datasets,” said Booth, the Patricia and Rhodes Hart Professor of Educational Neuroscience at Vanderbilt Peabody College of education and human development. “My hope is that others will be able to reproduce some of our core findings and extend them in interesting new directions.”

Available in the digital repository OpenNeuro, together the datasets include more than 3,000 magnetic resonance imaging scans that explore brain structure and function in school-age children.

James R. Booth (Vanderbilt)

“My hope is that others will be able to reproduce some of our core findings and extend them in interesting new directions.”
–James R. Booth

Using the data

Booth and his colleagues have used the dataset on “Cross-Sectional Multidomain Lexical Processing,” which uses rhyming, spelling and meaning tasks to understand how children process features of both written and spoken language, to provide a deeper understanding of domain specific and domain general processes in the brain, and how this is related to academic skill.

Through making these data publicly available, other researchers can extend the body of foundational research stemming from this dataset, which includes several tasks in both the visual and auditory modalities. For example, researchers could use  network approaches to understand whether brain dynamics differ depending on task demands.

Longitudinal Brain Correlates of Multisensory Lexical Processing in Children” expands upon the research conducted in the first dataset, by exploring rhyming in audio-visual contexts (see Figure 1). This project focused on one of the core skills related to reading skill and dyslexia, the ability to map between auditory and visual modalities. This dataset also has a longitudinal component which allows researchers to explore how an individual’s reading develops across childhood.

Figure 1. Overview of study design. Children performed rhyming judgments when in the MRI. The researchers collected structural, functional and diffusion neuroimaging data.

Although several papers have been published on the data, none of these studies have examined changes in brain activation over time, so the release of this data  provides an exciting opportunity for future investigation. Future studies could also examine whether these trajectories can be predicted in advance, which is useful for early identification and intervention.

Comparing brain function to academic skills

In addition to the more than 3,000 brain scans, both datasets include scores from an extensive number of standardized tests to allow researchers to compare brain function to other academically relevant skills. Testing scores, behavioral performance on the imaging tasks, demographics and brain data can be integrated to holistically explore how children develop. For example, it is not known how the neural basis of reading skill varies as a function of cognitive skills indexed by intelligence measures. In addition, it is not known how socio-economic status relates to functional brain changes over time in the reading network.

Publications on the data

Marisa Lytle (Vanderbilt)

“We hope to continue to do our part in giving back to the scientific community and making research practices more open and transparent.”
–Marisa Lytle

Detailed descriptors of these datasets have been recently published in Data in Brief (Multidomain) and Scientific Data (Multisensory) to facilitate future reuse of the data. Both the datasets and their descriptors are open access, meaning that anyone with internet access can read and utilize these extensive datasets.

Booth’s lab has also released additional resources on how to share neuroimaging data.

“We hope to continue to do our part in giving back to the scientific community and making research practices more open and transparent,” said Marisa Lytle, research assistant  in Booth’s Brain Development Lab and coordinator for the data sharing project. “By providing the knowledge of how to share data in addition to the datasets themselves, we hope that other researchers will feel empowered to share their own data with the research community and the public.”

Previous dataset release

In March 2019, Booth released the largest known developmental neuroimaging dataset on arithmetic processing, comprised of several hundred brain scans his lab conducted on school-age children performing math problems. This data is also freely available.


The National Institute of Child Health and Human Development funded the initial research for these two datasets (R01-HD042049) as well as the sharing of the data (R03-HD093547).

Learn More

Watch a video of James R. Booth discussing his research

Visit the website for the Brain Development Lab