Compressed representation of brain genetic transcription

James K Ruffle; Henry Watkins; Robert J Gray; Harpreet Hyare; Michel Thiebaut de Schotten; Parashkev Nachev

doi:10.1002/hbm.26795

Compressed representation of brain genetic transcription

Hum Brain Mapp. 2024 Aug 1;45(11):e26795. doi: 10.1002/hbm.26795.

Authors

James K Ruffle¹, Henry Watkins¹, Robert J Gray¹, Harpreet Hyare¹, Michel Thiebaut de Schotten^{2

3}, Parashkev Nachev¹

Affiliations

¹ Queen Square Institute of Neurology, University College London, London, UK.
² Groupe d'Imagerie Neurofonctionnelle, Institut des Maladies Neurodégénératives-UMR 5293, CNRS, CEA, University of Bordeaux, Bordeaux, France.
³ Brain Connectivity and Behaviour Laboratory, Paris, France.

Abstract

The architecture of the brain is too complex to be intuitively surveyable without the use of compressed representations that project its variation into a compact, navigable space. The task is especially challenging with high-dimensional data, such as gene expression, where the joint complexity of anatomical and transcriptional patterns demands maximum compression. The established practice is to use standard principal component analysis (PCA), whose computational felicity is offset by limited expressivity, especially at great compression ratios. Employing whole-brain, voxel-wise Allen Brain Atlas transcription data, here we systematically compare compressed representations based on the most widely supported linear and non-linear methods-PCA, kernel PCA, non-negative matrix factorisation (NMF), t-stochastic neighbour embedding (t-SNE), uniform manifold approximation and projection (UMAP), and deep auto-encoding-quantifying reconstruction fidelity, anatomical coherence, and predictive utility across signalling, microstructural, and metabolic targets, drawn from large-scale open-source MRI and PET data. We show that deep auto-encoders yield superior representations across all metrics of performance and target domains, supporting their use as the reference standard for representing transcription patterns in the human brain.

Keywords: Allen Brain Atlas; UMAP; brain imaging; brain transcription; deep autoencoding; deep learning; dimensionality reduction; non‐negative matrix factorisation; principal component analysis; representation learning; t‐SNE.

MeSH terms

Atlases as Topic
Brain* / diagnostic imaging
Brain* / metabolism
Data Compression / methods
Humans
Image Processing, Computer-Assisted / methods
Magnetic Resonance Imaging*
Positron-Emission Tomography
Principal Component Analysis
Transcription, Genetic* / physiology

Abstract

MeSH terms

Grants and funding