Improved metabolomic data-based prediction of depressive symptoms using nonlinear machine learning with feature selection

Yuta Takahashi; Masao Ueki; Makoto Yamada; Gen Tamiya; Ikuko N Motoike; Daisuke Saigusa; Miyuki Sakurai; Fuji Nagami; Soichi Ogishima; Seizo Koshiba; Kengo Kinoshita; Masayuki Yamamoto; Hiroaki Tomita

doi:10.1038/s41398-020-0831-9

Improved metabolomic data-based prediction of depressive symptoms using nonlinear machine learning with feature selection

Transl Psychiatry. 2020 May 19;10(1):157. doi: 10.1038/s41398-020-0831-9.

Authors

Yuta Takahashi^{1

2

3}, Masao Ueki^{4

5}, Makoto Yamada⁵, Gen Tamiya^{6

4

5}, Ikuko N Motoike^{4

7}, Daisuke Saigusa^{6

4}, Miyuki Sakurai⁴, Fuji Nagami⁴, Soichi Ogishima^{6

4}, Seizo Koshiba^{6

4}, Kengo Kinoshita^{4

7

8}, Masayuki Yamamoto^{6

4}, Hiroaki Tomita^{9

10

11}

Affiliations

¹ Graduate School of Medicine, Tohoku University, Sendai, Japan. [email protected].
² Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan. [email protected].
³ International Research Institute of Disaster Science, Tohoku University, Sendai, Japan. [email protected].
⁴ Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan.
⁵ RIKEN Center for Advanced Intelligence Project, Tokyo, Japan.
⁶ Graduate School of Medicine, Tohoku University, Sendai, Japan.
⁷ Graduate School of Information Sciences, Tohoku University, Sendai, Japan.
⁸ Institute for Development Aging and Cancer, Tohoku University, Sendai, Japan.
⁹ Graduate School of Medicine, Tohoku University, Sendai, Japan. [email protected].
¹⁰ Tohoku Medical Megabank Organization, Tohoku University, Sendai, Japan. [email protected].
¹¹ International Research Institute of Disaster Science, Tohoku University, Sendai, Japan. [email protected].

Abstract

To solve major limitations in algorithms for the metabolite-based prediction of psychiatric phenotypes, a novel prediction model for depressive symptoms based on nonlinear feature selection machine learning, the Hilbert-Schmidt independence criterion least absolute shrinkage and selection operator (HSIC Lasso) algorithm, was developed and applied to a metabolomic dataset with the largest sample size to date. In total, 897 population-based subjects were recruited from the communities affected by the Great East Japan Earthquake; 306 metabolite features (37 metabolites identified by nuclear magnetic resonance measurements and 269 characterized metabolites based on the intensities from mass spectrometry) were utilized to build prediction models for depressive symptoms as evaluated by the Center for Epidemiologic Studies-Depression Scale (CES-D). The nested fivefold cross-validation was used for developing and evaluating the prediction models. The HSIC Lasso-based prediction model showed better predictive power than the other prediction models, including Lasso, support vector machine, partial least squares, random forest, and neural network. L-leucine, 3-hydroxyisobutyrate, and gamma-linolenyl carnitine frequently contributed to the prediction. We have demonstrated that the HSIC Lasso-based prediction model integrating nonlinear feature selection showed improved predictive power for depressive symptoms based on metabolome data as well as on risk metabolites based on nonlinear statistics in the Japanese population. Further studies should use HSIC Lasso-based prediction models with different ethnicities to investigate the generality of each risk metabolite for predicting depressive symptoms.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Databases, Factual
Depression*
Japan
Machine Learning*

Grants and funding

JP19dm0107099/Japan Agency for Medical Research and Development (AMED)/International