GenoML: Automated Machine Learning for Genomics
Authors:
Mary B. Makarious,
Hampton L. Leonard,
Dan Vitale,
Hirotaka Iwaki,
David Saffo,
Lana Sargent,
Anant Dadu,
Eduardo Salmerón Castaño,
John F. Carter,
Melina Maleknia,
Juan A. Botia,
Cornelis Blauwendraat,
Roy H. Campbell,
Sayed Hadi Hashemi,
Andrew B. Singleton,
Mike A. Nalls,
Faraz Faghri
Abstract:
GenoML is a Python package automating machine learning workflows for genomics (genetics and multi-omics) with an open science philosophy. Genomics data require significant domain expertise to clean, pre-process, harmonize and perform quality control of the data. Furthermore, tuning, validation, and interpretation involve taking into account the biology and possibly the limitations of the underlyin…
▽ More
GenoML is a Python package automating machine learning workflows for genomics (genetics and multi-omics) with an open science philosophy. Genomics data require significant domain expertise to clean, pre-process, harmonize and perform quality control of the data. Furthermore, tuning, validation, and interpretation involve taking into account the biology and possibly the limitations of the underlying data collection, protocols, and technology. GenoML's mission is to bring machine learning for genomics and clinical data to non-experts by developing an easy-to-use tool that automates the full development, evaluation, and deployment process. Emphasis is put on open science to make workflows easily accessible, replicable, and transferable within the scientific community. Source code and documentation is available at https://genoml.com.
△ Less
Submitted 4 March, 2021;
originally announced March 2021.