The Male Fertility Gene Atlas: a web tool for collecting and integrating OMICS data in the context of male infertility

Hum Reprod. 2020 Sep 1;35(9):1983-1990. doi: 10.1093/humrep/deaa155.

Abstract

Study question: How can one design and implement a system that provides a comprehensive overview of research results in the field of epi-/genetics of male infertility and germ cells?

Summary answer: Working at the interface of literature search engines and raw data repositories, the newly developed Male Fertility Gene Atlas (MFGA) provides a system that can represent aggregated results from scientific publications in a standardized way and perform advanced searches, for example based on the conditions (phenotypes) and genes related to male infertility.

What is known already: PubMed and Google Scholar are established search engines for research literature. Additionally, repositories like Gene Expression Omnibus and Sequence Read Archive provide access to raw data. Selected processed data can be accessed by visualization tools like the ReproGenomics Viewer.

Study design, size, duration: The MFGA was developed in a time frame of 18 months under a rapid prototyping approach.

Participants/materials, setting, methods: In the context of the Clinical Research Unit 'Male Germ Cells' (CRU326), a group of around 50 domain experts in the fields of male infertility and germ cells helped to develop the requirements engineering and feedback loops. They provided a set of 39 representative and heterogeneous publications to establish a basis for the system requirements.

Main results and the role of chance: The MFGA is freely available online at https://mfga.uni-muenster.de. To date, it contains 115 data sets corresponding to 54 manually curated publications and provides an advanced search function based on study conditions, meta-information and genes, whereby it returns the publications' exact tables and figures that fit the search request as well as a list of the most frequently investigated genes in the result set. Currently, study data for 31 different tissue types, 32 different cell types and 20 conditions are available. Also, ∼8000 and ∼1000 distinct genes have been found to be mentioned in at least 10 and 15 of the publications, respectively.

Large scale data: Not applicable because no novel data were produced.

Limitations, reasons for caution: For the most part, the content of the system currently includes the selected publications from the development process. However, a structured process for the prospective literature search and inclusion into the MFGA has been defined and is currently implemented.

Wider implications of the findings: The technical implementation of the MFGA allows for accommodating a wide range of heterogeneous data from aggregated research results. This implementation can be transferred to other diseases to establish comparable systems and generally support research in the medical field.

Study funding/competing interest(s): This work was carried out within the frame of the German Research Foundation (DFG) Clinical Research Unit 'Male Germ Cells: from Genes to Function' (CRU326). The authors declare no conflicts of interest.

Keywords: database; epigenetics; genetics; male infertility; omics.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Fertility
  • Humans
  • Infertility, Male* / genetics
  • Male
  • Phenotype
  • Prospective Studies