The likely genetic architecture of complex diseases is that subgroups of patients share variants in genes in specific networks sufficient to express a shared phenotype. We combined high throughput sequencing with advanced bioinformatic approaches to identify such subgroups of patients with variants in shared networks. We performed targeted sequencing of patients with 2 or 3 generations of preterm birth on genes, gene sets and haplotype blocks that were highly associated with preterm birth. We analyzed the data using a multi-sample, protein-protein interaction (PPI) tool to identify significant clusters of patients associated with preterm birth. We identified shared protein interaction networks among preterm cases in two statistically significant clusters, p < 0.001. We also found two small control-dominated clusters. We replicated these data on an independent, large birth cohort. Separation testing showed significant similarity scores between the clusters from the two independent cohorts of patients. Canonical pathway analysis of the unique genes defining these clusters demonstrated enrichment in inflammatory signaling pathways, the glucocorticoid receptor, the insulin receptor, EGF and B-cell signaling, These results support a genetic architecture defined by subgroups of patients that share variants in genes in specific networks and pathways which are sufficient to give rise to the disease phenotype.
© 2022. The Author(s).