Homophily, the tendency of humans to attract each other when sharing similar features, traits, or opinions, has been identified as one of the main driving forces behind the formation of structured societies. Here we ask to what extent homophily can explain the formation of social groups, particularly their size distribution. We propose a spin-glass-inspired framework of self-assembly, where opinions are represented as multidimensional spins that dynamically self-assemble into groups; individuals within a group tend to share similar opinions (intragroup homophily), and opinions between individuals belonging to different groups tend to be different (intergroup heterophily). We compute the associated nontrivial phase diagram by solving a self-consistency equation for "magnetization" (combined average opinion). Below a critical temperature, there exist two stable phases: one ordered with nonzero magnetization and large clusters, the other disordered with zero magnetization and no clusters. The system exhibits a first-order transition to the disordered phase. We analytically derive the group-size distribution that successfully matches empirical group-size distributions from online communities.