Codon usage bias, or the unequal use of synonymous codons, is observed across genes, genomes, and between species. It has been implicated in many cellular functions, such as translation dynamics and transcript stability, but can also be shaped by neutral forces. We characterized codon usage across 1,154 strains from 1,051 species from the fungal subphylum Saccharomycotina to gain insight into the biases, molecular mechanisms, evolution, and genomic features contributing to codon usage patterns. We found a general preference for A/T-ending codons and correlations between codon usage bias, GC content, and tRNA-ome size. Codon usage bias is distinct between the 12 orders to such a degree that yeasts can be classified with an accuracy >90% using a machine learning algorithm. We also characterized the degree to which codon usage bias is impacted by translational selection. We found it was influenced by a combination of features, including the number of coding sequences, BUSCO count, and genome length. Our analysis also revealed an extreme bias in codon usage in the Saccharomycodales associated with a lack of predicted arginine tRNAs that decode CGN codons, leaving only the AGN codons to encode arginine. Analysis of Saccharomycodales gene expression, tRNA sequences, and codon evolution suggests that avoidance of the CGN codons is associated with a decline in arginine tRNA function. Consistent with previous findings, codon usage bias within the Saccharomycotina is shaped by genomic features and GC bias. However, we find cases of extreme codon usage preference and avoidance along yeast lineages, suggesting additional forces may be shaping the evolution of specific codons.
Keywords: Saccharomycotina; codon; codon usage bias; machine learning; tRNA.
© The Author(s) 2024. Published by Oxford University Press on behalf of The Genetics Society of America.