Over the last decade, there have been huge increases in the numbers of protein sequences and structures determined. In parallel, many methods have been developed for recognising similarities between these proteins, arising from their common evolutionary background, and for clustering such relatives into protein families. Here we review some of the protein family resources available to the biologist and describe how these can be used to provide structural and functional annotations for newly determined sequences. In particular we describe recent developments to the CATH domain database of protein structural families which have facilitated genome annotation and which have also revealed important caveats that must be considered when transferring functional data between homologous proteins.