To identify large proteins with an EGF-like-motif in a systematic manner, we developed a computer-assisted method called motif-trap screening. The method exploits 5'-end single-pass sequence data obtained from a pool of cDNAs whose sizes exceed 5 kb. Using this screening procedure, we were able to identify five known and nine new genes for proteins with multiple EGF-like-motifs from 8000 redundant human brain cDNA clones. These new genes were found to encode a novel mammalian homologue of Drosophila fat protein, two seven-transmembrane proteins containing multiple cadherin and EGF-like motifs, two mammalian homologues of Drosophila slit protein, an unidentified LDL receptor-like protein, and three totally uncharacterized proteins. The organization of the domains in the proteins, together with their expression profiles and fine chromosomal locations, has indicated their biological significance, demonstrating that motif-trap screening is a powerful tool for the discovery of new genes that have been difficult to identify by conventional methods.
Copyright 1998 Academic Press.