The two most consistent features of the diseases caused by trinucleotide repeat expansion-neuropsychiatric symptoms and the phenomenon of genetic anticipation-may be present in forms of dementia, hereditary ataxia, Parkinsonism, bipolar affective disorder, schizophrenia and autism. To identify candidate genes for these disorders, we have screened human brain cDNA libraries for the presence of gene fragments containing polymorphic trinucleotide repeats. Here we report the cDNA cloning of CAGR1, originally detected in a retinal cDNA library. The 2743 bp cDNA contains a 1077 bp open reading frame encoding 359 amino acids. This amino acid sequence is homologous (56% amino acid identify and 81% amino acid conservation) to the Caenorhabditis elegans cell fate-determining protein mab-21. CAGR1 is expressed in several human tissues, most prominently in the cerebellum, as a message of approximately 3.0 kb. The gene was mapped to 13q13, just telomeric to D13S220. A 5'-untranslated CAG trinucleotide repeat is highly polymorphic, with repeat length ranging from six to 31 triplets and a heterozygosity of 87-88% in 684 chromosomes from several human populations. One allele from an individual with an atypical movement disorder and bipolar affective disorder type II contains 46 triplets, 15 triplets longer than any other allele detected. Though insufficient data are available to link the long repeat to this clinical phenotype, an expansion mutation of the CAGR1 repeat can be considered a candidate for the etiology of disorders with anticipation or developmental abnormalities, and particularly any such disorders linked to chromosome 13.