The cenC gene of Cellulomonas fimi, encoding endoglucanase CenC, has an open reading frame of 1101 codons closely followed by a 9 bp inverted repeat. The predicted amino acid sequence of mature CenC, which is 1069 amino acids long, is very unusual in that it has a 150-amino-acid tandem repeat at the N-terminus and an unrelated 100-amino-acid tandem repeat at the C-terminus. CenC belongs to subfamily E1 of the beta-1,4-glycanases. High-level expression in Escherichia coli of cenC from a 3.6 kbp fragment of C. fimi DNA leads to levels of CenC which exceed 10% of total cell protein. Most of the CenC is in the cytoplasm in an inactive form. About 60% of the active fraction of CenC is in the periplasm. The catalytic properties of the active CenC are indistinguishable from those of native CenC from C. fimi. The Mr of CenC from E. coli and C. fimi is approximately 130 kDa. E. coli and C. fimi also produce an endoglucanase, CenC', of approximate Mr 120kDa and with the same N-terminal amino acid sequence and catalytic properties as CenC. CenC' appears to be a proteolytic product of CenC. CenC and CenC' can bind to cellulose and to Sephadex. CenC is the most active component of the C. fimi cellulase system isolated to date.