The analysis of functionally related sequences for conserved patterns is important for further research of different functional regions. This paper presents an analysis of genes and intergenic sequences from the point of view of linguistics analysis, where gene and intergenic regions are regarded as two different subjects written in the four-letter alphabet [A, C, G, T] and high-frequency simple sequences are taken as keywords. A measurement alpha[l(tau)] was introduced to describe the relative repeat ratio of simple sequences. Cutoff values were found for keywords selection. After eliminating "noise," 87 short sequences were selected as keywords for intergenic regions and 76 for gene regions.