We recently proposed a novel approach to categorize information carried by symbolic sequences based on their usage of repetitive patterns. A simple quantitative index to measure the dissimilarity between two symbolic sequences can be defined. This information dissimilarity index, defined by our formula, is closely related to the Shannon entropy and rank order of the repetitive patterns in the symbolic sequences. Here we discuss the underlying statistical physics assumptions of this dissimilarity index. We use human cardiac interbeat interval time series and DNA sequences as examples to illustrate the applicability of this generic approach to real-world problems.