Purpose: This study builds a stochastic model of a discrete-time Markov chain (DTMC) that fits well with a dataset of professional playing records. Methods: The point-by-point dataset of Men's single matches played in the Association of Tennis Professionals (ATP) tour from 2011 to 2015 is analyzed. A long-debated assumption on the iid-ness in the point winning probability of the server is statistically tested. A DTMC model is then developed to analyze the dataset further. Results: The statistical test results indicate that the identicality of point winning probabilities is not a valid assumption. For example, the server's point winning probability from scores 40:0, 30:15, 15:30, and 0:40 are significantly different. On the other hand, the independence is a generally valid assumption except for 40:15 where who won the previous point influences the point winning probability. Game winning probabilities and the importance of each point in winning a game are analyzed using the DTMC model by court surfaces and player groups of the different levels of serve effectiveness. Conclusion: Extensive empirical validation concludes unsealed debates over the stochastic models for tennis. The presented results reveal interesting properties in professional tennis matches.
Keywords: ATP tour; Sports analytics; big data; probability model.