Clinical assessment scales, where subitem ratings are added and summarized as a total score, are convenient tools for monitoring disease progression and often used to measure the effect of drug treatment in clinical trials. Statistical evaluation of any beneficial treatment effects tends to focus on single-valued summary measures, for example, the difference between the score at the end of treatment and the score at baseline. Such analyses ignore potentially important features of the data, e.g. early vs. late recoveries. It is therefore of interest to develop longitudinal models that make more efficient use of the information present in non-monotonic clinical assessment scale data. We propose a two-part modeling approach for the modeling of this type of data. Non-monotonicity is managed by regarding score changes as Markovian transition events. A set of probabilistic models are used to describe the occurrences of the transitions. Continuous models are used to describe the magnitude of the scale score change, given the observed transition. In this manner, a non-monotonic disease progression is handled more efficiently than if other available methods are used. We illustrate this approach using data from a recent phase II study of a drug used in the treatment of stroke, where stroke severity was measured on the Scandinavian Stroke Scale (SSS). This scale consists of nine subitems: consciousness, eye movements, hand/arm/leg motor performance, orientation, speech, facial palsy, and gait. The data were non-monotonic, since there was at any time a risk of a score decline, despite a general tendency towards healing. The two-part probabilistic/continuous model fit the data well and proved to be robust in model-checking procedures such as posterior predictive checks and bootstrapping. The models derived using this approach could potentially accommodate drug effects, not only in terms of score improvement at end of study, but also on the onset of recovery, on dropout and on the probability of unfavorable progression patterns. In addition, it is possible to use the resulting for simulation of the prospective outcome of future studies. We conclude that this approach has considerable potential for more efficient use of information in longitudinal modeling of non-monotonic clinical assessment scale data.