Patients undergoing limb salvage surgery for bone and soft tissue sarcoma of the extremities experience significant physical disability as a result of life-preserving treatment. The existing health status measures do not adequately evaluate physical function from the patient's perspective. This paper presents the developmental studies (item selection, reduction, reliability, validity and responsiveness) of a new measure, The Toronto Extremity Salvage Score (TESS). Patients with bone and soft tissue sarcoma (76 upper and 83 lower extremity) were randomly selected and mailed the TESS. Patients rated the severity and importance of physical disabilities; the response options included a 'not applicable' category and open-ended questions that allowed patients to suggest additional items for inclusion in the questionnaire. Therefore, patient perceptions were used to determine item content. Difficulty and importance frequencies were calculated and items rated 'totally unimportant' or 'not applicable' by 30% of the sample were eliminated. Extra items identified 30% of the time were added to the questionnaire. Internal consistency was evaluated by Cronbach's alpha. Test-retest reliability and validity were evaluated on subsequent patient samples. The intraclass correlation coefficient (ICC) was calculated for test-retest reliability and correlations with The Musculoskeletal Tumour Society Rating Scale (MSTS) were calculated for construct validity. Standardized effect sizes were calculated as a measure of responsiveness. Fifty upper extremity and sixty-six lower extremity patients responded to the mailed questionnaire. No items were eliminated based on importance or not applicable ratings. Sporting activities were identified as additional items in both the upper and lower extremity questionnaire. High internal consistency was demonstrated: 0.94 for the lower and 0.92 for the upper extremity questionnaires respectively. Test-retest reliability was evaluated at multiple time-points and the intraclass correlation coefficient was greater than 0.87 in all instances. Construct validity was shown by a moderate correlation with the MSTS. The effect sizes were large demonstrating responsiveness. The use of patients' perceptions in determining the content of the TESS has resulted in a reliable and valid measure that is able to detect change over time.