Previously, we showed that case-specific non-linear finite element (FE) models are better at predicting the load to failure of metastatic femora than experienced clinicians. In this study we improved our FE modelling and increased the number of femora and characteristics of the lesions. We retested the robustness of the FE predictions and assessed why clinicians have difficulty in estimating the load to failure of metastatic femora. A total of 20 femora with and without artificial metastases were mechanically loaded until failure. These experiments were simulated using case-specific FE models. Six clinicians ranked the femora on load to failure and reported their ranking strategies. The experimental load to failure for intact and metastatic femora was well predicted by the FE models (R(2) = 0.90 and R(2) = 0.93, respectively). Ranking metastatic femora on load to failure was well performed by the FE models (τ = 0.87), but not by the clinicians (0.11 < τ < 0.42). Both the FE models and the clinicians allowed for the characteristics of the lesions, but only the FE models incorporated the initial bone strength, which is essential for accurately predicting the risk of fracture. Accurate prediction of the risk of fracture should be made possible for clinicians by further developing FE models.