The molecular mechanics Poisson-Boltzmann surface area (MM-PB/SA) method has been popular for computing protein-ligand binding free energies in recent years. All previous evaluations of the MM-PB/SA method are based upon computer-generated conformational ensembles, which may be affected by the defective computational methods used for preparing these conformational ensembles. In an attempt to reach more convincing conclusions, we have evaluated the MM-PB/SA method on a set of 24 diverse protein-ligand complexes, each of which has a set of conformations derived from NMR spectroscopy. Our results indicate that both MM-PB/SA and molecular mechanics generalized Born surface area (MM-GB/SA) are able to produce a modest correlation between their results and the experimentally measured binding free energies on our test set. In particular, both MM-PB/SA and MM-GB/SA produced better results by using a representative structure (R = 0.72-0.79) rather than averaging over the conformational ensemble of each given complex (R = 0.61-0.74). A head-to-head comparison with four selected scoring functions (X-Score, PLP, ChemScore, and DrugScore) on the same test set reveals that MM-PB/SA and MM-GB/SA results are marginally better than those produced by scoring funcitons, supporting the value of the MM-PB/SA method. Nevertheless, scoring functions are still more cost-effective options, especially for high-throughput tasks.