This study aimed to develop a novel framework to quickly personalize electromyography (EMG)-driven musculoskeletal models (MMs) as efferent neural interfaces for upper limb prostheses. Our framework adopts a generic upper-limb MM as a baseline and uses an artificial neural network-based policy to fine-tune the model parameters for MM personalization. The policy was trained by reinforcement learning (RL) to heuristically adjust the MM parameters to maximize the accuracy of estimated hand and wrist motions from EMG inputs. Our present framework was compared to the baseline MM and a widely used MM parameter optimization method: simulated annealing (SA). An offline evaluation was performed to first quantify the time required for MM personalization and the kinematics estimation accuracy of personalized MMs based on data collected from able-bodied subjects. Then, in an online evaluation, additional human subjects, including an individual with a transradial amputation, performed a virtual hand posture matching task using generic and personalized MMs. Results showed that compared to the baseline generic MM, personalized MMs estimated joint motion with lower error in both offline (p<0.05) and online tests (p=0.014), demonstrating the benefit of MM personalization. The RL-based framework performed model optimization in under one second on average in cases that took SA over 13 minutes and yielded comparable kinematics estimations both offline and online. Hence, our present personalization framework can be a practical solution for the daily use of EMG-driven MMs in prostheses or other assistive devices.