Rehabilitation after neurological injury can be provided by robots that help patients perform different exercises. Multiple such robots can be combined in a rehabilitation robot gym to allow multiple patients to perform a diverse range of exercises simultaneously. In pursuit of better multipatient supervision, we aim to develop an automated assignment system that assigns patients to different robots during a training session to maximize their skill development. Our previous work was designed for simplified simulated environments where each patient's skill development is known beforehand. The current work improves upon that work by changing the deterministic environment into a stochastic environment where part of the skill development is random and the assignment system must estimate each patient's predicted skill development using a neural network based on the patient's previous training success rate with that robot. These skill development estimates are used to create patient-robot assignments on a timestep-by-timestep basis to maximize the skill development of the patient group. Results from simplified simulation trials show that the schedules produced by our assignment system outperform multiple baseline schedules (e.g., schedules where patients never switch robots and schedules where patients only switch robots once halfway through the session). Additionally, we discuss how some of our simplifications could be addressed in the future.