Health question answering systems often depend on the initial step of question type classification. Practitioners face several modeling choices for this component alone. We evaluate the effectiveness of different modeling choices in both the embeddings and architectural hyper-parameters of the classifier. In the process, we achieve improved performance over previous methods, achieving a new best 5-fold accuracy of 85.3% on the GARD dataset. The contribution of this work is to evaluate the performance of sentence classification methods on the task of consumer health question type classification and to contribute a dataset of 2,882 medical questions annotated for question type.