Exact analysis of glottal vibration patterns is indispensable for the assessment of laryngeal pathologies. Increasing demand of voice related examination and large amount of data provided by high-speed laryngoscopy and stroboscopy call for automatic assistance in research and patient care. Automatic glottis segmentation is necessary to assist glottal vibration pattern analysis, but unfortunately proves to be very challenging. Previous glottis segmentation approaches hardly consider characteristic glottis features as well as inhomogeneity of glottal regions and show serious drawbacks in their application for diagnostic purposes. We developed a fully automated glottis segmentation framework that extracts a set of glottal regions in endoscopic videos by using a flexible thresholding technique combined with a refining level set method that incorporates prior glottis shape knowledge. A novel descriptor for glottal regions is presented to remove potential nonglottal fake regions that show glottis-like shape properties. Knowledge of local color distributions is incorporated into Bayesian probability image generation. Glottal regions are then tracked frame-by-frame in probability images with a region-based level set segmentation strategy. Principal component analysis of pixel coordinates is applied to determine glottal orientation in each frame and to remove nonglottal regions if erroneous regions are included. The framework shows very promising results concerning segmentation accuracy and processing times and is applicable for both stroboscopic and high-speed videos.