Seizure detection: correlation of human experts

Scott B Wilson; Mark L Scheuer; Cheryl Plummer; Bryan Young; Steve Pacia

doi:10.1016/s1388-2457(03)00212-8

Seizure detection: correlation of human experts

Clin Neurophysiol. 2003 Nov;114(11):2156-64. doi: 10.1016/s1388-2457(03)00212-8.

Authors

Scott B Wilson¹, Mark L Scheuer, Cheryl Plummer, Bryan Young, Steve Pacia

Affiliation

¹ Persyst Development Corporation, 1060 Sandretto Drive, Suite E2, Prescott, AZ 86305, USA. [email protected]

PMID: 14580614
DOI: 10.1016/s1388-2457(03)00212-8

Abstract

Objective: The description and application of a new, overlap-integral comparison method and the quantification of human vs. human accuracies that can be used as goals for algorithms.

Methods: Four human experts marked ten 8 h electroencephalography (EEG) records from seizure patients. The seizures varied in origin and type, including complex partial, generalized absence, secondarily generalized and primary generalized tonic-clonic. The traditional any-overlap comparison method is used in addition to the overlap-integral method, which is sensitive to the correct placement of the seizure endpoints.

Results: The number of events marked by each reader ranged from 57 to 77. The average any-overlap sensitivity and false positives per hour rate are 0.92 and 0.117. The average overlap-integral correlation, sensitivity and specificity are 0.80, 0.82 and 0.9926. As expected, the correspondence between readers is high, but confounding issues resulted in overlap-integral sensitivities less than 0.5 for 10% of the records. Seven percent of the any-overlap sensitivities are less than 0.5. A comparison of the methods by record shows that the overlap-integral specificity and the any-overlap false positive rate measure different features.

Conclusions: There was little variation between readers and they were essentially interchangeable. High seizure rate (many per hour), short seizure durations (<10 s) and long seizure durations (approximately 10 min) with ambiguous offsets can complicate the analysis and result in poor correlation. There may be any number of unmarked events in rigorously marked records and it may be preferable to use records from non-epilepsy patients to compute the false positive rate. The any-overlap and overlap-integral comparison methods are complementary.

Significance: Correlation between expert human readers can be low on some records, which will complicate testing of seizure detection algorithms.

MeSH terms

Algorithms
Electroencephalography / statistics & numerical data*
Epilepsy, Complex Partial / diagnosis*
Epilepsy, Generalized / diagnosis*
Humans
Models, Statistical
Neurology / statistics & numerical data
Observer Variation
Sensitivity and Specificity