The measurement of exhaled volatile organic compounds (VOCs) in exhaled breath (breathomics) represents an exciting biomarker matrix for airways disease, with early research indicating a sensitivity to airway inflammation. One of the key aspects to analytical validity for any clinical biomarker is an understanding of the short-term repeatability of measures. We collected exhaled breath samples on 5 consecutive days in 14 subjects with severe asthma who had undergone extensive clinical characterisation. Principal component analysis on VOC abundance across all breath samples revealed no variance due to the day of sampling. Samples from the same patients clustered together and there was some separation according to T2 inflammatory markers. The intra-subject and between-subject variability of each VOC was calculated across the 70 samples and identified 30.35% of VOCs to be erratic: variable between subjects but also variable in the same subject. Exclusion of these erratic VOCs from machine learning approaches revealed no apparent loss of structure to the underlying data or loss of relationship with salient clinical characteristics. Moreover, cluster evaluation by the silhouette coefficient indicates more distinct clustering. We are able to describe the short-term repeatability of breath samples in a severe asthma population and corroborate its sensitivity to airway inflammation. We also describe a novel variance-based feature selection tool that, when applied to larger clinical studies, could improve machine learning model predictions.
Keywords: VOC; asthma; breathomics; repeatability; respiratory; severe asthma; volatile organic compounds.