Measurement of serum biomarkers by multiplex assays may be more variable as compared to single biomarker assays. Measurement error in these data may bias parameter estimates in regression analysis, which could mask true associations of serum biomarkers with an outcome. The Least Absolute Shrinkage and Selection Operator (LASSO) can be used for variable selection in these high-dimensional data. Furthermore, when the distribution of measurement error is assumed to be known or estimated with replication data, a simple measurement error correction method can be applied to the LASSO method. However, in practice the distribution of the measurement error is unknown and is expensive to estimate through replication both in monetary cost and need for greater amount of sample which is often limited in quantity. We adapt an existing bias correction approach by estimating the measurement error using validation data in which a subset of serum biomarkers are re-measured on a random subset of the study sample. We evaluate this method using simulated data and data from the Tucson Epidemiological Study of Airway Obstructive Disease (TESAOD). We show that the bias in parameter estimation is reduced and variable selection is improved.
Keywords: LASSO; bias correction; biomarkers; high-dimensional; measurement error.