Design of experiments (DOE) is an established method to allocate resources for efficient parameter space exploration. Model based active learning (AL) data sampling strategies have shown potential for further optimization. This paper introduces a workflow for conducting DOE comparative studies using automated machine learning. Based on a practical definition of model complexity in the context of machine learning, the interplay of systematic data generation and model performance is examined considering various sources of uncertainty: this includes uncertainties caused by stochastic sampling strategies, imprecise data, suboptimal modeling, and model evaluation. Results obtained from electrical circuit models with varying complexity show that not all AL sampling strategies outperform conventional DOE strategies, depending on the available data volume, the complexity of the dataset, and data uncertainties. Trade-offs in resource allocation strategies, in particular between identical replication of data points for statistical noise reduction and broad sampling for maximum parameter space exploration, and their impact on subsequent machine learning analysis are systematically investigated. Results indicate that replication oriented strategies should not be dismissed but may prove advantageous for cases with non-negligible noise impact and intermediate resource availability. The provided workflow can be used to simulate practical experimental conditions for DOE testing and DOE selection.
© 2024. The Author(s).