-
2D mapping of radiation dose and clonogenic survival for accurate assessment of in vitro X-ray GRID irradiation effects
Authors:
D. Arous,
J. L. Lie,
B. V. Håland,
M. Børsting,
N. F. J. Edin,
E. Malinen
Abstract:
Spatially fractionated radiation therapy (SFRT or GRID) is an approach to deliver high local radiation doses in an 'on-off' pattern. To better appraise the radiobiological effects from GRID, a framework to link local radiation dose to clonogenic survival needs to be developed. A549 (lung) cancer cells were irradiated in T25 cm$^2$ flasks using 220 kV X-rays with an open field or through a tungsten…
▽ More
Spatially fractionated radiation therapy (SFRT or GRID) is an approach to deliver high local radiation doses in an 'on-off' pattern. To better appraise the radiobiological effects from GRID, a framework to link local radiation dose to clonogenic survival needs to be developed. A549 (lung) cancer cells were irradiated in T25 cm$^2$ flasks using 220 kV X-rays with an open field or through a tungsten GRID collimator with periodical 5 mm openings and 10 mm blockings. Delivered nominal doses were 2, 5, and 10 Gy. A novel approach for image segmentation was used to locate the centroid of surviving colonies in scanned images of the cell flasks. GafchromicTM film dosimetry (GFD) and FLUKA Monte Carlo (MC) simulations were employed to map the dose distribution in the flasks at each surviving colony centroid. Fitting the linear-quadratic (LQ) function to clonogenic survival data for open field irradiation, the expected survival level at a given dose level was calculated. The expected survival level was then mapped together with the observed levels in the GRID-irradiated flasks. GFD and FLUKA MC gave similar dose distributions, with a mean peak-to-valley dose ratio of about 5. LQ-parameters for open field irradiation gave $α= 0.16 \pm 0.04$ Gy$^{-1}$ and $β= 0.001 \pm 0.004$ Gy$^{-2}$. Using the image segmentation method, the mean absolute percentage deviation between observed and predicted survival in the (peak; valley) dose regions was (8; 10) %, (4; 41) %, and (3; 138) % for 2, 5 and 10 Gy, respectively. In conclusion, a framework for mapping of surviving colonies following GRID irradiation together with predicted survival levels from homogeneous irradiation was presented. For the given cell line, our findings indicate that GRID irradiation, especially at high peak doses, causes reduced survival compared to an open field configuration.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Gaussian Processes with Input Location Error and Applications to the Composite Parts Assembly Process
Authors:
Wenjia Wang,
Xiaowei Yue,
Benjamin Haaland,
C. F. Jeff Wu
Abstract:
In this paper, we investigate Gaussian process modeling with input location error, where the inputs are corrupted by noise. Here, the best linear unbiased predictor for two cases is considered, according to whether there is noise at the target unobserved location or not. We show that the mean squared prediction error converges to a non-zero constant if there is noise at the target unobserved locat…
▽ More
In this paper, we investigate Gaussian process modeling with input location error, where the inputs are corrupted by noise. Here, the best linear unbiased predictor for two cases is considered, according to whether there is noise at the target unobserved location or not. We show that the mean squared prediction error converges to a non-zero constant if there is noise at the target unobserved location, and provide an upper bound of the mean squared prediction error if there is no noise at the target unobserved location. We investigate the use of stochastic Kriging in the prediction of Gaussian processes with input location error, and show that stochastic Kriging is a good approximation when the sample size is large. Several numeric examples are given to illustrate the results, and a case study on the assembly of composite parts is presented. Technical proofs are provided in the Appendix.
△ Less
Submitted 1 March, 2021; v1 submitted 4 February, 2020;
originally announced February 2020.
-
A clustered Gaussian process model for computer experiments
Authors:
Chih-Li Sung,
Benjamin Haaland,
Youngdeok Hwang,
Siyuan Lu
Abstract:
A Gaussian process has been one of the important approaches for emulating computer simulations. However, the stationarity assumption for a Gaussian process and the intractability for large-scale dataset limit its availability in practice. In this article, we propose a clustered Gaussian process model which segments the input data into multiple clusters, in each of which a Gaussian process model is…
▽ More
A Gaussian process has been one of the important approaches for emulating computer simulations. However, the stationarity assumption for a Gaussian process and the intractability for large-scale dataset limit its availability in practice. In this article, we propose a clustered Gaussian process model which segments the input data into multiple clusters, in each of which a Gaussian process model is performed. The stochastic expectation-maximization is employed to efficiently fit the model. In our simulations as well as a real application to solar irradiance emulation, our proposed method had smaller mean square errors than its main competitors, with competitive computation time, and provides valuable insights from data by discovering the clusters. An R package for the proposed methodology is provided in an open repository.
△ Less
Submitted 5 November, 2020; v1 submitted 11 November, 2019;
originally announced November 2019.
-
Synthesizing simulation and field data of solar irradiance
Authors:
Furong Sun,
Robert B. Gramacy,
Benjamin Haaland,
Siyuan Lu,
Youngdeok Hwang
Abstract:
Predicting the intensity and amount of sunlight as a function of location and time is an essential component in identifying promising locations for economical solar farming. Although weather models and irradiance data are relatively abundant, these have yet, to our knowledge, been hybridized on a continental scale. Rather, much of the emphasis in the literature has been on short-term localized for…
▽ More
Predicting the intensity and amount of sunlight as a function of location and time is an essential component in identifying promising locations for economical solar farming. Although weather models and irradiance data are relatively abundant, these have yet, to our knowledge, been hybridized on a continental scale. Rather, much of the emphasis in the literature has been on short-term localized forecasting. This is probably because the amount of data involved in a more global analysis is prohibitive with the canonical toolkit, via the Gaussian process (GP). Here we show how GP surrogate and discrepancy models can be combined to tractably and accurately predict solar irradiance on time-aggregated and daily scales with measurements at thousands of sites across the continental United States. Our results establish short term accuracy of bias-corrected weather-based simulation of irradiance, when realizations are available in real space-time (e.g., in future days), and provide accurate surrogates for smoothing in the more common situation where reliable weather data is not available (e.g., in future years).
△ Less
Submitted 22 June, 2019; v1 submitted 13 June, 2018;
originally announced June 2018.
-
Emulating satellite drag from large simulation experiments
Authors:
Furong Sun,
Robert B. Gramacy,
Benjamin Haaland,
Earl Lawrence,
Andrew Walker
Abstract:
Obtaining accurate estimates of satellite drag coefficients in low Earth orbit is a crucial component in positioning and collision avoidance. Simulators can produce accurate estimates, but their computational expense is much too large for real-time application. A pilot study showed that Gaussian process (GP) surrogate models could accurately emulate simulations. However, cubic runtime for training…
▽ More
Obtaining accurate estimates of satellite drag coefficients in low Earth orbit is a crucial component in positioning and collision avoidance. Simulators can produce accurate estimates, but their computational expense is much too large for real-time application. A pilot study showed that Gaussian process (GP) surrogate models could accurately emulate simulations. However, cubic runtime for training GPs means that they could only be applied to a narrow range of input configurations to achieve the desired level of accuracy. In this paper we show how extensions to the local approximate Gaussian Process (laGP) method allow accurate full-scale emulation. The new methodological contributions, which involve a multi-level global/local modeling approach, and a set-wise approach to local subset selection, are shown to perform well in benchmark and synthetic data settings. We conclude by demonstrating that our method achieves the desired level of accuracy, besting simpler viable (i.e., computationally tractable) global and local modeling approaches, when trained on seventy thousand core hours of drag simulations for two real-world satellites: the Hubble space telescope (HST) and the gravity recovery and climate experiment (GRACE).
△ Less
Submitted 22 June, 2019; v1 submitted 30 November, 2017;
originally announced December 2017.
-
Multi-Resolution Functional ANOVA for Large-Scale, Many-Input Computer Experiments
Authors:
Chih-Li Sung,
Wenjia Wang,
Matthew Plumlee,
Benjamin Haaland
Abstract:
The Gaussian process is a standard tool for building emulators for both deterministic and stochastic computer experiments. However, application of Gaussian process models is greatly limited in practice, particularly for large-scale and many-input computer experiments that have become typical. We propose a multi-resolution functional ANOVA model as a computationally feasible emulation alternative.…
▽ More
The Gaussian process is a standard tool for building emulators for both deterministic and stochastic computer experiments. However, application of Gaussian process models is greatly limited in practice, particularly for large-scale and many-input computer experiments that have become typical. We propose a multi-resolution functional ANOVA model as a computationally feasible emulation alternative. More generally, this model can be used for large-scale and many-input non-linear regression problems. An overlapping group lasso approach is used for estimation, ensuring computational feasibility in a large-scale and many-input setting. New results on consistency and inference for the (potentially overlapping) group lasso in a high-dimensional setting are developed and applied to the proposed multi-resolution functional ANOVA model. Importantly, these results allow us to quantify the uncertainty in our predictions. Numerical examples demonstrate that the proposed model enjoys marked computational advantages. Data capabilities, both in terms of sample size and dimension, meet or exceed best available emulation tools while meeting or exceeding emulation accuracy.
△ Less
Submitted 8 January, 2019; v1 submitted 20 September, 2017;
originally announced September 2017.
-
Controlling Sources of Inaccuracy in Stochastic Kriging
Authors:
Wenjia Wang,
Benjamin Haaland
Abstract:
Scientists and engineers commonly use simulation models to study real systems for which actual experimentation is costly, difficult, or impossible. Many simulations are stochastic in the sense that repeated runs with the same input configuration will result in different outputs. For expensive or time-consuming simulations, stochastic kriging \citep{ankenman} is commonly used to generate prediction…
▽ More
Scientists and engineers commonly use simulation models to study real systems for which actual experimentation is costly, difficult, or impossible. Many simulations are stochastic in the sense that repeated runs with the same input configuration will result in different outputs. For expensive or time-consuming simulations, stochastic kriging \citep{ankenman} is commonly used to generate predictions for simulation model outputs subject to uncertainty due to both function approximation and stochastic variation. Here, we develop and justify a few guidelines for experimental design, which ensure accuracy of stochastic kriging emulators. We decompose error in stochastic kriging predictions into nominal, numeric, parameter estimation and parameter estimation numeric components and provide means to control each in terms of properties of the underlying experimental design. The design properties implied for each source of error are weakly conflicting and broad principles are proposed. In brief, space-filling properties "small fill distance" and "large separation distance" should balance with replication at distinct input configurations, with number of replications depending on the relative magnitudes of stochastic and process variability. Non-stationarity implies higher input density in more active regions, while regression functions imply a balance with traditional design properties. A few examples are presented to illustrate the results.
△ Less
Submitted 8 August, 2018; v1 submitted 2 June, 2017;
originally announced June 2017.
-
Potentially Predictive Variance Reducing Subsample Locations in Local Gaussian Process Regression
Authors:
Chih-Li Sung,
Robert B. Gramacy,
Benjamin Haaland
Abstract:
Gaussian process models are commonly used as emulators for computer experiments. However, developing a Gaussian process emulator can be computationally prohibitive when the number of experimental samples is even moderately large. Local Gaussian process approximation (Gramacy and Apley, 2015) was proposed as an accurate and computationally feasible emulation alternative. However, constructing local…
▽ More
Gaussian process models are commonly used as emulators for computer experiments. However, developing a Gaussian process emulator can be computationally prohibitive when the number of experimental samples is even moderately large. Local Gaussian process approximation (Gramacy and Apley, 2015) was proposed as an accurate and computationally feasible emulation alternative. However, constructing local sub-designs specific to predictions at a particular location of interest remains a substantial computational bottleneck to the technique. In this paper, two computationally efficient neighborhood search limiting techniques are proposed, a maximum distance method and a feature approximation method. Two examples demonstrate that the proposed methods indeed save substantial computation while retaining emulation accuracy.
△ Less
Submitted 26 November, 2016; v1 submitted 18 April, 2016;
originally announced April 2016.
-
A Simple Approach to Constructing Quasi-Sudoku-based Sliced Space-Filling Designs
Authors:
Diane Donovan,
Benjamin Haaland,
David J. Nott
Abstract:
Sliced Sudoku-based space-filling designs and, more generally, quasi-sliced orthogonal array-based space-filling designs are useful experimental designs in several contexts, including computer experiments with categorical in addition to quantitative inputs and cross-validation. Here, we provide a straightforward construction of doubly orthogonal quasi-Sudoku Latin squares which can be used to gene…
▽ More
Sliced Sudoku-based space-filling designs and, more generally, quasi-sliced orthogonal array-based space-filling designs are useful experimental designs in several contexts, including computer experiments with categorical in addition to quantitative inputs and cross-validation. Here, we provide a straightforward construction of doubly orthogonal quasi-Sudoku Latin squares which can be used to generate sliced space-filling designs which achieve uniformity in one and two-dimensional projections for both the full design and each slice. A construction of quasi-sliced orthogonal arrays based on these constructed doubly orthogonal quasi-Sudoku Latin squares is also provided and can, in turn, be used to generate sliced space-filling designs which achieve uniformity in one and two-dimensional projections for the full design and and uniformity in two-dimensional projections for each slice. These constructions are very practical to implement and yield a spectrum of design sizes and numbers of factors not currently broadly available.
△ Less
Submitted 19 February, 2015;
originally announced February 2015.
-
A Framework for Controlling Sources of Inaccuracy in Gaussian Process Emulation of Deterministic Computer Experiments
Authors:
Benjamin Haaland,
Wenjia Wang,
Vaibhav Maheshwari
Abstract:
Computer experiments have become ubiquitous in science and engineering. Commonly, runs of these simulations demand considerable time and computing, making experimental design extremely important in gaining high quality information with limited time and resources. Principles of experimental design are proposed and justified which ensure high nominal, numeric, and parameter estimation accuracy for G…
▽ More
Computer experiments have become ubiquitous in science and engineering. Commonly, runs of these simulations demand considerable time and computing, making experimental design extremely important in gaining high quality information with limited time and resources. Principles of experimental design are proposed and justified which ensure high nominal, numeric, and parameter estimation accuracy for Gaussian process emulation of deterministic simulations. The space-filling properties "small fill distance" and "large separation distance" are only weakly conflicting and ensure well-controlled nominal, numeric, and parameter estimation error, while non-stationarity requires a greater density of experimental inputs in regions of the input space with more quickly decaying correlation. This work will provide scientists and engineers with robust, rigorously justified, and practically useful overarching principles for selecting combinations of simulation inputs with high information content.
△ Less
Submitted 11 May, 2017; v1 submitted 25 November, 2014;
originally announced November 2014.
-
Speeding up neighborhood search in local Gaussian process prediction
Authors:
Robert B. Gramacy,
Benjamin Haaland
Abstract:
Recent implementations of local approximate Gaussian process models have pushed computational boundaries for non-linear, non-parametric prediction problems, particularly when deployed as emulators for computer experiments. Their flavor of spatially independent computation accommodates massive parallelization, meaning that they can handle designs two or more orders of magnitude larger than previous…
▽ More
Recent implementations of local approximate Gaussian process models have pushed computational boundaries for non-linear, non-parametric prediction problems, particularly when deployed as emulators for computer experiments. Their flavor of spatially independent computation accommodates massive parallelization, meaning that they can handle designs two or more orders of magnitude larger than previously. However, accomplishing that feat can still require massive supercomputing resources. Here we aim to ease that burden. We study how predictive variance is reduced as local designs are built up for prediction. We then observe how the exhaustive and discrete nature of an important search subroutine involved in building such local designs may be overly conservative. Rather, we suggest that searching the space radially, i.e., continuously along rays emanating from the predictive location of interest, is a far thriftier alternative. Our empirical work demonstrates that ray-based search yields predictors with accuracy comparable to exhaustive search, but in a fraction of the time - bringing a supercomputer implementation back onto the desktop.
△ Less
Submitted 5 January, 2015; v1 submitted 29 August, 2014;
originally announced September 2014.
-
Accurate emulators for large-scale computer experiments
Authors:
Ben Haaland,
Peter Z. G. Qian
Abstract:
Large-scale computer experiments are becoming increasingly important in science. A multi-step procedure is introduced to statisticians for modeling such experiments, which builds an accurate interpolator in multiple steps. In practice, the procedure shows substantial improvements in overall accuracy, but its theoretical properties are not well established. We introduce the terms nominal and numeri…
▽ More
Large-scale computer experiments are becoming increasingly important in science. A multi-step procedure is introduced to statisticians for modeling such experiments, which builds an accurate interpolator in multiple steps. In practice, the procedure shows substantial improvements in overall accuracy, but its theoretical properties are not well established. We introduce the terms nominal and numeric error and decompose the overall error of an interpolator into nominal and numeric portions. Bounds on the numeric and nominal error are developed to show theoretically that substantial gains in overall accuracy can be attained with the multi-step approach.
△ Less
Submitted 12 March, 2012;
originally announced March 2012.